Skip to content

Rule Creation Guide

Florian Roth edited this page Oct 13, 2022 · 10 revisions

Rule Creation

Sigma is a very flexible standard with many optional fields. This guide will help you create a Sigma rule that aligns with the other community rules in our repository.

Rule Template

The best way is to use an existing rule that gets close to what you plan like to write.

Make sure that the following fields are set in a rule that you would like to push to our public repository:

title: a short capitalised title with less than 50 characters
id: generate one here https://www.uuidgenerator.net/version4
status: experimental
description: A description of what your rule is meant to detect 
references:
    - A list of all references that can help a reader or analyst understand the meaning of a triggered rule
tags:
    - attack.execution  # example MITRE ATT&CK category
    - attack.t1059      # example MITRE ATT&CK technique id
    - car.2014-04-003   # example CAR id
author: Michael Haag, Florian Roth, Markus Neis  # example, a list of authors
date: 2018/04/06  # Rule date
logsource:                      # important for the field mapping in predefined or your additional config files
    category: process_creation  # In this example we choose the category 'process_creation'
    product: windows            # the respective product
detection:
    selection:
        FieldName: 'StringValue'
        FieldName: IntegerValue
        FieldName|modifier: 'Value'
    condition: selection
fields:
    - fields in the log source that are important to investigate further
falsepositives:
    - describe possible false positive conditions to help the analysts in their investigation
level: one of five levels (informational, low, medium, high, critical)

Common Pitfalls

Title

  • Don't use a prefix in the title like "Detects .."
  • Use a short title with less than 50 characters as an alert name
  • Save any explanation for the description
  • Use title casing (e.g. 'Suspicious PowerShell CommandLine' and not 'Suspicious powershell commandline')

Bad Examples:

  • Detects a process execution in a Windows folder that shouldn't contain executables (unnecessary prefix, too long, all lower case, contains an explanation)
  • Detects process injection (unnecessary prefix, too general, lower case first characters)

Good Examples:

  • Process Injection Using Iexplore.exe
  • Suspicious PowerShell Cmdline with JAB
  • Certutil Lolbin Decode Use

ID

No known pitfalls. We use the optional field id in our repo to provide a unique identifier that never changes, while all other field values of that rule could change over time. You can simply create an ID in form of an UUID on this web page.

Status

Every new rule has the status of experimental. It gets the status test after months of productive use and without any negative feedback from the community. After ~1 year of use without significant modifications apart from filters, we classify a rules as stable.

Description

The best descriptions start with Detects .... Please don't just use the selected title. Try to describe as good as possible what it means if that rule triggers. Analysts that see such a rule trigger should get a good understanding what a match could possibly indicate.

Bad examples:

  • Rule detects w3wp.exe spawn bitsadmin.exe
  • New whoami process start

Good examples:

  • Detects a suspicious Background Transmission Service execution by the IIS web server service
  • Detects the execution of whoami, which could be part of administrative activity but is also often used by attackers that have exploited some local privilege escalation or remote code execution vulnerability. The command whoami reveals the current user context. Administrators usually know which user they've used to login. Attackers usually need to evaluate the user context after successful exploitation.

References

The values must be a list.

Use links to web pages or documents only.

  • don't link to EVTX files, PCAPs or other raw content
  • don't include links to MITRE ATT&CK techniques (we use the tags for that)

The links used in the list can be, i.e.:

  • links to a blog post or tweet
  • links to a project page of a certain hack tool
  • links to the manual page of a builtin Windows tool
  • links to advisories
  • links to discussions that better explain the detected threat

Author

  • The author field is a string, not a list
  • Combine multiple authors separated with a comma
  • If you use a special character like @ for a twitter handle, you have to use upper ticks, e.g. author: '@cyb3rops'
  • You may add the type of contribution in round brackets, e.g. 'author: 'John Galt (idea), Florian Roth (rule)'`

Date

We use the optional field date in our public rules to show the creation date of the rule without requiring a git-log. Changing a rule that has once been published in the master branch, requires you to use a new field named modified to indicate a modification of the initial rule.

We use the format YYYY/MM/DD or %Y/%m/%d as Python's strftime directive.

Reasons to change the modified date:

  • changed title
  • changed detection section
  • changed level
  • changed logsource (rare)

You don't need to change the modified date for changes in all other field values.

Tags

In our public ruleset, we use tags from MITRE ATT&CK, CAR and tags for CVE numbers.

Examples

tags: 
   - attack.credential_access
   - attack.t1003.002
   - car.2013-07-001
   - cve.2020.10189
  • Use lower-case tags only
  • We use . or - as divider in tag names
  • Replace space with an underscore _

Log Source

This is a more difficult section. There are two options:

  1. The log source already exists
  2. There is not a single rule in our repo for that log source

In case 1 please use one of these rules as a template. In case 2, check the existing rules in the different folders to get a feeling for the use of the three identifiers that you can use in this section:

  • product (e.g. linux, windows, cisco)
  • service (e.g. sysmon, ldapd, dhcp)
  • category (e.g. process_creation)

Note that these identifiers are used in the config files in ./tools/config used by sigmac (old) or [pySigma](https://github.com/SigmaHQ/pySigma/) (new) to map a specific log source's fields to the fields that you use in your rule. If you create a new log source, it would be great if you could add appropriate mapping in all the current config files for the different backends (qradar, helk, splunk etc.). Otherwise other users or we have maintainers have to do that.

Detection

The detection section is very flexible but we see common errors or styling issues in this section that require a reword by us maintainers.

  1. If your list consists of a single element, don't use a list (see examples below)
  2. Use only lowercase identifiers
  3. Put comments on lines if you like to (use 2 spaces to separate the expression from your comment, e.g. - 'cmd.exe' # command line)
  4. Don't use regular expressions unless you really have to (e.g. instead of CommandLine|re: '\\payload.*\skeyset' use CommandLine|contains|all with the values \payload and keyset).
  5. In new sources use the field names as they appear in the log source, remove spaces and keep hyphens (e.g. SAM User Account becomes SAMUserAccount)
  6. Don't use SIEM specific logic in your condition
  7. Create a pull request (all pull request will automatically be checked for syntax errors, conformance with our standards and false positives)

Backslashes

Backslashes have two functions in Sigma:

  • Backslash as plain value
  • Backslash as prefix to escape characters with special meanings: the backslash \ itself, as well as the wildcards * and ?.

Handling the backslash in this way has the advantage that values that contain single backslashes (the common case) can be expressed in a plain way. On the other hand, some corner cases require additional escaping:

  • Values that contain only single backslashes can be expressed in the plain way: C:\Windows\System32\cmd.exe
  • Don't escape single backslashes with a backslash, write the plain value from the previous example instead of C:\\Windows\\System32\\cmd.exe.
  • If you want to express two plain backslashes use four of them: \\\\foo\bar results in the value \\foo\bar.
  • Write \\\\ if you want two back slahes
  • Write \* if you want a plain wildcard * as resulting value.
  • Write \\* if you want a plain backslash followed by a wildcard * as resulting value.
  • Write \\\* if you want a plain backslash followed by a plain * as resulting value.

Be aware that backslashes have a special semantic in regular expressions. Under some circumstances more backslashes are required as in plain values. Example:

    CommandLine|re: ...\\Microsoft...

Intention: Backslash followed by a plain M. Result: escaped M (\M). This is invalid in Perl-compatible regular expressions used by many target systems and will cause an error in the new Sigmatools. Fix:

    CommandLine|re: ...\\\\Microsoft...

Value Modifiers

Even though it is technically possible to chain value modifiers arbitrary, not all combinations make sense. The following ordering rules should be followed:

  • Modifiers that add wildcards (startswith, endswith and contains) must not be followed by encoding modifiers (base64, base64offset) because they will also encode the wildcards themselves, causing the loss of their special functionality.
  • The value modifier chain must not end with character set encoding modifiers (utf16, utf16le, utf16be and wide). The resulting values are internally represented as byte sequences instead of text strings and contain null characters which are usually difficult to handle in queries. Therefore the should be followed by an encoding modifier (base64, base64offset)
  • Usually it doesn't makes sense to combine the re type modifier with any other modifier.
  • Generally all could be put at an arbitrary position because all modifiers can handle single values as well as lists, but this modifier should be put at the end by convention.

Some common combinations are:

  • |contains|all: All values in the list are contained in the logged value. This is useful to express command line parameters in an order-agnostic way.
  • |utf16|base64offset|contains: value is Base64-encoded UTF16 and might be contained anywhere in a value (e.g. as part of a bigger Base64 value).

Fields

These are the fields that are very helpful in the evaluation of a certain event. For example, it is helpful to know the parent process of a process that contains suspicious strings in its command line parameters.

These fields could be extracted automatically and presented to the analyst in order to speed up the analysis.

False Positive

Think about possible false positive conditions that could also trigger the rule. This list should contain useful hints for an analyst. E.g. the comment "Legitimate processes that delete the shadow copies" can be a hint for an analyst to check for backup processes on that system or ask for any unusual administrative activity that involved the deletion of the local volume shadow copies.

Level

The four existing levels could further be divided into two categories.

  1. Rules that have informative character and should be displayed in a list or bar chart (low, medium)
  2. Rules that should trigger a dedicated alert (high, critical)

Apply the following guidelines when setting a level:

  • Rules of level critical should never trigger a false positive and be of high relevance
  • Rules of level high trigger on threats of high relevance that have to be reviewed manually (rare false positives > baselining required)
  • Rules of level high and critical indicate an incident (if not a false positive)
  • Rules of level low and medium indicate suspicious activity and policy violations
  • Rules of level informational have informative character and are often used for compliance or correlation purposes

External Guides

The following links lead to web pages that explain the rule creation process with screenshots or certain aspects of the expression language.

Tools

We recommend the use of Visual Studio Code with the Sigma Extension.