Defender’s Toolkit 102: Sigma Rules

7 min readMar 6, 2021

Now that the intelligence community is finally reaching its due maturity, advisories shared with fellow organizations often contain useful detection use-cases. If we were to travel back a few years, an average analyst would dread the manual conversion of these use-cases into searchable queries for the logging platform or SIEM. What we duly needed was a standard — a way to write a query once and search it everywhere —that is precisely what Sigma provides.

What is Sigma?

Sigma is the brainchild of Florian Roth and Thomas Patzke. To quote the GitHub page of the open-source tool, Sigma:

is a generic and open signature format that allows you to describe relevant log events in a straightforward manner.

Let’s reference back to our problem. One single use-case had to be converted for every single SIEM or centralized logging platform available to SOC’s all over the world. Luckily, Sigma rules (a use-case converted into the Sigma specification is called a rule) serve as a strict format for analysts to describe a particular use-case and convert it for whichever platform they want.

Writing Sigma rules is fairly easy and follow the YAML serialization format (we’ll discuss the specification later) which is quite flexible to accommodate custom fields. One of the best features is it doesn’t restrict to a particular log file or extension. You may even use it via grep on the command-line. The re-usability of rules and ease of sharing is one of the many reasons why researchers (like myself) love this tool!

Writing Sigma Rules

We’ve previously discussed how a Sigma rule is encapsulated in a YAML file. This YAML file follows a standard or specification set by the authors of Sigma. It is available on the GitHub page as well. Just to summarize that information — the following sections constitute a good rule:

Metadata (Title, ID, Author, References, Tags, Level)
Log Source (Define the log source which will be used to collect the data from e.g. the Windows Security log channel)
Detections (selections, filters, and conditions)
False-positives
Optional (and custom) tags

The official specification over GitHub has a detailed explanation of every section and the fields within each one of them. If you’d like a thorough review of the specification — head to the link. If you’d like a quick introduction to getting started with Sigma rules and writing them — read ahead.

How to Start Writing Sigma Rules?

Let’s break the process down into small, achievable steps.

Step 1: Acquire the Sigma repository [Optional]

The official Sigma repository contains all documentation, sample rules, and compilers required to convert the Sigma into queries. To get started, fetch the Sigma repository and store it on your disk:

git clone https://github.com/SigmaHQ/sigma

Step 2: Create a YAML file

Let’s start writing the Sigma rule by creating a new YAML file locally. You can use your favorite development environment or editor (even Notepad if you love excruciating pain). With code editors, you can easily format, syntax-check, and highlight your YAML file without hitting your head against a wall. Once the file is created, let’s copy the specification into the file: (excuse any mistakes you might notice as we’ll fix it while writing the rule)

title:
id:
status:
description:
author:
references:
logsource:
   category:
   product:
   service:
   definition:
detection: 
   condition: 
fields: 
falsepositives:
level: 
tags:

Step 3: Provide Input to Attributes

Our next step is to fill the required attributes by acquiring information from the source of the detection use-case. For this particular example, I’ll cheat and open one of rules from the Sigma repository which are available under the .\rules\ directory.

Here a few attributes which are almost always available to us in advisories:

title: Mimikatz Command Line
id: a642964e-bead-4bed-8910-1bb4d63e3b4d
description: Detection well-known mimikatz command line arguments
author: Teymur Kheirkhabarov, oscd.community
date: 2019/10/22
modified: 2020/09/01
references:
    - https://www.slideshare.net/heirhabarov/hunting-for-credentials-dumping-in-windows-environment
falsepositives:
    - Legitimate Administrator using tool for password recovery
level: medium
status: experimental

Here, the status attribute defines the maturity of the rule itself. It can have either one of these values:

Stable — Usable in production environments
Test — Tuning is required if an FP is thoroughly vetted
Experimental — Usable in the test environment and needs tuning to reduce the noise and FPs

Similarly, the level attribute defines the criticality of the rule itself. If an alert is indeed generated based on the rule — what should you be doing? Based on internal processes, the following four values can be used for the Level attribute and responded to accordingly:

Low
Medium
High
Critical

Next up — we have the logsource and detection sections. Let’s cover them one by one. From the same rule:

logsource:
    category: process_creation
    product: windows

The logsource section covers the log data itself upon which the detection clauses will be applicable. It can have three major attributes:

Category — log files which fall under a particular category e.g. DNS server logs, process_creation, file_event logs, etc.
Product — log files generated by a particular product e.g. windows (Eventlog), linux, splunk, etc.
Service — Subset of a product’s log e.g. security, powershell, sysmon, etc.

For our example, we’ve used the process_creation category which caters to generic logs of process creation under the Windows Eventlog.

detection:
    selection_1:
        CommandLine|contains:
            - DumpCreds
            - invoke-mimikatz
    selection_2:
        CommandLine|contains:
            - rpc
            - token
            - crypto
            - dpapi
            - sekurlsa
            - kerberos
            - lsadump
            - privilege
            - process
    selection_3:
        CommandLine|contains:
            - '::'
    condition: selection_1 or selection_2 and selection_3

For the detection section, we’ve defined the following attributes:

Selections — What you actually wish to select/search from the log data
Conditions — How should the selection or filters be evaluated

Here, the selection (search-identifier) matches the CommandLine field in the log data and uses the transformation modifier (contains) in order to check if the listed keywords are available or not. A short list of modifiers is listed below:

contains
all
base64
endswith
startswith

We can also use wildcard characters to match a wide list of keywords in the log data. For example, instead of a hardcoded path, we wish to find logs of execution for cmd.exe — we can use ‘\*cmd.exe’. All three selections repeat the same format. Finally, the condition attribute evaluates the three selections by means of a logical OR operation.

For conditions, we can evaluate the expressions using:

Logical AND/OR operations
1 of (selection) OR all of (selection) — this you might recognize from Yara as well
Negation using ‘not’ — e.g. not selection
Grouping expressions by using parenthesis — e.g. (selection1 and selection2) or selection3

Mind you — there is much more to learn here about modifiers (type vs. transformation modifiers), various categories, lists and maps in selections, grouping of conditions and aggregations. Sadly, this article can’t contain it all. I, once again, urge you to read the specification (thoroughly) once so you can easily write better rules.

Step 4: Compiling the Rule

In order to convert the Sigma rule into searchable queries for your SIEM or logging platform, you can use the Sigmac tool shipped with Sigma itself. Head into the .\tools\ directory and execute Sigmac :

python .\sigmac -h

Sigmac allows you to convert a particular rule for a target like Splunk, QRadar, or even PowerShell. Sigmac also uses field mappings to convert fields used in the rule into actual fields available for the desired target. For example, the CommandLine field might be converted into “Command Line” for your custom solution (based on your target/configuration).

In order to view all available targets, configurations, and modifiers, simply run:

python .\sigmac --lists

To finally convert the rule for a target (e.g. Splunk) use: (We also require the -c flag for configurations to correctly set the field mappings)

python .\sigmac -t splunk -c splunk-windows ..\rules\windows\process_creation\win_mimikatz_command_line.yml

You can now go ahead and use the rule on your Splunk instance!

To read more about Configurations and Field Mapping — refer to the source: https://github.com/SigmaHQ/sigma/blob/master/tools/README.md

One quicker way of writing Sigma rules is to review the .\rules\ directory, find the rule of your preference (or the one which has similar detections/selections), and go about editing it. This can save you some time. Though make sure you edit the attributes so the newly written rule is unique as well. It’s fairly common to take inspiration from the available rules (especially during the learning phase).

Using Uncoder.io

SOC Prime has also released a tool named Uncoder. Using the web interface, you can easily write and play around with Sigma rules on the web. It allows you to write, edit, test, and compile the rules for a variety of targets.

Conclusion

Writing Sigma rules is quite easy. Though there are sophisticated conditions and selections as well, you can easily grasp them in time. The authors have done a tremendous job with the documentation of the tool and its compiler for newcomers to understand. Make sure to practice writing Sigma rules every now and then so you can write them to aid your Incident Response process or empower your SOC in the longer run.