************* Writing Rules ************* Gene rules have been designed to be simple to understand and to parse. In order not to have to develop a custom rule parser, we have decided to use **JSON** document as container. Thus every rule, to work properly, has to follow a specific format which is going to be described in this page. Getting the format supported by the engine ========================================== A starting point to understand the format of a rule is to use the ``gene`` command line utility to get the format supported by your engine:: gene -template This command line should return a **JSON** document containing all the fields used by your current version of ``gene``. .. note:: When dealing with **JSON** documents in command line there is an amazing utility that you must try out if you do not know it yet. This tool is called ``jq``. If you are running Linux you can probably install it from your package manager, otherwise visit `jq `_ Rule Structure ============== .. code-block:: JSON { "Name": "", "Tags": [], "Meta": { "EventIDs": [], "Channels": [], "Computers": [], "Traces": [], "Criticality": 0, "Disable": false }, "Matches": [], "Condition": "" } .. note:: The fields present in the template shown above are the ones used by the engine. It means that **any** additional fields will not impact the engine. This trick can be used to document the rule. It is a good practice to add information such as **Author**, **Comments** and eventual **Links** in the **Meta** section. .. table:: Field Definition +------------+------------+----------------------------------------------------+ | Field | Type | Description | +============+============+====================================================+ | Name | string | Name of the rule | | | | | +------------+------------+----------------------------------------------------+ | Tags | []string | Contains a list of tags related to the rule. It | | | | can be used to group rules | | | | according to their tag(s). | +------------+------------+----------------------------------------------------+ | Meta | dict | Contains a bunch of information related the trigger| | | | of the rule. The information in there is used to | | | | match against the "System" section of the Windows | | | | events to speed up the match. | +------------+------------+----------------------------------------------------+ | EventIDs | []int | List of Windows Event IDs the rule should match | | | | against. If empty the rule will apply against any | | | | Event ID of the ``Channels`` (c.f. see next) | +------------+------------+----------------------------------------------------+ | Channels | []string | List of channels the rule should apply on. If | | | | empty, the rule will apply against any event of any| | | | channel. | +------------+------------+----------------------------------------------------+ | Computers | []string | List of computer names the rule should apply on. | | | | If empty, the rule applies on all the computers. | +------------+------------+----------------------------------------------------+ | Traces | []string | List of traces used to trace other events related | | | | to the rule. A rule can be used to generate | | | | dynamic rules with information from the event which| | | | matched the rule. The syntax of each trace must | | | | follow `Traces Format`_. | +------------+------------+----------------------------------------------------+ |Criticality |0 < int < 10| The criticality level attributed to the events | | | | matching the rule. If an event matches several | | | | rules the criticality levels are added between them| | | | and will never go above 10. | +------------+------------+----------------------------------------------------+ | Disable | bool | Boolean value used to disable the rule. | +------------+------------+----------------------------------------------------+ | Matches | []string | List of **Matches**, should follow the syntax of | | | | `Matches Format`_ | +------------+------------+----------------------------------------------------+ | Condition | string | String implementing the logic on the **Matches** to| | | | trigger the rule. The syntax should be compliant | | | | with `Condition Format`_ | +------------+------------+----------------------------------------------------+ .. note:: The more precise **EventIDs** and **Channels** fields, the faster the rule is. This information is mainly used to filter out irrelevant events. Matches Format -------------- A **Match** can be seen as an atomic check which is done on every Windows Event (pre-filtered using **Meta** section of the rule) going through the engine. Every match can be referenced once or more in the **Condition** to create complex matching rule. Currently, the latest version of the engine supports two kinds of **Matches**. .. important:: It is very important to remember that **Matches** only apply on the fields located under the ``EventData`` section of Windows Events. Field Matches ^^^^^^^^^^^^^ A **Field Match** is basically an **equality** or a **regex** check done on a given **field value**. This kind of **Match** brings flexibility to the engine since anything can be matched through regular expression. **Syntax:** ``$VAR_NAME: FIELD OPERATOR 'VALUE'`` .. table:: Field Match Symbols Definition +------------+----------------------------------------------------------------+ | Symbols | Description | +============+================================================================+ | VAR_NAME | Name of the variable use to access the result of the **Match** | | | in the **Condition**, it must be preceded by a ``$`` | +------------+----------------------------------------------------------------+ | FIELD | Field to match with in ``EventData`` section of Windows Events | +------------+----------------------------------------------------------------+ | OPERATOR | Operator to use for the match: | | | * ``=`` : equal operator | | | * ``~=`` : regexp operator (tells to compile VALUE as a regex)| | | * ``>`` : greater than operator (only for `int` fields) | | | * ``<`` : lower than operator (only for `int` fields) | | | * ``&=`` : test flag operator expects the field to be an `int`| +------------+----------------------------------------------------------------+ | VALUE | Must be surrounded by **simple quotes** ``'``. This is the | | | **value/regex** to match against to make **$VAR_NAME = true** | +------------+----------------------------------------------------------------+ Match Workflow:: +-------+ +---------+ | Event | | Match | +-------+ +---------+ | +----------+ | +----> | Engine | <----+ +----------+ | +---------------------------+ | Extracts value from FIELD | +---------------------------+ | +---------------------------+ | Does value match VALUE | | according to OPERATOR ? | +---------------------------+ | ^ YES / \ NO / \ +------------------+ +-------------------+ | $VAR_NAME = true | | $VAR_NAME = false | +------------------+ +-------------------+ \ / \ / v | +--------------------+ | $VAR_NAME value is | | used in condition | +--------------------+ .. note:: Any regular expression must follow `Go regexp syntax `_. Example """"""" The following snippet shows a rule used to catch Windows Event log clearing attempts using ``wevtutil.exe``. .. code-block:: JSON { "Name": "EventClearing", "Tags": ["PostExploit"], "Meta": { "EventIDs": [1], "Channels": ["Microsoft-Windows-Sysmon/Operational"], "Computers": [], "Criticality": 8, "Author": "@0xrawsec" }, "Matches": [ "$im: Image ~= '(?i:\\\\wevtutil\\.exe$)'", "$cmd: CommandLine ~= '(?i: cl | clear-log )'" ], "Condition": "$im and $cmd" } .. warning:: In order to match a single ``\`` Windows path separator, we need to use ``\\\\`` when using ``=~`` and ``\\`` when using ``=`` operator. This is due to the **JSON** document parsing process which escapes characters. The following additional example shows how to detect a suspicious access to ``lsass.exe`` with the help of the ``&=`` operator. Basically, we want to trigger this alert on any **ProcessAccess** events targeting ``lsass.exe`` where the **GrantedAccess** contains process **read access flag 0x10**. .. code-block:: JSON { "Name": "SuspiciousLsassAccess", "Tags": ["Mimikatz", "Credentials", "Lsass"], "Meta": { "EventIDs": [10], "Channels": ["Microsoft-Windows-Sysmon/Operational"], "Computers": [], "Criticality": 8, "Author": "0xrawsec" }, "Matches": [ "$ctwdef: CallTrace ~= '(?i:windows defender)'", "$ga: GrantedAccess &= '0x10'", "$lsass: TargetImage ~= '(?i:\\\\lsass\\.exe$)'" ], "Condition": "$lsass and $ga and !$ctwdef" } Container Matches ^^^^^^^^^^^^^^^^^ A **Container Match** is a little bit more advanced since it can be used to extract a part of a **field value** and check it against a container. For instance, with this kind of **Match**, we are able to extract a **domain** information contained in Windows DNS-Client logs and check it against a blacklist. Although, implementing this use case would be possible with **Field Matches**, it would be much slower due to regex engine. In addition the rule would need to be updated at every new entry to check, however with **Container Match** only the container (a simple separate file) needs to be updated. The speed is provided by the container which is implemented in a form of a set data structure. **Syntax:** ``$VAR_NAME: extract('REGEXP', FIELD) in CONTAINER`` .. table:: Container Match Symbols Definition +------------+----------------------------------------------------------------+ | Symbols | Description | +============+================================================================+ | VAR_NAME | Name of the variable used to access the result of the **Match**| | | in the **Condition**, it must be preceded by a ``$`` | +------------+----------------------------------------------------------------+ | FIELD | Field to extract from | +------------+----------------------------------------------------------------+ | REGEXP | Regular expression used to extract a value from FIELD and check| | | it against a **CONTAINER**. **REGEXP** must follow **named** | | | regexp syntax ``(?Pre)`` | +------------+----------------------------------------------------------------+ | CONTAINER | Container to use to check the extracted value | +------------+----------------------------------------------------------------+ .. important:: * If a rule makes use of an **undefined container**, the rule will be disabled at runtime and a warning message will be printed. * A given container is shared across all the rules loaded into the engine * Any regular expression must follow `Go regexp syntax `_. Example """"""" This rule shows an example of how to extract domains and sub-domains from **Windows DNS-Client** logs and check it against a blacklist. .. code-block:: JSON { "Name": "BlacklistedDomain", "Tags": ["DNS"], "Meta": { "EventIDs": [], "Channels": ["Microsoft-Windows-DNS-Client/Operational"], "Computers": [], "Criticality": 10, "Author": "@0xrawsec", "Comment": "" }, "Matches": [ "$domainBL: extract('(?P\\w+\\.\\w+$)',QueryName) in blacklist'", "$subdomainBL: extract('(?P\\w+\\.\\w+\\.\\w+$)',QueryName) in blacklist'", "$subsubdomainBL: extract('(?P\\w+\\.\\w+\\.\\w+\\.\\w+$)',QueryName) in blacklist'" ], "Condition": "$domainBL or $subdomainBL or $subsubdomainBL" } Traces Format ------------- A trace is used to generate a new rule **on the fly** derived from both the rule which triggered and the **Windows Event** which matched. This feature allows the engine to do some **basic** correlation. The rule generated is very basic and has a single match. **Syntax:** ``EVENT_IDS:CHANNELS: NEW_FIELD OPERATOR EVT_VAL_FIELD`` .. table:: Trace Symbols Definition +---------------+----------------------------------------------------------------+ | Symbols | Description | +===============+================================================================+ | EVENT_IDS | Comma separated list of **Windows Event IDs** used to set | | | EventIDs field of the new rule. If empty, default is to | | | inherit from **the rule defining the trace**. | +---------------+----------------------------------------------------------------+ | CHANNELS | Comma separated list of **Windows Event Log Channels** used to | | | set **Channels** field of the generated rule. If empty, default| | | is to inherit from **the rule defining the trace**. | +---------------+----------------------------------------------------------------+ | NEW_FIELD | **Field name** to use for the **single Match** of the generated| | | rule. | +---------------+----------------------------------------------------------------+ | OPERATOR | Operator to use for the match: | | | * ``=`` : equal operator | | | * ``~=`` : regexp operator (tells to compile VALUE as a regex)| +---------------+----------------------------------------------------------------+ | EVT_VAL_FIELD | Name of the field in the matching **Windows Event** to extract | | | the value from and used as **VALUE** in the generated rule | +---------------+----------------------------------------------------------------+ .. note :: Keywords ``any``, ``ANY`` or ``*`` can be used instead of comma separated list in **EVENT_IDS** and **CHANNELS** to respectively apply trace on any Event ID and any Channel. The concept behind the traces is maybe a little bit hard to understand (and also to explain). That is why, in the following snippet, I have tried to show what a generated rule from a trace would look like (because you would not see it since it is generated dynamically). .. code-block:: JSON { "Name": "GENERATED_NAME", "Tags": ["inherited from triggering rule"], "Meta": { "EventIDs": ["inherited from triggering rule OR set from trace"], "Channels": ["inherited from triggering rule OR set from trace"], "Computers": ["inherited from triggering rule"], "Traces": [ "inherited from triggering rule" ], "Criticality": "inherited from triggering rule", }, "Matches": [ "$m: NEW_FIELD OPERATOR 'ValueOf(EVT_VAL_FIELD) extracted from Matching Event'", ], "Condition": "$m" } .. warning:: * Traces generation is not enabled by default by the engine, in order to enable it, use the ``-trace`` command line switch * When trace mode is enabled, many rules can be generated at runtime and the engine will by design become slower while the number of rules is growing * If **X** number of traces is defined, **X** rules will be generated at runtime when **trace mode** is enabled and the rule matches a **Windows Event** Example ^^^^^^^ The following rule will generate rules to trace **any Event ID** from channel **Microsoft-Windows-Sysmon/Operational** where either the **ProcessGuid** or **ParentProcessGuid** is equal to the **ProcessGuid** of the event which triggered the rule. .. code-block:: JSON { "Name": "MaliciousLsassAccess", "Tags": ["Mimikatz", "Credentials", "Lsass"], "Meta": { "EventIDs": [10], "Channels": ["Microsoft-Windows-Sysmon/Operational"], "Computers": [], "Traces": [ "*::ProcessGuid = ProcessGuid", "*::ParentProcessGuid = ProcessGuid" ], "Criticality": 10, "Author": "0xrawsec" }, "Matches": [ "$ct: CallTrace ~= 'UNKNOWN'", "$lsass: TargetImage ~= '(?i:\\\\lsass\\.exe$)'" ], "Condition": "$lsass and $ct" } Condition Format ---------------- A condition applies a logic to the different **Matches** defined in the rule. If the result of the computation of the **Condition** is **true** the event is considered as matching the rule. .. table:: Allowed Symbols in Condition +---------+----------------------------------------------------------------+ | Symbols | Description | +=========+================================================================+ | ``$var``| Variable referencing a **Match** | +---------+----------------------------------------------------------------+ | ``()`` | Used to group / prioritize some logical expressions | +---------+----------------------------------------------------------------+ | ``!`` | Negates a **Match** or a grouped expression | +---------+----------------------------------------------------------------+ | ``AND`` | AND logical operator | +---------+ | | ``and`` | | +---------+ | | ``&&`` | | +---------+----------------------------------------------------------------+ | ``OR`` | OR logical operator | +---------+ | | ``or`` | | +---------+ | | ``||`` | | +---------+----------------------------------------------------------------+ .. important:: For every **Windows Event** tested against a rule the **Condition** is evaluated in real time **from left to right**. As a consequence, the order of the variables to check might have a small impact on the rule performances. For more efficiency always try to put the more restrictive ones first. Example ^^^^^^^ The following rule is used to match suspicious explicit network logons, we can see an example of an advanced condition. .. code-block:: JSON { "Name": "ExplicitNetworkLogon", "Tags": ["Lateral", "Security"], "Meta": { "EventIDs": [4624], "Channels": ["Security"], "Computers": [], "Criticality": 5, "Author": "@0xrawsec" }, "Matches": [ "$logt: LogonType = '3'", "$user: TargetUserName = 'ANONYMOUS LOGON'", "$iplh1: IpAddress = '-'", "$iplh2: IpAddress = '127.0.0.1'", "$enddol: TargetUserName ~= '\\$$'" ], "Condition": "$logt and !($user or $iplh1 or $iplh2 or $enddol)" } Regular Expression Templates ---------------------------- Since **Gene version 1.4**, we have introduced regular expression templates. When the amount of rules is growing, we noticed that we often use the same regular expressions. It is always boring to remember where we used a similar regex in order to copy paste it in the new rule we are writing. In addition, if a mistake is found in such a shared regex, all the places where it is used need to be updated, which is error prone. So, you got it, the purpose of this feature is just to make the life of the rules writer easier. There is no rocket science behind regular expression templates. Templates are defined in a file (or files) and can then be used in the **Matches** following a given syntax. **File Extension**: ``.tpl`` (must be located in rule directory we are using) **Syntax** * **definition:** ``TEMPLATE_NAME: 'REGULAR_EXPRESSION'`` * **use:** ``{{TEMPLATE_NAME}}`` Example ^^^^^^^ There is an example of a few **regex templates** .. code-block:: bash # Paths windows: '(?i:C:\\Windows\\)' system: '(?i:C:\\Windows\\Sys(wow64|tem32)\\)' programfiles: '(?i:C:\\(PROGRA~2|Program Files.*?)\\)' #Extensions script-exts: '(?i:(\.ps1|\.bat|\.cmd|\.vb|\.vbs|\.vbscript|\.vbe|\.js|\.jse|\.ws|\.wsf))' exec-exts: '(?i:(\.acm|\.ax|\.com|\.cpl|\.dic|\.dll|\.drv|\.ds|\.efi|\.exe|\.grm|\.iec|\.ime|\.lex|\.msstyles|\.mui|\.ocx|\.olb|\.rll|\.rs|\.scr|\.sys|\.tlb|\.tsp))' .. important:: To escape characters, just use a simple ``\`` it is different from the way to escape in rules. To make use of the template previously defined .. code-block:: JSON { "Name": "HeurDropper", "Tags": ["Heuristics", "CreateFile"], "Meta": { "EventIDs": [11], "Channels": ["Microsoft-Windows-Sysmon/Operational"], "Computers": [], "Criticality": 8, "Author": "0xrawsec", "Comments": "Experimental rule to detect executable files dropped by common utilities" }, "Matches": [ "$susp: Image ~= '{{suspicious}}$'", "$target: TargetFilename ~= '({{exec-exts}}|{{script-exts}})$'" ], "Condition": "$susp and $target" } In order to debug the rules using templates, we have introduced a new feature in the ``gene`` command line utility. One can use the ``-dump`` command line switch to dump the rule as it is after template replacement. .. code-block:: bash > gene -dump HeurDropper -r ./gene-rules | jq { "Name": "HeurDropper", "Tags": [ "Heuristics", "CreateFile" ], "Meta": { "EventIDs": [ 11 ], "Channels": [ "Microsoft-Windows-Sysmon/Operational" ], "Computers": [], "Traces": null, "Criticality": 8, "Disable": false }, "Matches": [ "$susp: Image ~= '(?i:\\\\(certutil|rundll32|powershell|wscript|cscript|cmd|mshta|regsvr32|msbuild|installutil|regasm)\\.exe)$'", "$target: TargetFilename ~= '((?i:(\\.acm|\\.ax|\\.com|\\.cpl|\\.dic|\\.dll|\\.drv|\\.ds|\\.efi|\\.exe|\\.grm|\\.iec|\\.ime|\\.lex|\\.msstyles|\\.mui|\\.ocx|\\.olb|\\.rll|\\.rs|\\.scr|\\.sys|\\.tlb|\\.tsp))|(?i:(\\.ps1|\\.bat|\\.cmd|\\.vb|\\.vbs|\\.vbscript|\\.vbe|\\.js|\\.jse|\\.ws|\\.wsf)))$'" ], "Condition": "$susp and $target" } .. note:: As you can see in the dumped rule, the simple ``\`` becomes ``\\``, this is due to the **JSON** parser. .. note:: Notice how easy it is now just to add a new extension to the list so that it impacts all the rules using this template.