*************
Writing Rules
*************
Gene rules have been designed to be simple to understand and to parse.
In order not to have to develop a custom rule parser, we have decided to use **JSON**
document as container. Thus every rule, to work properly, has to follow a specific
format which is going to be described in this page.
Getting the format supported by the engine
==========================================
A starting point to understand the format of a rule is to use the ``gene`` command
line utility to get the format supported by your engine::
gene -template
This command line should return a **JSON** document containing all the fields used
by your current version of ``gene``.
.. note::
When dealing with **JSON** documents in command line there is an amazing utility
that you must try out if you do not know it yet. This tool is called ``jq``. If
you are running Linux you can probably install it from your package manager,
otherwise visit `jq `_
Rule Structure
==============
.. code-block:: JSON
{
"Name": "",
"Tags": [],
"Meta": {
"EventIDs": [],
"Channels": [],
"Computers": [],
"Traces": [],
"Criticality": 0,
"Disable": false
},
"Matches": [],
"Condition": ""
}
.. note::
The fields present in the template shown above are the ones used by the engine.
It means that **any** additional fields will not impact the engine. This
trick can be used to document the rule. It is a good practice to add information
such as **Author**, **Comments** and eventual **Links** in the **Meta** section.
.. table:: Field Definition
+------------+------------+----------------------------------------------------+
| Field | Type | Description |
+============+============+====================================================+
| Name | string | Name of the rule |
| | | |
+------------+------------+----------------------------------------------------+
| Tags | []string | Contains a list of tags related to the rule. It |
| | | can be used to group rules |
| | | according to their tag(s). |
+------------+------------+----------------------------------------------------+
| Meta | dict | Contains a bunch of information related the trigger|
| | | of the rule. The information in there is used to |
| | | match against the "System" section of the Windows |
| | | events to speed up the match. |
+------------+------------+----------------------------------------------------+
| EventIDs | []int | List of Windows Event IDs the rule should match |
| | | against. If empty the rule will apply against any |
| | | Event ID of the ``Channels`` (c.f. see next) |
+------------+------------+----------------------------------------------------+
| Channels | []string | List of channels the rule should apply on. If |
| | | empty, the rule will apply against any event of any|
| | | channel. |
+------------+------------+----------------------------------------------------+
| Computers | []string | List of computer names the rule should apply on. |
| | | If empty, the rule applies on all the computers. |
+------------+------------+----------------------------------------------------+
| Traces | []string | List of traces used to trace other events related |
| | | to the rule. A rule can be used to generate |
| | | dynamic rules with information from the event which|
| | | matched the rule. The syntax of each trace must |
| | | follow `Traces Format`_. |
+------------+------------+----------------------------------------------------+
|Criticality |0 < int < 10| The criticality level attributed to the events |
| | | matching the rule. If an event matches several |
| | | rules the criticality levels are added between them|
| | | and will never go above 10. |
+------------+------------+----------------------------------------------------+
| Disable | bool | Boolean value used to disable the rule. |
+------------+------------+----------------------------------------------------+
| Matches | []string | List of **Matches**, should follow the syntax of |
| | | `Matches Format`_ |
+------------+------------+----------------------------------------------------+
| Condition | string | String implementing the logic on the **Matches** to|
| | | trigger the rule. The syntax should be compliant |
| | | with `Condition Format`_ |
+------------+------------+----------------------------------------------------+
.. note::
The more precise **EventIDs** and **Channels** fields, the faster the rule is.
This information is mainly used to filter out irrelevant events.
Matches Format
--------------
A **Match** can be seen as an atomic check which is done on every Windows Event
(pre-filtered using **Meta** section of the rule) going through the engine. Every
match can be referenced once or more in the **Condition** to create complex
matching rule. Currently, the latest version of the engine supports two kinds of
**Matches**.
.. important::
It is very important to remember that **Matches** only apply on the fields
located under the ``EventData`` section of Windows Events.
Field Matches
^^^^^^^^^^^^^
A **Field Match** is basically an **equality** or a **regex** check done on a
given **field value**. This kind of **Match** brings flexibility to the engine since
anything can be matched through regular expression.
**Syntax:** ``$VAR_NAME: FIELD OPERATOR 'VALUE'``
.. table:: Field Match Symbols Definition
+------------+----------------------------------------------------------------+
| Symbols | Description |
+============+================================================================+
| VAR_NAME | Name of the variable use to access the result of the **Match** |
| | in the **Condition**, it must be preceded by a ``$`` |
+------------+----------------------------------------------------------------+
| FIELD | Field to match with in ``EventData`` section of Windows Events |
+------------+----------------------------------------------------------------+
| OPERATOR | Operator to use for the match: |
| | * ``=`` : equal operator |
| | * ``~=`` : regexp operator (tells to compile VALUE as a regex)|
| | * ``>`` : greater than operator (only for `int` fields) |
| | * ``<`` : lower than operator (only for `int` fields) |
| | * ``&=`` : test flag operator expects the field to be an `int`|
+------------+----------------------------------------------------------------+
| VALUE | Must be surrounded by **simple quotes** ``'``. This is the |
| | **value/regex** to match against to make **$VAR_NAME = true** |
+------------+----------------------------------------------------------------+
Match Workflow::
+-------+ +---------+
| Event | | Match |
+-------+ +---------+
| +----------+ |
+----> | Engine | <----+
+----------+
|
+---------------------------+
| Extracts value from FIELD |
+---------------------------+
|
+---------------------------+
| Does value match VALUE |
| according to OPERATOR ? |
+---------------------------+
|
^
YES / \ NO
/ \
+------------------+ +-------------------+
| $VAR_NAME = true | | $VAR_NAME = false |
+------------------+ +-------------------+
\ /
\ /
v
|
+--------------------+
| $VAR_NAME value is |
| used in condition |
+--------------------+
.. note::
Any regular expression must follow `Go regexp syntax `_.
Example
"""""""
The following snippet shows a rule used to catch Windows Event log clearing attempts
using ``wevtutil.exe``.
.. code-block:: JSON
{
"Name": "EventClearing",
"Tags": ["PostExploit"],
"Meta": {
"EventIDs": [1],
"Channels": ["Microsoft-Windows-Sysmon/Operational"],
"Computers": [],
"Criticality": 8,
"Author": "@0xrawsec"
},
"Matches": [
"$im: Image ~= '(?i:\\\\wevtutil\\.exe$)'",
"$cmd: CommandLine ~= '(?i: cl | clear-log )'"
],
"Condition": "$im and $cmd"
}
.. warning::
In order to match a single ``\`` Windows path separator, we need to use ``\\\\``
when using ``=~`` and ``\\`` when using ``=`` operator. This is due to the **JSON**
document parsing process which escapes characters.
The following additional example shows how to detect a suspicious access to ``lsass.exe`` with the help
of the ``&=`` operator. Basically, we want to trigger this alert on any **ProcessAccess**
events targeting ``lsass.exe`` where the **GrantedAccess** contains process
**read access flag 0x10**.
.. code-block:: JSON
{
"Name": "SuspiciousLsassAccess",
"Tags": ["Mimikatz", "Credentials", "Lsass"],
"Meta": {
"EventIDs": [10],
"Channels": ["Microsoft-Windows-Sysmon/Operational"],
"Computers": [],
"Criticality": 8,
"Author": "0xrawsec"
},
"Matches": [
"$ctwdef: CallTrace ~= '(?i:windows defender)'",
"$ga: GrantedAccess &= '0x10'",
"$lsass: TargetImage ~= '(?i:\\\\lsass\\.exe$)'"
],
"Condition": "$lsass and $ga and !$ctwdef"
}
Container Matches
^^^^^^^^^^^^^^^^^
A **Container Match** is a little bit more advanced since it can be used to extract
a part of a **field value** and check it against a container. For
instance, with this kind of **Match**, we are able to extract a **domain** information
contained in Windows DNS-Client logs and check it against a blacklist. Although,
implementing this use case would be possible with **Field Matches**, it
would be much slower due to regex engine. In addition the rule would need to be updated
at every new entry to check, however with **Container Match** only the container
(a simple separate file) needs to be updated. The speed is provided by the
container which is implemented in a form of a set data structure.
**Syntax:** ``$VAR_NAME: extract('REGEXP', FIELD) in CONTAINER``
.. table:: Container Match Symbols Definition
+------------+----------------------------------------------------------------+
| Symbols | Description |
+============+================================================================+
| VAR_NAME | Name of the variable used to access the result of the **Match**|
| | in the **Condition**, it must be preceded by a ``$`` |
+------------+----------------------------------------------------------------+
| FIELD | Field to extract from |
+------------+----------------------------------------------------------------+
| REGEXP | Regular expression used to extract a value from FIELD and check|
| | it against a **CONTAINER**. **REGEXP** must follow **named** |
| | regexp syntax ``(?Pre)`` |
+------------+----------------------------------------------------------------+
| CONTAINER | Container to use to check the extracted value |
+------------+----------------------------------------------------------------+
.. important::
* If a rule makes use of an **undefined container**, the rule will be disabled
at runtime and a warning message will be printed.
* A given container is shared across all the rules loaded into the engine
* Any regular expression must follow `Go regexp syntax `_.
Example
"""""""
This rule shows an example of how to extract domains and sub-domains from **Windows
DNS-Client** logs and check it against a blacklist.
.. code-block:: JSON
{
"Name": "BlacklistedDomain",
"Tags": ["DNS"],
"Meta": {
"EventIDs": [],
"Channels": ["Microsoft-Windows-DNS-Client/Operational"],
"Computers": [],
"Criticality": 10,
"Author": "@0xrawsec",
"Comment": ""
},
"Matches": [
"$domainBL: extract('(?P\\w+\\.\\w+$)',QueryName) in blacklist'",
"$subdomainBL: extract('(?P\\w+\\.\\w+\\.\\w+$)',QueryName) in blacklist'",
"$subsubdomainBL: extract('(?P\\w+\\.\\w+\\.\\w+\\.\\w+$)',QueryName) in blacklist'"
],
"Condition": "$domainBL or $subdomainBL or $subsubdomainBL"
}
Traces Format
-------------
A trace is used to generate a new rule **on the fly** derived from both the rule
which triggered and the **Windows Event** which matched. This feature allows
the engine to do some **basic** correlation. The rule generated is very basic
and has a single match.
**Syntax:** ``EVENT_IDS:CHANNELS: NEW_FIELD OPERATOR EVT_VAL_FIELD``
.. table:: Trace Symbols Definition
+---------------+----------------------------------------------------------------+
| Symbols | Description |
+===============+================================================================+
| EVENT_IDS | Comma separated list of **Windows Event IDs** used to set |
| | EventIDs field of the new rule. If empty, default is to |
| | inherit from **the rule defining the trace**. |
+---------------+----------------------------------------------------------------+
| CHANNELS | Comma separated list of **Windows Event Log Channels** used to |
| | set **Channels** field of the generated rule. If empty, default|
| | is to inherit from **the rule defining the trace**. |
+---------------+----------------------------------------------------------------+
| NEW_FIELD | **Field name** to use for the **single Match** of the generated|
| | rule. |
+---------------+----------------------------------------------------------------+
| OPERATOR | Operator to use for the match: |
| | * ``=`` : equal operator |
| | * ``~=`` : regexp operator (tells to compile VALUE as a regex)|
+---------------+----------------------------------------------------------------+
| EVT_VAL_FIELD | Name of the field in the matching **Windows Event** to extract |
| | the value from and used as **VALUE** in the generated rule |
+---------------+----------------------------------------------------------------+
.. note ::
Keywords ``any``, ``ANY`` or ``*`` can be used instead of comma separated list
in **EVENT_IDS** and **CHANNELS** to respectively apply trace on any Event ID
and any Channel.
The concept behind the traces is maybe a little bit hard to understand (and also to explain).
That is why, in the following snippet, I have tried to show what a generated rule
from a trace would look like (because you would not see it since it is generated
dynamically).
.. code-block:: JSON
{
"Name": "GENERATED_NAME",
"Tags": ["inherited from triggering rule"],
"Meta": {
"EventIDs": ["inherited from triggering rule OR set from trace"],
"Channels": ["inherited from triggering rule OR set from trace"],
"Computers": ["inherited from triggering rule"],
"Traces": [
"inherited from triggering rule"
],
"Criticality": "inherited from triggering rule",
},
"Matches": [
"$m: NEW_FIELD OPERATOR 'ValueOf(EVT_VAL_FIELD) extracted from Matching Event'",
],
"Condition": "$m"
}
.. warning::
* Traces generation is not enabled by default by the engine, in order to enable
it, use the ``-trace`` command line switch
* When trace mode is enabled, many rules can be generated at runtime and the
engine will by design become slower while the number of rules is growing
* If **X** number of traces is defined, **X** rules will be generated at runtime when
**trace mode** is enabled and the rule matches a **Windows Event**
Example
^^^^^^^
The following rule will generate rules to trace **any Event ID** from channel
**Microsoft-Windows-Sysmon/Operational** where either the **ProcessGuid** or
**ParentProcessGuid** is equal to the **ProcessGuid** of the event which triggered the
rule.
.. code-block:: JSON
{
"Name": "MaliciousLsassAccess",
"Tags": ["Mimikatz", "Credentials", "Lsass"],
"Meta": {
"EventIDs": [10],
"Channels": ["Microsoft-Windows-Sysmon/Operational"],
"Computers": [],
"Traces": [
"*::ProcessGuid = ProcessGuid",
"*::ParentProcessGuid = ProcessGuid"
],
"Criticality": 10,
"Author": "0xrawsec"
},
"Matches": [
"$ct: CallTrace ~= 'UNKNOWN'",
"$lsass: TargetImage ~= '(?i:\\\\lsass\\.exe$)'"
],
"Condition": "$lsass and $ct"
}
Condition Format
----------------
A condition applies a logic to the different **Matches** defined in the rule.
If the result of the computation of the **Condition** is **true** the event is
considered as matching the rule.
.. table:: Allowed Symbols in Condition
+---------+----------------------------------------------------------------+
| Symbols | Description |
+=========+================================================================+
| ``$var``| Variable referencing a **Match** |
+---------+----------------------------------------------------------------+
| ``()`` | Used to group / prioritize some logical expressions |
+---------+----------------------------------------------------------------+
| ``!`` | Negates a **Match** or a grouped expression |
+---------+----------------------------------------------------------------+
| ``AND`` | AND logical operator |
+---------+ |
| ``and`` | |
+---------+ |
| ``&&`` | |
+---------+----------------------------------------------------------------+
| ``OR`` | OR logical operator |
+---------+ |
| ``or`` | |
+---------+ |
| ``||`` | |
+---------+----------------------------------------------------------------+
.. important::
For every **Windows Event** tested against a rule the **Condition** is evaluated in real
time **from left to right**. As a consequence, the order of the variables to
check might have a small impact on the rule performances. For more efficiency
always try to put the more restrictive ones first.
Example
^^^^^^^
The following rule is used to match suspicious explicit network logons, we can
see an example of an advanced condition.
.. code-block:: JSON
{
"Name": "ExplicitNetworkLogon",
"Tags": ["Lateral", "Security"],
"Meta": {
"EventIDs": [4624],
"Channels": ["Security"],
"Computers": [],
"Criticality": 5,
"Author": "@0xrawsec"
},
"Matches": [
"$logt: LogonType = '3'",
"$user: TargetUserName = 'ANONYMOUS LOGON'",
"$iplh1: IpAddress = '-'",
"$iplh2: IpAddress = '127.0.0.1'",
"$enddol: TargetUserName ~= '\\$$'"
],
"Condition": "$logt and !($user or $iplh1 or $iplh2 or $enddol)"
}
Regular Expression Templates
----------------------------
Since **Gene version 1.4**, we have introduced regular expression templates. When
the amount of rules is growing, we noticed that we often use the same regular
expressions. It is always boring to remember where we used a similar regex
in order to copy paste it in the new rule we are writing. In addition, if a
mistake is found in such a shared regex, all the places where it is used need to
be updated, which is error prone. So, you got it, the purpose of this feature is
just to make the life of the rules writer easier.
There is no rocket science behind regular expression templates. Templates
are defined in a file (or files) and can then be used in the **Matches** following a
given syntax.
**File Extension**: ``.tpl`` (must be located in rule directory we are using)
**Syntax**
* **definition:** ``TEMPLATE_NAME: 'REGULAR_EXPRESSION'``
* **use:** ``{{TEMPLATE_NAME}}``
Example
^^^^^^^
There is an example of a few **regex templates**
.. code-block:: bash
# Paths
windows: '(?i:C:\\Windows\\)'
system: '(?i:C:\\Windows\\Sys(wow64|tem32)\\)'
programfiles: '(?i:C:\\(PROGRA~2|Program Files.*?)\\)'
#Extensions
script-exts: '(?i:(\.ps1|\.bat|\.cmd|\.vb|\.vbs|\.vbscript|\.vbe|\.js|\.jse|\.ws|\.wsf))'
exec-exts: '(?i:(\.acm|\.ax|\.com|\.cpl|\.dic|\.dll|\.drv|\.ds|\.efi|\.exe|\.grm|\.iec|\.ime|\.lex|\.msstyles|\.mui|\.ocx|\.olb|\.rll|\.rs|\.scr|\.sys|\.tlb|\.tsp))'
.. important::
To escape characters, just use a simple ``\`` it is different from the way to
escape in rules.
To make use of the template previously defined
.. code-block:: JSON
{
"Name": "HeurDropper",
"Tags": ["Heuristics", "CreateFile"],
"Meta": {
"EventIDs": [11],
"Channels": ["Microsoft-Windows-Sysmon/Operational"],
"Computers": [],
"Criticality": 8,
"Author": "0xrawsec",
"Comments": "Experimental rule to detect executable files dropped by common utilities"
},
"Matches": [
"$susp: Image ~= '{{suspicious}}$'",
"$target: TargetFilename ~= '({{exec-exts}}|{{script-exts}})$'"
],
"Condition": "$susp and $target"
}
In order to debug the rules using templates, we have introduced a new feature
in the ``gene`` command line utility. One can use the ``-dump`` command line switch
to dump the rule as it is after template replacement.
.. code-block:: bash
> gene -dump HeurDropper -r ./gene-rules | jq
{
"Name": "HeurDropper",
"Tags": [
"Heuristics",
"CreateFile"
],
"Meta": {
"EventIDs": [
11
],
"Channels": [
"Microsoft-Windows-Sysmon/Operational"
],
"Computers": [],
"Traces": null,
"Criticality": 8,
"Disable": false
},
"Matches": [
"$susp: Image ~= '(?i:\\\\(certutil|rundll32|powershell|wscript|cscript|cmd|mshta|regsvr32|msbuild|installutil|regasm)\\.exe)$'",
"$target: TargetFilename ~= '((?i:(\\.acm|\\.ax|\\.com|\\.cpl|\\.dic|\\.dll|\\.drv|\\.ds|\\.efi|\\.exe|\\.grm|\\.iec|\\.ime|\\.lex|\\.msstyles|\\.mui|\\.ocx|\\.olb|\\.rll|\\.rs|\\.scr|\\.sys|\\.tlb|\\.tsp))|(?i:(\\.ps1|\\.bat|\\.cmd|\\.vb|\\.vbs|\\.vbscript|\\.vbe|\\.js|\\.jse|\\.ws|\\.wsf)))$'"
],
"Condition": "$susp and $target"
}
.. note::
As you can see in the dumped rule, the simple ``\`` becomes ``\\``, this is due
to the **JSON** parser.
.. note::
Notice how easy it is now just to add a new extension to the list so that it
impacts all the rules using this template.