*************
Writing Rules
*************
Gene rules have been designed to be simple to understand and to parse.
In order not to have to develop a custom rule parser, we have decided to use **JSON**
document as container. Thus every rule, to work properly, has to follow a specific
format which is going to be described in this page.
Getting the format supported by the engine
==========================================
A starting point to understand the format of a rule is to use the ``gene`` command
line utility to get the format supported by your engine::
gene -template
This command line should return a **JSON** document containing all the fields used
by your current version of ``gene``.
.. note::
When dealing with **JSON** documents in command line there is an amazing utility
that you must try out if you do not know it yet. This tool is called ``jq``. If
you are running Linux you can probably install it from your package manager,
otherwise visit `jq `_
Rule Structure
==============
.. code-block:: JSON
{
"Name": "",
"Tags": [],
"Meta": {
"Events": {},
"Computers": [],
"ATTACK": [
{
"ID": "",
"Tactic": "",
"Reference": ""
}
],
"Criticality": 0,
"Disable": false,
"Filter": false,
"Schema": "2.0.0"
},
"Matches": [],
"Condition": "",
"Actions": []
}
.. note::
The fields present in the template shown above are the ones used by the engine.
It means that **any** additional fields will not impact the engine. This
trick can be used to document the rule. It is a good practice to add information
such as **Author**, **Comments** and eventual **Links** in the **Meta** section.
.. table:: Field Definition
+------------+----------------------+----------------------------------------------------+
| Field | Type | Description |
+============+======================+====================================================+
| Name | string | Name of the rule |
| | | |
+------------+----------------------+----------------------------------------------------+
| Tags | []string | Contains a list of tags related to the rule. It |
| | | can be used to group rules |
| | | according to their tag(s). |
+------------+----------------------+----------------------------------------------------+
| Meta | dict | Contains a bunch of information related the trigger|
| | | of the rule. The information in there is used to |
| | | match against the "System" section of the Windows |
| | | events to speed up the match. |
+------------+----------------------+----------------------------------------------------+
| Events | map[string][]int | List of Windows Event IDs the rule should match |
| | | against. If empty the rule will apply against any |
| | | Event ID of the ``Channels`` (c.f. see next) |
+------------+----------------------+----------------------------------------------------+
| Computers | []string | List of computer names the rule should apply on. |
| | | If empty, the rule applies on all the computers. |
+------------+----------------------+----------------------------------------------------+
| ATTACK | []map[string]string | List of ATT&CK techniques corresponding to the |
| | | detection rule. See `MITRE ATT&CK Integration`_. |
+------------+----------------------+----------------------------------------------------+
|Criticality | 0 < int < 10 | The criticality level attributed to the events |
| | | matching the rule. If an event matches several |
| | | rules the criticality levels are added between them|
| | | and will never go above 10. |
+------------+----------------------+----------------------------------------------------+
| Disable | bool | Boolean value used to disable the rule. |
+------------+----------------------+----------------------------------------------------+
| Filter | bool | Boolean value used to flag this rule as being a |
| | | **filter**. A filter rule is used to filter in |
| | | some wanted events without assigning any |
| | | criticality to them. It can be used to show events |
| | | bringing contextual information. |
+------------+----------------------+----------------------------------------------------+
|Schema | string | **Schema** version of the rule. This has been |
| | | introduced to solve incompatibility issues between |
| | | the engine and the rules. |
+------------+----------------------+----------------------------------------------------+
| Matches | []string | List of **Matches**, should follow the syntax of |
| | | `Matches Format`_ |
+------------+----------------------+----------------------------------------------------+
| Condition | string | String implementing the logic on the **Matches** to|
| | | trigger the rule. The syntax should be compliant |
| | | with `Condition Format`_ |
+------------+----------------------+----------------------------------------------------+
| Actions | []string | This field is used to encode **Actions** to be |
| | | taken when the rule triggers. It is up to the code |
| | | making use of the Gene engine to implement |
| | | **action handlers**. Gene command-line utility does|
| | | not implement any **action handler**. |
+------------+----------------------+----------------------------------------------------+
.. important::
The more precise **Events** field, the faster the rule is.
This information is used to pre-filter relevant events.
Matches Format
--------------
A **Match** can be seen as an atomic check which is done on every Windows Event
(pre-filtered using **Meta** section of the rule) going through the engine. Every
match can be referenced once or more in the **Condition** to create complex
matching rule. Currently, the latest version of the engine supports two kinds of
**Matches**.
.. important::
It is very important to remember that **Matches** only apply on the fields
located under the ``EventData`` section of Windows Events.
Field Matches
^^^^^^^^^^^^^
.. warning::
**Indirect Match** expressions are only available since **v1.6**
A **Field Match** is basically an **equality** or a **regex** check done on a
given **field value**. This kind of **Match** brings flexibility to the engine since
anything can be matched through regular expression. **Field Matches** come in two
flavours namely **Direct** and **Indirect**. A **Direct Match** is used to match
against values (regex, strings ...) know in advance when the rule is written.
An **Indirect Match** aims at matching against a value present in another field of the
event.
**Direct Match Syntax:** ``$VAR_NAME: FIELD OPERATOR 'VALUE'``
**Indirect Match Syntax:** ``$VAR_NAME: FIELD = @OTHER_FIELD``
.. table:: Field Match Symbols Definition
+---------------------+----------------------------------------------------------------+
| Symbols | Description |
+=====================+================================================================+
| VAR_NAME | Name of the variable use to access the result of the **Match** |
| | in the **Condition**, it must be preceded by a ``$`` |
+---------------------+----------------------------------------------------------------+
| FIELD | OTHER_FIELD | Field to match with in ``EventData`` section of Windows Events |
+---------------------+----------------------------------------------------------------+
| OPERATOR | Operator to use for the match: |
| | * ``=`` : equal operator |
| | * ``~=`` : regexp operator (tells to compile VALUE as a regex)|
| | * ``>`` : greater than operator (only for `int` fields) |
| | * ``<`` : lower than operator (only for `int` fields) |
| | * ``&=`` : test flag operator expects the field to be an `int`|
+---------------------+----------------------------------------------------------------+
| VALUE | Must be surrounded by **simple quotes** ``'``. This is the |
| | **value/regex** to match against to make **$VAR_NAME = true** |
+---------------------+----------------------------------------------------------------+
Match Workflow::
+-------+ +---------+
| Event | | Match |
+-------+ +---------+
| +----------+ |
+----> | Engine | <----+
+----------+
|
+---------------------------+
| Extracts value from FIELD |
+---------------------------+
|
+---------------------------+
| Does value match VALUE |
| according to OPERATOR ? |
+---------------------------+
|
^
YES / \ NO
/ \
+------------------+ +-------------------+
| $VAR_NAME = true | | $VAR_NAME = false |
+------------------+ +-------------------+
\ /
\ /
v
|
+--------------------+
| $VAR_NAME value is |
| used in condition |
+--------------------+
.. note::
Any regular expression must follow `Go regexp syntax `_.
Example
"""""""
The following snippet shows a rule used to catch Windows Event log clearing attempts
using ``wevtutil.exe``.
.. code-block:: JSON
{
"Name": "EventClearing",
"Tags": [
"PostExploit"
],
"Meta": {
"Events": {
"Microsoft-Windows-Sysmon/Operational": [
1
]
},
"ATTACK": [
{
"ID": "T1070",
"Tactic": "defense-evasion",
"Reference": "https://attack.mitre.org/techniques/T1070"
}
],
"Criticality": 8,
"Schema": "2.0.0"
},
"Matches": [
"$im: Image ~= '(?i:\\\\wevtutil\\.exe$)'",
"$cmd: CommandLine ~= '(?i: cl | clear-log )'"
],
"Condition": "$im and $cmd",
"Actions": null
}
.. warning::
**Windows path separator** ``\`` **escaping:**
* When using ``=~`` **operator**: needs to be escaped **twice** ``\\\\`` (one for JSON and one for regex parsers)
* When using ``=`` **operator**: needs to be escaped **once** ``\\`` (for JSON parser)
The following additional example shows how to detect a suspicious access to ``lsass.exe`` with the help
of the ``&=`` operator. Basically, we want to trigger this alert on any **ProcessAccess**
events targeting ``lsass.exe`` where the **GrantedAccess** contains process
**read access flag 0x10**.
.. code-block:: JSON
{
"Name": "SuspiciousLsassAccess",
"Tags": [
"Mimikatz",
"Credentials",
"Lsass"
],
"Meta": {
"Events": {
"Microsoft-Windows-Sysmon/Operational": [
10
]
},
"ATTACK": [
{
"ID": "T1003",
"Tactic": "Credential Access",
"Reference": "https://attack.mitre.org/techniques/T1003/"
}
],
"Criticality": 8,
"Schema": "2.0.0"
},
"Matches": [
"$ctwdef: CallTrace ~= '(?i:windows defender)'",
"$ga: GrantedAccess &= '0x10'",
"$lsass: TargetImage ~= '(?i:\\\\lsass\\.exe$)'",
"$wmiprvse: SourceImage ~= '(?i:(?i:C:\\\\Windows\\\\Sys(wow64|tem32)\\\\)wbem\\\\wmiprvse\\.exe)'",
"$taskmgr: SourceImage ~= '(?i:(?i:C:\\\\Windows\\\\Sys(wow64|tem32)\\\\)taskmgr\\.exe)'",
"$boot: SourceImage ~= '(?i:C:\\\\Windows\\\\system32\\\\(wininit|csrss)\\.exe)'"
],
"Condition": "$lsass and $ga and !($ctwdef or $wmiprvse or $taskmgr or $boot)",
}
Container Matches
^^^^^^^^^^^^^^^^^
A **Container Match** is a little bit more advanced since it can be used to extract
a part of a **field value** and check it against a container. For
instance, with this kind of **Match**, we are able to extract a **domain** information
contained in **Windows DNS-Client logs** and check it against a blacklist. Although,
implementing this use case would be possible with **Field Matches**, it
would be much slower due to regex engine. In addition the rule would need to be updated
at every new entry to check. With **Container Match** only the container
(a simple separate file) needs to be updated. The speed is provided by the
container being implemented in a form of a set data structure.
**Syntax:** ``$VAR_NAME: extract('REGEXP', FIELD) in CONTAINER``
.. table:: Container Match Symbols Definition
+------------+----------------------------------------------------------------+
| Symbols | Description |
+============+================================================================+
| VAR_NAME | Name of the variable used to access the result of the **Match**|
| | in the **Condition**, it must be preceded by a ``$`` |
+------------+----------------------------------------------------------------+
| FIELD | Field to extract from |
+------------+----------------------------------------------------------------+
| REGEXP | Regular expression used to extract a value from FIELD and check|
| | it against a **CONTAINER**. **REGEXP** must follow **named** |
| | regexp syntax ``(?Pre)`` |
+------------+----------------------------------------------------------------+
| CONTAINER | Container to use to check the extracted value |
+------------+----------------------------------------------------------------+
.. important::
* If a rule makes use of an **undefined container**, the rule will be disabled
at runtime and a warning message will be printed.
* A given container is shared across all the rules loaded into the engine
* Any regular expression must follow `Go regexp syntax `_.
Example
"""""""
This rule shows an example of how to extract domains and sub-domains from **Windows
DNS-Client** logs and check it against a blacklist.
.. code-block:: JSON
{
"Name": "BlacklistedDomain",
"Tags": [
"DNS"
],
"Meta": {
"Events": {
"Microsoft-Windows-DNS-Client/Operational": []
},
"Criticality": 10,
"Schema": "2.0.0"
},
"Matches": [
"$domainBL: extract('(?P\\w+\\.\\w+$)',QueryName) in blacklist'",
"$subdomainBL: extract('(?P\\w+\\.\\w+\\.\\w+$)',QueryName) in blacklist'",
"$subsubdomainBL: extract('(?P\\w+\\.\\w+\\.\\w+\\.\\w+$)',QueryName) in blacklist'"
],
"Condition": "$domainBL or $subdomainBL or $subsubdomainBL",
}
Condition Format
----------------
A condition applies a logic to the different **Matches** defined in the rule.
If the result of the computation of the **Condition** is **true** the event is
considered as matching the rule.
.. table:: Allowed Symbols in Condition
+---------+----------------------------------------------------------------+
| Symbols | Description |
+=========+================================================================+
| ``$var``| Variable referencing a **Match** |
+---------+----------------------------------------------------------------+
| ``()`` | Used to group / prioritize some logical expressions |
+---------+----------------------------------------------------------------+
| ``!`` | Negates a **Match** or a grouped expression |
+---------+----------------------------------------------------------------+
| ``AND`` | AND logical operator |
+---------+ |
| ``and`` | |
+---------+ |
| ``&&`` | |
+---------+----------------------------------------------------------------+
| ``OR`` | OR logical operator |
+---------+ |
| ``or`` | |
+---------+ |
| ``||`` | |
+---------+----------------------------------------------------------------+
.. important::
**Matches** are evaluated in real time, in the same order their **variables** appear in the **Condition**.
So the **variables** order has an impact on the **rule speed**. A good practice is to put first selective **Matches**
to abort condition evaluation as soon as possible and prevent useless **Matches** to happen.
Example
^^^^^^^
The following rule is used to match suspicious explicit network logons, we can
see an example of a rule where the order of the **variables** in the condition matters.
In this case we first match on ``LogonType``, this makes the condition aborting after
the first evaluation (as it is mandatory for the condition to be met) for every other ``LogonType``
than **3**.
.. code-block:: JSON
{
"Name": "ExplicitNetworkLogon",
"Tags": [
"Lateral",
"Security"
],
"Meta": {
"Events": {
"Security": [
4624
]
},
"Criticality": 5,
"Schema": "2.0.0"
},
"Matches": [
"$logt: LogonType = '3'",
"$user: TargetUserName = 'ANONYMOUS LOGON'",
"$iplh1: IpAddress = '-'",
"$iplh2: IpAddress = '127.0.0.1'",
"$enddol: TargetUserName ~= '\\$$'"
],
"Condition": "$logt and !($user or $iplh1 or $iplh2 or $enddol)",
}
Regular Expression Templates
----------------------------
.. warning::
Templates use **TOML** format (c.f. https://toml.io/en/) since **v2.0.0**
Regex templates have been introduced to remove the burden of maintaining rules
sharing the same regular expressions. Let's take a common example of suspicious
binaries we want to create rules on, a basic matching regex would look like this
``(?i:\\(certutil|rundll32)\.exe)``. Assuming this regex is used in **several rules**,
it is a big burden to update all of them once we want to add a new executable name
in this list. So the idea behind regex template is to centralize such shared regex
inside **configuration file(s)** for easier maintainance.
**File Extension**: ``.toml`` (must be located in rule directory we are using)
**Syntax**
* **definition:** ``TEMPLATE_NAME = 'REGULAR_EXPRESSION'``
* **usage in Match:** ``{{TEMPLATE_NAME}}``
Example
^^^^^^^
There is an example of a few **regex templates**
.. code-block:: bash
# Extensions
script-exts = '(?i:(\.ps1|\.bat|\.cmd|\.vb|\.vbs|\.vbscript|\.vbe|\.js|\.jse|\.ws|\.wsf))'
exec-exts = '(?i:(\.acm|\.ax|\.com|\.cpl|\.dic|\.dll|\.drv|\.ds|\.efi|\.exe|\.grm|\.iec|\.ime|\.lex|\.msstyles|\.mui|\.ocx|\.olb|\.rll|\.rs|\.scr|\.sys|\.tlb|\.tsp|\.winmd|\.node))'
# Exe to monitor
suspicious = '(?i:\\(certutil|rundll32|powershell|wscript|cscript|cmd|mshta|regsvr32|msbuild|installutil|regasm)\.exe)'
.. important::
Only `Golang regexp special characters `_ **need to be escaped**.
* **Windows path separator** ``\`` needs to be escaped only once (i.e. ``\\``) in template definitions.
To make use of the template previously defined
.. code-block:: JSON
{
"Name": "HeurDropper",
"Tags": [
"Heuristics",
"CreateFile"
],
"Meta": {
"Events": {
"Microsoft-Windows-Sysmon/Operational": [
11
]
},
"Criticality": 8,
"Author": "0xrawsec",
"Comments": "Experimental rule to detect executable files dropped by common utilities",
"Schema": "2.0.0"
},
"Matches": [
"$susp: Image ~= '{{suspicious}}$'",
"$target: TargetFilename ~= '({{exec-exts}}|{{script-exts}})$'",
"$poltest: TargetFilename ~= '(?i:C:\\\\Users\\\\.*?\\\\AppData\\\\Local\\\\Temp\\\\__PSScriptPolicyTest_.*?\\.ps1)'"
],
"Condition": "$susp and $target and !$poltest"
}
In order to debug the rules using templates, we have introduced a new feature
in the ``gene`` command line utility. One can use the ``-dump`` command line switch
to dump the rule as it is after template replacement.
.. code-block:: bash
> gene -dump HeurDropper -r ./gene-rules | jq
{
"Name": "HeurDropper",
"Tags": [
"Heuristics",
"CreateFile"
],
"Meta": {
"Events": {
"Microsoft-Windows-Sysmon/Operational": [
11
]
},
"Criticality": 8,
"Schema": "2.0.0"
},
"Matches": [
"$susp: Image ~= '(?i:\\\\(certutil|rundll32|powershell|wscript|cscript|cmd|mshta|regsvr32|msbuild|installutil|regasm)\\.exe)$'",
"$target: TargetFilename ~= '((?i:(\\.acm|\\.ax|\\.com|\\.cpl|\\.dic|\\.dll|\\.drv|\\.ds|\\.efi|\\.exe|\\.grm|\\.iec|\\.ime|\\.lex|\\.msstyles|\\.mui|\\.ocx|\\.olb|\\.rll|\\.rs|\\.scr|\\.sys|\\.tlb|\\.tsp|\\.winmd|\\.node))|(?i:(\\.ps1|\\.bat|\\.cmd|\\.vb|\\.vbs|\\.vbscript|\\.vbe|\\.js|\\.jse|\\.ws|\\.wsf)))$'",
"$poltest: TargetFilename ~= '(?i:C:\\\\Users\\\\.*?\\\\AppData\\\\Local\\\\Temp\\\\__PSScriptPolicyTest_.*?\\.ps1)'"
],
"Condition": "$susp and $target and !$poltest",
}
.. note::
As you can see in the dumped rule, the simple ``\`` becomes ``\\``, this is due
to **JSON** special characters' encoding.
.. note::
See how easy it is now, just to add a new extension to the list so that it
impacts all the rules using this template.
MITRE ATT&CK Integration
------------------------
Gene has full support for the `MITRE ATT&CK `_ framework through the **ATTACK** field of the
**Meta** section of the rule definition. What is documented there is purely informational
and can be displayed in the alerts reported.
Example
^^^^^^^
Given the following rule matching suspicious ADS creation.
.. code-block:: JSON
{
"Name": "ExecutableADS",
"Tags": [
"ADS"
],
"Meta": {
"Events": {
"Microsoft-Windows-Sysmon/Operational": [
15
]
},
"ATTACK": [
{
"ID": "T1096",
"Tactic": "defense-evasion",
"Reference": "https://attack.mitre.org/techniques/T1096"
}
],
"Criticality": 10,
"Schema": "2.0.0"
},
"Matches": [
"$unk: Hash = 'Unknown'",
"$impash: Hash ~= '(?i:(IMPHASH=00000000000000000000000000000000))'"
],
"Condition": "!($impash or $unk)",
}
The alert reported would look like the following.
.. code-block:: JSON
{
"Event": {
"EventData": {
"CreationUtcTime": "2018-02-23 13:17:31.176",
"Hash": "SHA1=E8B4D84A28E5EA17272416EC45726964FDF25883,MD5=09F7401D56F2393C6CA534FF0241A590,SHA256=6766717B8AFAFE46B5FD66C7082CCCE6B382CBEA982C73CB651E35DC8187ACE1,IMPHASH=68E56344CAB250384904953E978B70A9",
"Image": "C:\\Windows\\system32\\cmd.exe",
"ProcessGuid": "{49F1AF32-12C5-5A90-0000-00100AEA0B00}",
"ProcessId": "2100",
"TargetFilename": "C:\\Users\\CALDUS~1\\AppData\\Local\\Temp\\test.txt:malicious.exe",
"UtcTime": "2018-02-23 13:17:31.192"
},
"GeneInfo": {
"ATTACK": [
{
"ID": "T1096",
"Tactic": "defense-evasion",
"Reference": "https://attack.mitre.org/techniques/T1096"
}
],
"Criticality": 10,
"Signature": [
"ExecutableADS"
]
},
"System": {
"...": "..."
}
}
}