Syntax
Extract fields from the MDM
In Message Query Language (MQL), fields can be extracted from the MDM by specifying the field name. To get a subfield, combine two fields together with a .
in between.
For example, to retrieve the sender's display name:
sender.display_name
Literal values
Strings can be specified in two forms: escaped and unescaped.
Escaped strings are surrounded by double quotes "
and special characters can be escaped with \
. If an unrecognized character is specified after \
, that is treated as an invalid escape sequence and causes a syntax error.
"hello world"
"line 1\nline2\nline3"
"unicode characters like ✉️ are supported"
Unescaped or raw strings are surrounded by single quotes '
. No escape sequences are supported within a raw string. However, two single quotes ''
can be used to insert a single quote character.
'hello world'
'escaping apostrophes isn''t that difficult'
'this back\slash is interpreted literally'
Escape sequences
\r
carriage return CR (ASCII 0x0d)\n
new line LF (ASCII 0x0a)\t
tab (ASCII 0x09)\'
single quotes'
(ASCII 0x27)\"
double quotes"
(ASCII 0x22)\\
backslash\
(ASCII 0x5c)\u{xxxxxxxx}
Unicode code point between 0x01 and 0x10ffff.
Unicode escape sequences can include any valid unicode code point, including ASCII characters, non-printable characters or other unicode characters. Between 2 and 8 hex digits are recognized within {
and }
.
Example unicode escapes:
\u{0a}
identical to\n
for a newline\u{0398}
greek capital theta: Θ\u{200f}
unicode right-to-left encoding character\u{1f4ec}
open mailbox emoji: 📬\u{0001f4ec}
open mailbox emoji, with optional leading zeros: 📬
Comparing values
MQL provides eight built-in operators to compare two values. The syntax is <left> <operator> <right>
:
sender.email.domain.root_domain == "sublimesecurity.com"
Numbers can be compared with the below operators. MQL can represent unsigned, signed, or floating-point values.
<
: less than<=
: less than or equal to==
: equal to!=
: not equal to>=
: greater than or equal to>
: greater than
Note: When two values are compared of different numeric types, they are automatically converted to have compatible types. For example, in the comparison 1 < 1.5
, the integer 1
is compared to a floating point 1.5
. First, the 1
is converted to a floating point 1.0
, and then the comparison is performed. This means that an expression like 3 == 3.14
will always evaluate false because it's converted to 3.0 == 3.14
.
Strings support two additional operators to support case-insensitive equality. Unless explicitly specified, assume that strings are interpreted with case-sensitivity (meaning that "a"
and "A"
are distinct). Range operators, such as <
use lexicographical ordering and are always case-sensitive.
case-sensitive comparisons
<
: less than<=
: less than or equal to==
: equal to ("Abc" == "abc"
evaluates asfalse
)!=
: not equal to>=
: greater than or equal to>
: greater than
case-insensitive comparisons
=~
: case-insensitive equality. ("Abc" =~ "abc"
evaluates astrue
)!~
: case-insensitive inequality. ("Abc" !~ "abc"
evaluates asfalse
)
Booleans can only be compared with ==
or !=
. Booleans also have dedicated operators for traditional boolean logic, such as and
. (see the next section).
Range checking
A common pattern when comparing values is to check if a value is within a given range. MQL provides syntax sugar for this kind of comparison with the syntax <lower> <operator> x <operator> <upper>
. Some examples:
strings.levenshtein(sender.email.email, "[email protected]") > 4 and
strings.levenshtein(sender.email.email, "[email protected]") <= 7
# more compact form
4 < strings.levenshtein(sender.email.email, "[email protected]") <= 7
'abc' <= subject.subject < 'xyz'
Like other comparisons, both strings and numbers are supported; however, range checking only supports the following operators:
<
: less than<=
: less than or equal to
Boolean logic
Multiple boolean expressions can be combined with traditional boolean operators. Use the and
, or
, or not
keywords for boolean operations:
sender.email.email == "[email protected]" and subject.subject == "Password reset request"
sender.email.email == "[email protected]" or subject.subject == "Password reset request"
not (sender.email.email == "[email protected]")
Check against multiple values with in
in
One common pattern that often combines comparisons with boolean logic is to check the same field against multiple literal values. Instead of combining similar equality checks ==
with a logical or
, use the in
keyword to check a value against a set of values:
For example, to detect subjects commonly used in BEC attacks:
subject.subject == "Urgent" or
subject.subject == "Can you help" or
subject.subject == "Quick errand"
# more compact form with `in`
subject.subject in ("urgent", "can you help", "quick errand")
To check the inverse and ensure that a value is not in a list of values, use not in
:
subject.subject not in ("Urgent", "Can you help", "Quick errand")
in
can also be used in a case-insensitive way, using in~
:
subject.subject in~ ("Urgent", "Can you help", "Quick errand")
Checking array fields with in
in
The syntax for in
and in~
can also be used to lookup a single value against an array field or function that returns an array. The syntax x in array_field
is simply shorthand for any(some_array, . == x)
.
Can be used with file.explode
:
any(attachments, .file_extension in~ ('html', 'htm') and
any(file.explode(.),
any(.scan.javascript.identifiers, . == "unescape"))
)
With in
to check .scan.javascript.identifiers
:
any(attachments, .file_extension in~ ('html', 'htm') and
any(file.explode(.),
"unescape" in .scan.javascript.identifiers)
)
Matching several boolean expressions with of
of
Occasionally when writing rules, a minimum amount of boolean clauses must match. With and
, all the clauses must be true and with or
at least one clause has to be true. Use of
for a hybrid operator when checking for a minimum amount of matches.
X of (...)
evaluates true if at least X terms evaluate true
. The basic structure for of
follows:
X of (clause1, clause2, ..., clauseN)
Using 1 of (...)
is identical to an or
over all of the terms. Similarly, if X
is the same as the total number of terms, then it's equivalent to and
. The minimum number of clauses to check with of
must be between 1 and the total number of clauses between (
and )
.
For example,
# returns true
3 of (true, false, true, true)
# returns false
3 of (true, false, false, true)
Checking a named list with in
in
A value can be checked against a list by using in $list
or not in $list
syntax. See the full reference of currently available lists.
sender.email.domain.domain in $alexa_1m
sender.email.domain.domain not in $alexa_1m
Creating arrays
Create arrays on-the-fly with a list of literal or dynamic values. Arrays can be combined with array functions such as any
and all
to consolidate logic.
To create an array, encapsulate a list of values in [
]
["foo", "bar", "baz"]
[body.plain.text, body.html.text]
For example, to check if either body.plain.text
or body.html.text
contain a Social Security Number:
any([body.plain.text, body.html.text],
regex.contains(., '\b(\d\d\d)-(\d\d)-(\d\d\d\d)\b')
)
Using any
with an custom array is equivalent to writing an or
with two regex.contains
calls:
regex.contains(body.plain.text, '\b(\d\d\d)-(\d\d)-(\d\d\d\d)\b') or
regex.contains(body.html.text, '\b(\d\d\d)-(\d\d)-(\d\d\d\d)\b')
To check if any of the recipients is a disposable email provider:
any([recipients.to, recipients.cc, recipients.bcc],
any(., .email.domain.domain in $disposable_email_providers)
)
See array functions for the full list of array functions.
Arithmetic
All numbers support standard arithmetic operations. When two numbers of different types are used in arithmetic they are first converted to a matching type. That means 1 * 2.0
is automatically converted to 1.0 * 2.0
. The supported arithmetic operations are:
+
Add two numbers-
Subtract two numbers*
Multiply two numbers/
Divide two numbers. If both values are integers, this indicates integer division, meaning that5 / 2
is2
. To get2.5
, add a decimal point to either side:5 / 2.0
,5.0 / 2
and5.0 / 2.0
are all equivalent.%
Modulo of two numbers
Order of operations
The precedence of operations, ordered from highest to lowest:
(
...)
: parentheses*
,/
and%
: multiplication, division, and modulus+
and-
: addition and subtraction<
,<=
,==
,=~
,!=
,!~
,>=
,>
,in
: comparisonsof
: matching multiple boolean termsnot
: boolean NOTand
: boolean ANDor
: boolean OR
Common logic errors
Missing parentheses
The following rule will match any message with "microsoft team" in the body, regardless of whether type.inbound
is true.
type.inbound
and strings.ilike(subject.subject, "*microsoft team*")
or strings.ilike(body.html.inner_text, "*microsoft team*")
^^-- oops!
That's the same as typing:
(
type.inbound
and strings.ilike(subject.subject, "*microsoft team*")
) or strings.ilike(body.html.inner_text, "*microsoft team*")
^^-- oops!
If you intend for the or
to cover the two ilike
checks, you need parentheses to make the order of operations explicit. That way, type.inbound
must always evaluate as true
:
type.inbound
and (
strings.ilike(subject.subject, "*microsoft team*")
or strings.ilike(body.html.inner_text, "*microsoft team*")
)
Comments
MQL supports single-line comments beginning with //
.
type.inbound
// check if the sender's domain uses the ru TLD
and sender.email.domain.tld != 'ru'
Updated about 1 year ago