All regular expression functions below currently use RE2 (Golang flavor of RE2) for the regular expression syntax. RE2 shares many features with PCRE, but is typically more efficient and does not include computationally expensive features of PCRE. For more details, see the official documentation for RE2.

📘
Single quote ' vs double quote "
Use single quotes for regular expressions so that escape sequences such as \d don't have to be escaped twice.
For instance, to find three consecutive digits anywhere in a string, use regex_search(field, '\d\d\d') to avoid double escapes of the \\. This is equivalent to regex_search(field, "\\d\\d\\d")

`match` / `imatch`

regex.match(input: string, pattern: regexp, ...) -> bool
regex.imatch(input: string, pattern: regexp, ...) -> bool

match is used to match an _entire string _against a list of regular expressions. If at least one of the regular expressions matches the entire string, then match will return true. match performs a case-sensitive regular expression match, but imatch is case-insensitive. Both functions match against the entire string, so add leading and trailing wildcards (.* or .*?) to search for a substring within the entire input string.

If input is null, then match and imatch will return null.

Examples

regex.match("[email protected]", ".*[email protected]") -> true

# use imatch for case-insensitive matches
regex.match("[email protected]", "@sublimesecurity.com") -> false
regex.imatch("[email protected]", "@sublimesecurity.com") -> true

# if multiple regular expressions are provided, only one needs to match
regex.match("[email protected]", "@.*.org$", "@.*.com$", "@.*.gov$") -> true

`contains` / `icontains`

regex.contains(input: string, pattern: regexp, ...) -> bool
regex.icontains(input: string, pattern: regexp, ...) -> bool

regex.contains is used to check if a string contains a substring that has matches at least one of a list of regular expressions. Unlike regex.match, the full string does not need to match. regex.contains(field, '\bfoo\b') has the same behavior as regex.match(field, '.*\bfoo\b.*').

For case-insensitive regular expression matching, use regex.icontains.

Examples

regex.contains("[email protected]", "@(google|sublimesecurity)") -> true

# use icontains for case-insensitive substring
regex.contains("[email protected]", "@(google|sublimesecurity)") -> true

# if multiple regular expressions are provided, only one needs to match
regex.contains("[email protected]", "@google", "@sublimesecurity") -> true

`count` / `icount`

regex.count(input: string, pattern: regexp) -> bool
regex.icount(input: string, pattern: regexp) -> bool

regex.count is used to count the number of times pattern matches input. Matches are greedy. For example, regex.count('hello', '.*') will be 1.

For case-insensitive regular expression matching, use regex.icount.

Examples

# Find large numbers of unusual characters
regex.count(body.current_thread.text, '[^\x00-\x7F]') > 20

# Find multiple uses of excessive punctuation
regex.count(body.current_thread.text, '[!?.]{2,}') > 3

`extract` / `iextract`

regex.extract(input: string, pattern: regexp) -> [RegexMatch]
regex.iextract(input: string, pattern: regexp) -> [Regexmatch]

regex.extract is used to return all regular expression matches within a string, including submatches for capture groups. This is similar to regex.contains, but instead of returning a boolean true/false for whether a match exists, it returns the complete match and the individual submatches.

The returned fields for one of the matches:

full_match: Matches the complete regular expression, including all capture gorups
groups: A list of all the strings matched by capture groups. This will always be the same length as the number of capture groups, and individual captures are never null but "".
named_groups: A mapping of string -> string for capture groups with names. In RE2 syntax, this is done via (?P<my_capture>.*) like syntax, where "my_capture" will be one of the keys to the group. The resulting string, .named_groups["my_capture"] is the value matching that group.

For case-insensitive regular expression extraction, use regex.iextract.

Examples

// With positional capture groups
any(regex.iextract(sender.display_name, '\A(.*)\((?:via )?Google'),
    any(.groups, . in~ $org_display_names)
)

// With named capture groups
any(regex.iextract(sender.display_name,
                   '\A(?P<sender_display_name>.*)\((?:via )?Google'
                  ),
    .named_groups["sender_display_name"] in~ $org_display_names
)

RegEx functions

📘
Single quote `'` vs double quote `"`

`match` / `imatch`

Examples

`contains` / `icontains`

Examples

`count` / `icount`

Examples

`extract` / `iextract`

Examples

📘Single quote ' vs double quote "

match / imatch

Examples

contains / icontains

Examples

count / icount

Examples

extract / iextract

Examples

📘
Single quote `'` vs double quote `"`

`match` / `imatch`

`contains` / `icontains`

`count` / `icount`

`extract` / `iextract`