RegEx functions

Functions in the regex module

All regular expression functions below currently use RE2 for the regular expression syntax. RE2 shares many features with PCRE, but is typically more efficient and does not include computationally expensive features of PCRE. For more details, see the official documentation for RE2.

RE2 regular expression tester

📘

Single quote ' vs double quote "

Use single quotes for regular expressions so that escape sequences such as \d don't have to be escaped twice.

For instance, to find three consecutive digits anywhere in a string, use regex_search(field, '\d\d\d') to avoid double escapes of the \\. This is equivalent to regex_search(field, "\\d\\d\\d")

match / imatch

regex.match(input: string, pattern: string, ...) -> bool
regex.imatch(input: string, pattern: string, ...) -> bool

match is used to match an _entire string _against a list of regular expressions. If at least one of the regular expressions matches the entire string, then match will return true. match performs a case-sensitive regular expression match, but imatch is case-insensitive. Both functions match against the entire string, so add leading and trailing wildcards (.* or .*?) to search for a substring within the entire input string.

If input is null, then match and imatch will return null.

Examples

regex.match("[email protected]", ".*[email protected]") -> true

# use imatch for case-insensitive matches
regex.match("[email protected]", "@sublimesecurity.com") -> false
regex.imatch("[email protected]", "@sublimesecurity.com") -> true

# if multiple regular expressions are provided, only one needs to match
regex.match("[email protected]", "@.*.org$", "@.*.com$", "@.*.gov$") -> true

contains / icontains

regex.contains(input: string, pattern: string, ...) -> bool
regex.icontains(input: string, pattern: string, ...) -> bool

regex.contains is used to check if a string contains a substring that has matches at least one of a list of regular expressions. Unlike regex.match, the full string does not need to match. regex.contains(field, '\bfoo\b') has the same behavior as regex.match(field, '.*\bfoo\b.*').

For case-insensitive regular expression matching, use regex.icontains.

Examples

regex.contains("[email protected]", "@(google|sublimesecurity)") -> true

# use icontains for case-insensitive substring
regex.contains("[email protected]", "@(google|sublimesecurity)") -> true

# if multiple regular expressions are provided, only one needs to match
regex.contains("[email protected]", "@google", "@sublimesecurity") -> true

count / icount

regex.count(input: string, pattern: string) -> bool
regex.icount(input: string, pattern: string) -> bool

regex.count is used to count the number of times pattern matches input. Matches are greedy. For example, regex.count('hello', '.*') will be 1.

For case-insensitive regular expression matching, use regex.icount.

Examples

# Find large numbers of unusual characters
regex.count(body.current_thread.text, '[^\x00-\x7F]') > 20

# Find multiple uses of excessive punctuation
regex.count(body.current_thread.text, '[!?.]{2,}') > 3