Grammar Regular Expression Engine

Updated 11 months ago by admin

This section describes the regular expressions syntax that the Advanced DLP add-on supports.

The DLP engine parser interprets regular expression syntax nearly identically to the UNIX regular expression syntax.


The following table describes the base regular expression operators available in the DLP engine, and the pattern the operator matches.


Matched Pattern


Quote the next metacharacter


Match the beginning of a line


Match the end of a line


Match any character (except newline)




Used for grouping to force operator precedence


The character x or y


The range of characters between x and z


Any character except z

For performance reasons, it is recommended that you explicitly list all the characters that you want to match, rather than using this operator
To use negated character classes in case-insensitive entities, you must include letters in both cases, for example [^Zz] rather than [^z]



Matched Pattern


Match 0 or more times


Match 1 or more times


Match 0 or 1 times


Match exactly n times


Match at least n times


Match at least n times, but no more than m times



Matched Pattern


Match tab


Match newline


Match return


Match formfeed


Match alarm (bell, beep, and so on)


Match escape


Match vertical tab


Match octal character (in this example, 21 octal)


Match hex character (in this example, F0 hex)


Match wide hex character (Unicode)


Match word character: [A-Za-z0-9_]


Match non-word character: [^A-Za-z0-9_]


Match whitespace character. This metacharacter also includes \n and \r[ \t\n\r]


Match non-whitespace character: [^ \t\n\r]


Match digit character: [0-9]


Match non-digit character: [^0-9]


Match word boundary


Match non-word boundary


Match start of string (never match at line breaks)


Match end of string. Never match at line breaks; only match at the end of the final buffer of text submitted for matching


Match any character that belongs to the specified Unicode character class. For example, \p{Sc} matches any currency symbol. You can omit the braces for single-character class names: \p{C} and \pC are equivalent


Match any character that does not belong to the specified Unicode character class. For example \P{Sc} matches any character that is not a currency symbol. You can omit the braces for single-character class names: \P{C} and \PC are equivalent

For performance reasons, it is recommended that you avoid using negated character classes where possible

How did we do?