DLP Dictionaries

Updated 2 years ago by admin

The Email Security product provides a number of built-in dictionaries to assist organisations with Data Loss Prevention (DLP). The dictionaries are used to detect potentially high risk data egressing via email messages.

DLP Dictionaries are applied to the message body only.

The DLP dictionaries consist of Regular Expressions and keywords.

The DLP dictionaries can be applied to Message Rules using any condition that supports dictionaries, such as the Body condition.

The "Use with" column indicates for best accuracy both dictionaries should be used in the same rule with multiple Body conditions. This is to reduce False Positives.

Dictionary

Description

Use with

AWS Keys (RegEx)

Format: access keys contain two parts: an access key ID (such as AKIAIOSFODNN7EXAMPLE) and a secret key (such as wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY)

Pattern: either the key or the secret must be present

AWS Keys (Keywords)

Azure DocumentDB Auth Key (RegEx)

Format:The string "DocumentDb" followed by the characters and strings outlined in the pattern below.

Pattern:

  • The string "DocumentDb"
  • Any combination of between 3-200 lower- or uppercase letters, digits, symbols, special characters, or spaces
  • A greater than symbol (>), an equal sign (=), a quotation mark ("), or an apostrophe (')
  • Any combination of 86 lower- or uppercase letters, digits, forward slash (/), or plus sign (+)
  • Two equal signs (=)

Azure Publish Setting Password (RegEx)

Format:The string "userpwd=" followed by an alphanumeric string.

Pattern:

  • the string "userpwd="
  • any combination of 60 lowercase letters or digits
  • a quotation mark (")

Azure Storage Account Key (RegEx)

Format:The string "DefaultEndpointsProtocol" followed by the characters and strings outlined in the pattern below, including the string "AccountKey".

Pattern:

  • the string "DefaultEndpointsProtocol"
  • zero to two whitespace characters
  • an equal sign (=)
  • zero to two whitespace characters
  • any combination of between 1-200 lower- or uppercase letters, digits, symbols, special characters, or spaces
  • the string "AccountKey"
  • zero to two whitespace characters
  • an equal sign (=)
  • zero to two whitespace characters
  • any combination of 86 characters that are lower- or uppercase letters, digits, forward slash (/), or plus sign (+)
  • two equal signs (=)

Card Number (RegEx)

Format: 14 digits that can be formatted or unformatted (dddddddddddddd) and must pass the Luhn test.

Pattern: Very complex and robust pattern that detects cards from all major brands worldwide, including Visa, MasterCard, Discover Card, JCB, American Express, gift cards, and diner cards.

Prefix from a valid card issuer and computes the Luhn checksum which every Credit Card Number must pass.

Card Number (Keywords)

Date of Birth (RegEx)

Format: a date represented in a known UK or US format

Pattern: must include a prefix "Date of birth:" or "Birthday:"

Date of Birth (Keywords)

Email Address (RegEx)

Format: has to have a prefix to the left of the @ symbol, @ symbol, and a domain appears to the right of the @ symbol. Additionally, a domain part needs to contain a dot, which has an additional 2-3 characters after that.

Pattern:

  • prefix: letters (a-z), numbers, underscores, periods, and dashes. An underscore, period, or dash must be followed by one or more letter or number.
  • @
  • domain part (before dot): letters, numbers, dashes.
  • dot
  • 2-3 characters (a-z)

International Banking Account Number, IBAN (RegEx)

Format: Country code (two letters) plus check digits (two digits) plus bban number (up to 30 characters)

Pattern : must include all of the following:

  • Two-digit country ISO code + two checksum digits + Basic Bank Account Number (BBAN)
  • All IBANs are digits only
  • BBAN is broken down into:

b- National bank code

c- account number

s- branch code

x- national check digit

The format for each country is slightly different. The IBAN sensitive information type covers these 60 countries: ad, ae, al, at, az, ba, be, bg, bh, ch, cr, cy, cz, de, dk, do, ee, es, fi, fo, fr, gb, ge, gi, gl, gr, hr, hu, ie, il, is, it, kw, kz, lb, li, lt, lu, lv, mc, md, me, mk, mr, mt, mu, nl, no, pl, pt, ro, rs, sa, se, si, sk, sm, tn, tr, vg

IP Address (RegEx)

Format:

IPv4: Complex pattern that accounts for formatted (periods) and unformatted (no periods) versions of the IPv4 addresses

IPv6: Complex pattern that accounts for formatted IPv6 numbers (which include colons)

Pattern: N/A

Password (RegEx)

Format: the password must contain at least one lowercase character, one uppercase character, one digit, one special character, and a length form 8 to 14.

Pattern: contain all of the following, but in no particular order:

  • At least one digit [0-9]
  • At least one lowercase character [a-z]
  • At least one uppercase character [A-Z]
  • At least one special character [*.!@#$%^&(){}[]:;<>,.?/~_+-=|\]
  • At least 8 characters in length, but no more than 14

Password (Keywords)

SWIFT Code (RegEx)

Format: four letters followed by 5-31 letters or digits

Pattern: four letters followed by 5-31 letters or digits:

  • Four-letter bank code (not case sensitive)
  • An optional space
  • 4-28 letters or digits (the Basic Bank Account Number (BBAN))
  • An optional space
  • 1-3 letters or digits (remainder of the BBAN)

SWIFT Code (Keywords)

Rule Configuration

Example rule to detect Credit Card Numbers on outbound email and quarantine to a "DLP" area for review by the administrator:


How did we do?