Extracts lines matching a regular expression. Often, you use a capturing group in this method to narrow down text you matched in an anchor.

Parameters
Examples
Notes

Parameters

Note: For the full list of parameters available for this method, see Global parameters for methods. The following table shows parameters most relevant to or specific to this method.

keyvaluedescription
id (required)regexSpecifies to include the anchor line in the method's output.
pattern (required)Javascript-flavored regexReturns the first capturing group. To capture more than one group, you can use one field for each group, then concatenate them with the Concatenate computed field method.
Double escape special characters since the regex is in a JSON object. For example, \\s, not \s , to represent a whitespace character.
Sensible throws an error if you specify a pattern that can match an empty string, for example, .*.
flagsJavascript-flavored regex flags. default: noneFlags to apply to the regex. For example, "i" for case-insensitive.
stopMatch object or array of match objects. default: noneStops extraction at the matched line. Matched line isn't included in the method output. If unspecified, matches to the end of the document.

Examples

The following example narrows down text matched by an anchor line by using the Regex method. The Regex method extracts the last four digits in a customer ID.

Config

{
  "fields": [
    {
      "id": "last_4_digits_customer_id",
      "type": "number",
      "anchor": {
        "match": [
          {
            "type": "startsWith",
            "text": "customer id"
          },
          {
            "type": "regex",
            "pattern": "^[A-Z]{4}\\d{4}$",
          }
        ]
      },
      "method": {
        "id": "regex",
        "pattern": "\\d{4}$"
      }
    }
  ]
}

Example document
The following image shows the example document used with this example config:

Click to enlarge

Example documentDownload link

Output

{
  "last_4_digits_customer_id": {
    "source": "7890",
    "value": 7890,
    "type": "number"
  }
}

Notes

Alternatives to this method include the Passthrough method with a custom type. For example, for the preceding example document, the following config returns similar output:

{
  "fields": [
     {
      "id": "last_4_digits_customer_id",
      "anchor": {
        "match": [
          // find an 8-character ID that follows the line
          // 'customer id'
          {
            "type": "startsWith",
            "text": "customer id"
          },
          {
            "type": "regex",
            "pattern": "^[A-Z]{4}\\d{4}$",
          }
        ]
      },
      // custom type outputs last 4 digits of the id
      "type": {
        "id": "custom",
        "pattern": "^[A-Z]{4}(\\d{4})$",
        "type": "last_4_digits"
      },
      "method": {
        "id": "passthrough"
      }
    }
    ]
}

The preceding config returns:

{
  "last_4_digits_customer_id": {
    "source": "ABCD7890",
    "value": "7890",
    "type": "last_4_digits"
  }
}