Method object

A Method object defines how to expand out from an Anchor line and extract target data. The following methods are available in Field objects.

For a list of methods, see Methods.

Parameters
Examples

Parameters

The following global parameters available to all methods:

KeyValueDescription
idbox,
checkbox,
column,
documentRange,
fixedTable,
intersection,
invoice,
keyValue,
label,
nearestCheckbox,
passthrough,
regex,
region,
row,
signature,
table,
textTable
topic
see Methods.
tiebreakerinteger (zero-based index)
or
ordinal (first, second, third, last)
or
comparison (>, <)
or
join
Default: join
If the method returns multiple elements (for example, a Row method), specifies which element to extract in the returned array.

integer: Returns the zero-indexed nth element in the returned lines array, using Sensible's default line sorting. For example, 0 returns the first line, -1 returns the last line, and -2 returns the second-to-last line in the array.

ordinal: Returns the first, second,third or last element, using Sensible's default line sorting.

comparison: Returns the first or last element, sorted alphanumerically using Unicode values.
If you want to compare numeric amounts and ignore non-numbers, then add a numeric type such as type: currency as a top-level parameter to the field.

join: Returns all elements in the returned array as a single string, delimited by whitespaces.
lineFiltersMatch objectFilters out the specified lines from the method match. For example, if the Box method extracts unwanted footer lines from a box, you can filter out the lines with this parameter.
typeFiltersarray of TypesFilters out the specified types from the method results. For example, for a target box containing a delivery date, a street address, and delivery notes, you can filter out the lines containing Date and Address types in order to extract the delivery notes. Note that less strict types, such as Name and Currency types, are less useful in this filter than stricter types such as the Phone Number type.
For an example, see the Examples section.
wordFiltersstring arrayFilters out the specified strings from the method results.
whitespaceFilterspaces, allRemove extra whitespaces.
spaces - remove solely extra spaces.
all - remove all whitespace characters, including newlines.
xRangeFilterobjectDefines left and right boundaries in which to capture lines. For example, in combination with the Document Range method, the X Range Filter parameter defines a "column" that's bounded at the top and bottom by text. This column excludes any lines that partially fall outside the defined rectangular region. Parameters:
start - right,left - Defines the starting point of the "column" at either the right or left boundary of the anchor line.
offsetX - Adjusts the horizontal position of the starting point defined by the Start parameter.
width - The width of the page region to capture, in inches.

For an example, see the Examples section.
(Deprecated) xMajorSortbooleanDeprecated: Use the Sort Lines parameter instead.
sortLinesreadingOrderLeftToRightSet this parameter to readingOrderLeftToRight to sort lines whose height and vertical position are misaligned. For example, with misaligned handwritten text, slight jitter in the vertical positions of lines can cause Sensible to incorrectly sort lines that a human reader interprets as following left to right. The Sort Lines parameter corrects this problem by sorting lines by their likely reading order.

Examples

Sort Lines example

PROBLEM

In the following example, the handwritten text "Nash" is slightly taller than the text "Steve", so Sensible interprets "Nash" as preceding "Steve" (reversing the order interpreted by a human reader) and outputs "Nash Steve" as the name:

Click to enlargeClick to enlarge

SOLUTION

To reliably capture the first and last name in their left-to-right order, set "sortLines": "readingOrderLeftToRight".

Config

{
  "fields": [
    {
      "id": "_name_joint_owner_raw",
      "match": "last",
      "anchor": {
        "match": {
          "type": "startsWith",
          "text": "Name",
          "isCaseSensitive": true
        }
      },
      "method": {
        "id": "region",
        "sortLines": "readingOrderLeftToRight",
        "start": "left",
        "width": 2.5,
        "height": 0.4,
        "offsetX": 0.2,
        "offsetY": -0.45
      }
    }
  ]
}

PDF

The following image shows the example PDF used with this example config:

Click to enlargeClick to enlarge

Example PDFDownload link

To run this example, verify the document type uses Google OCR (click the gear icon for the Document Type and select Google):

Click to enlargeClick to enlarge

Output

{
  "_name_joint_owner_raw": {
    "type": "string",
    "value": "Steve Nash"
  }
}

Type Filters example

The following example shows using the Type Filters parameter to extract delivery notes from a box.

Config

{
  "fields": [
    {
      "id": "delivery_notes",
      "type": "string",
      "anchor": "delivery information",
      "method": {
        "id": "box",
        "offsetY": 1,
        "typeFilters": [
          "address",
          {
            "id": "date",
            "format": "%b_%D_%y"
          }
        ]
      }
    }
  ]
}

PDF
The following image shows the example PDF used with this example config:

Click to enlargeClick to enlarge

Example PDFDownload link

Output

{
  "delivery_notes": {
    "type": "string",
    "value": "Please leave package at door"
  }
}

X Range Filter example

In combination with the Document Range method, the X Range Filter parameter defines a "column" that's bounded at the top and bottom by text.

The following image shows using this parameter to extract a "cell" of text that doesn't fit other methods:

Click to enlargeClick to enlarge

In this example, the X Range Filter parameter is the best option:

  • Document Range by itself isn't a good option, because it captures the address of the importer as well as the supplier.
  • The Fixed Table and Table methods aren't the best options, because the table formatting is hard to recognize.
  • The Text Table method with "detectMultipleLinesPerRow": true configured is an alternate solution for this example.

Try out this example in the Sensible app using the following PDF and config:

Example PDFDownload link

This example uses the following config:

{
  "fields": [
    {
      "id": "mailing_address_supplier",
      "anchor": {
        "match": {
          "text": "supplier",
          "type": "startsWith"
        }
      },
      "method": {
        "id": "documentRange",
        "xRangeFilter": {
          "start": "left",
          "offsetX": -0.5,
          "width": 2
        },
        "stop": {
          "text": "type of business",
          "type": "includes"
        }
      }
    }
  ]
}

Did this page help you?