Anchor object
An anchor is a string, Match object, or array of Match objects.
An anchor's behavior depends on its field's method:
Category | Required? | Notes |
---|---|---|
Layout-based methods | required | An anchor is a computationally quick way to narrow down the rough location of the data you want to extract in a document. After locating the anchor, Sensible uses a layout-based "method" to spatially expand out from the anchor and extract the data you want. |
LLM-based methods | optional | An anchor is a test for running a field or skipping it. If the anchor is present in the document, Sensible searches the whole document for the target data. If the anchor is missing, Sensible returns null for the field. You can use this behavior to define backup, or "fallback", fields. For an exception to this behavior, see the Multimodal Engine parameter. |
Define anchors using simple or complex syntax. The following example shows simple syntax:
{
"fields": [
{
"id": "simple_label",
"anchor": "this is a string to anchor on",
"method": {
"id": "label",
"position": "below"
}
}
]
}
Behind the scenes, Sensible expands string anchors to case-insensitive includes
matches. For example, Sensible automatically expands the preceding example as:
"anchor": {
"match": {
"type": "includes",
"text": "this is a string to anchor on",
}
},
Parameters
An Anchor object has the following top-level parameters:
key | values | description |
---|---|---|
match (required, except for string anchors) | Match object or array of Match objects | See Match object. |
start | string, Match object, or Match object array | Start the search for the anchor's Match parameter at a line of text in the document, and ignore all the text that precedes the start line. The terms "preceding" and "succeeding" primarily mean above and below the Start line, respectively. For more information, see Line sorting. You can extract anchor output with the Passthrough method. |
end | string, Match, or Match array | Stop the search for the anchor's Match parameter at a line of text in the document, and ignore all the text that succeeds the End line. The terms "preceding" and "succeeding" primarily mean above and below the End line, respectively. For more information, see Line sorting. If unspecified, the anchor searches for matches to the end of the document. |
includeEnd | boolean | Whether to include the matching End line in the anchor output. |
Examples
Here's an example of an Anchor object that uses all these parameters:
{
"fields": [
{
"id": "simple_label",
"anchor": {
"start": "My section heading. Start matching at the start of this line",
"end": "My footer text. Stop matching before it",
"includeEnd": true,
"match":
[
{
"type": "includes",
"text": "Only finds anchor if you match this string in a line that is between the start and end lines",
},
]
},
"method": {
"id": "label",
"position": "below"
}
}
]
}
Notes
For information about complex anchor syntax, see Anchor nuances.
Updated about 8 hours ago