SenseML is a query language that lets you extract structured data from PDF documents. A field is the basic SenseML query unit for extracting a piece of document data. The output of a field is a JSON key-value pair that structures the extracted data.

Here's a simple example of a field:

{

  "fields": [
    {
      "id": "name_of_output_key",
      "anchor": "an anchor is some text to match. An anchor can be an array of matches",
      "method": {
        "id": "label",
         "position": "below"
      }
    },
  ]
}

The following image shows this example in the Sensible app:

Click to enlargeClick to enlarge

As the preceding image shows, here's the output of the example field:

{
  "name_of_output_key": {
    "type": "string",
    "value": "Below the matching anchor, this is the data to extract. The anchor is a label for this data."
  }
}

This example shows the following key concepts:

keydescription
fieldA query that extracts data in relationship to matched text. Its ID is the key for the extracted data. In this example, name_of_output_key.
anchorMatched text that helps narrow down a location in the PDF from which to extract data. In this example, "an anchor is some text to match...".
methodDefines how to expand out from the anchor and extract data. In this example, the Label method extracts data that's below the anchor ("position": "below"). For a list of methods, see Methods.

For a more complete SenseML example, see the SenseML introduction.


Did this page help you?