Color coding

This topic describes color-coded symbols that the SenseML editor overlays on documents in the Sensible app. These overlays visually represent how SenseML queries operate on documents. Use these symbols to author and troubleshoot queries.

symbolrepresents
Yellow boxanchor
Blue boxcaptured method data
Green boxbox, region, table, or chunk
Green pointstarting point for recognizing a box
Green bracketsranges for sections
Dotted blue boxdiscarded method data
Dotted yellow boxdiscarded anchor data
Pink boxfingerprint
Purple boxline details

Yellow box

Yellow boxes represent anchors. For more information about anchors, see Anchors.

For example, the following image shows:

  • an anchor line outlined in yellow ("Here is a good candidate")
  • a line output by the Label method outlined in blue ("And here's the text below")

Click to enlarge

The query used for the preceding image is:

{
  "fields": [
    {
      "id": "simple_label",
      "anchor": "here is a good candidate",
      "method": {
        "id": "label",
        "position": "below"
      }
    }
  ]
}    

Blue box

Blue boxes represent method output. For more information about method, see Method object.

For example, the following image shows:

  • an anchor line outlined in yellow ("Here is a good candidate")
  • a line output by the Label method outlined in blue ("And here's the text below")

Click to enlarge

The query used for the preceding image is:

{
  "fields": [
    {
      "id": "simple_label",
      "anchor": "here is a good candidate",
      "method": {
        "id": "label",
        "position": "below"
      }
    }
  ]
}    

Green box

Green boxes represent boxes, regions, tables, or chunks.

Green point

Green points represent the following:

  • a starting point for recognizing a box or checkbox

  • a starting point for defining the coordinates of a region

Green points can be useful for troubleshooting. For example, in the following image, Sensible can't recognize the box. The green dot provides a visual clue about the problem: the green dot is on the box border itself, as specified by ("position": "left").

Click to enlarge

If you specify to find the box borders by starting from the right edge of the anchor line's boundaries ("position": "right"), the green dot is far enough inside the borders for Sensible to recognize the box:

Click to enlarge

Green brackets

Green brackets represent the start and end of each section in a section group:

Click to enlarge

Dotted blue box

Dotted blue boxes represent discarded method data. Sensible methods filter out captured data depending on parameters you set in the field, the anchor, and the method.

For example, in the following image, a Row method captures everything to the right of the text "Python", but a tiebreaker selects "0" (dark blue box) and discards "first" (dotted blue box).

Click to enlarge

The query used for the preceding image is:

{
  "fields": [
    {
      "id": "filtered_by_tiebreaker",
      "anchor": "Javascript",
      "method": {
        "id": "row",
        "position": "right",
        "tiebreaker": "second"
      }
    }
  ]
}

Common parameters resulting in filtering include:

  • the field's data type (currency, date, address, etc)
  • the method's stop
  • the method's tiebreaker

Dotted yellow box

Dotted yellow boxes represent discarded anchor data, for example for queries that return null.

For example, for the following config:

{
  "fields": [
    {
      "id": "anchors_candidates_filtered_by_method",
      "anchor": "python",
      "match": "first",
      "method": {
        "id": "label",
        "position": "right"
      }
    }
  ]
}

Sensible filters out "python" strings that don't meet the Label method's proximity requirements. For example, in the following image, Sensible represents the "python" string with a dotted yellow box to show that it doesn't work with the Label method (it would, however, work with the Row method):

Click to enlarge

Common parameters resulting in filtering include:

  • the field's data type (currency, date, address, etc)
  • the field's match method (first, last, all)
  • the anchor's start and end
  • the method's id (for example, a Label method filters out all lines that aren't close to other lines)

Pink box

Pink boxes represent matching fingerprint tests.
In the following image, the pink text is a matching fingerprint.

Click to enlarge

The query used for the preceding image is:

fingerprint": {
    "tests": [
      {
        "page": "first",
        "match": [
          {
            "text": "anyco auto insurance",
            "type": "startsWith"
          }
        ]
      }
    ]
  

Note The pink highlighting for fingerprint text isn't compatible with preprocessors that change line indices, such as Split Lines and Merge Lines.

Purple box

If you click on a line, it changes to a purple box and shows the following details:

  • underlying extracted text
  • coordinates of the line's boundaries
  • SenseML and extracted output that relies on that text

You can select multiple lines to see their combined details.

Click to enlarge