Sept 2023

In the last month, we released new quality scores for extractions, enabled extracting large tables with the NLP Table method, and made advanced LLM prompt configuration options available in the Sensible Instruct editor for the List and Table methods.

New feature: Measure extraction coverage

The Sensible API extraction endpoints now return a coverage score, which measures how fully an extraction captured all the data in the document. It's is a percentage comparing non-null, validated fields to total fields returned by a config for a document. For example, a score of 70% for an extraction with no validation errors means that 30% of fields were null. You can filter past extractions using this measure in the Sensible app:

Click to enlarge

A low percentage can indicate a poor-quality extraction, or it can indicate that a document type is sparsely filled out. For example, supplemental forms in insurance applications or supplemental schedules in tax forms can return many nulls, since these forms are often left blank. For more information about extraction coverage, see Accuracy measures.

Improvement: Extract large NLP tables using new parameter

You can now extract large tables using the NLP Table's new Rewrite Table parameter. If you set this parameter to false, Sensible skips restructuring the table, so you improve performance and avoid LLM token overflow limits for tables that exceed 4,000 tokens. This parameter is true by default to allow you to prompt the LLM to split or merge columns or otherwise restructure the table.

Improvement: Superscript and subscript formatting for Paragraph type

For the Paragraph type, you can now annotate the extracted text to indicate superscripts and subscripts in the source document, for example for footnotes or for chemical symbols. If you set the Annotate Superscript And Subscript parameter to true, Sensible annotates subscript and superscript text with [^...] and [_...], respectively

UX improvement: Advanced Query method options in Sensible Instruct

Several of the advanced prompt configuration parameters we released in July are now available in the Sensible Instruct editor in addition to the SenseML editor:

Click to enlarge

You can configure the following parameters in Sensible Instruct for the Table and List methods in individual fields:

  • Page Hinting
  • Prompt Introduction
  • Context Description
  • Chunk Size
  • Chunk Overlap Percentage
  • Chunk Count
  • Confidence Signals

You can now also configure confidence signals for individual Query fields under advanced settings.

For more information, see Advanced prompt configuration.