July 2023

In the last month, we released new advanced prompt configuration options for methods powered by large-language models (LLMs), introduced confidence signals as a more nuanced alternative to confidence scores for data extracted by LLMs, and made several improvements to existing features.

New feature: Advanced LLM prompt configuration

For methods powered by LLMs, we've introduced advanced configurability at both the config and field level:

  • The chunk-related parameters released last month are now available for all Sensible Instruct methods and at the config level through the NLP preprocessor.
  • You can now configure an introduction to your prompt using the Prompt Introduction parameter.
  • You can now describe your context, for example, describing the document type, using the Context Description parameter.
  • You can add or remove page metadata from the prompt using the Page Hinting Parameter.

For more information, see Advanced prompt configuration.

New feature: Confidence signals for LLMs

For data extracted by LLMs, Sensible asks the LLM to report any uncertainties it has about the accuracy of the extracted data. For example, an LLM can report "multiple possible answers" or "ambiguous query". You can enable this troubleshooting information for the Query method using the new Confidence Signals parameter. For more information, see Confidence signals.

Improvement: Return document filename in API extractions

For the extraction endpoints, you can now optionally specify to return the filename of the document from which you extracted data. To return the name, specify it in the new Document Name query parameter. For example, see Extract doc at your URL.

Improvement: Specify configuration in extraction API endpoints

In the Sensible extraction API endpoints, you can now specify the name of the config with which to extract data from a document. To specify the config, use the new config_name path parameter. Sensible uses the specified config instead of automatically choosing the best-scoring extraction in the document type. For example, see Extract data from a document using specified config.

Improvement: Consistency in naming Question versus Query method

We've resolved a naming inconsistency between the Sensible Instruct and SenseML editor methods, which used the terms Question and Query interchangeably to refer to the same method. Now both use the term Query. The SenseML Question method is now deprecated. For more information, see Query.

Improvement: Configure OCR engine in OCR preprocessor

You can now configure which OCR engine to use in the OCR preprocessor, instead of using the default Amazon engine. For more information, see the Engine parameter in the OCR preprocessor topic.