Advanced LLM prompt configuration

To extract data from a document using LLM-based methods, Sensible has to submit a part of the document to the LLM. Submitting a part of the document instead of the whole document improves performance and accuracy. This document excerpt is called a prompt's context.

To troubleshoot LLM-based methods, you can configure a prompt's context using one of the following approaches:

(Default) Locate context by scoring document chunks
Locate context by summarizing pages
Locate context by chaining prompts
Locate multimodal (non-text) data

For information about configuring each of these approaches, see the following sections.

(Default) Locate context by scoring document chunks

Sensible's default method for locating context is to split the document into chunks, score them for relevancy using embeddings, and then return the top-scoring chunks as context:

Click to enlarge

The advantage of this approach is that it's fast. The disadvantage is that it can be brittle.

The following steps outline this default approach and provide configuration details:

Sensible splits the document into chunks. A chunk is less than or equal to a page in length. Parameters that configure this step include:
- Chunk Count parameter.
- Page Range parameter
- Note: Defaults for chunking vary by LLM-based method. For example, the default for the Chunk Count parameter is 5 for the Query Group method and 20 for the List method, and each method has a default chunk size.
Sensible selects the most relevant chunks and combines them with page-number metadata to create a "context". Parameters that configure this step include:
- LLM Engine parameter
Sensible creates a full prompt for the LLM that includes the context and the descriptive prompts you configure in the method. Sensible sends the full prompt to the LLM.
Sensible returns the LLM's response.

The details for this general process vary for each LLM-based method. For more information, see the Notes section for each method's SenseML reference topic, for example, List method.

Locate context by summarizing pages

When you set the Search By Summarization parameter to true for supported LLM-based methods, Sensible finds context using LLM-generated page summaries. Sensible uses a completion-only retrieval-augmented generation (RAG) strategy:

Sensible prompts an LLM to summarize each page in the document.
Sensible prompts a second LLM to return the pages most relevant to your prompt based on the summaries.
Sensible uses those pages' text as the context.

Click to enlarge

This strategy is useful for long documents in which multiple mentions of the same concept make finding relevant context difficult, for example, long legal documents.

Locate context by chaining prompts

When you specify the Source IDs parameter for supported LLM-based methods, Sensible prompts an LLM to answer questions about other fields' extracted data. In this case, the context is predetermined: it's the output from the other fields.

For example, you use the Text Table method to extract the following data into a snacks_rank field:

snack       annual regional sales
apples      $100k
corn chips  $200k
bananas     $150k

If you create a Query Group method with the prompt what is the best-selling snack?, and specify snacks_rank as the context using the Source IDs parameter, then Sensible searches for answers to your question (corn chips) only in the extracted snacks_rank table rather than in the entire document:

Click to enlarge

Use other fields as context to:

Reformat or otherwise transform the outputs of other fields.
Compute or generate new data from the output of other fields
Narrow down the context for your prompts to a specific part of the document.
Troubleshoot or simplify complex prompts that aren't performing reliably. Break the prompt into several simpler parts, and chain them together using successive Source ID parameters in the fields array.

To use other fields as context, configure the Source Ids parameter for the Query Group or List methods.

Locate multimodal (non-text) data

When you configure the Multimodal Engine parameter for the Query Group method, you can extract from non-text data, such as photographs, charts, or illustrations.

For example, for the following image, you can prompt, "are the buildings multistory? return true or false".

Click to enlarge

When you extract multimodal data, Sensible sends an image of the relevant document region as context to the LLM. Using the Region parameter, you can configure to locate the context using a manually specified anchor and region coordinates. If you specify automatic, Sensible selects the top-scoring document chunks and sends them as images.

Troubleshooting

See the following tips for troubleshooting situations in which large language model (LLM)-based extraction methods return inaccurate responses, nulls, or errors.

Fix error messages

Error message

ConfigurationError: LLM response format is invalid

Notes

Reword the prompt in simpler terms, chain the prompt using the Source Ids parameter, or avoid specifying a format in the prompt for the extracted data. Or, add a fallback field to bypass the error if the original query is working for most documents and you're only seeing the error intermittently. See the following section for more information about fallbacks.

Background: Sensible returns this error when the LLM doesn't return its response in the JSON format that Sensible specifies in the backend for full prompts. This can occur when your description parameters prompt the LLM to return data in a specific format that conflicts with the expected JSON format.

Interpret confidence signals

Confidence signals are an alternative to confidence scores and to error messages. For information about troubleshooting LLM confidence signals, such as multiple_possible_answers or answer_maybe_be_incomplete, see Qualifying LLM accuracy.

Create fallbacks for null responses or false positives

Sometimes an LLM prompt works for the majority of documents in a document type, but returns null or an inaccurate response (a "false positive") for a minority of documents. Rather than rewrite the prompt, which can cause regressions, create fallbacks targeted at the failing documents. For more information, see Fallback strategies.

Trace source context

Tracing the document's source text, or context, for an LLM's answer can help you determine if the LLM is misinterpreting the correct text, or targeting the wrong text.

You can view the source text for an LLM's answer highlighted in the document:

In the visual output pane, click the Location icon next to a field to view its source text in the document. For more information about how location highlighting works and its limitations, see Location highlighting.

Click to enlarge