Getting started

Overview

Welcome! Sensible is a developer-first platform for extracting structured data from documents, for example, business forms in PDF format. Sensible is highly configurable. You can get simple data about text and images in documents in minutes by leveraging GPT-4 and other large language models (LLMs), or you can tackle complex and idiosyncratic document formatting with Sensible's powerful layout-based document primitives.

Click to enlarge

See the following list for an overview of going live with Sensible:

  • Learn to extract data, or use out-of-the-box supported document types
  • Integrate using Sensible's API, SDKs, quick-extract UI, or other tools
  • Quality control extracted data
  • Monitor extracted data in production

This guide gets you started with the first step, extracting data.

Learn to extract data

Let's get started with extracting document data from an example bank statement. We'll author a prompt for a large language model (LLM) to extract a checking account number in minutes.

In this guide, you'll:

  • Extract data from an example document using a natural-language description of your target data, for example, a checking bank account number.
  • Publish your prompt as part of a "config."
  • Test your config against a second, similar document to ensure it extracts the same target data.

Get an account

  1. Get an account at sensible.so. If you don't have an account, you can still read along to get a rough idea of how things work.

  2. Log into the Sensible app

View an example

  1. To view an example bank statement extraction, navigate to https://app.sensible.so/editor/instruct/?d=sensible_instruct_basics&c=bank_statement&g=bank_statement.

    Sensible displays an example document in the left pane, and fields of extracted data in the right pane.

Click to enlarge

Take the following steps to create a prompt to extract more data from the document.

Auto-extract data

To extract document data automatically, take the following steps:

  1. Click Query group:

    Click to enlarge

  2. Click Auto generate, then click Generate:

    Click to enlarge

  3. Sensible automatically generates queries and extracts their answers from the document:

    Click to enlarge

  4. (Optional) Add more queries by clicking Suggest queries, selecting the field IDs that interest you, and clicking Add selected queries:

    Click to enlarge

To test the automatically generated extraction configuration with another document, see Test the prompt. To author your own extraction configurations, see the following steps.

Manually configure extraction

  1. To author your own LLM prompts to extract data points from the document, click Query group.

Click to enlarge

  1. Edit the query group as shown in the following screenshot by entering checking account number (not savings) in the query field. Click Extract.
  2. Sensible displays the extracted account number, 8347-32348, in the Extracted data section:

Click to enlarge

  1. Click Back to fields.

Congratulations! You extracted the checking account number from the bank statement.

Publish the prompt

To extract checking account numbers from other bank statements in production, publish the "config" containing your prompt.

Click Publish configuration, click Production, then click Publish to production:

Click to enlarge

Test the prompt

Let's see if the config containing your prompt works with other bank statements. To test the prompt, take the following steps:

  1. Navigate to https://app.sensible.so/editor/instruct/?d=sensible_instruct_basics&c=bank_statement&g=bank_statement_2. Notice that the left pane now displays a statement for a different customer.

Click to enlarge

  1. In the right pane, scroll down to the checking account number field you authored in previous steps. Verify that the extracted information automatically updated to reflect the second example document. For example, the account number updated from 8347-32348 to 9876-12345:

Click to enlarge

It looks like your prompt was successful at extracting the checking account number from another document. Great!

(Optional) Extract more data

Try extracting more complex pieces of information. For example, try extracting the time period for each account using the List method. See the accounts_list field in this config for an example of using the List method.

Publish the config to save your changes.

(Optional) Extract from your own documents

To extract data from your documents, first check if they're on Sensible's list of out-of-the-box supported document types. If not, configure your custom extractions by using the interactive tutorial or taking the following steps:

  1. To exit the Sensible Instruct editor, click Sensible in the upper left corner.
  2. Click the Document types tab. Create a new document type, then click the type you created to edit it.
  3. In the document type's Reference documents tab, upload your own example document.
  4. In the document type's Configurations tab, create a new test configuration, and click the configuration you created to edit it.
  5. Write prompts in the configuration editor to extract data using what you learned in previous steps.

Next

Learn more about extraction

Integrate

Get extracted document data out of Sensible and put it to work in Excel files, databases, and other destinations. See Integrating.