Developer quickstart
Introduction
Sensible extracts structured data from documents, for example PDFs of business forms, using SenseML, a JSON-formatted query language. Write SenseML to extract from your custom documents, or leverage our open-source SenseML configuration library for common business document types.
In this quickstart, use an example SenseML configuration and example PDF to get a quick "hello world" API response.
- If you instead want a guided tour of SenseML concepts, see Getting started.
- If you instead want to explore SenseML without much explanation, then sign up for a free account and check out our interactive in-app tutorials: extract_your_first_data, tables and rows, checkboxes, paragraphs, and regions, and a blank-slate challenge.
Extract example document data
To run an API call and return extracted, structured data from a downloaded example document:
-
Get an account at sensible.so.
NOTE In the Sensible app, don't rename of the default doc type (senseml_basics) or delete the 1_extract_your_first_data config, or this example fails.
-
Copy the following code example:
curl -L https://github.com/sensible-hq/sensible-docs/raw/main/readme-sync/assets/v0/pdfs/1_extract_your_first_data.pdf \
--output 1_extract_your_first_data.pdf && \
curl --request POST \
--url "https://api.sensible.so/v0/extract/senseml_basics" \
--header "Authorization: Bearer <YOUR_API_TOKEN>" \
--header "Content-Type: application/pdf" \
--data-binary "@1_extract_your_first_data.pdf"
curl -L https://github.com/sensible-hq/sensible-docs/raw/main/readme-sync/assets/v0/pdfs/1_extract_your_first_data.pdf ^
--output 1_extract_your_first_data.pdf && ^
curl --request POST ^
--url "https://api.sensible.so/v0/extract/senseml_basics" ^
--header "Authorization: Bearer <YOUR_API_TOKEN>" ^
--header "Content-Type: application/pdf" ^
--data-binary "@1_extract_your_first_data.pdf"
-
Replace
<YOUR_API_TOKEN>
with your API key in the preceding code example. Find your key on your account page. -
Run the code sample in a command prompt. The API returns a
parsed_document
object with the extracted data, as well as metadata about the extraction, in a response like the following:
{
"id":"153753e0-5673-466f-aa61-5175200c210d",
"created":"2022-02-02T21:30:01.981Z",
"status":"COMPLETE",
"type":"senseml_basics",
"configuration":"1_extract_your_first_data",
"parsed_document":{
"your_first_extracted_field":{
"type":"string",
"value":"Welcome to your first document"
}
},
"validations":[
],
"validation_summary":{
"fields":1,
"fields_present":1,
"errors":0,
"warnings":0,
"skipped":0
},
"classification_summary":[
{
"configuration":"1_extract_your_first_data",
"score":{
"value":1,
"fields_present":1,
"penalties":0
}
},
{
"configuration":"2_tables_and_rows",
"score":{
"value":0,
"fields_present":0,
"penalties":0
}
},
{
"configuration":"3_checkboxes_paragraphs_and_regions",
"score":{
"value":0,
"fields_present":0,
"penalties":0
}
},
{
"configuration":"4_extract_from_scratch",
"score":{
"value":0,
"fields_present":0,
"penalties":0
}
}
],
"errors":[
]
}
(Optional) See how it works in the Sensible app
To see this example in the Sensible app:
-
Log into the Sensible app.
-
Navigate to the first tutorial config.
-
Visually examine the example PDF (middle pane), config (left pane), and extracted data (right pane) to better understand the API call you just ran:
Next
- Learn concepts with more detailed examples in the Getting Started Guide
- Check out the SenseML method reference docs to write your own extractions
- See the API reference and example code
Updated 20 days ago