Nov 2023

In the last month, we released an open-source Javascript SDK that offers convenient access to the Sensible API, added support for extracting data from Word documents, and increased the upper page limit for the List method from 2 pages to 20 pages. We also made advanced table options available in the Sensible Instruct editor and released other minor improvements.

New feature: Node SDK

Our new open-source Node SDK simplifies extracting data from documents and classifying document types. For example, you can now asynchronously extract from local files with just one method call, instead of two API calls. For more information, see the Node SDK quickstart and Node SDK documentation.

Improvement: List method extracts up to 20 pages of data

The List method's maximum page limit is now 20 pages, updated from a former limit of 2 pages. As a result, you can now use the List method as an alternative to sections for long, repeating data that have simple layouts.

New feature: Support for Word documents

Sensible now supports extracting data from Microsoft Word documents (DOC and DOCX file types) in addition to PDF, PNG, JPEG, and TIFF file types. Sensible also now supports classifying Word documents by type. For more information, see Supported file types.

UX improvement: Advanced Table option in Sensible Instruct

The advanced Rewrite Table parameter we released in July 2023 is now available in the Sensible Instruct editor in addition to the SenseML editor.

Improvement: New configurable recognition for Paragraph type

With the new paragraph Break Threshold parameter for the Paragraph type, you can now configure the size of the space that Sensible recognizes as a paragraph break. If you set the Annotate Superscript and Subscript parameter to true, you can also output end-of-page breaks as [EOP] annotations.

Improvement: Test development configs with quick extractions

You can now choose between production-version and development-version configs when you extract from documents using the Sensible app's Quick extraction tab. This improvement makes testing new configs in bulk easier. When you're satisfied with your new config edits, publish them to production.

Click to enlarge

Improvement: Ensure consistent OCR between portfolio and single-document files

We introduced a new level for the OCR Level parameter. Set the new level 5 for document types you use to process both single documents and portfolios, so that your OCR level is consistent between single documents and portfolios. At level 5, Sensible renders and tests each page in the document to determine whether to run OCR on the page. For more information, see OCR Level.