October 2025

In the last month, Sensible released advanced configuration options for layout-based extraction methods and added support for extracting from XLSM files.

Improvement: Preprocessor now splits lines on matched pages

When preprocessing your document, you can now split lines on matching pages using the new Match parameter. For example, use this parameter to target typewritten pages in a document that otherwise contains pages with digital fonts.

Improvement: Advanced configurability for Box method

For the Box method, you can now relax the criteria by which Sensible determines that a box "contains" lines. For example, use the new Percent Overlap X and Percent Overlap Y parameters to extract poorly aligned box contents:

Improvement: Output document's file type

With the Get File Metadata method's new Content Type enum, you can output the document's MIME content type to the extraction's parsed_document output. For example, you can use the Conditional method to extract a set of fields based on whether a document's file type is image/jpeg. For more information, see the Get File Metadata method.

Improvement: Reduce large output size when postprocessing

When you specify a completely custom schema for your extracted document data using a JsonLogic postprocessor, you can now suppress outputting data in the default parsed_document schema. For example, if you want to reduce the size of the extracted output or you're interested solely in the postprocessor output, you can set the new Keep Parsed Document parameter in the postprocessor to false. Setting it to false disables human review and Excel output.

New feature: Support for XLSM documents

Sensible now supports data extraction and classification for XLSM documents. For more information, see Supported file types.