Quickstart PDF to Excel

In this quickstart, extract data from an example tax form PDF and convert the data to a spreadsheet with no coding involved.

  • If you're a developer, see the developer quickstart for a "hello world" API call.
  • If you instead want a guided tour of SenseML concepts so you extract data from your own custom documents, see Getting started.

Introduction

If you're trying to convert a PDF into an Excel spreadsheet, you'll often find tools that visually map the PDF layout onto a spreadsheet, with no meaningful relationship between the extracted text and the underlying cells.

In contrast, this tutorial shows you how to use Sensible to convert document tables, checkboxes, paragraphs, and even complex repeating section layouts into meaningfully labeled column/row pairs and linked sheets. You can convert documents formatted as PDFs, PNGs, and JPEGs.

Convert document data to Excel spreadsheet

  1. Get an account at sensible.so.

  2. Navigate to Sensible's open-source configuration library to choose an example document type. For this tutorial, select Tax forms.

  3. Select Clone to account to copy example tax forms and associated configurations for extracting data from those forms to your account.

  4. Download the following example tax form:

    Example PDFDownload link
  5. Navigate to the quick extraction tab.

  6. Upload the document you downloaded in the previous step.

  7. Select tax_form in the Document type dropdown and click Run extraction.

    Click to enlargeClick to enlarge

    Sensible extracts data from the document and displays it as JSON in the Extraction pane.

  8. Select Download Excel to convert the extracted data to Excel.

Click to enlargeClick to enlarge

The following spreadsheet shows the example output:

  1. (Optional) View the document and its configuration in the Sensible app at https://app.sensible.so/editor/?d=tax_forms&c=1040_2021&g=1040_2021_sample to explore or tweak the SenseML rules for extracting data from this tax form.

Click to enlargeClick to enlarge

Next


Did this page help you?