Getting started with email extraction
Introduction
You can automatically extract structured data from email bodies and attachments by forwarding them to Sensible.
The following image shows an overview of email extraction:
flowchart TD
A[User receives email] --> B[User forwards email to Sensible]
B --> C[Sensible classifies and extracts data]
C --> D[User gets extracted data via webhook]
Implementation overview
To implement this workflow, take the following general steps:
-
Determine email filters
-
Determine a set of similar emails from which you want to extract data. For example, you're in PropTech and you want to extract data from residential lease applications.
-
Determine email filtering criteria for the set of emails. In a succeeding step, use the filters to automatically forward these emails to a Sensible email address.
-
-
Configure data extraction
- In the Sensible app, define a document type for each email attachment in the lease application emails from which you want to extract data. You can optionally define a document type for the email body. In this example, the lease application emails include
driverse_licenses,paystubs,leases,email_body_lease_applications, and other document types.
- In the Sensible app, define a document type for each email attachment in the lease application emails from which you want to extract data. You can optionally define a document type for the email body. In this example, the lease application emails include
-
(Optional) Configure data destination
- By default, view the extracted data in the Sensible app. Optionally you can also define webhooks to receive the extracted data.
-
Create email processor
- When you've completed the preceding steps, contact Sensible to create an email processor. An email processor contains the specified document types, webhook URLs, and forwarding email aliases. You can now start forwarding emails to the processor and receive extracted data.
-
(Optional) Send a test email
- Download sample documents and send a test email to view an example extraction.
-
(Optional) Test in dev
- Make changes to your extraction configs and test in a dev environment before going into production.
Getting started
Let's walk through an example of implementing an email processor. In this example implementation, you're in PropTech and you want to extract data from lease applications addressed to the property manager "Sensible Property." Lease application emails to this property manager typically include the following attachments:
- drivers license
- signed lease
- a single PDF file containing multiple documents (a "portfolio" file):
- tax statement
- bank statement
- paystub
The following image shows an example email:
You'll create a residential_lease_applications email processor to handle emails like this one.
Determine email filters
- Determine your filtering criteria for forwarding Sensible Property lease applications. For example, you filter by emails addressed to
[email protected].
Configure data classification and extraction
To configure email data classification and extraction in your Sensible account, take the following steps.
Create out-of-the-box document types
Create document types to classify and extract from the email attachments:
- Follow the steps in Out-of-the-box extractions to add extraction support for the following document types to your account:
- driver_license document type
- pay_stubs document type
- bank_statements document type
- pay_stubs document type
- 1040s document type
(Optional) Create custom document types
Sensible doesn't provide out-of-the-box extraction support for leases. To create support in your account, take the following steps:
-
Create a document type for leases. In the Document Types tab, Click New document type. In the dialog, take the following steps:
- Name the document type
leases. - Upload the following example document:
Example document Download link -
Name the config
sensiblepropertiesfor the fictional property management company in this example. -
After you create the document type, edit the config you created. Paste the following code into the left pane:
{ "fields": [ { "method": { "id": "queryGroup", "searchBySummarization": "page", "queries": [ { "id": "tenancy_terms_start", "description": "tenancy terms start date", "type": "date" }, { "id": "tenancy_terms_end", "description": "tenancy terms end date", "type": "date" }, { "id": "monthly_rents_dollars", "description": "monthly rents in dollars", "type": "currency" } ] } } ] }
- Name the document type
-
(Optional) Create a document type for lease application email bodies:
- Follow the preceding steps to create a document type named
email_body_lease_applicationswith a config namedsensibleproperties. Upload the following example document:
Example document Download link Note: This example document is a PDF exported from an email body for testing. In production, Sensible automatically converts email bodies to PDFs.
- In the config, paste the following code:
- Follow the preceding steps to create a document type named
{
"fields": [
{
"method": {
"id": "queryGroup",
"searchBySummarization": "page",
"queries": [
{
"id": "applicant_name",
"description": "What is the name of the applicant?",
"type": "string"
},
{
"id": "date_sent",
"description": "What is the date the email was sent?",
// this type formats the extracted data as a ISO 8601 date
"type": "date"
},
{
"id": "attachment_count",
"description": "How many attachments are included in the email?",
"type": "string"
}
]
}
}
]
}How it works: email processors and document types
Your residential_lease_applications email processor uses the document types you configured in previous steps for classification and extraction:
- The email processor classifies each attachment against the document types you specify for the email processor:
-
If you specify to process all attachments as portfolio files, Sensible automatically segments each document by its page range in the file, and classifies each document in each file against all the document types you specify.
-
If you specify to process all attachments as single-file documents, Sensible classifies each file as a single document type.
If you expect a mix of portfolio and single-document files, then specify to process them all as portfolio files. Note this setting can add extra processing time for single-document files.
-
- You specify one document type for the email body, for example,
lease_application_email_bodies. The email processor extracts data using that document type.
flowchart TD
A[email processor] -->|classify attachments| B[attachment document types]
A --> C[body document type]
B --> D[extract data]
C --> D[extract data]
Each document type contains configs, or collections of SenseML queries for extracting document data. Configs handle variations in a document type. For example, each config in the pay_stubs document type handles a different paystub software vendor, such as Gusto, ADP, or Paylocity. When you edit configs, you can publish them to a development environment for testing before publishing them to production.
(Optional) Configure data destination
To receive extracted email data, you have the following options:
- By default, view and download the extracted data in the Sensible app on the Extraction history tab:
- Implement webhooks as destinations for the extracted data. You can specify a webhook for each environment to which you publish your configs. See the following sections for more information about environments.
Create email processor
In the preceding steps, you configured the necessary prerequisites for an email processor that can handle lease applications. Contact Sensible to create the email processor. Provide the following details:
- the name of the email processor, for example,
residential_lease_applications. - the names of the document types you created in your account (
driver_license,pay_stubs,bank_statements,1040s,leases, andemail_body_lease_applications). - indicate whether you expect the attachments to include any multi-document portfolio attachments. In this example, you expect portfolio file attachments in addition to single-document file attachments, so specify
portfolio. - (optional) the environmental prefix you want to use for your development environment, for example
devordevelopment. - (optional) the URL of each webhook you implemented and which environment each corresponds to.
After creating the email processor, Sensible provides you with the email address for the processor, for example: [email protected].
Forward your lease application emails to this address.
(Optional) send a test email
Send a test email with attachments to the processor you created. You can download example documents from the following locations:
| document | link |
|---|---|
| Drivers license | Download link |
| Portfolio file containing bank statement, paystub, and tax statement | Download link |
| Lease | Download link |
For the body, use the following text:
Dear Anita Patel,
I hope you’re doing well. I’m writing to formally submit my application for the rental unit at 123 Sample St unit #3. I am very interested in leasing this apartment and have attached all the necessary documents for your review.
Please find attached:
- Signed lease agreement
- Proof of income (recent pay stub)
- Copy of my ID (driver’s license)
Please let me know if you need any additional information or if there are any next steps in the approval process.
Thank you for your time and consideration. I look forward to your response.
Best regards,
Brenda Sample
(505) 123 4567
[email protected]
You should get back an extraction response for each attachment at the webhook you specified.
In the Sensible app, click each extraction to view its data. For example, the paystub extraction includes the extracted fields employer_name: Delta Airlines and employee_name: Brenda Sample:
(Optional) test in dev
If you make a change to a config, you can test it in the development environment before going live in production.
For example, say you make the following change in your config in the email_body_lease_applications document type:
{
"fields": [
{
"method": {
"id": "queryGroup",
"searchBySummarization": "page",
"queries": [
/* old prompt was 'What is the name of the applicant?'
new simplified prompts asks for last and first names separately */
{
"id": "applicant_first_name",
"description": "Applicant first name",
"type": "string"
},
{
"id": "applicant_last_name",
"description": "Applicant last name",
"type": "string"
}
]
}
}
]
}
To test the change in the development environment:
- Publish the config to the development environment
- Add the development environment prefix you specified in a previous step to the forward address, for example,
development.residential_lease_applications.abc_xyz@app.sensible.so. If you omit the environment prefix, Sensible defaults to theproductionenvironment.
View the results in the Sensible app, or in the webhook you specified for the development environment in a previous step.
Updated 2 days ago