Use the following methods to extract structured data from documents.

Layout-based methods

MethodImageNotes
BoxClick to enlargeExtracts contents from boxes with continuous borders.
CheckboxClick to enlargeExtracts true/false for the selection status of checkboxes.
ColumnClick to enlargeExtracts text aligned in a column, from an anchor down to the bottom of the page.
Document RangeClick to enlargeExtracts text in a range, or extract image metadata (coordinates). Simpler alternative to the advanced Paragraph method.
Fixed TableClick to enlargeExtracts tables where column headings never vary.
IntersectionClick to enlargeExtracts a target line at the intersection of a horizontal line defined by an anchor, and a vertical line defined by a second anchor.
LabelClick to enlargeExtracts a line of text that's proximate to another line.
Nearest CheckboxClick to enlargeExtracts true/false for the selection status of the checkbox nearest to the anchor.
ParagraphClick to enlargeExtracts paragraphs that partially span the page width, for example from columnar layouts.
PassthroughClick to enlargeExtracts anchor text, optionally using RegEx.
RegexExtracts text matching RegEx. Use RegEx capturing groups in this method to clean up extracted data in combination with the Passthrough method.
RegionClick to enlargeExtracts data from a rectangular region defined by coordinates. Faster alternative to Box method.
RowClick to enlargeExtracts text aligned in a row.
SignatureExtracts true/false for the signed status of a region.
Text TableClick to enlargeExtracts tables using solely text-positioning data (fast but limited).

Large language model (LLM)-based methods

See LLM-based methods.