Split lines

Splits lines distributed along a horizontal axis. This preprocessor is most useful for typewriter-style documents that use whitespaces for formatting.


Note: For the full list of parameters available for this method, see Global parameters for methods. The following table shows parameters most relevant to or specific to this method.

type (required)splitLinessplits lines distributed along a horizontal axis.
minSpaces (required)numberThe number of consecutive whitespace characters ( ) at or above which to split lines.
separatorstringModifies the Min Spaces parameter to split on the specified character, for example "-", instead of the default whitespace character. For example, if you specify "-" for this parameter and 2 for the Min Spaces parameter, then Sensible splits lines when it finds --.


The following example shows solving undersplit lines in a "typewritten" style PDF. The Split Lines preprocessor preserves columns and rows in this document.


Without the Split Lines preprocessor, Sensible merges the lines too aggressively:

Click to enlargeClick to enlarge



  "preprocessors": [
      "type": "splitLines",
      "minSpaces": 3
  "fields": [
      "id": "policy_number",
      "method": {
        "id": "row",
      "anchor": "policy number",

Example document

The following image shows the example PDF used with this example config:

Click to enlargeClick to enlarge

Example PDFDownload link


  "policy_number": {
    "type": "string",
    "value": "18-376-190"

Did this page help you?