Split lines
Splits lines distributed along a horizontal axis. This preprocessor is most useful for typewriter-style documents that use whitespaces for formatting.
Parameters
Note: For additional parameters available for this method, see Global parameters for methods. The following table shows parameters most relevant to or specific to this method.
key | value | description |
---|---|---|
type (required) | splitLines | splits lines distributed along a horizontal axis. |
minSpaces (required) | number | The number of consecutive whitespace characters (  ) at or above which to split lines. |
separator | string | Modifies the Min Spaces parameter to split on the specified character, for example "-", instead of the default whitespace character. For example, if you specify "-" for this parameter and 2 for the Min Spaces parameter, then Sensible splits lines when it finds -- . |
Examples
The following example shows solving undersplit lines in a "typewritten" style document. The Split Lines preprocessor preserves columns and rows in this document.
PROBLEM
Without the Split Lines preprocessor, Sensible merges the lines too aggressively:
SOLUTION
Config
{
"preprocessors": [
{
"type": "splitLines",
"minSpaces": 3
}
],
"fields": [
{
"id": "policy_number",
"method": {
"id": "row",
},
"anchor": "policy number",
}
]
}
Example document
The following image shows the example document used with this example config:
Example document | Download link |
---|
Output
{
"policy_number": {
"type": "string",
"value": "18-376-190"
}
}
Updated 6 days ago