Data forms render alongside the spatial document viewer, creating a connected workflow where users can select text in the document and extract values into form fields. Two extraction mechanisms are available: direct extract for verbatim text copying, and AI extraction for LLM-powered multi-field inference.Documentation Index
Fetch the complete documentation index at: https://developer.kodexa.ai/llms.txt
Use this file to discover all available pages before exploring further.
Direct Extract
Direct extract copies selected text from the document viewer into a form field verbatim, applying automatic type conversion (e.g., parsing a date string into a date value). It is the simplest extraction method and is best suited for fields where the document text exactly matches the desired attribute value. Enable direct extract by settingallowDirectExtract: true in the editorOptions on a v2:attributeEditor:
AI Extraction on Attribute Editors
AI extraction sends the page text and the user’s selected text to an LLM, which extracts values for multiple target fields simultaneously. This is useful when a single text selection contains information for several related fields — for example, selecting an invoice header block to populate the invoice number, date, and vendor name at once. Enable AI extraction by adding anaiExtraction object to editorOptions:
AIExtractionConfig object accepts the following properties:
| Property | Type | Description |
|---|---|---|
prompt | string | Inline prompt text sent to the LLM. Mutually exclusive with promptRef. |
promptRef | string | Reference to a stored prompt template (e.g., "acme/extract-invoice"). Mutually exclusive with prompt. |
modelType | "SMALL" | "LARGE" | Model size. SMALL is faster and cheaper; LARGE is more capable. Defaults to SMALL. |
targetPaths | AIExtractionTarget[] | The fields the LLM should populate. Each entry specifies a tagPath and an optional description to help the model understand what to extract. |
aiExtraction is configured, showAddFromSelection is implied — the editor displays a sparkle button when the user has an active text selection in the document viewer. Clicking the button triggers the LLM call, and the returned values are written into each target field. An empty field shows the placeholder “Select text in document, then click to extract” by default.
AI Extraction on Grids
Grids support their own AI extraction configuration for tabular and repeating data. When configured on av2:grid, an “AI Extract” button appears in the grid toolbar. Clicking it sends the page text and selection to an LLM, which extracts multiple rows and creates a data object for each one.
AIGridExtractionConfig shares the same prompt, promptRef, and modelType properties as the attribute-level config. The key difference is targetPaths — when omitted, the grid automatically derives targets from all enabled, non-group children of the grid’s taxon, so you only need to specify targetPaths if you want to limit or annotate the extracted fields.
Choosing an Extraction Method
| Direct Extract | AI Extraction | |
|---|---|---|
| Mechanism | Verbatim text copy | LLM inference |
| Fields | Single field | Multiple fields at once |
| Accuracy | Exact match | Interpreted by model |
| Speed | Instant | Network round-trip |
| Best for | Simple values, corrections | Complex or multi-field extraction |
v2:attributeEditor and v2:grid props, see Data Components.