Documentation Index
Fetch the complete documentation index at: https://developer.kodexa.ai/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Data definitions in Kodexa provide the structure and rules for extracting, validating, and processing information from documents. They define what data to extract, how to validate it, and how to present it to users.What Are Data Definitions?
Data definitions are the blueprints for your document processing workflows. They specify:- Structure: What data elements exist and how they relate
- Types: What kind of data each field contains (text, numbers, dates, etc.)
- Sources: Where data comes from (document content, metadata, calculations, review)
- Validation: Business rules and data quality checks
- Behavior: Formulas, selection options, event scripts, and validation cascades that run when data changes
Data Definition Structure
Data elements, groups, sources, and extraction behavior
Event-Based Scripting
Reactive JavaScript scripts that run when modeled data changes
Formulas
Calculation logic for derived and computed fields
Validation and Formatting
Business rules, exceptions, and reviewer-facing visual cues
Core Concepts
Data Structure
Data definitions are hierarchical structures of data elements that define what to extract from documents. In configuration and API payloads, those elements are still stored under thetaxons field.
Example use cases:
- Invoice data extraction (vendor, line items, totals)
- Contract metadata (parties, dates, terms)
- Form processing (applicant info, answers, signatures)
Data Definition Structure
Learn how data elements, groups, value sources, and extraction behavior fit together
Data Types
Kodexa supports rich data types for accurate extraction and validation:- Basic Types
- Specialized Types
- Complex Types
- STRING - Text of any length
- NUMBER - Numeric values
- BOOLEAN - True/false values
- DATE - Calendar dates
- DATE_TIME - Dates with timestamps
Data Sources
Define where each data element gets its value:Document Extraction
Document Extraction
Extract directly from document content using AI/ML models
Metadata
Metadata
Pull from document properties and system fields
Formulas
Formulas
Calculate from other fields
Review
Review
Fields populated during human review
Common Patterns
Invoice Processing
Extract structured data from invoices:Contract Metadata
Capture key contract information:Form Data
Process form submissions:Validation and Quality
Validation Rules
Define business rules to ensure data quality:Validation and Conditional Formatting
Learn the exact
validationRules schema, conditional formatting schema, formula language, and runtime behavior.Conditional Formatting
Apply visual cues based on data values:Best Practices
Design Principles
Start Simple, Iterate
Start Simple, Iterate
Begin with core fields and add complexity as needed. Don’t over-engineer initial data definitions.Start with:
- Essential fields only
- Basic data types
- Simple validation
- Computed fields
- Complex validations
- Conditional formatting
Use Semantic Definitions Well
Use Semantic Definitions Well
Write clear, specific extraction prompts:Good:Avoid:
Organize with Groups
Organize with Groups
Use groups to:Repeating groups: Collections
- Organize related fields logically
- Handle repeating structures (line items, signatories)
- Improve UI presentation
Validate Strategically
Validate Strategically
Critical validations (non-overridable):
- Required fields
- Data type constraints
- Business logic rules
- Unusual values
- Formatting issues
- Threshold warnings
Naming Conventions
Use consistent naming across your data definitions:Getting Started
Understand Your Documents
Analyze the documents you’ll process:
- What data needs to be extracted?
- What’s the document structure?
- What validations are needed?
Design Your Data Definition
Sketch out the data structure:
- List all required fields
- Group related fields
- Identify repeating sections
Configure Data Elements
For each field, define:
- Data type
- Value source
- Semantic definition
- Validation rules
Learn More
Data Definition Structure
Data elements, groups, value sources, and configuration options
Formula Reference
Built-in functions for calculations and validations
Validation and Formatting
Complete guide to validation rules, conditional formats, and formula behavior
Event-Based Scripting
Add reactive JavaScript behavior to the data model
API Documentation
Programmatic access to data definition management
Examples
Invoice Data Definition
Complete invoice extraction example
Contract Data Definition
Contract metadata extraction example
Form Data Definition
Form data processing example
Purchase Order Data Definition
Purchase order extraction example
