Kodexa is an AI-powered document processing platform that transforms unstructured documents into high-quality structured data. By seamlessly integrating AI capabilities with robust document processing infrastructure, Kodexa enables organizations to efficiently extract, process, and manage information from their documents.
Core Platform Concepts
Document Processing & AI Integration
- Documents are parsed into standardized structures that can be processed by various AI models
- Generative AI capabilities are deeply integrated to handle unstructured data, tabular content, and even low-quality scans
- The platform is model vendor-agnostic, allowing you to use the best AI models for specific use cases
- Built-in guardrails ensure data lineage and quality
Task-Based Architecture
The platform is built around the concept of Tasks, which provide:
- A framework for AI/human collaboration
- Context for processing multiple related documents
- Integration points for workflow management
- Support for comments, assignments, and progress tracking
Infrastructure Components
Core Services
- Operational Data Store for managing metadata, lineage, and orchestration
- Event-bus infrastructure for scalable processing
- S3-based Storage Layer for Data Lake capabilities
- OpenSearch Index Services for monitoring and reporting
Developer Tools
- RESTful API supporting all platform capabilities
- Python SDK for easy integration
- Studio interface for designing, testing, and debugging implementations
- Workflow interface for managing human-in-the-loop tasks
Platform Features
- Rich Human-in-the-Loop tools with feedback mechanisms
- Powerful validation and rules engine
- Comprehensive logging and analytics
- Blue/Green deployment support for AI models
- Event-based processing architecture
Resource-Driven Design
The Kodexa platform allows for sharable “resources” to be defined, these resources are the building blocks of AI-driven document automation.
Metadata Classes
The Kodexa platform uses a hierarchy of metadata classes to represent various components and configurations:
Action
Represents a specific action in the system. Actions are discrete operations that can be performed within the Kodexa platform, such as processing documents, triggering workflows, or executing custom logic.
AssistantDefinition
Defines an AI assistant's capabilities. This class encapsulates the configuration, behavior, and functionality of AI assistants used in the platform for various tasks such as document analysis, question answering, or task automation.
CredentialDefinition
Defines credential types and their properties. This class is used to specify different types of authentication and authorization credentials used across the platform, ensuring secure access to various resources and services.
Dashboard
Represents a dashboard configuration. Dashboards provide a visual interface for users to monitor, analyze, and interact with data and processes within the Kodexa platform.
DataForm
Defines structure for data input forms. This class is used to create and manage forms for data entry, ensuring consistent and structured data collection across the platform.
ExtensionPack
Represents a package of platform extensions. Extension packs allow for the addition of new functionality, integrations, or customizations to the Kodexa platform, enhancing its capabilities and adaptability.
GuidanceSet
Defines a set of guidance rules or instructions. Guidance sets provide structured information to guide users or automated processes through complex tasks or decision-making scenarios.
ModelRuntime
Represents a runtime environment for models. This class defines the configuration and requirements for executing machine learning or AI models within the Kodexa platform, ensuring proper resource allocation and execution.
Pipeline
Defines a sequence of processing steps. Pipelines orchestrate the flow of data and operations, allowing for complex, multi-stage processing of documents or data within the platform.
ProjectTemplate
Represents a template for creating projects. Project templates provide predefined structures, configurations, and resources to streamline the creation of new projects within the Kodexa platform.
Prompt
Defines a prompt template for AI interactions. This class is used to create structured prompts for AI models, ensuring consistent and effective communication between users and AI assistants.
RuleSet
Represents a set of business or processing rules. Rule sets define logical conditions and actions to be applied to data or processes, enabling dynamic and configurable behavior within the platform.
Store
Represents a data store configuration. This class defines the properties and settings for various data storage solutions used within the Kodexa platform, ensuring proper data management and access.
Taxonomy
Defines a hierarchical classification system. Taxonomies provide a structured way to categorize and organize information within the platform, facilitating efficient data retrieval and analysis.
On this page
- Core Platform Concepts
- Document Processing & AI Integration
- Task-Based Architecture
- Infrastructure Components
- Core Services
- Developer Tools
- Platform Features
- Resource-Driven Design
- Metadata Classes
- Action
- AssistantDefinition
- CredentialDefinition
- Dashboard
- DataForm
- ExtensionPack
- GuidanceSet
- ModelRuntime
- Pipeline
- ProjectTemplate
- Prompt
- RuleSet
- Store
- Taxonomy