Skip to main content
Kodexa is an AI-powered document processing platform that transforms unstructured documents into high-quality structured data. By seamlessly integrating AI capabilities with robust document processing infrastructure, Kodexa enables organizations to efficiently extract, process, and manage information from their documents.

Core Platform Concepts

Document Processing & AI Integration

  • Documents are parsed into standardized structures that can be processed by various AI models
  • Generative AI capabilities are deeply integrated to handle unstructured data, tabular content, and even low-quality scans
  • The platform is model vendor-agnostic, allowing you to use the best AI models for specific use cases
  • Built-in guardrails ensure data lineage and quality

Task-Based Architecture

The platform is built around the concept of Tasks, which provide:
  • A framework for AI/human collaboration
  • Context for processing multiple related documents
  • Integration points for workflow management
  • Support for comments, assignments, and progress tracking

Infrastructure Components

  • Operational Data Store for managing metadata, lineage, and orchestration
  • Event-bus infrastructure for scalable processing
  • S3-based Storage Layer for Data Lake capabilities
  • OpenSearch Index Services for monitoring and reporting

Developer Tools

  • Studio web interface for managing organizations, projects, and document review
  • RESTful API supporting all platform capabilities
  • Python SDK for programmatic integration
  • CLI tools for local development and GitOps workflows

Platform Features

  • Rich Human-in-the-Loop tools with feedback mechanisms via the Workspace
  • Powerful validation and rules engine
  • Comprehensive reporting and analytics
  • Blue/Green deployment support for AI models
  • Event-based processing architecture