Skip to main content
The Kodexa Document SDK provides a powerful, cross-platform library for working with hierarchical structured documents stored in KDDB (Kodexa Document Database) format. Whether you’re building document processing pipelines, extraction workflows, or content analysis tools, the SDK gives you the tools you need.

What is KDDB?

KDDB is a SQLite-based format for storing and manipulating structured documents. It provides:
  • Hierarchical Structure: Documents are organized as trees of content nodes
  • Rich Metadata: Support for features, tags, and document-level metadata
  • High Performance: In-memory mode delivers ~100x faster operations
  • Cross-Platform: Same document format works across Python, TypeScript, and more

Choose Your Language

Key Features

Documents are composed of ContentNodes arranged in a tree structure. Each node can have:
  • Content: Text or data stored in the node
  • Type: Classification like “paragraph”, “heading”, “table”, etc.
  • Features: Key-value metadata attached to nodes
  • Tags: Annotations with optional confidence scores and values
Query documents using a familiar selector syntax:
//paragraph                           # All paragraphs
//paragraph[contains(@content, 'invoice')]  # Paragraphs containing "invoice"
//section/paragraph                   # Direct child paragraphs of sections
//*[@tag='important']                 # Any node tagged as important
Choose the right mode for your use case:
  • In-Memory: ~1ms document creation, ideal for processing pipelines
  • File-Based: Persistent storage, ideal for long-term document management
Load and save documents in multiple ways:
  • KDDB files (native SQLite format)
  • Bytes/Blobs (for API responses)
  • JSON (for debugging and interoperability)
  • Text (automatic paragraph parsing)

Quick Comparison

FeaturePythonTypeScript
Packagekodexa-document@kodexa-ai/document-wasm-ts
RuntimePython 3.12+Node.js 16+ / Modern Browsers
BackendGo via CFFIGo via WebAssembly
Performance~100x faster in-memory~5x faster than pure JS
Use CasesPipelines, ML, ServersWeb Apps, APIs, Frontends

Next Steps