Content Structures & Mixins - Kodexa Developer Portal

Not all content is a PDF. The Kodexa Document format (KDDB) uses mixins to adapt the same underlying content node tree to different kinds of content — spatial documents with bounding boxes, editable markdown content, email messages, and more. This guide explains the content structure system, the available mixins, and how the UI, selectors, and extraction pipeline work with each one.

What is a Mixin?

A mixin is a label on a document that tells the platform how to interpret its content node tree. It determines:

Which viewer/editor the UI renders for the document
What node types are expected in the tree
What features are available on content nodes
How the content was produced (spatial parsing, markdown conversion, email ingestion, etc.)

Mixins are set on the document’s mixins field and are composable — a document can use multiple mixins when one builds on another.

Content Structure Overview

Mixin	Root Node	Tree Shape	UI Component	Use Cases
`spatial`	`document`	Deep — page → content-area → line → word	Spatial viewer with bounding boxes	PDFs, scans, images, PowerPoints
`text`	`document`	Shallow — text nodes	Text viewer	Plain text files
`markdown`	`document`	Shallow — block-level nodes	Block editor	News articles, rich-text content, general writing
`email`	`email`	Shallow — block-level nodes	Email header panel + block editor	Email messages
`workbook`	`workbook`	Medium — workbook → worksheet → row → cell	Spreadsheet viewer	Excel files, spreadsheets

All content structures share the same underlying KDDB format, the same selector language, and the same extraction pipeline. The mixin simply changes how the content node tree is organized and how the UI presents it.

Spatial Content Structure

The spatial mixin is the original Kodexa content structure, designed for documents where physical layout matters — PDFs, scanned images, PowerPoints, and similar formats.

Tree structure

document
  └── page (index: 0)
      ├── content-area
      │   ├── line: "ACME Corp"
      │   │   ├── word: "ACME"
      │   │   └── word: "Corp"
      │   └── line: "Invoice #12345"
      │       ├── word: "Invoice"
      │       └── word: "#12345"
      └── content-area
          ├── line: "Widget A  $1,234.00"
          └── line: "Widget B  $567.89"

Key characteristics

Deep tree — document → page → content-area → line → word
Bounding boxes — Every node carries spatial coordinates [x, y, width, height] describing its position on the page
Page-based — Content is organized by physical pages
Word-level granularity — Individual words are nodes, enabling precise tagging and spatial queries
Read-only in UI — Users view the spatial layout and tag content, but don’t edit the text directly

Spatial features

Content nodes in a spatial document carry spatial features:

spatial:bbox  → [0.5, 1.2, 3.4, 1.5]    # Position on page
spatial:rotate → 90                       # Rotation in degrees
format:font   → "Arial"                  # Font information
format:bold   → true                     # Text formatting

Example selectors

//page                              # All pages
//line                              # All lines across all pages
//word[contains(@content, 'Total')] # Words containing 'Total'
//page[0]//line                     # All lines on the first page
//*[hasTag('invoice_number')]       # Nodes tagged as invoice number

Text Content Structure

The text mixin is the simplest structure, used for plain text content without spatial information or rich formatting.

Tree structure

document
  └── (text content nodes)

Key characteristics

Shallow tree — Minimal hierarchy
No bounding boxes — No spatial positioning
No formatting — Plain text only
View-only in UI — Rendered as plain text

Markdown Content Structure

The markdown mixin represents rich-text content as a tree of block-level content nodes. Each block is an independently editable region in the UI, and inline formatting (bold, italic, links, inline code) is stored as markdown syntax within the block’s content.

Why block-level?

A full markdown AST would create nodes for every bold span, link, and inline code segment. This is unnecessarily complex for editing — every keystroke in a bold word would require tree restructuring. Instead, the markdown mixin uses block-level nodes only:

Block nodes — heading, paragraph, list, code-block, etc. — are ContentNodes in the tree
Inline formatting — **bold**, *italic*, [links](url), `code` — stays as markdown syntax within each block’s content string

This keeps the tree manageable, makes editing straightforward, and still allows selectors to query at the block level.

Tree structure

document
  ├── heading (level: 1, "Breaking News: Market Update")
  ├── paragraph ("The stock market saw **significant gains** today...")
  ├── heading (level: 2, "Key Highlights")
  ├── list (ordered: false)
  │   ├── list-item ("S&P 500 up 2.3%")
  │   ├── list-item ("Tech sector leads gains")
  │   └── list-item ("Bond yields decline")
  ├── blockquote ("> Analysts expect continued growth...")
  ├── image (src: "chart.png", alt: "Market chart")
  ├── code-block (language: "json", "{ \"sp500\": 5234.12 }")
  ├── table
  │   ├── row
  │   │   ├── cell ("Index")
  │   │   └── cell ("Change")
  │   └── row
  │       ├── cell ("S&P 500")
  │       └── cell ("+2.3%")
  └── horizontal-rule

Block node types

Text Blocks
Container Blocks
Media & Structural Blocks

Node Type	Features	Content
`heading`	`markdown:level` (1-6)	Inline markdown text
`paragraph`	—	Inline markdown text
`blockquote`	—	Inline markdown text

These blocks contain text with optional inline markdown formatting. The content is stored in content_parts and rendered as rich text in the editor.

heading content:    "Breaking **News**: Market Update"
paragraph content:  "The [S&P 500](https://...) hit a new high."
blockquote content: "Analysts expect *continued* growth."

Node Type	Features	Content
`list`	`markdown:ordered` (bool)	Empty — children are `list-item` nodes
`list-item`	`markdown:checked` (bool, optional)	Inline markdown text
`table`	—	Empty — uses `row` → `cell` children

Container blocks have children that hold the actual content. Lists contain list-item nodes; tables use the existing row → cell node types from the spatial structure.The optional markdown:checked feature on list-item supports task list items (- [x] Done, - [ ] Todo).

Node Type	Features	Content
`code-block`	`markdown:language` (string)	Raw code text
`image`	`markdown:src`, `markdown:alt`	Empty
`horizontal-rule`	—	Empty

Code blocks store raw text (no inline markdown processing) with an optional language identifier for syntax highlighting. Images reference external URLs via features. Horizontal rules are simple dividers with no content.

Key characteristics

Shallow tree — document → blocks, with nesting only for lists and tables
No bounding boxes — Content is not spatially positioned
Editable in UI — Block-based editor where each ContentNode is an editable block
Inline markdown — Rich formatting preserved as markdown syntax within block content

Example selectors

//heading                                                      # All headings
//heading[hasFeatureValue('markdown', 'level', '1')]           # All h1 headings
//paragraph                                                    # All paragraphs
//list-item                                                    # All list items
//blockquote                                                   # All blockquotes
//code-block[hasFeatureValue('markdown', 'language', 'python')]# Python code blocks
//table/row/cell                                               # All table cells

Email Content Structure

The email mixin composes the markdown mixin to represent email messages. Email-specific metadata (from, to, subject, date) is stored in document metadata, while the email body uses the same block-level markdown nodes.

Tree structure

email (root)
  ├── paragraph ("Hello team,")
  ├── paragraph ("Here are the Q4 results:")
  ├── list (ordered: false)
  │   ├── list-item ("Revenue: **$12.4M** (+15%)")
  │   └── list-item ("Operating margin: **23%**")
  ├── blockquote ("> From the previous quarterly report...")
  └── paragraph ("Best regards,\nPhil")

Document metadata

Email headers are stored as document-level metadata, not as content nodes:

Field	Type	Description
`from`	string	Sender email address
`to`	string[]	Recipient addresses
`cc`	string[]	CC addresses
`bcc`	string[]	BCC addresses
`subject`	string	Email subject line
`date`	datetime	Send date/time
`messageId`	string	RFC 2822 Message-ID
`inReplyTo`	string	Parent message ID for threading
`threadId`	string	Conversation thread identifier
`headers`	object	Additional raw email headers

Attachments

Email attachments are not embedded in the email KDDB. Each attachment becomes its own document family in the store with the appropriate mixin:

A PDF attachment → spatial mixin with its own KDDB
A text file attachment → text mixin
A forwarded email → email mixin

The parent email’s metadata links to attachment document families. This keeps KDDBs focused and allows each attachment to go through its own processing pipeline.

Key characteristics

Root node is email (not document) — distinguishes email from general markdown
Composable — The email mixin includes the markdown mixin for the body
Headers in metadata — Email-specific data lives in document metadata, not the tree
Same block editor — The body uses the same block editor as the markdown mixin
Same selectors — Query the body with the same selector syntax

Example selectors

//email/paragraph    # Body paragraphs
//email//list-item   # All list items in the email body
//email/blockquote   # Quoted text (often from replies)

Workbook Content Structure

The workbook mixin represents spreadsheet content — Excel files and similar tabular formats. The content node tree mirrors the workbook’s structure: sheets contain rows, rows contain cells.

Tree structure

workbook (root)
  ├── worksheet ("Income Statement")
  │   ├── row
  │   │   ├── cell: "Revenue" (ref: A1)
  │   │   ├── cell: "Q1" (ref: B1)
  │   │   └── cell: "Q2" (ref: C1)
  │   └── row
  │       ├── cell: "Product A" (ref: A2)
  │       ├── cell: "1,234.00" (ref: B2)
  │       └── cell: "1,456.00" (ref: C2)
  └── worksheet ("Balance Sheet")
      └── row
          └── cell: "Assets" (ref: A1)

Cell features

Content nodes in a workbook document carry cell-specific features:

workbook:ref     → "B2"                    # Cell reference (column letter + row number)
workbook:sheet   → "Income Statement"      # Parent worksheet name
workbook:formula → "=SUM(B2:B10)"          # Original formula (if cell contains one)
workbook:merge   → "A1:D1"                 # Merged cell range (on top-left cell only)

Key characteristics

Medium-depth tree — workbook → worksheet → row → cell
No bounding boxes — Cells are addressed by reference, not spatial coordinates
Cell references — Every cell carries a workbook:ref feature mapping it to its Excel address
Formula preservation — Formulas are stored alongside calculated values
Read-only in UI — Users view the spreadsheet and tag cells for extraction, but don’t edit values
Sheet tabs — Multiple worksheets render as tabs, similar to Excel

Example selectors

//worksheet                                                    # All worksheets
//cell                                                         # All cells across all sheets
//cell[hasFeatureValue('workbook', 'ref', 'B2')]              # Cell at reference B2
//cell[contains(@content, 'Revenue')]                          # Cells containing 'Revenue'
//worksheet[contains(@content, 'Income')]//cell               # All cells in sheets with 'Income' in the name
//*[hasTag('revenue/total')]                                   # Nodes tagged as revenue total

The Block Editor

The markdown and email mixins share a block-based editor in the Kodexa UI. Each content node is rendered as an independently editable block, similar to Notion or Google Docs.

How editing maps to the KDDB

Every user action in the editor corresponds directly to a KDDB content node operation:

User Action	KDDB Operation
Edit text in a block	Update `content_parts` on the ContentNode
Reorder blocks (drag & drop)	Update `index` on affected ContentNodes
Change block type (e.g., paragraph → heading)	Update `node_type`, add/remove features
Delete a block	Remove ContentNode from tree
Add a new block	Create new ContentNode at the target index
Split a block (press Enter)	Split content_parts, create new ContentNode after current
Merge blocks (Backspace at start)	Merge content into previous node, remove current

Block type selection

Users can change a block’s type using a toolbar or / command. Compatible conversions include:

paragraph ↔ heading ↔ blockquote
paragraph → list (wraps in a list with one list-item)
paragraph → code-block
Any text block → horizontal-rule (clears content)

Extraction Across All Content Structures

The extraction pipeline works the same way regardless of mixin. Tags are applied to content nodes, grouped by index, and converted to data objects: This means you can extract structured data from markdown and email content using the same data definitions, tagging models, and extraction engine that work with spatial documents. For example:

News articles — Extract entities, topics, dates, and quotes from markdown content
Emails — Extract action items, deadlines, and referenced documents from email bodies
Reports — Extract metrics, summaries, and key findings from rich-text documents

Choosing a Content Structure

Use spatial when...

You’re processing PDFs, scanned documents, images, or PowerPoints where physical layout matters. The spatial mixin preserves bounding boxes, page structure, and word-level positioning — essential for document understanding, table extraction, and content that needs to be visually overlaid on the original document.

Use text when...

You have plain text content with no formatting or layout requirements. This is the simplest structure and is appropriate for log files, raw text exports, or content that will be processed purely for its text.

Use markdown when...

You have rich-text content that users need to view and edit — news articles, reports, knowledge base entries, or any content that benefits from structured blocks (headings, lists, code, tables) with inline formatting. The block editor makes this content interactive.

Use email when...

You’re ingesting email messages. The email mixin gives you structured metadata (from, to, subject, date, threading) plus a markdown body that users can view and edit. Attachments become their own document families with appropriate mixins.

Use workbook when...

You’re processing Excel files, spreadsheets, or tabular data where cell structure matters. The workbook mixin preserves cell references, formulas, merged cells, and worksheet organization — essential for financial data extraction, tabular analysis, and content that needs to be visually mapped to a spreadsheet grid.

What’s Next?

Document Structure Deep Dive

Detailed look at content nodes, spatial data, and how the tree maps to real documents.

SDK Reference

Work with documents programmatically in Python and TypeScript.

Data Definitions

Define taxonomies to extract structured data from any content structure.

Selectors

Query content nodes using the XPath-like selector language.

​What is a Mixin?

​Content Structure Overview

​Spatial Content Structure

​Tree structure

​Key characteristics

​Spatial features

​Example selectors

​Text Content Structure

​Tree structure

​Key characteristics

​Markdown Content Structure

​Why block-level?

​Tree structure

​Block node types

​Key characteristics

​Example selectors

​Email Content Structure

​Tree structure

​Document metadata

​Attachments

​Key characteristics

​Example selectors

​Workbook Content Structure

​Tree structure

​Cell features

​Key characteristics

​Example selectors

​The Block Editor

​How editing maps to the KDDB

​Block type selection

​Extraction Across All Content Structures

​Choosing a Content Structure

​What’s Next?

Document Structure Deep Dive

SDK Reference

Data Definitions

Selectors

What is a Mixin?

Content Structure Overview

Spatial Content Structure

Tree structure

Key characteristics

Spatial features

Example selectors

Text Content Structure

Tree structure

Key characteristics

Markdown Content Structure

Why block-level?

Tree structure

Block node types

Key characteristics

Example selectors

Email Content Structure

Tree structure

Document metadata

Attachments

Key characteristics

Example selectors

Workbook Content Structure

Tree structure

Cell features

Key characteristics

Example selectors

The Block Editor

How editing maps to the KDDB

Block type selection

Extraction Across All Content Structures

Choosing a Content Structure

What’s Next?