Documentation Index
Fetch the complete documentation index at: https://developer.kodexa.ai/llms.txt
Use this file to discover all available pages before exploring further.
AI agents (Claude Code, custom tooling, etc.) can use kdx document commands to programmatically analyze, search, annotate, and extract structured data from complex documents. The CLI’s JSON output format and composable command design make it ideal for agent workflows.
The Agent Workflow
A typical agent workflow follows this pipeline:
info → stats → text → grep/locate → tag → data create → data set-attribute
Each step narrows focus from document-level understanding down to precise node-level annotation.
Step 1: Understand the Document
# Get document overview
kdx document info doc.kddb
# Get quantitative summary
kdx document stats doc.kddb
The agent learns how many pages, nodes, tags, and data objects exist before diving in.
Step 2: Read Content
# Read specific pages
kdx document text doc.kddb --pages 1:3
# Read a single page
kdx document page 5 doc.kddb
The agent reads page content to understand what the document contains and identify sections of interest.
Step 3: Search for Content
# Regex search across the document
kdx document grep "revenue|income" doc.kddb
# Multi-criteria search
kdx document find doc.kddb --contains "total" --type line --page 3
These commands return JSON with node IDs and match positions for further processing.
Step 4: Locate Nodes for Tagging
# Find nodes with match positions
kdx document locate doc.kddb --pattern "\$[\d,]+\.\d{2}" --type word --max 10
The locate command returns nodeId, matchStart, matchEnd, and matchText - everything an agent needs for precise annotation.
Step 5: Tag Nodes
# Tag a node found by locate
kdx document tag doc.kddb --node-id 245 --name "invoice/amount" --value "$1,234.56"
The output includes a tagUuid that links the tag to the node for provenance tracking.
Step 6: Create Structured Data
# Create a data object
kdx document data create doc.kddb --path "INVOICE"
# Set attributes on the data object
kdx document data set-attribute doc.kddb \
--object-id 1 --tag "total_amount" --value "1234.56" \
--type CURRENCY --tag-uuid "a1b2c3d4-..."
The --tag-uuid flag links the attribute back to its source node in the document.
Example: Processing a Financial Document
This walkthrough shows how an agent would process a 50-page financial filing to extract key figures.
1. Assess the Document
$ kdx document info filing.kddb
{
"uuid": "e25dab60-cbdf-499f-857e-ff9c82a19d87",
"version": "6.0.0",
"statistics": {
"nodeCount": 12847,
"pageCount": 50,
"tagCount": 0
}
}
The agent sees 50 pages with no existing tags - a fresh document to process.
2. Find Key Sections
$ kdx document grep "Total Revenue" filing.kddb --max 5
{"nodeId":4521,"type":"line","content":"Total Revenue $45,678,000","page":12,"matchStart":0,"matchEnd":13}
{"nodeId":8932,"type":"line","content":"Total Revenue for Fiscal Year","page":28,"matchStart":0,"matchEnd":13}
3. Read the Revenue Page
$ kdx document text filing.kddb --pages 12:12
--- Page 12 ---
CONSOLIDATED STATEMENTS OF INCOME
(In thousands)
Total Revenue $45,678,000
Cost of Goods Sold $28,456,000
Gross Profit $17,222,000
...
4. Locate Specific Values
$ kdx document locate filing.kddb --pattern "\$[\d,]+" --type word --page 12
{"nodeId":4530,"type":"word","content":"$45,678,000","page":12,"matchStart":0,"matchEnd":11,"matchText":"$45,678,000"}
{"nodeId":4538,"type":"word","content":"$28,456,000","page":12,"matchStart":0,"matchEnd":11,"matchText":"$28,456,000"}
{"nodeId":4545,"type":"word","content":"$17,222,000","page":12,"matchStart":0,"matchEnd":11,"matchText":"$17,222,000"}
5. Tag and Create Data
# Tag the revenue node
$ kdx document tag filing.kddb --node-id 4530 --name "financials/total_revenue"
{"nodeId":4530,"tag":"financials/total_revenue","status":"tagged","tagId":1,"tagUuid":"f1a2b3c4-d5e6-7890-abcd-ef1234567890"}
# Create data object
$ kdx document data create filing.kddb --path "FINANCIALS/INCOME_STATEMENT"
{"id":1,"path":"FINANCIALS/INCOME_STATEMENT"}
# Set attribute linked to source
$ kdx document data set-attribute filing.kddb \
--object-id 1 --tag "total_revenue" --value "45678000" \
--type CURRENCY --tag-uuid "f1a2b3c4-d5e6-7890-abcd-ef1234567890"
{"id":1,"dataObjectId":1,"tag":"total_revenue","value":"45678000","type":"CURRENCY"}
All commands produce JSON Lines (JSONL) by default - one JSON object per line. This streams well and is easy for agents to parse line-by-line:
{"nodeId":100,"type":"word","content":"Revenue","page":1}
{"nodeId":101,"type":"word","content":"$1,234","page":1}
Use --pretty for human-readable debugging:
{
"nodeId": 100,
"type": "word",
"content": "Revenue",
"page": 1
}
Best Practices for Agent Developers
Limit Results
Always use --max to prevent overwhelming output on large documents:
kdx document locate doc.kddb --pattern ".*" --max 20
Focus by Page
Use --page to work on one page at a time instead of the entire document:
kdx document locate doc.kddb --pattern "amount" --page 5
Chain Commands
The intended workflow chains outputs from one command into the next:
locate returns nodeId → use with tag --node-id
tag returns tagUuid → use with data set-attribute --tag-uuid
data create returns id → use with data set-attribute --object-id
Use Node Type Filters
Filter by node type to get the right granularity:
--type word for individual tokens (amounts, dates, names)
--type line for full lines of text
--type paragraph for paragraph-level content
Verify Before Writing
Use read-only commands (info, stats, text, grep, locate, node) to understand the document before using write commands (tag, data create, data set-attribute).
Inspect Nodes Before Tagging
Use node to verify a node’s content and context before tagging:
kdx document node 245 doc.kddb --tags --children
Command Reference
| Command | Mode | Purpose |
|---|
info | Read | Document summary |
stats | Read | Detailed statistics |
text | Read | Page text extraction |
grep | Read | Regex content search |
find | Read | Multi-criteria search |
locate | Read | Node discovery with match positions |
node | Read | Single node inspection |
tags | Read | List all tags |
tag | Write | Annotate a node |
data create | Write | Create data object |
data set-attribute | Write | Set attribute on data object |
audit | Read | Revision history |