Skip to main content

Store Commands

The kdx store command provides extended operations for working with document stores beyond basic CRUD operations.

Available Commands

CommandDescription
uploadUpload a file to a document store
reprocessReprocess document families based on a filter
watchMonitor document processing progress
statsShow store statistics
reindexTrigger store reindexing

Upload Documents

Upload files (PDF, images, documents) to a document store for processing.
kdx store upload <store-ref> <file-path>

Parameters

ParameterDescription
store-refStore reference in format org/store-name
file-pathPath to the file to upload

Examples

# Upload a PDF to a store
kdx store upload satori/my-project-processing ./invoice.pdf

# Upload with full path
kdx store upload my-org/document-store /path/to/financial-statement.pdf

Output

Uploading invoice.pdf to store satori/my-project-processing...
✓ Document family created: a700ca3e-38e7-4e63-8974-59c2e9058cf3
The command returns the document family ID which you can use with kdx store watch to monitor processing.

Reprocess Documents

Reprocess document families in a store, typically used to retry failed documents or rerun processing with a different assistant.
kdx store reprocess <store-ref> --assistant-id <id> [flags]

Parameters

ParameterDescription
store-refStore reference in format org/store-name:version

Flags

FlagDefaultDescription
--assistant-id(required)Assistant ID to use for reprocessing
--filter(required)Filter expression for selecting document families
--dry-runfalsePreview matching documents without triggering reprocessing
--page-size100Page size when fetching document families
Both --assistant-id and --filter are required. The store-level CLI command always sends an explicit assistantIds list to the batch reprocess endpoint, even though the document-family API can auto-detect assistants when you call it directly. You can find assistant IDs with kdx get assistants.

Common Filters

FilterDescription
statistics.recentExecutions.execution.status=='FAILED'Documents with failed processing
pendingProcessing==trueDocuments stuck in pending state

Examples

# Reprocess all failed documents
kdx store reprocess my-org/my-store:1.0.0 \
  --assistant-id abc123 \
  --filter "statistics.recentExecutions.execution.status=='FAILED'"

# Preview what would be reprocessed (dry run)
kdx store reprocess my-org/my-store:1.0.0 \
  --assistant-id abc123 \
  --filter "statistics.recentExecutions.execution.status=='FAILED'" \
  --dry-run

# Reprocess documents stuck in pending state
kdx store reprocess my-org/my-store:1.0.0 \
  --assistant-id abc123 \
  --filter "pendingProcessing==true"

Output

Found 12 document families matching filter
Reprocessing triggered for 12 document families in store 'my-org/my-store:1.0.0'
With --dry-run:
Found 12 document families matching filter

[DRY RUN] The following document families would be reprocessed:
  1. a700ca3e-38e7-4e63-8974-59c2e9058cf3
  2. b812df4a-51a2-4f78-9b21-6e8c3d7af912
  3. c923eg5b-62b3-5g89-ac32-7f9d4e8bg023
  ...

Reprocess Failed Documents Workflow

A common workflow for retrying failed documents:
# 1. Find the assistant ID
kdx get assistants -o json

# 2. Preview failed documents
kdx store reprocess my-org/my-store:1.0.0 \
  --assistant-id abc123 \
  --filter "statistics.recentExecutions.execution.status=='FAILED'" \
  --dry-run

# 3. Trigger reprocessing
kdx store reprocess my-org/my-store:1.0.0 \
  --assistant-id abc123 \
  --filter "statistics.recentExecutions.execution.status=='FAILED'"

# 4. Monitor a specific document's progress
kdx store watch <document-family-id>

Watch Processing Progress

Monitor a document family as it progresses through the processing pipeline.
kdx store watch <document-family-id> [flags]

Flags

FlagDefaultDescription
--labelPROCESSEDTarget label to wait for
--timeout600Timeout in seconds
--poll-interval3Poll interval in seconds

Processing Labels

Documents typically progress through these labels:
LabelDescription
PREPAREDDocument parsed and prepared
FIRST-PASSInitial LLM extraction complete
LABELEDSecond pass extraction complete
PROCESSEDFinal transformation complete

Examples

# Watch until fully processed (default)
kdx store watch a700ca3e-38e7-4e63-8974-59c2e9058cf3

# Watch for specific label
kdx store watch a700ca3e-38e7-4e63-8974-59c2e9058cf3 --label LABELED

# Custom timeout and poll interval
kdx store watch a700ca3e-38e7-4e63-8974-59c2e9058cf3 --timeout 300 --poll-interval 5

Output

The watch command shows real-time progress:
Watching document family a700ca3e-38e7-4e63-8974-59c2e9058cf3 for label: PROCESSED
Timeout: 600s, Poll interval: 5s

[  0s] RUNNING    | Labels: none | PDF Parser
[ 42s] RUNNING    | Labels: none | AWS Textract
[ 73s] RUNNING    | Labels: none | Kodexa LLM Data Labeling
[389s] RUNNING    | Labels: FIRST-PASS | Kodexa LLM Data Labeling
[472s] RUNNING    | Labels: LABELED | Spreading Transformer
[523s] RUNNING    | Labels: PROCESSED | Note Creator

✓ Document family reached label: PROCESSED

Error Detection

The watch command immediately detects processing failures:
[ 24s] FAILED     | Labels: none | AWS Textract: Unable to build document

✗ Processing failed: AWS Textract: Unable to build document

Store Statistics

View statistics for a document store.
kdx store stats <store-name>

Example

kdx store stats my-document-store

Output

Store: my-document-store
====================
Documents: 150
Storage Size: 2.5 GB
Index Status: READY

Reindex Store

Trigger a reindexing operation for a store.
kdx store reindex <store-name> [flags]

Flags

FlagDescription
--forceForce reindex even if already in progress

Example

# Standard reindex
kdx store reindex my-document-store

# Force reindex
kdx store reindex my-document-store --force

Complete Upload and Monitor Workflow

Here’s a complete example workflow for uploading and monitoring a document:
# 1. Upload document
kdx store upload satori/project-processing ./financial-report.pdf
# Output: ✓ Document family created: abc123

# 2. Monitor processing
kdx store watch abc123 --timeout 600

# 3. Once processed, get the extracted data
kdx document-family data abc123 -o extracted-data.json

Troubleshooting

File Already Exists

Error: upload failed (400): {"errors":{"*":"document.pdf already exists"}}
Solution: Use a different filename or upload to a different project.

Store Not Found

Error: upload failed (404): Store not found
Solution: Verify the store reference format is correct: org/store-name

Processing Timeout

Error: timeout: document family did not reach label 'PROCESSED' within 600 seconds
Solution:
  • Increase timeout with --timeout
  • Check if document is still processing with kdx store watch
  • Large documents may take longer to process