The Kodexa CLI provides powerful commands for working with documents in document stores. This page explains how to query, upload, and manage documents.

Querying Documents

The query command allows you to search for and manipulate documents in a document store:
kodexa query <store-ref> [<query>]
For example, to search for all documents in a store:
kodexa query org-name/store-name "*"

Filtering Results

You can filter results using specific queries:
kodexa query org-name/store-name "content:invoice"
Or use filters for more complex queries:
kodexa query org-name/store-name "created>2023-01-01" --filter

Sorting Results

Sort results by specific fields:
kodexa query org-name/store-name "*" --sort "name:asc"
kodexa query org-name/store-name "*" --sort "modified:desc"

Pagination and Streaming

For large result sets, use pagination:
kodexa query org-name/store-name "*" --page 2 --pageSize 20
For continuous processing, use streaming:
kodexa query org-name/store-name "*" --stream
Limit the number of results when streaming:
kodexa query org-name/store-name "*" --stream --limit 100

Multithreaded Operations

Speed up operations with multiple threads:
kodexa query org-name/store-name "*" --stream --threads 10

Downloading Documents

Download KDDB Format

Download documents in Kodexa Document Database (KDDB) format:
kodexa query org-name/store-name "*" --download
This saves files as <document-path>.kddb

Download Native Files

Download the original native files:
kodexa query org-name/store-name "*" --download-native
This saves files as <document-path>.native

Download Extracted Data

Download extracted data as JSON:
kodexa query org-name/store-name "*" --download-extracted-data
This saves files as <document-path>-extracted_data.json Specify a project ID for extracted data:
kodexa query org-name/store-name "*" --download-extracted-data --project-id project123

Reprocessing Documents

Reprocess documents using a specific assistant:
kodexa query org-name/store-name "*" --stream --reprocess assistant-id
Note: Reprocessing requires streaming mode.

Labeling Documents

Add labels to documents:
kodexa query org-name/store-name "*" --stream --add-label invoice
Remove labels:
kodexa query org-name/store-name "*" --stream --remove-label draft

Watching for Changes

Monitor and refresh results periodically:
kodexa query org-name/store-name "*" --watch 30
This refreshes the results every 30 seconds.

Deleting Documents

Delete documents matching a query:
kodexa query org-name/store-name "*" --stream --delete
You’ll be prompted to confirm the deletion.

Uploading Documents

The upload command allows you to upload files to a document store:
kodexa upload <store-ref> <file-paths>
For example, to upload a single file:
kodexa upload org-name/store-name /path/to/document.pdf
To upload multiple files:
kodexa upload org-name/store-name /path/to/document1.pdf /path/to/document2.pdf

Upload With External Data

You can attach external data to uploaded documents:
kodexa upload org-name/store-name /path/to/document.pdf --external-data
This looks for a matching JSON file (e.g., document.json for document.pdf) and attaches its content as external data to the uploaded document.

Multithreaded Uploads

For faster uploads of multiple files:
kodexa upload org-name/store-name /path/to/documents/*.pdf --threads 10

Advanced Examples

Search and download with multiple options

kodexa query org-name/store-name "path:*.pdf" --stream --download --download-extracted-data --threads 5

Reprocess all invoices with a new assistant

kodexa query org-name/store-name "labels:invoice" --stream --reprocess assistant-123

Monitor document store for changes

kodexa query org-name/store-name "*" --watch 60 --page 1 --pageSize 50

Complex filtering with sorting

kodexa query org-name/store-name "created>2023-01-01 AND modified<2024-01-01" --filter --sort "modified:desc"

Bulk operations with confirmation

# Add label to recent documents
kodexa query org-name/store-name "created>2024-01-01" --stream --add-label processed --threads 10

# Delete old drafts
kodexa query org-name/store-name "labels:draft AND modified<2023-01-01" --stream --delete

Upload with parallel processing

# Upload all PDFs from a directory with external metadata
kodexa upload org-name/store-name /data/documents/*.pdf --external-data --threads 10

Query Options Reference

OptionDescriptionExample
--filterUse filter syntax instead of query syntax--filter
--pagePage number for pagination--page 2
--pageSizeNumber of items per page--pageSize 20
--sortSort results by field--sort "name:asc"
--streamStream results instead of paginating--stream
--limitLimit number of results in streaming--limit 100
--threadsNumber of threads for operations--threads 10
--downloadDownload documents in KDDB format--download
--download-nativeDownload original files--download-native
--download-extracted-dataDownload extracted data as JSON--download-extracted-data
--project-idProject ID for extracted data--project-id proj123
--reprocessReprocess with assistant ID--reprocess asst123
--add-labelAdd label to documents--add-label invoice
--remove-labelRemove label from documents--remove-label draft
--deleteDelete matching documents--delete
--watchRefresh results every n seconds--watch 30

Upload Options Reference

OptionDescriptionExample
--threadsNumber of threads for parallel uploads--threads 10
--external-dataAttach JSON metadata from matching files--external-data