> ## Documentation Index > Fetch the complete documentation index at: https://developer.kodexa.ai/llms.txt > Use this file to discover all available pages before exploring further. # Knowledge System > The Kodexa Knowledge System captures document metadata and configures intelligent processing behaviors, connecting what you know to what you do. The Knowledge System is how Kodexa captures information about documents and uses that information to customize processing. It connects **what you know** about documents to **what you do** with them.

{"Narrated concept walkthrough"}

{"Knowledge System: Context For Work"}

{"Review the five-slide introduction to Knowledge Features, Knowledge Items, Knowledge Sets, and how knowledge adapts Activities."}

{"Play Slides"}

{"The presentation starts after the click and advances when each voiceover finishes."}

Knowledge connects document context to behavior inside Activity execution

## The Two Sides of Knowledge ```mermaid theme={null} flowchart LR subgraph know["What We Know"] direction TB FT[Knowledge Feature Type
e.g., "Vendor"] F[Knowledge Feature
e.g., "Acme Corp"] FT --> F end subgraph do["What We Do"] direction TB IT[Knowledge Item Type
e.g., "Prompt Override"] I[Knowledge Item
e.g., "SEC extraction prompt"] IT --> I end subgraph bridge["The Bridge"] KS[Knowledge Set] end F --> KS KS --> I DOC[Document] -.->|has feature| F I -.->|applied to| DOC style know fill:#e0f2fe style do fill:#fef3c7 style bridge fill:#d1fae5 ``` ### "What We Know" - Document Metadata | Concept | Purpose | Example | | -------------------------- | ------------------------------------------- | ------------------------------------- | | **Knowledge Feature Type** | Template defining a category of metadata | "Vendor", "Document Type", "Language" | | **Knowledge Feature** | Reusable instance (shared across documents) | "Acme Corp", "10K", "English" | You define Feature Types once, then create Features of that type. Multiple documents can share the same Feature (e.g., 50 invoices all linked to "Acme Corp"). ### "What We Do" - Configurable Behaviors | Concept | Purpose | Example | | ----------------------- | ------------------------------------------- | --------------------------------------- | | **Knowledge Item Type** | Template defining a configurable capability | "Prompt Override", "Validation Rule" | | **Knowledge Item** | Specific configured behavior | "Use SEC prompt for revenue extraction" | Item Types define what can be configured. Items are the actual configurations with specific values. ### The Bridge - Knowledge Sets | Concept | Purpose | Example | | ----------------- | -------------------------- | ------------------------------------------------------ | | **Knowledge Set** | Connects features to items | "When Document Type = 10K, use SEC extraction prompts" | Knowledge Sets are the rules that say "when a document has *these features*, apply *these items*." ## When Do You Need This? Track which vendor invoices came from, classify document types, identify languages Use different extraction prompts for different document types Apply specific validation rules based on document characteristics Let agents propose knowledge configurations for human approval ## Quick Example: Vendor Tracking **Goal:** Track which vendor each invoice comes from. **Step 1: Create a Feature Type** ```yaml theme={null} slug: vendor name: Vendor description: The vendor that issued this invoice options: - name: vendorId type: string label: Vendor ID extendedOptions: - name: displayName type: string label: Display Name ``` **Step 2: Create Features** ```yaml theme={null} # Feature 1 featureType: vendor properties: vendorId: "V001" extendedProperties: displayName: "Acme Corporation" # Feature 2 featureType: vendor properties: vendorId: "V002" extendedProperties: displayName: "Globex Inc" ``` **Step 3: Link to Documents** As invoices are processed, they get linked to the appropriate vendor feature. Now you can: * Search for all documents from a specific vendor * See which vendor a document belongs to * Trigger different processing based on vendor ## Quick Example: Customizing Extraction **Goal:** Use different extraction prompts for 10K vs 10Q documents. **Step 1: Create Feature Type + Features** ```yaml theme={null} # Feature Type slug: sec-filing-type name: SEC Filing Type options: - name: filingType type: string # Features - featureType: sec-filing-type properties: { filingType: "10K" } - featureType: sec-filing-type properties: { filingType: "10Q" } ``` **Step 2: Create Item Type + Items** ```yaml theme={null} # Item Type slug: extraction-prompt name: Extraction Prompt Override options: - name: promptText type: text # Items - knowledgeItemType: extraction-prompt title: "10K Revenue Prompt" properties: promptText: "Extract annual revenue from the 10K financial statements..." - knowledgeItemType: extraction-prompt title: "10Q Revenue Prompt" properties: promptText: "Extract quarterly revenue from the 10Q financial statements..." ``` **Step 3: Create Knowledge Sets** ```yaml theme={null} # When document is 10K, use 10K prompt - name: "10K Extraction Rules" features: - featureType: sec-filing-type properties: { filingType: "10K" } items: - itemRef: "10K Revenue Prompt" # When document is 10Q, use 10Q prompt - name: "10Q Extraction Rules" features: - featureType: sec-filing-type properties: { filingType: "10Q" } items: - itemRef: "10Q Revenue Prompt" ``` ## Detailed Guides Define categories of document metadata with natural keys and display properties Define configurable capabilities like prompt overrides and validation rules End-to-end guide: different prompts for different document types End-to-end guide: conditional validation based on document features How agents build and consume knowledge with human-in-the-loop approval ## Knowledge Set Attachments Knowledge sets can have **set-level file attachments** — files that belong to the knowledge set itself rather than to individual items. These are useful for storing reference documents, images, templates, or other supporting files that apply to the entire set. ### Uploading Attachments Upload attachments via the API using a multipart form POST: ``` POST /api/knowledge-sets/{id}/attachments ``` Each attachment includes: | Field | Description | | -------------- | ----------------------------------------------------------------------------------------------------------- | | `attachmentId` | Custom slug identifier (e.g., `company-logo`). Optional on upload; generated from the file name if omitted. | | `contentHash` | SHA-256 hash of the file content, used for deduplication and storage. | | `fileName` | Original file name. | | `size` | File size in bytes. | | `contentType` | MIME type of the file. | | `uploadedAt` | Timestamp of when the file was uploaded. | | `uploadedBy` | User who uploaded the file. | ### Referencing Attachments in Markdown Attachments can be referenced in knowledge item markdown content using the `attachment://` protocol: ```markdown theme={null} ![Company Logo](attachment://company-logo) See the reference template: ![](attachment://ref-template) ``` When the platform renders the markdown, it resolves `attachment://` references to presigned download URLs for the corresponding files. ### CLI Sync Format When defining knowledge sets via the CLI sync YAML format, attachments can be declared alongside items and features: ```yaml theme={null} name: "Invoice Processing Rules" features: - featureType: vendor properties: { vendorId: "V001" } items: - itemRef: "extraction-prompt-override" attachments: - attachmentId: sample-invoice attachmentPath: ./attachments/sample-invoice.pdf - attachmentId: vendor-logo attachmentPath: ./attachments/vendor-logo.png ``` ### Managing Attachments Use the following API endpoints to manage set-level attachments: * **List** — `GET /api/knowledge-sets/{id}/attachments` * **Upload** — `POST /api/knowledge-sets/{id}/attachments` * **Download** — `GET /api/knowledge-sets/{id}/attachments/{attachmentId}` * **Delete** — `DELETE /api/knowledge-sets/{id}/attachments/{attachmentId}` See the [Knowledge Sets API Reference](/api-reference/knowledge-sets/get-knowledge-sets-id) for full details. ## Expression-Based Matching Knowledge sets use **expression trees** to define when a set of items should be applied to a document. Expressions support logical operators for flexible feature matching. ### Expression Operators | Operator | Behavior | | --------- | ------------------------------------------------------------------------ | | `FEATURE` | Leaf node — matches if the document has the referenced feature (by slug) | | `AND` | All child expressions must evaluate to true | | `OR` | At least one child expression must evaluate to true | | `NOT` | The child expression must evaluate to false | ### Example To match documents that have **both** the "10K" filing type **and** the "Acme Corp" vendor feature: ``` AND ├── FEATURE: sec-filing-type/10K └── FEATURE: vendor/acme-corp ``` To match documents that are **either** 10K **or** 10Q filings: ``` OR ├── FEATURE: sec-filing-type/10K └── FEATURE: sec-filing-type/10Q ``` To match documents that are 10K filings but **not** from Acme Corp: ``` AND ├── FEATURE: sec-filing-type/10K └── NOT └── FEATURE: vendor/acme-corp ``` ### How Assessment Works When a document's features change (e.g., a new feature is assigned via an intake, script step, or agent), the platform evaluates all knowledge sets against the document's current feature set. The assessment produces four categories: | Category | Meaning | | -------------------- | ----------------------------------------------------------------- | | **New Matches** | Knowledge sets that now match but did not before | | **Still Matching** | Knowledge sets that continue to match | | **No Longer Match** | Knowledge sets that previously matched but no longer do | | **Snapshot Changed** | Knowledge sets that still match but whose items have been updated | This drives automatic reprocessing — when a document gains or loses a knowledge set match, the platform can trigger the appropriate processing pipeline. ## Reference For GitOps deployment of knowledge resources, see: * [Metadata Sync](/guides/kdx-cli/sync/overview) - Deploy via `kdx sync` * [Resource Deployments](/guides/deployment/resource-deployments) - CI/CD integration