Skip to main content
The Knowledge System is how Kodexa captures information about documents and uses that information to customize processing. It connects what you know about documents to what you do with them.

The Two Sides of Knowledge

”What We Know” - Document Metadata

ConceptPurposeExample
Knowledge Feature TypeTemplate defining a category of metadata”Vendor”, “Document Type”, “Language”
Knowledge FeatureReusable instance (shared across documents)“Acme Corp”, “10K”, “English”
You define Feature Types once, then create Features of that type. Multiple documents can share the same Feature (e.g., 50 invoices all linked to “Acme Corp”).

”What We Do” - Configurable Behaviors

ConceptPurposeExample
Knowledge Item TypeTemplate defining a configurable capability”Prompt Override”, “Validation Rule”
Knowledge ItemSpecific configured behavior”Use SEC prompt for revenue extraction”
Item Types define what can be configured. Items are the actual configurations with specific values.

The Bridge - Knowledge Sets

ConceptPurposeExample
Knowledge SetConnects features to items”When Document Type = 10K, use SEC extraction prompts”
Knowledge Sets are the rules that say “when a document has these features, apply these items.”

When Do You Need This?

Track Document Metadata

Track which vendor invoices came from, classify document types, identify languages

Customize Extraction

Use different extraction prompts for different document types

Apply Validation Rules

Apply specific validation rules based on document characteristics

Automate Configuration

Let agents propose knowledge configurations for human approval

Quick Example: Vendor Tracking

Goal: Track which vendor each invoice comes from. Step 1: Create a Feature Type
slug: vendor
name: Vendor
description: The vendor that issued this invoice
options:
  - name: vendorId
    type: string
    label: Vendor ID
extendedOptions:
  - name: displayName
    type: string
    label: Display Name
Step 2: Create Features
# Feature 1
featureType: vendor
properties:
  vendorId: "V001"
extendedProperties:
  displayName: "Acme Corporation"

# Feature 2
featureType: vendor
properties:
  vendorId: "V002"
extendedProperties:
  displayName: "Globex Inc"
Step 3: Link to Documents As invoices are processed, they get linked to the appropriate vendor feature. Now you can:
  • Search for all documents from a specific vendor
  • See which vendor a document belongs to
  • Trigger different processing based on vendor

Quick Example: Customizing Extraction

Goal: Use different extraction prompts for 10K vs 10Q documents. Step 1: Create Feature Type + Features
# Feature Type
slug: sec-filing-type
name: SEC Filing Type
options:
  - name: filingType
    type: string

# Features
- featureType: sec-filing-type
  properties: { filingType: "10K" }

- featureType: sec-filing-type
  properties: { filingType: "10Q" }
Step 2: Create Item Type + Items
# Item Type
slug: extraction-prompt
name: Extraction Prompt Override
options:
  - name: promptText
    type: text

# Items
- knowledgeItemType: extraction-prompt
  title: "10K Revenue Prompt"
  properties:
    promptText: "Extract annual revenue from the 10K financial statements..."

- knowledgeItemType: extraction-prompt
  title: "10Q Revenue Prompt"
  properties:
    promptText: "Extract quarterly revenue from the 10Q financial statements..."
Step 3: Create Knowledge Sets
# When document is 10K, use 10K prompt
- name: "10K Extraction Rules"
  features:
    - featureType: sec-filing-type
      properties: { filingType: "10K" }
  items:
    - itemRef: "10K Revenue Prompt"

# When document is 10Q, use 10Q prompt
- name: "10Q Extraction Rules"
  features:
    - featureType: sec-filing-type
      properties: { filingType: "10Q" }
  items:
    - itemRef: "10Q Revenue Prompt"

Detailed Guides

Knowledge Feature Types

Define categories of document metadata with natural keys and display properties

Knowledge Item Types

Define configurable capabilities like prompt overrides and validation rules

Customizing Extraction

End-to-end guide: different prompts for different document types

Adding Validation Rules

End-to-end guide: conditional validation based on document features

Knowledge and Agents

How agents build and consume knowledge with human-in-the-loop approval

Expression-Based Matching

Knowledge sets use expression trees to define when a set of items should be applied to a document. Expressions support logical operators for flexible feature matching.

Expression Operators

OperatorBehavior
FEATURELeaf node — matches if the document has the referenced feature (by slug)
ANDAll child expressions must evaluate to true
ORAt least one child expression must evaluate to true
NOTThe child expression must evaluate to false

Example

To match documents that have both the “10K” filing type and the “Acme Corp” vendor feature:
AND
├── FEATURE: sec-filing-type/10K
└── FEATURE: vendor/acme-corp
To match documents that are either 10K or 10Q filings:
OR
├── FEATURE: sec-filing-type/10K
└── FEATURE: sec-filing-type/10Q
To match documents that are 10K filings but not from Acme Corp:
AND
├── FEATURE: sec-filing-type/10K
└── NOT
    └── FEATURE: vendor/acme-corp

How Assessment Works

When a document’s features change (e.g., a new feature is assigned via an intake, script step, or agent), the platform evaluates all knowledge sets against the document’s current feature set. The assessment produces four categories:
CategoryMeaning
New MatchesKnowledge sets that now match but did not before
Still MatchingKnowledge sets that continue to match
No Longer MatchKnowledge sets that previously matched but no longer do
Snapshot ChangedKnowledge sets that still match but whose items have been updated
This drives automatic reprocessing — when a document gains or loses a knowledge set match, the platform can trigger the appropriate processing pipeline.

Reference

For GitOps deployment of knowledge resources, see: