Skip to main content

Knowledge System

The Knowledge System is how Kodexa captures information about documents and uses that information to customize processing. It connects what you know about documents to what you do with them.

The Two Sides of Knowledge

”What We Know” - Document Metadata

ConceptPurposeExample
Knowledge Feature TypeTemplate defining a category of metadata”Vendor”, “Document Type”, “Language”
Knowledge FeatureReusable instance (shared across documents)“Acme Corp”, “10K”, “English”
You define Feature Types once, then create Features of that type. Multiple documents can share the same Feature (e.g., 50 invoices all linked to “Acme Corp”).

”What We Do” - Configurable Behaviors

ConceptPurposeExample
Knowledge Item TypeTemplate defining a configurable capability”Prompt Override”, “Validation Rule”
Knowledge ItemSpecific configured behavior”Use SEC prompt for revenue extraction”
Item Types define what can be configured. Items are the actual configurations with specific values.

The Bridge - Knowledge Sets

ConceptPurposeExample
Knowledge SetConnects features to items”When Document Type = 10K, use SEC extraction prompts”
Knowledge Sets are the rules that say “when a document has these features, apply these items.”

When Do You Need This?

Quick Example: Vendor Tracking

Goal: Track which vendor each invoice comes from. Step 1: Create a Feature Type
slug: vendor
name: Vendor
description: The vendor that issued this invoice
options:
  - name: vendorId
    type: string
    label: Vendor ID
extendedOptions:
  - name: displayName
    type: string
    label: Display Name
Step 2: Create Features
# Feature 1
featureType: vendor
properties:
  vendorId: "V001"
extendedProperties:
  displayName: "Acme Corporation"

# Feature 2
featureType: vendor
properties:
  vendorId: "V002"
extendedProperties:
  displayName: "Globex Inc"
Step 3: Link to Documents As invoices are processed, they get linked to the appropriate vendor feature. Now you can:
  • Search for all documents from a specific vendor
  • See which vendor a document belongs to
  • Trigger different processing based on vendor

Quick Example: Customizing Extraction

Goal: Use different extraction prompts for 10K vs 10Q documents. Step 1: Create Feature Type + Features
# Feature Type
slug: sec-filing-type
name: SEC Filing Type
options:
  - name: filingType
    type: string

# Features
- featureType: sec-filing-type
  properties: { filingType: "10K" }

- featureType: sec-filing-type
  properties: { filingType: "10Q" }
Step 2: Create Item Type + Items
# Item Type
slug: extraction-prompt
name: Extraction Prompt Override
options:
  - name: promptText
    type: text

# Items
- knowledgeItemType: extraction-prompt
  title: "10K Revenue Prompt"
  properties:
    promptText: "Extract annual revenue from the 10K financial statements..."

- knowledgeItemType: extraction-prompt
  title: "10Q Revenue Prompt"
  properties:
    promptText: "Extract quarterly revenue from the 10Q financial statements..."
Step 3: Create Knowledge Sets
# When document is 10K, use 10K prompt
- name: "10K Extraction Rules"
  features:
    - featureType: sec-filing-type
      properties: { filingType: "10K" }
  items:
    - itemRef: "10K Revenue Prompt"

# When document is 10Q, use 10Q prompt
- name: "10Q Extraction Rules"
  features:
    - featureType: sec-filing-type
      properties: { filingType: "10Q" }
  items:
    - itemRef: "10Q Revenue Prompt"

Detailed Guides

Reference

For GitOps deployment of knowledge resources, see: