Knowledge Bindings - Kodexa Developer Portal

The knowledge global exposes the knowledge features and items that have been attached to a document family. Scripts can read what the knowledge engine has resolved for a family without re-running resolution, then dispatch downstream behaviour from a single source of truth.

knowledge is available in script steps only — not in intake scripts, event subscriptions, or selection-option formulas. It is read-only: every returned object is Object.freeze-d, and there is no setter or mutate API.

How to think about it

Knowledge in Kodexa is a declarative way to describe what a document is and what should happen to it — kept separate from the script’s runtime state. Two pieces work together:

Knowledge features are facts about a document family (vendor ID, language, document classification). They are attached to the family early in processing — typically by an upstream script step that reads upstream metadata, or by an intake module.
Knowledge sets are org-level rules that match on features and produce knowledge items with rich, type-checked properties: configuration, prompts, validation rules, processing models, anything you want a downstream script to consume.

When a script step runs, the knowledge engine has already evaluated the project-scoped sets and materialised the resulting items into the document family. Your script reads those items via the knowledge global — it does not re-implement the matching logic. The mental model is declare configuration as knowledge, dispatch in the script:

[upstream]                 [knowledge engine]              [your script]
   ─────                       ────────                       ──────
attach features  ──►  match against knowledge sets  ──►  read items, dispatch

Why we built it this way

Without knowledge, a script that needs vendor- or document-type-specific behaviour has to do one of:

Hard-code lookup tables inside the script body.
Round-trip to a service bridge for every document.
Stash configuration in environment variables or module options.
Re-implement set-matching logic by reading raw features and writing JS conditionals.

All of these put both logic and data in the script — making it harder to update one without redeploying the other. The knowledge global is designed so you can keep configuration in org metadata (versioned, reviewable, separately deployed) and keep the script thin enough that it rarely changes.

What knowledge is and isn’t

Knowledge is good for	Knowledge is not for
Per-vendor / per-tenant configuration	Per-document runtime state (use script-local variables)
Lookup tables that change without code	Module-wide config (use module options)
Decoupling org-level policy from code	Cross-step communication (use the `{features: [...]}` return contract or doc metadata)
Audit trails of “what rules applied here”	High-frequency mutable state

If the answer to “where does this value come from?” is “an org admin configures it in the UI”, knowledge is probably the right home. If the answer is “this script computes it for this document”, keep it in the script.

API surface

The global has four explicit accessors and four bare-form conveniences.

Bare accessors (single-family scripts)

When exactly one document family is in scope, scripts can use the bare forms. These are the 80% path for activity-plan script steps that operate on one family at a time.

Accessor	Returns
`knowledge.features`	`KnowledgeFeature[]` — all features attached to the in-scope family
`knowledge.items`	`KnowledgeItem[]` — all knowledge items resolved on the in-scope family
`knowledge.featuresByType(typeSlug)`	`KnowledgeFeature[]` filtered by `featureType.slug === typeSlug`
`knowledge.itemsByType(typeSlug)`	`KnowledgeItem[]` filtered by `itemType.slug === typeSlug`

Explicit accessors (always available)

Accessor	Returns
`knowledge.getFeatures(familyId)`	`KnowledgeFeature[]` for the given family
`knowledge.getItems(familyId)`	`KnowledgeItem[]` for the given family
`knowledge.getFeaturesByType(familyId, typeSlug)`	filtered features
`knowledge.getItemsByType(familyId, typeSlug)`	filtered items

The explicit forms are required when the script’s families slice has anything other than exactly one entry. The familyId you pass MUST be in the script’s families slice (see Family-access scoping).

Object shapes

Annotations below use TypeScript-style notation. Each returned array element is a frozen plain JS object with exactly these properties.

KnowledgeFeature

{
  id: string;                  // primary key of kdxa_knowledge_features
  uuid: string;                // stable UUID
  slug: string;                // computed from feature type + properties
  active: boolean;             // is_active column; not filtered server-side
  properties: object | null;   // free-form map; shape depends on featureType
  extendedProperties: object | null;  // free-form map; shape depends on featureType
  featureType: KnowledgeFeatureType | null;  // null if type was deleted
}

KnowledgeFeatureType (sub-record on each feature)

{
  slug: string;                // type identifier (lowercase by convention)
  name: string;                // display name
  description: string | null;  // long-form description
  icon: string | null;         // icon ref (e.g. "tag", "barcode")
  color: string | null;        // hex or theme colour ref
  options: KnowledgeOption[] | null;          // schema for `properties`
  extendedOptions: KnowledgeOption[] | null;  // schema for `extendedProperties`
  labelJsonPath: string | null;  // JSONPath / JSONata expression for human label
  useJSONata: boolean;           // true => labelJsonPath is JSONata, not JSONPath
}

KnowledgeItem

{
  id: string;                  // KDDB-side identifier
  uuid: string;                // stable UUID
  slug: string;                // type-scoped slug
  title: string;               // display title
  description: string | null;  // long-form description
  active: boolean;             // is_active flag from the resolved set
  sequenceOrder: number;       // ordering hint within the type
  properties: object | null;   // free-form map; shape depends on itemType
  knowledgeSetId: string | null;  // owning knowledge set
  itemType: KnowledgeItemType | null;  // null if type was deleted
}

KnowledgeItemType (sub-record on each item)

{
  slug: string;                // type identifier (lowercase by convention)
  name: string;                // display name
  description: string | null;  // long-form description
  options: KnowledgeOption[] | null;  // schema for `properties`
  supportsAttachment: boolean; // whether this type accepts file attachments
}

KnowledgeOption (schema entries inside `options` / `extendedOptions`)

{
  name: string;                // property key
  type: string;                // "string", "boolean", "selection", ...
  label: string;               // display label
  description: string;         // help text
  required: boolean;
  default: any;
  possibleValues: any[];       // present for "selection" type
}

properties is free-form. It is whatever shape the type author defined in options. If a type defines instructionMarkdown, domain, pipeline, etc., those keys live on properties — they are NOT separate top-level fields on the feature or item. To find what keys a type uses, inspect featureType.options (or itemType.options) and look at each option’s name.

// Right
const route = knowledge.itemsByType("processing-model")[0];
const pipeline = route.properties.pipeline;        // correct

// Wrong -- will be undefined
const pipeline = route.pipeline;                   // there is no top-level pipeline field

Single-family rule

The bare accessors (knowledge.features, knowledge.items, knowledge.featuresByType, knowledge.itemsByType) auto-bind to families[0] when exactly one family is in scope. Anything else throws:

knowledge.<accessor> requires exactly one document family in scope (found N). Use knowledge.<explicit-accessor>(familyId) explicitly for multi-family scripts.

For multi-family script steps, you MUST iterate over families and use the explicit forms:

for (var i = 0; i < families.length; i++) {
  var fam = families[i];
  var feats = knowledge.getFeatures(fam.id);
  var route = knowledge.getItemsByType(fam.id, "processing-model")[0];
  // ...
}

Family-access scoping

Even with the explicit accessors, the familyId you pass must be present in the script’s families slice (the families attached to the script’s task). Calling with a foreign family throws:

knowledge: familyId "<id>" is not in scope for this script

This is the primary tenant-isolation defence: the underlying KDDB loader and feature query are keyed by document_family_id only, so this check prevents a script from reading another tenant’s data by guessing UUIDs.

Write protection

Every object returned by knowledge.* is Object.freeze-d, recursively. Nested arrays and sub-records (e.g. featureType, itemType, properties) are also frozen.

JS mode	Mutate attempt
Strict mode	throws `TypeError`
Sloppy mode	silently no-ops

In both cases Object.isFrozen(obj) returns true. The contract is: knowledge is read-only from scripts. If you need to compute a derived value, build a new object instead of mutating the returned one:

var feats = knowledge.features;
// var enriched = feats[0];                  // frozen, can't assign new keys
// enriched.computed = "x";                  // throws in strict mode

var copy = Object.assign({}, feats[0], { computed: "x" });  // OK -- new object

Performance characteristics

Per-script-execution caches keyed by family ID make repeated calls cheap.

Resource	First-call cost	Subsequent calls (same family)
Features (bare or explicit)	1 SQL query joining `kdxa_document_family_features` -> `kdxa_knowledge_features` -> `kdxa_knowledge_feature_types`	map hit, no I/O
Items (bare or explicit)	1 KDDB document load + 1 read of `getKnowledge()`	map hit, no I/O
`featuresByType` / `itemsByType`	filters the cached full list — no extra queries	map hit, no I/O

Calling knowledge.featuresByType("vendor") ten times in the same script runs one query.

The KDDB load behind knowledge.items is a separate loader from loadDocument(familyId). If a script calls both, the KDDB is loaded twice (once for each cache). This is acceptable today but may be consolidated in a future revision.

Item ordering

knowledge.items and knowledge.itemsByType always return items sorted by:

sequenceOrder ASC
ties broken by slug ASC

This is a documented contract — scripts that index by position (e.g. items[0]) may rely on it. Without this, scripts would be at the mercy of whatever order the KDDB happens to return. Features have no documented ordering — they come back in whatever order the SQL join returns them.

Active-features policy

The bindings return all features and items regardless of the active flag, with the boolean exposed on each instance. If you want only-active records, filter in JS:

var activeOnly = knowledge.features.filter(function (f) { return f.active; });
var activeRoutes = knowledge.itemsByType("processing-model").filter(function (it) { return it.active; });

The rationale: scripts that audit, log, or report on inactive records need to see them too. Filtering up-front would have been a footgun.

Empty-result handling

Filtered accessors return [] when nothing matches. Indexing [0] on an empty array yields undefined, and any property access after that throws. Always guard:

var route = knowledge.itemsByType("processing-model")[0];
if (!route) {
  throw new Error("no processing-model resolved -- vendor missing or unmapped");
}
// safe to use route.properties from here on

Null `featureType` / `itemType`

In pathological cases (e.g. the type was deleted via direct DB access despite the immutability claims) the enrichment lookup returns no row. The binding surfaces this as featureType: null (or itemType: null) on the affected instance rather than throwing. Null-check before accessing nested fields:

var feats = knowledge.features;
for (var i = 0; i < feats.length; i++) {
  var slug = feats[i].featureType ? feats[i].featureType.slug : "(orphaned)";
  log.info("feature", feats[i].uuid, "type:", slug);
}

itemType may also be null when a knowledge item references a type from a different organisation than the script’s context, since the type lookup is org-scoped.

Type-slug case sensitivity

featuresByType / itemsByType perform an exact string match against the type’s slug. Slugs are lowercase by convention — featuresByType("Vendor") will not match a type whose slug is "vendor".

Kill switch

There is no runtime feature flag for the knowledge global. Emergency disable requires commenting out the registration in kodexa-orchestrator/internal/service/planner_script_adapters.go (the kb.Register(vm) call near the bottom of plannerContextBindings.Register) and redeploying.

What’s NOT here (deferred to v2)

The following are not available in v1:

Resolution metadata (knowledge.matches, ResolutionMatch objects). The engine does not currently persist per-set resolution decisions; surfacing them requires engine changes.
Per-item source / clauses. Which knowledge clause produced an item is not exposed.
Attachment-fetch accessors. itemType.supportsAttachment is exposed, but there is no accessor to fetch the underlying attachment (presigned URL, content, etc.).
Mid-script re-resolution. The bindings read what is already attached to the family; they do not re-run the knowledge engine. If you need fresh resolution, run a separate engine invocation upstream of the script.
Feature provenance read fields (feature.attachedAt, feature.attachedBy). The schema and write-side population land in v1’s parallel track so the data accumulates; the read API ships in v2.

Worked example: vendor routing

A common cass-analysis pattern: a single document family carries a vendor feature emitted upstream, the knowledge engine resolves the family’s processing-model knowledge set, and a downstream script step dispatches based on the resolved item’s properties.

// Step 1 (upstream) emits the vendor feature from the family's metadata.
// Step 2 (this script step) reads the resolved processing-model item and
// returns a dispatch action.

const route = knowledge.itemsByType("processing-model")[0];
if (!route) {
    throw new Error("no processing-model resolved -- vendor missing or unmapped");
}

// route.properties is shaped by the processing-model item type's options:
//   { domain: "utilities", pipeline: "template", hasLineItemsTemplate: false }
switch (route.properties.pipeline) {
    case "template": return { action: "template_path" };
    case "llm":      return { action: "llm_path" };
    default:         return { action: "unknown" };
}

The same pattern applied to multi-family steps:

var dispatches = [];
for (var i = 0; i < families.length; i++) {
  var fam = families[i];
  var route = knowledge.getItemsByType(fam.id, "processing-model")[0];
  if (!route) {
    log.warn("no processing-model for family", fam.id);
    continue;
  }
  dispatches.push({ familyId: fam.id, pipeline: route.properties.pipeline });
}
return { dispatches: dispatches };

Common patterns

Pattern 1: Two-step feature emission then read

The canonical use of knowledge is a two-step activity-plan flow: an upstream script step emits a feature using the planner’s {features: [...]} return contract; the orchestrator runs AssessAndEnrich between steps; a downstream script reads the resolved items via knowledge.itemsByType(...).

// Step 1: emit_features (script step, dependsOn: [])
// Read upstream metadata and emit a feature so the knowledge engine
// can resolve which items apply to this document.
var doc = loadDocument(families[0].id);
var vendorId = (doc.metadata || {})["CustomFields.PrimaryVendorId"];
if (!vendorId) {
  throw new Error("PrimaryVendorId missing -- intake metadata is incomplete");
}
return {
  action: "ok",
  features: [{
    documentFamilyId: families[0].id,
    featureTypeSlug: "vendor",
    properties: { vendorId: String(vendorId) }
  }]
};

// Step 2: classify_path (script step, dependsOn: [emit_features, ocr])
// AssessAndEnrich has run between the two steps. Read the resolved item.
var route = knowledge.itemsByType("processing-model")[0];
if (!route) {
  // Useful fallback: keep the legacy path active until coverage is complete.
  log.warn("No processing-model resolved; falling back to template_single");
  return { action: "template_single" };
}
switch (route.properties.pipeline) {
  case "template": return { action: "template_path" };
  case "llm":      return { action: "llm_path" };
  default:         return { action: "unknown" };
}

The split keeps each step focused: step 1 sources data from upstream, step 2 dispatches based on org-level configuration.

Pattern 2: Per-vendor configuration without a lookup table

Older code commonly contains hard-coded tables like:

// Before: vendor-specific knowledge baked into the script
var TEMPLATE_VENDORS = ["VEN-001", "VEN-005", "VEN-018"];
var LINE_ITEM_VENDORS = ["VEN-002", "VEN-018"];
if (TEMPLATE_VENDORS.indexOf(vendorId) >= 0) { /* ... */ }

Knowledge replaces this with a declarative item per vendor:

// After: read the vendor's processing-model item; let knowledge sets resolve it
var route = knowledge.itemsByType("processing-model")[0];
if (route && route.properties.pipeline === "template") {
  // ...
  if (route.properties.hasLineItemsTemplate) { /* ... */ }
}

Now adding a new vendor is a knowledge-set edit (UI or YAML), not a code change.

Pattern 3: Conditional behaviour gated on feature presence

Sometimes you don’t need a knowledge item at all — just the existence of a feature. Use featuresByType:

// Skip OCR if the document has already been classified as machine-readable upstream.
var classifications = knowledge.featuresByType("document-classification");
var alreadyOCRed = classifications.some(function (f) {
  return f.properties.layer === "text-extracted";
});
if (alreadyOCRed) {
  return { action: "skip_ocr" };
}

Pattern 4: Multi-vendor / batch script steps

When the step’s families slice has more than one entry, the bare accessors throw. Loop and use the explicit forms:

var dispatches = [];
for (var i = 0; i < families.length; i++) {
  var fam = families[i];
  var route = knowledge.getItemsByType(fam.id, "processing-model")[0];
  if (!route) {
    log.warn("no processing-model for family", fam.id);
    continue;
  }
  dispatches.push({ familyId: fam.id, pipeline: route.properties.pipeline });
}
return { action: "dispatched", count: dispatches.length, items: dispatches };

Pattern 5: Audit / inspection

A read-only inspection script can summarise what the engine has done:

var feats = knowledge.features;
var items = knowledge.items;
log.info("knowledge state",
  "features", feats.length,
  "items", items.length,
  "byType", items.reduce(function (acc, it) {
    var t = it.itemType ? it.itemType.slug : "(orphaned)";
    acc[t] = (acc[t] || 0) + 1;
    return acc;
  }, {})
);
return { action: "audited" };

This is useful in test pipelines, before-merge checks, and triage activity-plans where you want visibility into what was resolved.

Choosing knowledge vs. alternatives

A short decision guide for “where should this configuration live?”:

Source of the value	Recommended home	Notes
Org admin sets it once per vendor	Knowledge item (this binding)	Versioned, reviewable, no redeploy to update
Same for every project of a given type	Project template option	Set when a project is created from the template
Same for every document a module processes	Module option	Visible at module-config time
Computed at runtime from this specific document	Script-local variable	Don’t shoehorn into knowledge
Cross-step within one activity run	`{features: [...]}` return contract, or doc metadata	Knowledge is for stable config, not run-state
Upstream system pushes it on every doc	Document metadata / intake	Surface as a knowledge feature only if downstream knowledge sets need to match on it

The recurring test: if the org admin needs to change this value tomorrow, do they have to redeploy code, redeploy the project template, or just edit a knowledge item? The last is the cheapest — favour it when the value is genuinely org-scoped.

​How to think about it

​Why we built it this way

​What knowledge is and isn’t

​API surface

​Bare accessors (single-family scripts)

​Explicit accessors (always available)

​Object shapes

​KnowledgeFeature

​KnowledgeFeatureType (sub-record on each feature)

​KnowledgeItem

​KnowledgeItemType (sub-record on each item)

​KnowledgeOption (schema entries inside options / extendedOptions)

​Single-family rule

​Family-access scoping

​Write protection

​Performance characteristics

​Item ordering

​Active-features policy

​Empty-result handling

​Null featureType / itemType

​Type-slug case sensitivity

​Kill switch

​What’s NOT here (deferred to v2)

​Worked example: vendor routing

​Common patterns

​Pattern 1: Two-step feature emission then read

​Pattern 2: Per-vendor configuration without a lookup table

​Pattern 3: Conditional behaviour gated on feature presence

​Pattern 4: Multi-vendor / batch script steps

​Pattern 5: Audit / inspection

​Choosing knowledge vs. alternatives

​See also

How to think about it

Why we built it this way

What knowledge is and isn’t

API surface

Bare accessors (single-family scripts)

Explicit accessors (always available)

Object shapes

KnowledgeFeature

KnowledgeFeatureType (sub-record on each feature)

KnowledgeItem

KnowledgeItemType (sub-record on each item)

KnowledgeOption (schema entries inside `options` / `extendedOptions`)

Single-family rule

Family-access scoping

Write protection

Performance characteristics

Item ordering

Active-features policy

Empty-result handling

Null `featureType` / `itemType`

Type-slug case sensitivity

Kill switch

What’s NOT here (deferred to v2)

Worked example: vendor routing

Common patterns

Pattern 1: Two-step feature emission then read

Pattern 2: Per-vendor configuration without a lookup table

Pattern 3: Conditional behaviour gated on feature presence

Pattern 4: Multi-vendor / batch script steps

Pattern 5: Audit / inspection

Choosing knowledge vs. alternatives

See also