> ## Documentation Index
> Fetch the complete documentation index at: https://developer.kodexa.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Knowledge Bindings

> Read knowledge features and items attached to a document family directly from script steps using the read-only knowledge global.

The `knowledge` global exposes the knowledge features and items that have been attached to a document family. Scripts can read what the knowledge engine has resolved for a family without re-running resolution, then dispatch downstream behaviour from a single source of truth.

<Note>
  `knowledge` is available in **script steps** only -- not in intake scripts, event subscriptions, or selection-option formulas. It is read-only: every returned object is `Object.freeze`-d, and there is no setter or mutate API.
</Note>

## How to think about it

Knowledge in Kodexa is a **declarative** way to describe what a document is and what should happen to it -- kept separate from the script's runtime state. Two pieces work together:

* **Knowledge features** are facts *about* a document family (vendor ID, language, document classification). They are attached to the family early in processing -- typically by an upstream script step that reads upstream metadata, or by an intake module.
* **Knowledge sets** are org-level rules that match on features and produce **knowledge items** with rich, type-checked properties: configuration, prompts, validation rules, processing models, anything you want a downstream script to consume.

When a script step runs, the knowledge engine has already evaluated the project-scoped sets and materialised the resulting items into the document family. Your script *reads* those items via the `knowledge` global -- it does not re-implement the matching logic.

The mental model is **declare configuration as knowledge, dispatch in the script**:

```
[upstream]                 [knowledge engine]              [your script]
   ─────                       ────────                       ──────
attach features  ──►  match against knowledge sets  ──►  read items, dispatch
```

### Why we built it this way

Without `knowledge`, a script that needs vendor- or document-type-specific behaviour has to do one of:

* Hard-code lookup tables inside the script body.
* Round-trip to a service bridge for every document.
* Stash configuration in environment variables or module options.
* Re-implement set-matching logic by reading raw features and writing JS conditionals.

All of these put both *logic* and *data* in the script -- making it harder to update one without redeploying the other. The `knowledge` global is designed so you can keep configuration in **org metadata** (versioned, reviewable, separately deployed) and keep the script thin enough that it rarely changes.

### What knowledge is and isn't

| Knowledge is good for                     | Knowledge is not for                                                                   |
| ----------------------------------------- | -------------------------------------------------------------------------------------- |
| Per-vendor / per-tenant configuration     | Per-document runtime state (use script-local variables)                                |
| Lookup tables that change without code    | Module-wide config (use module options)                                                |
| Decoupling org-level policy from code     | Cross-step communication (use the `{features: [...]}` return contract or doc metadata) |
| Audit trails of "what rules applied here" | High-frequency mutable state                                                           |

If the answer to "where does this value come from?" is "an org admin configures it in the UI", knowledge is probably the right home. If the answer is "this script computes it for this document", keep it in the script.

## API surface

The global has four explicit accessors and four bare-form conveniences.

### Bare accessors (single-family scripts)

When exactly one document family is in scope, scripts can use the bare forms. These are the 80% path for activity-plan script steps that operate on one family at a time.

| Accessor                             | Returns                                                                  |
| ------------------------------------ | ------------------------------------------------------------------------ |
| `knowledge.features`                 | `KnowledgeFeature[]` -- all features attached to the in-scope family     |
| `knowledge.items`                    | `KnowledgeItem[]` -- all knowledge items resolved on the in-scope family |
| `knowledge.featuresByType(typeSlug)` | `KnowledgeFeature[]` filtered by `featureType.slug === typeSlug`         |
| `knowledge.itemsByType(typeSlug)`    | `KnowledgeItem[]` filtered by `itemType.slug === typeSlug`               |

### Explicit accessors (always available)

| Accessor                                          | Returns                                   |
| ------------------------------------------------- | ----------------------------------------- |
| `knowledge.getFeatures(familyId)`                 | `KnowledgeFeature[]` for the given family |
| `knowledge.getItems(familyId)`                    | `KnowledgeItem[]` for the given family    |
| `knowledge.getFeaturesByType(familyId, typeSlug)` | filtered features                         |
| `knowledge.getItemsByType(familyId, typeSlug)`    | filtered items                            |

The explicit forms are required when the script's `families` slice has anything other than exactly one entry. The `familyId` you pass MUST be in the script's `families` slice (see [Family-access scoping](#family-access-scoping)).

## Object shapes

Annotations below use TypeScript-style notation. Each returned array element is a frozen plain JS object with exactly these properties.

### KnowledgeFeature

```typescript theme={null}
{
  id: string;                  // primary key of kdxa_knowledge_features
  uuid: string;                // stable UUID
  slug: string;                // computed from feature type + properties
  active: boolean;             // is_active column; not filtered server-side
  properties: object | null;   // free-form map; shape depends on featureType
  extendedProperties: object | null;  // free-form map; shape depends on featureType
  featureType: KnowledgeFeatureType | null;  // null if type was deleted
}
```

### KnowledgeFeatureType (sub-record on each feature)

```typescript theme={null}
{
  slug: string;                // type identifier (lowercase by convention)
  name: string;                // display name
  description: string | null;  // long-form description
  icon: string | null;         // icon ref (e.g. "tag", "barcode")
  color: string | null;        // hex or theme colour ref
  options: KnowledgeOption[] | null;          // schema for `properties`
  extendedOptions: KnowledgeOption[] | null;  // schema for `extendedProperties`
  labelJsonPath: string | null;  // JSONPath / JSONata expression for human label
  useJSONata: boolean;           // true => labelJsonPath is JSONata, not JSONPath
}
```

### KnowledgeItem

```typescript theme={null}
{
  id: string;                  // KDDB-side identifier
  uuid: string;                // stable UUID
  slug: string;                // type-scoped slug
  title: string;               // display title
  description: string | null;  // long-form description
  active: boolean;             // is_active flag from the resolved set
  sequenceOrder: number;       // ordering hint within the type
  properties: object | null;   // free-form map; shape depends on itemType
  knowledgeSetId: string | null;  // owning knowledge set
  itemType: KnowledgeItemType | null;  // null if type was deleted
}
```

### KnowledgeItemType (sub-record on each item)

```typescript theme={null}
{
  slug: string;                // type identifier (lowercase by convention)
  name: string;                // display name
  description: string | null;  // long-form description
  options: KnowledgeOption[] | null;  // schema for `properties`
  supportsAttachment: boolean; // whether this type accepts file attachments
}
```

### KnowledgeOption (schema entries inside `options` / `extendedOptions`)

```typescript theme={null}
{
  name: string;                // property key
  type: string;                // "string", "boolean", "selection", ...
  label: string;               // display label
  description: string;         // help text
  required: boolean;
  default: any;
  possibleValues: any[];       // present for "selection" type
}
```

<Warning>
  **`properties` is free-form.** It is whatever shape the type author defined in `options`. If a type defines `instructionMarkdown`, `domain`, `pipeline`, etc., those keys live on `properties` -- they are NOT separate top-level fields on the feature or item. To find what keys a type uses, inspect `featureType.options` (or `itemType.options`) and look at each option's `name`.

  ```javascript theme={null}
  // Right
  const route = knowledge.itemsByType("processing-model")[0];
  const pipeline = route.properties.pipeline;        // correct

  // Wrong -- will be undefined
  const pipeline = route.pipeline;                   // there is no top-level pipeline field
  ```
</Warning>

## Single-family rule

The bare accessors (`knowledge.features`, `knowledge.items`, `knowledge.featuresByType`, `knowledge.itemsByType`) auto-bind to `families[0]` when **exactly one** family is in scope. Anything else throws:

```
knowledge.<accessor> requires exactly one document family in scope (found N). Use knowledge.<explicit-accessor>(familyId) explicitly for multi-family scripts.
```

For multi-family script steps, you MUST iterate over `families` and use the explicit forms:

```javascript theme={null}
for (var i = 0; i < families.length; i++) {
  var fam = families[i];
  var feats = knowledge.getFeatures(fam.id);
  var route = knowledge.getItemsByType(fam.id, "processing-model")[0];
  // ...
}
```

## Family-access scoping

Even with the explicit accessors, the `familyId` you pass must be present in the script's `families` slice (the families attached to the script's task). Calling with a foreign family throws:

```
knowledge: familyId "<id>" is not in scope for this script
```

This is the primary tenant-isolation defence: the underlying KDDB loader and feature query are keyed by `document_family_id` only, so this check prevents a script from reading another tenant's data by guessing UUIDs.

## Write protection

Every object returned by `knowledge.*` is `Object.freeze`-d, recursively. Nested arrays and sub-records (e.g. `featureType`, `itemType`, `properties`) are also frozen.

| JS mode     | Mutate attempt     |
| ----------- | ------------------ |
| Strict mode | throws `TypeError` |
| Sloppy mode | silently no-ops    |

In both cases `Object.isFrozen(obj)` returns `true`. The contract is: **knowledge is read-only from scripts**. If you need to compute a derived value, build a new object instead of mutating the returned one:

```javascript theme={null}
var feats = knowledge.features;
// var enriched = feats[0];                  // frozen, can't assign new keys
// enriched.computed = "x";                  // throws in strict mode

var copy = Object.assign({}, feats[0], { computed: "x" });  // OK -- new object
```

## Performance characteristics

Per-script-execution caches keyed by family ID make repeated calls cheap.

| Resource                         | First-call cost                                                                                                    | Subsequent calls (same family) |
| -------------------------------- | ------------------------------------------------------------------------------------------------------------------ | ------------------------------ |
| Features (bare or explicit)      | 1 SQL query joining `kdxa_document_family_features` -> `kdxa_knowledge_features` -> `kdxa_knowledge_feature_types` | map hit, no I/O                |
| Items (bare or explicit)         | 1 KDDB document load + 1 read of `getKnowledge()`                                                                  | map hit, no I/O                |
| `featuresByType` / `itemsByType` | filters the cached full list -- no extra queries                                                                   | map hit, no I/O                |

Calling `knowledge.featuresByType("vendor")` ten times in the same script runs **one** query.

<Note>
  The KDDB load behind `knowledge.items` is a separate loader from `loadDocument(familyId)`. If a script calls both, the KDDB is loaded twice (once for each cache). This is acceptable today but may be consolidated in a future revision.
</Note>

## Item ordering

`knowledge.items` and `knowledge.itemsByType` always return items sorted by:

1. `sequenceOrder` ASC
2. ties broken by `slug` ASC

This is a documented contract -- scripts that index by position (e.g. `items[0]`) may rely on it. Without this, scripts would be at the mercy of whatever order the KDDB happens to return.

Features have no documented ordering -- they come back in whatever order the SQL join returns them.

## Active-features policy

The bindings return **all** features and items regardless of the `active` flag, with the boolean exposed on each instance. If you want only-active records, filter in JS:

```javascript theme={null}
var activeOnly = knowledge.features.filter(function (f) { return f.active; });
var activeRoutes = knowledge.itemsByType("processing-model").filter(function (it) { return it.active; });
```

The rationale: scripts that audit, log, or report on inactive records need to see them too. Filtering up-front would have been a footgun.

## Empty-result handling

Filtered accessors return `[]` when nothing matches. Indexing `[0]` on an empty array yields `undefined`, and any property access after that throws. Always guard:

```javascript theme={null}
var route = knowledge.itemsByType("processing-model")[0];
if (!route) {
  throw new Error("no processing-model resolved -- vendor missing or unmapped");
}
// safe to use route.properties from here on
```

## Null `featureType` / `itemType`

In pathological cases (e.g. the type was deleted via direct DB access despite the immutability claims) the enrichment lookup returns no row. The binding surfaces this as `featureType: null` (or `itemType: null`) on the affected instance rather than throwing. Null-check before accessing nested fields:

```javascript theme={null}
var feats = knowledge.features;
for (var i = 0; i < feats.length; i++) {
  var slug = feats[i].featureType ? feats[i].featureType.slug : "(orphaned)";
  log.info("feature", feats[i].uuid, "type:", slug);
}
```

`itemType` may also be null when a knowledge item references a type from a different organisation than the script's context, since the type lookup is org-scoped.

## Type-slug case sensitivity

`featuresByType` / `itemsByType` perform an **exact** string match against the type's `slug`. Slugs are lowercase by convention -- `featuresByType("Vendor")` will not match a type whose slug is `"vendor"`.

## Kill switch

There is no runtime feature flag for the `knowledge` global. Emergency disable requires commenting out the registration in `kodexa-orchestrator/internal/service/planner_script_adapters.go` (the `kb.Register(vm)` call near the bottom of `plannerContextBindings.Register`) and redeploying.

## What's NOT here (deferred to v2)

The following are **not** available in v1:

* **Resolution metadata** (`knowledge.matches`, `ResolutionMatch` objects). The engine does not currently persist per-set resolution decisions; surfacing them requires engine changes.
* **Per-item source / clauses.** Which knowledge clause produced an item is not exposed.
* **Attachment-fetch accessors.** `itemType.supportsAttachment` is exposed, but there is no accessor to fetch the underlying attachment (presigned URL, content, etc.).
* **Mid-script re-resolution.** The bindings read what is already attached to the family; they do not re-run the knowledge engine. If you need fresh resolution, run a separate engine invocation upstream of the script.
* **Feature provenance read fields** (`feature.attachedAt`, `feature.attachedBy`). The schema and write-side population land in v1's parallel track so the data accumulates; the read API ships in v2.

## Worked example: vendor routing

A common cass-analysis pattern: a single document family carries a `vendor` feature emitted upstream, the knowledge engine resolves the family's `processing-model` knowledge set, and a downstream script step dispatches based on the resolved item's properties.

```javascript theme={null}
// Step 1 (upstream) emits the vendor feature from the family's metadata.
// Step 2 (this script step) reads the resolved processing-model item and
// returns a dispatch action.

const route = knowledge.itemsByType("processing-model")[0];
if (!route) {
    throw new Error("no processing-model resolved -- vendor missing or unmapped");
}

// route.properties is shaped by the processing-model item type's options:
//   { domain: "utilities", pipeline: "template", hasLineItemsTemplate: false }
switch (route.properties.pipeline) {
    case "template": return { action: "template_path" };
    case "llm":      return { action: "llm_path" };
    default:         return { action: "unknown" };
}
```

The same pattern applied to multi-family steps:

```javascript theme={null}
var dispatches = [];
for (var i = 0; i < families.length; i++) {
  var fam = families[i];
  var route = knowledge.getItemsByType(fam.id, "processing-model")[0];
  if (!route) {
    log.warn("no processing-model for family", fam.id);
    continue;
  }
  dispatches.push({ familyId: fam.id, pipeline: route.properties.pipeline });
}
return { dispatches: dispatches };
```

## Common patterns

### Pattern 1: Two-step feature emission then read

The canonical use of `knowledge` is a two-step activity-plan flow: an upstream script step emits a feature using the planner's `{features: [...]}` return contract; the orchestrator runs `AssessAndEnrich` between steps; a downstream script reads the resolved items via `knowledge.itemsByType(...)`.

```javascript theme={null}
// Step 1: emit_features (script step, dependsOn: [])
// Read upstream metadata and emit a feature so the knowledge engine
// can resolve which items apply to this document.
var doc = loadDocument(families[0].id);
var vendorId = (doc.metadata || {})["CustomFields.PrimaryVendorId"];
if (!vendorId) {
  throw new Error("PrimaryVendorId missing -- intake metadata is incomplete");
}
return {
  action: "ok",
  features: [{
    documentFamilyId: families[0].id,
    featureTypeSlug: "vendor",
    properties: { vendorId: String(vendorId) }
  }]
};
```

```javascript theme={null}
// Step 2: classify_path (script step, dependsOn: [emit_features, ocr])
// AssessAndEnrich has run between the two steps. Read the resolved item.
var route = knowledge.itemsByType("processing-model")[0];
if (!route) {
  // Useful fallback: keep the legacy path active until coverage is complete.
  log.warn("No processing-model resolved; falling back to template_single");
  return { action: "template_single" };
}
switch (route.properties.pipeline) {
  case "template": return { action: "template_path" };
  case "llm":      return { action: "llm_path" };
  default:         return { action: "unknown" };
}
```

The split keeps each step focused: step 1 sources data from upstream, step 2 dispatches based on org-level configuration.

### Pattern 2: Per-vendor configuration without a lookup table

Older code commonly contains hard-coded tables like:

```javascript theme={null}
// Before: vendor-specific knowledge baked into the script
var TEMPLATE_VENDORS = ["VEN-001", "VEN-005", "VEN-018"];
var LINE_ITEM_VENDORS = ["VEN-002", "VEN-018"];
if (TEMPLATE_VENDORS.indexOf(vendorId) >= 0) { /* ... */ }
```

Knowledge replaces this with a declarative item per vendor:

```javascript theme={null}
// After: read the vendor's processing-model item; let knowledge sets resolve it
var route = knowledge.itemsByType("processing-model")[0];
if (route && route.properties.pipeline === "template") {
  // ...
  if (route.properties.hasLineItemsTemplate) { /* ... */ }
}
```

Now adding a new vendor is a knowledge-set edit (UI or YAML), not a code change.

### Pattern 3: Conditional behaviour gated on feature presence

Sometimes you don't need a knowledge item at all -- just the existence of a feature. Use `featuresByType`:

```javascript theme={null}
// Skip OCR if the document has already been classified as machine-readable upstream.
var classifications = knowledge.featuresByType("document-classification");
var alreadyOCRed = classifications.some(function (f) {
  return f.properties.layer === "text-extracted";
});
if (alreadyOCRed) {
  return { action: "skip_ocr" };
}
```

### Pattern 4: Multi-vendor / batch script steps

When the step's `families` slice has more than one entry, the bare accessors throw. Loop and use the explicit forms:

```javascript theme={null}
var dispatches = [];
for (var i = 0; i < families.length; i++) {
  var fam = families[i];
  var route = knowledge.getItemsByType(fam.id, "processing-model")[0];
  if (!route) {
    log.warn("no processing-model for family", fam.id);
    continue;
  }
  dispatches.push({ familyId: fam.id, pipeline: route.properties.pipeline });
}
return { action: "dispatched", count: dispatches.length, items: dispatches };
```

### Pattern 5: Audit / inspection

A read-only inspection script can summarise what the engine has done:

```javascript theme={null}
var feats = knowledge.features;
var items = knowledge.items;
log.info("knowledge state",
  "features", feats.length,
  "items", items.length,
  "byType", items.reduce(function (acc, it) {
    var t = it.itemType ? it.itemType.slug : "(orphaned)";
    acc[t] = (acc[t] || 0) + 1;
    return acc;
  }, {})
);
return { action: "audited" };
```

This is useful in test pipelines, before-merge checks, and triage activity-plans where you want visibility into what was resolved.

## Choosing knowledge vs. alternatives

A short decision guide for "where should this configuration live?":

| Source of the value                             | Recommended home                                         | Notes                                                                                |
| ----------------------------------------------- | -------------------------------------------------------- | ------------------------------------------------------------------------------------ |
| Org admin sets it once per vendor               | **Knowledge item** (this binding)                        | Versioned, reviewable, no redeploy to update                                         |
| Same for every project of a given type          | **Project template** option                              | Set when a project is created from the template                                      |
| Same for every document a module processes      | **Module option**                                        | Visible at module-config time                                                        |
| Computed at runtime from this specific document | **Script-local variable**                                | Don't shoehorn into knowledge                                                        |
| Cross-step within one activity run              | **`{features: [...]}` return contract**, or doc metadata | Knowledge is for stable config, not run-state                                        |
| Upstream system pushes it on every doc          | **Document metadata / intake**                           | Surface as a knowledge feature only if downstream knowledge sets need to match on it |

The recurring test: *if the org admin needs to change this value tomorrow, do they have to redeploy code, redeploy the project template, or just edit a knowledge item?* The last is the cheapest -- favour it when the value is genuinely org-scoped.

## See also

* [Scripting Reference](/guides/scripting/index) -- core globals, document/data-object/attribute APIs, and execution-context details.
* [Service Bridges](/guides/scripting/service-bridges) -- calling external systems from scripts.
