> ## Documentation Index
> Fetch the complete documentation index at: https://developer.kodexa.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Knowledge and Agents

> How Kodexa agents build and consume knowledge with human-in-the-loop approval, capturing institutional knowledge as agents work with documents.

<Note>
  This feature is coming soon. This documentation describes the planned functionality for how Kodexa agents will interact with the Knowledge System.
</Note>

Agents in Kodexa can both **build** and **consume** knowledge, creating a feedback loop where the system learns from documents and proposes configurations for human approval.

## The Vision

```mermaid theme={null}
flowchart LR
    subgraph learn["Agents Learn"]
        DOC[Documents] --> AGENT[Agent]
        AGENT --> PROPOSE1[Propose Features]
        AGENT --> PROPOSE2[Propose Sets]
    end

    subgraph review["Humans Review"]
        PROPOSE1 --> REVIEW[Pending Review]
        PROPOSE2 --> REVIEW
        REVIEW --> HUMAN[Human Approves]
    end

    subgraph apply["System Applies"]
        HUMAN --> ACTIVE[Active Knowledge]
        ACTIVE --> PROC[Processing]
        PROC --> DOC
    end

    style learn fill:#e0f2fe
    style review fill:#fef3c7
    style apply fill:#d1fae5
```

Agents do the heavy lifting of discovering patterns and proposing configurations. Humans stay in control by reviewing and approving what goes into production.

## Building Knowledge: Feature Discovery

When an agent processes documents, it may discover new entities that should be tracked as Knowledge Features.

### The Flow

```mermaid theme={null}
flowchart TD
    DOC[New Invoice Arrives] --> AGENT[Agent Processes]
    AGENT --> CHECK{Known Vendor?}

    CHECK -->|Yes| LINK[Link to existing Feature]
    CHECK -->|No| PROPOSE[Propose new Feature]

    PROPOSE --> PENDING[Feature: Pending Review]
    PENDING --> HUMAN[Human Reviews]

    HUMAN -->|Approve| CREATE[Feature Created]
    HUMAN -->|Reject| DISCARD[Discarded]
    HUMAN -->|Modify| EDIT[Edit & Approve]

    CREATE --> LINK
    LINK --> DONE[Document Linked to Feature]
```

### Example: New Vendor Discovery

1. **Agent processes invoice** from "NewTech Solutions Inc."
2. **Agent checks** if a vendor feature with matching ID exists
3. **No match found** - agent proposes new feature:

```yaml theme={null}
# Agent-proposed feature (status: pending_review)
featureType: vendor
status: pending_review
proposedBy: invoice-processing-agent
confidence: 0.92

properties:
  vendorId: "NTS-2024-001"  # Extracted from invoice

extendedProperties:
  displayName: "NewTech Solutions Inc."
  address: "123 Innovation Way, Austin, TX"
  extractedFrom: "invoice-2024-03-15-001.pdf"
```

4. **Human reviews** the proposed feature in the Knowledge interface
5. **Human approves** (or modifies and approves)
6. **Feature becomes active** and available for linking

### What Agents Extract

Agents can propose features based on:

* **Explicit data**: Vendor names, customer IDs, document types
* **Inferred classifications**: Language, document category, urgency
* **Patterns**: Recurring entities across multiple documents

## Consuming Knowledge: Intelligent Processing

Agents have access to all active knowledge and use it to make processing decisions.

### The Flow

```mermaid theme={null}
flowchart TD
    DOC[Document Arrives] --> AGENT[Agent Analyzes]

    AGENT --> FEATURES[Identify Features]
    FEATURES --> QUERY[Query Knowledge Sets]

    QUERY --> ITEMS[Get Applicable Items]
    ITEMS --> APPLY[Apply to Processing]

    APPLY --> RESULT[Processing Result]
```

### Example: Applying Extraction Rules

1. **Document arrives** - classified as SEC 10K filing
2. **Agent queries Knowledge Sets** matching "SEC Filing Type = 10K"
3. **Knowledge Set returns Items**:
   * Use 10K-specific extraction prompt
   * Apply annual report validation rules
   * Route to SEC compliance queue
4. **Agent applies these configurations** during processing

## Proposing Knowledge Sets

The most powerful capability: agents can propose entire Knowledge Sets by observing patterns.

### The Flow

```mermaid theme={null}
flowchart TD
    AGENT[Agent Observes] --> PATTERN[Detects Pattern]
    PATTERN --> ANALYZE[Analyzes Correlation]

    ANALYZE --> PROPOSE[Proposes Knowledge Set]
    PROPOSE --> PENDING[Set: Pending Review]

    PENDING --> HUMAN[Human Reviews]
    HUMAN -->|Approve| ACTIVE[Set Activated]
    HUMAN -->|Reject| LEARN[Agent Learns]
    HUMAN -->|Modify| EDIT[Edit & Activate]
```

### Example: Pattern Discovery

**Scenario**: Agent notices that invoices from Vendor X frequently have validation exceptions for missing tax IDs, and users always override with the same justification.

**Agent proposes**:

```yaml theme={null}
# Agent-proposed Knowledge Set
name: Vendor X Tax ID Exception
description: |
  Vendor X (Government Agency) is tax-exempt.
  Skip tax ID validation for their invoices.

status: pending_review
proposedBy: validation-analysis-agent
confidence: 0.88

evidence:
  - 47 invoices from Vendor X in past 90 days
  - 45 had tax ID validation overridden
  - Override reason consistently: "Government agency - tax exempt"

# Conditions
features:
  - featureTypeSlug: vendor
    properties:
      vendorId: "VENDOR-X-001"

# Actions
items:
  - itemType: validation-rule
    properties:
      ruleType: skip-field
      targetField: "vendor/tax_id"
      reason: "Government agency - tax exempt"
```

**Human reviews**:

* Sees the evidence (47 invoices, consistent overrides)
* Verifies the business logic makes sense
* Approves the Knowledge Set
* System now automatically skips tax ID validation for Vendor X

## The Human-in-the-Loop Principle

All agent-created knowledge requires human approval before becoming active.

### Why This Matters

| Aspect                | Agent Role                      | Human Role                |
| --------------------- | ------------------------------- | ------------------------- |
| **Pattern Detection** | Analyzes thousands of documents | Reviews proposed patterns |
| **Feature Creation**  | Extracts and proposes entities  | Validates accuracy        |
| **Rule Discovery**    | Identifies correlations         | Confirms business logic   |
| **Configuration**     | Proposes settings               | Approves for production   |

### Review Interface

The Knowledge interface shows:

* **Pending Features**: Agent-proposed entities awaiting approval
* **Pending Sets**: Agent-proposed rules awaiting approval
* **Evidence**: Why the agent made the proposal
* **Confidence Score**: Agent's certainty level
* **Impact Preview**: What would change if approved

## Feedback Loop

Agents learn from human decisions:

```mermaid theme={null}
flowchart LR
    PROPOSE[Agent Proposes] --> REVIEW[Human Reviews]
    REVIEW -->|Approve| LEARN1[Agent: Good Pattern]
    REVIEW -->|Reject| LEARN2[Agent: Avoid Similar]
    REVIEW -->|Modify| LEARN3[Agent: Refine Approach]

    LEARN1 --> IMPROVE[Improved Proposals]
    LEARN2 --> IMPROVE
    LEARN3 --> IMPROVE
```

When humans:

* **Approve** - Agent learns this pattern is valuable
* **Reject** - Agent learns to avoid similar proposals
* **Modify** - Agent learns the correct approach

## Configuration

### Enabling Agent Knowledge Building

```yaml theme={null}
# Assistant configuration
assistantDefinitionRef: kodexa/document-processing-assistant
options:
  knowledge:
    featureDiscovery: true      # Propose new features
    setProposal: true           # Propose knowledge sets
    confidenceThreshold: 0.75   # Minimum confidence to propose
    requireEvidence: true       # Must include evidence
```

### Review Notifications

```yaml theme={null}
# Project configuration
notifications:
  knowledgePendingReview:
    enabled: true
    channels:
      - email
      - slack
    recipients:
      - knowledge-admins@company.com
```

## Best Practices

### 1. Start with High Confidence Threshold

Begin with `confidenceThreshold: 0.9` and lower as you trust the agent's proposals.

### 2. Review Regularly

Don't let pending items pile up. Regular review keeps the feedback loop active.

### 3. Document Rejections

When rejecting proposals, add notes so the pattern is understood:

```yaml theme={null}
# Rejection with feedback
status: rejected
rejectionReason: "This pattern only applies to Q4, not year-round"
rejectedBy: john.smith
rejectedAt: 2024-03-15T10:30:00Z
```

### 4. Use Staging Environment

Test agent knowledge building in staging before production:

```yaml theme={null}
# Development/staging only
options:
  knowledge:
    featureDiscovery: true
    setProposal: true
    autoActivate: false  # Never auto-activate, always review
```

## Coming Soon

* **Batch Review**: Review multiple proposals at once
* **Approval Workflows**: Route proposals to specific reviewers
* **A/B Testing**: Test proposed rules on subset before full activation
* **Confidence Trends**: Track agent accuracy over time

## Related Documentation

* [Knowledge System Overview](/concepts/knowledge_system) - Foundation concepts
* [Knowledge Feature Types](/concepts/knowledge_feature_types) - Define metadata categories
* [Knowledge Item Types](/concepts/knowledge_item_types) - Define configurable behaviors