Execution Pipeline

When an event occurs in the Kodexa platform — such as a document being uploaded to a store, a channel message arriving, or a scheduled job firing — the platform creates an execution to process that event through a pipeline of modules. Executions are the mechanism by which assistants do their work.

What is an Execution?

An execution represents a single run of an assistant’s pipeline. It tracks the full lifecycle from the initial trigger event through each processing step to a final success or failure outcome. Every execution is associated with an assistant, scoped to an organization, and optionally linked to a specific document family. You can view executions for an assistant in the Kodexa UI, or query them via the API:

GET /api/executions?filter=assistantId: '{assistantId}'

How Executions are Created

Executions are created automatically when domain events match an assistant’s connections:

An event occurs — for example, new content is uploaded to a document store, or a scheduled job triggers.
The platform evaluates assistant connections — each active assistant has connections that define which events it listens to (e.g., a specific store, channel, or workspace).
Subscription filtering — if the connection has a subscription expression, the event is evaluated against it. This can include checks like file extension, document labels, or metadata values.
An execution is created in PENDING status with the assistant’s pipeline configuration and event context attached.

Subscription Expressions

Assistant connections can filter events using expressions. Simple subscriptions are comma-separated event types:

CONTENT_CREATED,DOCUMENT_FAMILY_CREATED

Rich expressions support conditions on the document family:

type == "content" and hasExtensions("pdf", "docx")

type == "content" and hasLabel("needs-processing") and !hasLabel("processed")

Available functions in subscription expressions include hasLabel(), hasMixins(), hasExtensions(), and matchesPath().

Pipeline Configuration

Each assistant has a pipeline defined in its options. The pipeline is an ordered list of steps, where each step references a module and can include options and conditionals:

pipeline:
  steps:
    - ref: module://kodexa/pdf-parser
      name: Parse PDF
      stepType: MODEL
      options:
        ocr_enabled: true

    - ref: module://kodexa/invoice-extractor
      name: Extract Invoice Data
      stepType: MODEL
      options:
        confidence_threshold: 0.85
      conditional: "metadata.get('document_type') == 'invoice'"

Each step has:

Field	Description
`ref`	Module reference in the format `module://orgSlug/moduleSlug`
`name`	A human-readable name for the step
`stepType`	The type of step — typically `MODEL` for module execution
`options`	Key-value options passed to the module at runtime
`conditional`	An optional expression that determines whether the step should execute (see Data Flow Step Conditionals)

Execution Lifecycle

An execution moves through the following statuses:

PENDING --> RUNNING --> SUCCEEDED
                   \--> FAILED
                   \--> CANCELLED

PENDING

The execution has been created and is waiting for the scheduler to pick it up. Executions are prioritized — higher priority executions are scheduled first, and within the same priority level, older executions are processed first.

RUNNING

The scheduler has planned the execution by creating slices — one per pipeline step. Each slice is dispatched to a module runtime (a Lambda function) for processing:

The scheduler reads the pipeline steps and creates an execution slice for each step.
Each slice is enqueued to an SQS queue keyed by the module runtime.
The dispatcher polls the queue, reserves concurrency for the runtime, and invokes the Lambda function with the slice payload.
The Lambda container downloads the module, sets up the environment, and calls the module’s entry point function.

SUCCEEDED

All slices completed successfully. The execution’s endDate is set and the status moves to SUCCEEDED.

FAILED

One or more slices failed or timed out. If a slice’s lease expires before completion, it is marked as TIMED_OUT. If the Lambda invocation itself fails, the slice is marked as FAILED. Either condition causes the overall execution to be marked FAILED.

CANCELLED

The execution was explicitly cancelled via the API before it completed.

Execution Context

Every execution carries a context — a JSON object that is passed to each module in the pipeline. The context is built from the triggering event and the assistant’s configuration:

Key	Description
`eventType`	The type of event that triggered the execution (e.g., `CONTENT_CREATED`)
`documentFamilyId`	The ID of the document family being processed, if applicable
`contentObjectId`	The ID of the content object that triggered the event
`storeId`	The ID of the document store
`channelId`	The ID of the channel, for channel-triggered events
`taskId`	The ID of the task, for task-triggered events
`taxonomyRefs`	References to taxonomies configured on the assistant
`completeLabel`	A label to apply to the document family when processing completes

Modules can access the execution context through the pipeline_context parameter. See Magic Parameter Injection for details.

MODEL Steps

When a pipeline step has stepType: MODEL, the platform:

Resolves the module runtime — looks up the module runtime referenced by the module (e.g., kodexa/base-module-runtime) to determine which Lambda function to invoke.
Downloads the module — the Lambda container downloads the module’s code and any module sidecars to a local directory.
Calls the entry point — by default, the runtime looks for a package called module with a function called infer. The function receives the document and any configured options.
Returns the result — the module returns the processed document, which is passed forward in the pipeline.

The options passed to a MODEL step are structured as:

{
  "model_store": "module://kodexa/invoice-extractor",
  "model_options": {
    "confidence_threshold": 0.85
  },
  "assistant_id": "abc-123-def-456",
  "runtime_parameters": {}
}

model_store identifies which module to download and run.
model_options contains the inference options configured on the step.
assistant_id links the execution back to the assistant for context.
runtime_parameters can override module runtime behavior (e.g., custom entry points).

Deduplication

The platform prevents duplicate executions. If a PENDING execution already exists for the same assistant and document family within the last minute, a new event for the same combination is ignored. This prevents redundant processing when multiple events fire in quick succession.

Monitoring Executions

Via the API

List recent executions for an assistant:

GET /api/executions?filter=assistantId: '{assistantId}'&sort=createdOn,desc

Get a specific execution with its pipeline and context:

GET /api/executions/{executionId}

Cancel a running execution:

PUT /api/executions/{executionId}/cancel

Via the UI

In the Kodexa UI, navigate to your project, select an assistant, and view the Executions tab. Each execution shows its status, duration, the pipeline steps that were run, and any errors that occurred.

Assistants — the entities that own pipelines and create executions
Modules — the processing units that execute within pipeline steps
Module Runtimes — the runtime environments that host module execution
Event Handling with Modules — how modules can react to platform events
Data Flow Step Conditionals — conditional logic for skipping pipeline steps
Module Sidecars — shared code loaded alongside modules

Introduction

Organization & Projects

Knowledge System

Resources

Modules

Data Forms

What is an Execution?

How Executions are Created

Subscription Expressions

Pipeline Configuration

Execution Lifecycle

PENDING

RUNNING

SUCCEEDED

FAILED

CANCELLED

Execution Context

MODEL Steps

Deduplication

Monitoring Executions

Via the API

Via the UI

Introduction

Organization & Projects

Knowledge System

Resources

Modules

Data Forms

​What is an Execution?

​How Executions are Created

​Subscription Expressions

​Pipeline Configuration

​Execution Lifecycle

​PENDING

​RUNNING

​SUCCEEDED

​FAILED

​CANCELLED

​Execution Context

​MODEL Steps

​Deduplication

​Monitoring Executions

​Via the API

​Via the UI

​Related Concepts

What is an Execution?

How Executions are Created

Subscription Expressions

Pipeline Configuration

Execution Lifecycle

PENDING

RUNNING

SUCCEEDED

FAILED

CANCELLED

Execution Context

MODEL Steps

Deduplication

Monitoring Executions

Via the API

Via the UI

Related Concepts