Documentation Index
Fetch the complete documentation index at: https://developer.kodexa.ai/llms.txt
Use this file to discover all available pages before exploring further.
Intake scripts run server-side JavaScript on every document uploaded through an intake endpoint. They execute after metadata is merged but before the document is stored, giving you a chance to inspect the content, enrich metadata, reject invalid uploads, and choose which Activity Plan should run for that upload.
How It Works
When a document is uploaded to an intake:
- Metadata is merged (intake config + document metadata + upload params)
- Your script runs with access to filename, content, and metadata
- Based on the script return value, the upload is accepted or rejected
- If accepted, the document family is stored
- Kodexa starts the returned Activity Plan, or the static Activity Plan configured on the intake
Scripts run in a sandboxed JavaScript runtime with a 5-second timeout. They cannot make network requests, access the filesystem, or call platform APIs — they operate purely on the data provided to them.
Available Variables
| Variable | Type | Description |
|---|
filename | string | Original uploaded filename |
fileSize | number | File size in bytes |
mimeType | string | Detected MIME type of the file |
metadata | object | Mutable metadata object. Merged from: intake config < document metadata < upload params |
document.text | string | Extracted text content (first 5 pages for PDFs) |
document.pageCount | number | Number of pages (if detectable) |
document.metadata | object | Read-only document-level metadata from the file |
log | object | Write to server logs via log.debug, log.info, log.warn, log.error (variadic) |
labels | string | Comma-separated label names from the upload request |
statusId | string | Document status ID from the upload request |
externalData | string | Raw external data JSON string from the upload request. After intake processing, this data is injected into the KDDB document under key "default" and accessible via doc.get_external_data(). |
documentVersion | string | Document version from the upload request |
Return Value
The script should return an object. All fields are optional:
return {
metadata: metadata, // Modified metadata object
reject: false, // Reject the upload (HTTP 400)
rejectReason: "", // Reason shown to caller
activityPlan: "invoice-intake", // Activity Plan slug or activity-plan://orgSlug/slug
title: "Invoice: " + filename, // Optional Activity title override
description: "", // Optional Activity description override
inputs: { // Optional Activity inputs
sourceFilename: filename,
documentType: "invoice"
}
};
If the script returns nothing, the upload proceeds with the original metadata and falls back to the static Activity Plan configured on the intake.
Activity Plans
The activityPlan field lets an intake decide which business process should run for the uploaded document. An intake script starts one Activity. That Activity can then create zero, one, or many Tasks with CREATE_TASK steps.
Do not model human review directly in the intake script. The script classifies and routes the upload. The Activity Plan owns the workflow, and Task Templates are referenced inside the Activity Plan when the process needs human work.
Behavior
| Script returns | Result |
|---|
activityPlan: "invoice-intake" | Starts the Activity Plan with that slug |
activityPlan: "activity-plan://acme/invoice-intake" | Starts the Activity Plan identified by URI |
No activityPlan field | Falls back to the static Activity Plan configured on the intake |
reject: true | Rejects the upload before storage and starts no Activity |
Activity Start Fields
| Field | Type | Required | Default |
|---|
activityPlan | string | No | Static intake Activity Plan |
title | string | No | Activity Plan title template or default title |
description | string | No | Empty |
inputs | object | No | {} |
The uploaded document family is attached to the started Activity automatically. The document family ID is also recorded in the Activity trigger metadata, so your Activity steps can work from the Activity’s document context instead of passing the ID through the script.
If an upload should not start work, leave the intake’s static Activity Plan empty or route to a lightweight archival Activity Plan. Scripts should not return an empty Task Template list as the new way to suppress work.
Example: Route by Document Type
if (document.text.indexOf("INVOICE") >= 0) {
metadata.documentType = "invoice";
return {
metadata: metadata,
activityPlan: "invoice-intake",
title: "Invoice: " + filename,
inputs: {
documentType: "invoice",
sourceFilename: filename
}
};
}
if (document.text.indexOf("CONTRACT") >= 0) {
metadata.documentType = "contract";
return {
metadata: metadata,
activityPlan: "contract-intake",
title: "Contract: " + filename,
inputs: {
documentType: "contract",
sourceFilename: filename
}
};
}
metadata.documentType = "unknown";
return {
metadata: metadata,
activityPlan: "document-triage",
title: "Triage: " + filename,
inputs: {
documentType: "unknown",
sourceFilename: filename
}
};
Example: Activity Creates the Tasks
The intake script should not return multiple Task Templates. Route to an Activity Plan that contains the task creation logic:
var amount = document.text.match(/\$[\d,]+\.?\d*/);
return {
metadata: metadata,
activityPlan: "invoice-intake",
title: "Invoice intake: " + filename,
inputs: {
priority: metadata.priority || "normal",
detectedAmount: amount ? amount[0] : null,
requiresApproval: metadata.priority === "urgent" || amount !== null
}
};
Inside invoice-intake, use Activity Plan steps to decide what human work is needed:
[
{
"slug": "extract",
"kind": "EXECUTION",
"config": {
"moduleRef": "kodexa/invoice-extractor"
}
},
{
"slug": "route",
"kind": "SCRIPT",
"dependsOn": ["extract"],
"config": {
"scriptActions": [
{ "name": "needs-review" },
{ "name": "straight-through" }
],
"scriptBody": "return { action: inputs.requiresApproval ? 'needs-review' : 'straight-through' };"
}
},
{
"slug": "analyst-review",
"kind": "CREATE_TASK",
"dependsOn": ["route:needs-review"],
"config": {
"taskTemplateRef": "invoice-review",
"waitForCompletion": true
}
}
]
Example: Reject or Route
if (mimeType !== "application/pdf") {
return {
metadata: metadata,
reject: true,
rejectReason: "Only PDF files are accepted"
};
}
metadata.sizeBucket = fileSize > 1024 * 1024 ? "large" : "standard";
return {
metadata: metadata,
activityPlan: fileSize > 1024 * 1024 ? "large-document-intake" : "standard-document-intake",
inputs: {
sizeBucket: metadata.sizeBucket,
sourceFilename: filename
}
};
Validation and Rejection
Return reject: true to refuse the upload before it is stored:
// Reject files over 100MB
if (fileSize > 100 * 1024 * 1024) {
return {
metadata: metadata,
reject: true,
rejectReason: "File exceeds 100MB limit"
};
}
// Reject non-PDF files
if (!filename.endsWith(".pdf")) {
return {
metadata: metadata,
reject: true,
rejectReason: "Only PDF files are accepted"
};
}
return {
metadata: metadata,
reject: false
};
The caller receives an HTTP 400 with the rejection reason:
{
"error": "SCRIPT_REJECTED",
"rejectReason": "File exceeds 100MB limit",
"filename": "huge-file.bin"
}
The metadata object is mutable. Any changes are persisted on the document family:
metadata.source = "intake";
metadata.receivedAt = new Date().toISOString();
metadata.classification = document.text.indexOf("CONFIDENTIAL") >= 0
? "confidential"
: "general";
return {
metadata: metadata,
reject: false
};
API Response
The upload response returns the created document family object. When an Activity starts successfully, the response also includes activityId:
{
"id": "doc-family-id",
"path": "invoice.pdf",
"activityId": "activity-id"
}
| Field | Description |
|---|
id | Created document family ID |
path | Stored document path |
activityId | ID of the Activity started from the returned or static Activity Plan |
Shared Modules
Load reusable JavaScript modules using the Module Refs picker in the Script tab. Selected modules execute before your script, making their functions available in global scope:
// Assuming "my-org/doc-classifier" module provides classifyDocument()
var docType = classifyDocument(document.text);
return {
metadata: metadata,
activityPlan: docType.activityPlan || "document-triage",
title: docType.title || "Triage: " + filename,
inputs: {
documentType: docType.type || "unknown",
confidence: docType.confidence || 0,
sourceFilename: filename
}
};
Limitations
- 5-second timeout — scripts that exceed this are terminated and the upload fails
- No network access — scripts cannot make HTTP requests or call external APIs
- No filesystem access — scripts operate only on the provided variables
- Text extraction — only the first 5 pages of PDFs are extracted; other file types may not have text available
- JavaScript runtime — supports ES5.1 JavaScript with some ES6+ features available