Documentation Index
Fetch the complete documentation index at: https://developer.kodexa.ai/llms.txt
Use this file to discover all available pages before exploring further.
Transactions enable you to batch multiple data object and data attribute operations into a single atomic unit. All operations are queued locally and executed in one call, providing both atomicity (all succeed or all fail) and significantly better performance for bulk operations.
Overview
Transactions are especially useful when:
- Creating many data objects and attributes at once
- Performing extraction operations that produce multiple related records
- Needing atomic rollback on failure
- Optimizing performance for bulk operations
Python Usage
In Python, use the batch_transaction() context manager:
from kodexa_document import Document
from kodexa_document.accessors import DataObjectInput, DataAttributeInput
with Document() as doc:
root = doc.create_node("document", "Invoice #12345")
doc.content_node = root
with doc.batch_transaction() as tx:
# Create a data object (returns immediately with a temporary record)
invoice = tx.data_objects.create(DataObjectInput(
path="/invoice"
))
# Add attributes using the temporary ID
tx.data_attributes.create(invoice['id'], DataAttributeInput(
tag="vendor-name",
string_value="Acme Corp",
confidence=0.95
))
tx.data_attributes.create(invoice['id'], DataAttributeInput(
tag="total-amount",
decimal_value=1234.56,
confidence=0.92
))
# Create child objects
for desc, amount in [("Widget A", 100.0), ("Widget B", 250.0)]:
line_item = tx.data_objects.create(DataObjectInput(
parent_id=invoice['id'],
path="/invoice/line-item"
))
tx.data_attributes.create(line_item['id'], DataAttributeInput(
tag="description",
string_value=desc,
confidence=0.90
))
tx.data_attributes.create(line_item['id'], DataAttributeInput(
tag="amount",
decimal_value=amount,
confidence=0.90
))
print(f"Queued {tx.operation_count} operations")
# All operations are committed atomically when exiting the context
# Verify the results
objects = doc.data_objects.get_all()
print(f"Created {len(objects)} data objects")
Transaction Operations
The TransactionContext provides accessors that mirror the standard data accessors:
| Accessor | Method | Description |
|---|
tx.data_objects | create(input) | Queue a data object creation |
tx.data_objects | update(id, updates) | Queue a data object update |
tx.data_objects | delete(id) | Queue a data object deletion |
tx.data_attributes | create(obj_id, input) | Queue an attribute creation |
tx.data_attributes | update(id, updates) | Queue an attribute update |
tx.data_attributes | delete(id) | Queue an attribute deletion |
tx.data_attributes | set_value(id, value) | Queue a value update |
tx.data_attributes | set_confidence(id, confidence) | Queue a confidence update |
ID Resolution
When you create a data object within a transaction, it returns a record with a temporary ID. You can use this temporary ID to create child objects or attributes within the same transaction. The IDs are resolved to real database IDs when the transaction is committed.
with doc.batch_transaction() as tx:
# parent gets a temporary ID
parent = tx.data_objects.create(DataObjectInput(path="/parent"))
# Use the temporary ID for the child - it's resolved on commit
child = tx.data_objects.create(DataObjectInput(
parent_id=parent['id'],
path="/parent/child"
))
# Use the child's temporary ID for attributes
tx.data_attributes.create(child['id'], DataAttributeInput(
tag="name",
string_value="Child attribute"
))
Error Handling
If an exception occurs within the transaction block, all queued operations are discarded:
try:
with doc.batch_transaction() as tx:
tx.data_objects.create(DataObjectInput(path="/test"))
raise ValueError("Something went wrong")
# The create operation is NOT committed
except ValueError:
print("Transaction rolled back")
# No data objects were created
assert len(doc.data_objects.get_all()) == 0
TypeScript Usage
In TypeScript, use the transaction() method with an async callback:
import { Kodexa } from '@kodexa-ai/document-wasm-ts';
async function batchCreateData() {
await Kodexa.init();
const doc = await Kodexa.createDocument();
try {
await doc.transaction(async (tx) => {
// Create a data object
const invoice = tx.dataObjects.create({
path: '/invoice'
});
// Add attributes using the temporary ID
tx.dataAttributes.create(invoice.id, {
tag: 'vendor-name',
stringValue: 'Acme Corp',
confidence: 0.95
});
tx.dataAttributes.create(invoice.id, {
tag: 'total-amount',
decimalValue: 1234.56,
confidence: 0.92
});
// Create child objects
const items = [
{ desc: 'Widget A', amount: 100.0 },
{ desc: 'Widget B', amount: 250.0 }
];
for (const { desc, amount } of items) {
const lineItem = tx.dataObjects.create({
parentId: invoice.id,
path: '/invoice/line-item'
});
tx.dataAttributes.create(lineItem.id, {
tag: 'description',
stringValue: desc,
confidence: 0.90
});
tx.dataAttributes.create(lineItem.id, {
tag: 'amount',
decimalValue: amount,
confidence: 0.90
});
}
console.log(`Queued ${tx.operationCount} operations`);
});
// All operations are committed atomically
// Verify results
const objects = await doc.dataObjects.getAll();
console.log(`Created ${objects.length} data objects`);
} finally {
doc.dispose();
}
}
Transactions provide significant performance benefits for bulk operations:
- Without transactions: Each create/update/delete is a separate FFI/WASM call
- With transactions: All operations are batched into a single call
For operations involving dozens or hundreds of data objects and attributes, transactions can be orders of magnitude faster.
Best Practices
- Use transactions for bulk operations: Any time you’re creating more than a few data objects or attributes, wrap them in a transaction.
- Keep transactions focused: Don’t mix unrelated operations in the same transaction.
- Handle errors: Wrap transaction blocks in try/except (Python) or try/catch (TypeScript) to handle failures gracefully.
- Check operation count: Use
tx.operation_count (Python) or tx.operationCount (TypeScript) to verify the expected number of operations before committing.