Skip to main content
The Kodexa Agent does more than reply with text. It can drive the workspace UI directly: render decision prompts in the chat stream, open documents, focus specific fields, and send notifications you will see even if the chat pane is not the focused panel. These interactions are not a separate mode you turn on — they happen automatically when the agent decides the right action is to drive the UI instead of writing a sentence about it.

When The Agent Asks You A Question

Sometimes the agent needs a decision before it can continue. Instead of asking in free text and parsing your reply, it renders an interactive prompt right in the chat stream. You will see three flavors of these prompts:
  • Choice prompt — the agent offers a small list of options as buttons. Click one to send your choice back.
  • Confirm prompt — yes / no, typically for a change the agent wants to make before doing it. Click to confirm or decline.
  • Document picker — the agent needs you to pick a document from a list (for example, “which invoice are you asking about?”). Click the document.
These prompts are tied to the message that asked them. Clicking an option records your choice as your next message, so the channel history reads naturally — “the agent asked X, you chose Y, then the agent continued”.
You can usually still type a free-text reply instead of clicking. The agent will accept either. Clicking is faster when the options cover what you want.

When The Agent Opens A Document

If the agent decides a document is relevant to the conversation, it can open it for you. You will see the workspace switch focus to the document viewer — the same view you would get by clicking the document in the task or project. This is useful in two common cases:
  • You asked about a specific document by name, and the agent is just saving you the click.
  • The agent is walking you through evidence — “here is the invoice where the total is below the line item sum”.
The chat pane stays open beside the document, so you can continue the conversation while looking at it.

When The Agent Focuses An Attribute

Once a document is open, the agent can scroll to and highlight a specific attribute — a particular field, value, or extracted region. This is the most precise UI-driving behavior the agent has: it puts your eye on exactly the part of the document the conversation is about. Try:
Open the invoice and focus the line items table.
Show me where "total_due" was extracted on the document.
The viewer scrolls; the attribute is highlighted; you can continue talking with the agent about that specific piece of the document.

Notifications

The agent can send notifications that surface as toasts in the workspace shell — short, non-blocking messages you will see even if the chat pane is collapsed or you are in another part of the workspace. Use cases:
  • A long-running operation completed.
  • The agent has a question waiting for you on a channel you are not currently looking at.
  • A change you asked for has been applied.
Notifications are informational. Clicking one usually takes you to the relevant channel or resource.

Auditability

Every UI interaction the agent performs is recorded on the channel where it happened. If the agent opened a document, focused an attribute, or asked you a question, the action is visible in the message stream. This is the source of truth for “what did the agent actually do?” — see Troubleshooting for how to use the channel history.