curl --request POST \
--url https://platform.kodexa.ai/api/organizations/{orgId}/ai/chat/completions \
--header 'Content-Type: application/json' \
--header 'x-api-key: <api-key>' \
--data '
{
"messages": [
{
"content": "<string>",
"role": "system"
}
],
"model": "<string>",
"max_tokens": 123,
"n": 123,
"stop": [
"<string>"
],
"stream": true,
"temperature": 123,
"top_p": 123
}
'{
"choices": [
{
"finish_reason": "<string>",
"index": 123,
"message": {
"content": "<string>",
"role": "system"
}
}
],
"created": 123,
"id": "<string>",
"model": "<string>",
"object": "<string>",
"usage": {
"completion_tokens": 123,
"prompt_tokens": 123,
"total_tokens": 123
}
}Creates a chat completion using the specified model via the AI Gateway. Supports both synchronous and streaming (SSE) responses. Set stream: true in the request body to receive server-sent events. The request is proxied to the organization’s configured AI provider (Bedrock, Azure, Gemini, etc.).
curl --request POST \
--url https://platform.kodexa.ai/api/organizations/{orgId}/ai/chat/completions \
--header 'Content-Type: application/json' \
--header 'x-api-key: <api-key>' \
--data '
{
"messages": [
{
"content": "<string>",
"role": "system"
}
],
"model": "<string>",
"max_tokens": 123,
"n": 123,
"stop": [
"<string>"
],
"stream": true,
"temperature": 123,
"top_p": 123
}
'{
"choices": [
{
"finish_reason": "<string>",
"index": 123,
"message": {
"content": "<string>",
"role": "system"
}
}
],
"created": 123,
"id": "<string>",
"model": "<string>",
"object": "<string>",
"usage": {
"completion_tokens": 123,
"prompt_tokens": 123,
"total_tokens": 123
}
}API key for authentication. Create one from the Kodexa platform UI under Settings > Access Tokens.
Organization UUID. The authenticated user must be a member of this organization.
Request body for creating a chat completion. Compatible with the OpenAI Chat Completions API.
A list of messages comprising the conversation.
Show child attributes
ID of the model to use (e.g., 'gpt-4', 'claude-3-sonnet').
Maximum number of tokens to generate.
Number of completions to generate.
Sequences where the API will stop generating.
If true, partial message deltas are sent as server-sent events (SSE).
Sampling temperature between 0 and 2.
Nucleus sampling parameter.
Chat completion response. When stream: true, the response is a stream of server-sent events with Content-Type: text/event-stream.
Response from the chat completion endpoint. Compatible with the OpenAI Chat Completions API response format.