OpenAI-Compatible API Integration

CozyUI includes a powerful feature that allows you to expose any of your workflows as a fully-featured, conversational agent accessible via an OpenAI-compatible API. This means you can interact with your complex, multi-step workflows using standard OpenAI client libraries and tools like Open WebUI, just as you would with models like GPT-4o.

How it Works: From Workflow to Conversational Agent

The magic of this integration lies in how it transforms a workflow with multiple inputs into a natural, multi-turn conversation.

When you expose a workflow, you decide which of its input parameters are Fixed Inputs and which are Query Inputs.

Fixed Inputs: These are values you, the workflow creator, set in advance. They remain constant for every run of the exposed workflow (e.g., a specific API key, a file path, a system prompt for an internal LLM).
Query Inputs: These are the values that need to be provided by the end-user. Instead of requiring the user to provide them all at once in a structured format, CozyUI's API will conversationally gather them.

The Conversational Flow

Let's imagine you've created a "Trip Planner" workflow that requires three Query Inputs: destination, budget, and travel_dates.

Initial Prompt: A user starts a conversation with your exposed workflow model.
User: "I want to plan a trip to Paris."
Parameter Extraction: CozyUI's internally called LLM receives this message. It analyzes the text and compares it against the required Query Inputs (destination, budget, travel_dates). It successfully extracts "destination": "Paris". During parameter extraction, information about the workflow, and nodes are injected in the context, so the more descriptive you are, the more punctual the parameter extraction will be.
Asking for Missing Information: The agent sees that budget and travel_dates are still missing. It uses another LLM call to formulate a natural follow-up question.
Agent (CozyUI): "Paris is a wonderful choice! To help plan your trip, what is your budget and what are your desired travel dates?"
Gathering More Information: The user provides the remaining details.
User: "My budget is around $2000, and I'd like to go for a week in mid-October."
Execution: The agent now has all the required Query Inputs. It combines these with any Fixed Inputs you pre-configured and starts the actual execution of your CozyUI workflow.

This entire conversational layer is handled automatically. You only define which inputs are needed, and CozyUI handles the chatbot logic for you automatically.

The Streaming Response: In thinking blocks

When your workflow executes, the API streams the response back in a format compatible with OpenAI's streaming chat completions. However, CozyUI adds a special feature for transparency: a <thinking> block.

While your workflow is running, the agent will stream real-time updates about which nodes are executing. This gives you a live view into the agent's "thought process."

WARNING

TODO insert example thought processs

These thinking blocks are usually hidden by default, keeping it out of the way.

This allows you to see the progress of the underlying workflow execution before the final answer is delivered.

Handling the Conversation ID

When you initiate a new chat with an exposed workflow, the agent's first response will include a unique Conversation ID, automatically injected into the message and wrapped in a special cozyui-conversation-identifier code block. This identifier is the key to enabling stateful, multi-turn conversations. It allows CozyUI to link all subsequent requests to the same chat session, remembering the full conversation history and any parameters you've already provided, without having to run data extraction on user messages again (which uses an LLM call).

For your client application to function correctly, it is needs to send back the entire message history with each new request, including all previous messages. Most standard OpenAI client libraries manage this history automatically, but it is an important technical detail to be aware of, if you do implement your own client.

Technical Details & API Usage

You can interact with your exposed workflows using any standard OpenAI client.

INFO

OpenAI endpoints are currently not authenticated.

API Endpoint: POST /openai/chat/completions

The request body should follow the standard OpenAI Chat Completions format. The key is to use the name you assigned to your Exposed Workflow in the model field.

Example: Calling with `curl`

bash

curl -N -X POST https://your-cozyui-instance/openai/chat/completions \
-H "Content-Type: application/json" \
-d '{
  "model": "trip-planner",
  "messages": [
    {
      "role": "user",
      "content": "I want to plan a trip to Paris."
    }
  ],
  "stream": true
}'

Example: Calling with Python

This is the most common use case. Make sure you have the openai library installed (pip install openai). The critical step is to point the client to your CozyUI instance by setting the base_url.

Example Python client

python

import os
from openai import OpenAI

# Point the client to your CozyUI instance's OpenAI-compatible endpoint
client = OpenAI(
    api_key="", # CozyUI currently does not require an API Key
    base_url="https://your-cozyui-instance/openai"
)

# The 'model' is the unique name you gave your Exposed Workflow
response_stream = client.chat.completions.create(
    model="trip-planner",
    messages=[
        {"role": "user", "content": "I want to plan a trip to Paris."}
    ],
    stream=True
)

print("Agent is responding...")
for chunk in response_stream:
    content = chunk.choices[0].delta.content
    if content:
        print(content, end='', flush=True)

print("\n\nStream finished.")

Listing Available Models

To see a list of all workflows you have exposed via the OpenAI API, you can make a GET request to the models endpoint.

Endpoint: GET /openai/models

Example curl requestExample Response

bash

curl https://your-cozyui-instance/openai/models

json

{
  "object": "list",
  "data": [
    {
      "id": "model-1",
      "object": "model",
      "created": 1686935002,
      "owned_by": "cozyui"
    },
    {
      "id": "model-2",
      "object": "model",
      "created": 1686935002,
      "owned_by": "cozyui"
    },
    {
      "id": "model-3",
      "object": "model",
      "created": 1686935002,
      "owned_by": "cozyui"
    },
  ],
  "object": "list"
}

This will return a JSON object listing all available "models" (your exposed workflows), compatible with the standard OpenAI v1/models response format.

The model object

property	type	description
created	integer	The Unix timestamp (in seconds) when the model was created.
id	string	The model identifier, which can be referenced in the API endpoints.
object	string	The object type, which is always `model`.
owned_by	string	The organization that owns the model. Currently a static value: `cozyui`.

OpenAI-Compatible API Integration ​

How it Works: From Workflow to Conversational Agent ​

The Conversational Flow ​

The Streaming Response: In thinking blocks ​

Handling the Conversation ID ​

Technical Details & API Usage ​

Example: Calling with curl ​

Example: Calling with Python ​

Listing Available Models ​

The model object ​