Generate text from a conversation. Supports streaming, system prompts, and multi-turn conversations.

Chat completions

POST /v1/chat/completions

Request body

Parameter	Type	Description
`model`	string	Model to use. Try `qwen` or `gemma`. Required
`messages`	array	Array of message objects with `role` and `content`. Required
`stream`	boolean	Stream partial responses as server-sent events. Default `false`.
`temperature`	number	Sampling temperature, 0-2. Lower = more deterministic. Default `1`.
`max_tokens`	integer	Maximum tokens to generate. Model decides if not set.
`top_p`	number	Nucleus sampling threshold, 0-1. Default `1`.

Message roles

Each message in the array has a role that determines how it’s treated.

Role	Description
`system`	Sets context for the conversation. Placed first in messages array.
`user`	Messages from the user. The model generates a response to this.
`assistant`	Previous model responses. Use for multi-turn conversations.

Example request

import os
from openai import OpenAI

client = OpenAI(
    base_url="https://router.appelon.ai/v1",
    api_key=os.environ["APPELON_API_KEY"]
)

response = client.chat.completions.create(
    model="qwen",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is the capital of the Netherlands?"}
    ]
)

print(response.choices[0].message.content)
# → "The capital of the Netherlands is Amsterdam."

import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://router.appelon.ai/v1',
  apiKey: process.env.APPELON_API_KEY,
});

const response = await client.chat.completions.create({
  model: 'qwen',
  messages: [
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'What is the capital of the Netherlands?' },
  ],
});

console.log(response.choices[0].message.content);
// → "The capital of the Netherlands is Amsterdam."

Response

Returns a completion object with the generated message.

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "model": "Qwen/Qwen3.6-35B-A3B-FP8",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "The capital of the Netherlands is Amsterdam."
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 28,
    "completion_tokens": 12,
    "total_tokens": 40
  }
}

Streaming

Set stream: true to receive tokens as they’re generated.

stream = client.chat.completions.create(
    model="qwen",
    messages=[{"role": "user", "content": "Tell me a story"}],
    stream=True
)

for chunk in stream:
    print(chunk.choices[0].delta.content, end="")

const stream = await client.chat.completions.create({
  model: 'qwen',
  messages: [{ role: 'user', content: 'Tell me a story' }],
  stream: true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || '');
}