# Appelon API Documentation > Base URL: https://router.appelon.ai/v1 > OpenAI-compatible. Use any OpenAI SDK with a custom base_url. --- # Quickstart > **OpenAI-compatible API.** If you already have an OpenAI client, change the base URL to `router.appelon.ai/v1` and you're done. ## Create an API key Sign up at [appelon.ai/signup](/signup) and create an API key in your dashboard. Store it as an environment variable. Never commit it to version control. ```bash export APPELON_API_KEY="sk-your-api-key" ``` ## Make your first request Call the chat completions endpoint. The example below uses Qwen 3.6 running on GPUs in Groningen. ```python import os from openai import OpenAI client = OpenAI( api_key=os.environ["APPELON_API_KEY"], base_url="https://router.appelon.ai/v1" ) response = client.chat.completions.create( model="qwen", messages=[{"role": "user", "content": "Hallo!"}] ) print(response.choices[0].message.content) ``` ```javascript import OpenAI from 'openai'; const client = new OpenAI({ apiKey: process.env.APPELON_API_KEY, baseURL: 'https://router.appelon.ai/v1', }); const response = await client.chat.completions.create({ model: 'qwen', messages: [{ role: 'user', content: 'Hallo!' }], }); console.log(response.choices[0].message.content); ``` ## Read the response Responses follow the OpenAI completion schema. You'll find the model's output in `choices[0].message.content`. ```json { "id": "chatcmpl-abc123", "object": "chat.completion", "model": "Qwen/Qwen3.6-35B-A3B-FP8", "choices": [{ "index": 0, "message": { "role": "assistant", "content": "Hallo! Hoe kan ik je helpen?" }, "finish_reason": "stop" }], "usage": { "prompt_tokens": 12, "completion_tokens": 8, "total_tokens": 20 } } ``` ## Using with AI coding assistants For full API documentation in a single file: [appelon.ai/llms.txt](/llms.txt) This works with Claude Code, Cursor, Copilot, and other AI coding tools. --- # Authentication ## API keys Your API key authenticates all requests. Include it in the `Authorization` header as a Bearer token. ``` Authorization: Bearer sk-your-api-key ``` > **Keep your key secret.** Never commit API keys to version control or expose them in client-side code. Use environment variables instead. ## Using with SDKs Appelon's API is OpenAI-compatible. Use the official OpenAI SDK with a custom base URL. ```python from openai import OpenAI client = OpenAI( api_key="sk-your-api-key", base_url="https://router.appelon.ai/v1" ) ``` ```javascript import OpenAI from 'openai'; const client = new OpenAI({ apiKey: 'sk-your-api-key', baseURL: 'https://router.appelon.ai/v1', }); ``` ## Environment variables Store your key in an environment variable for security. ```bash # Add to .bashrc, .zshrc, or .env export APPELON_API_KEY="sk-your-api-key" ``` With the environment variable set, initialize the client without hardcoding the key: ```python client = OpenAI( api_key=os.environ["APPELON_API_KEY"], base_url="https://router.appelon.ai/v1" ) ``` ## Authentication errors If authentication fails, you'll receive one of these responses: | Code | Description | |------|-------------| | `401` | Missing, invalid, or revoked API key. Check the `Authorization` header. | --- # Data residency ## Where your data is processed All inference requests are processed in our Groningen datacenter. The request and response flow is entirely within the Netherlands. | Component | Location | |-----------|----------| | API Gateway | Groningen, Netherlands | | GPU Compute | Groningen, Netherlands (NVIDIA A40, Blackwell) | | Model weights | Groningen, Netherlands | | Usage logs | Groningen, Netherlands | ## What we store We log usage metadata for billing and debugging. Your prompts and responses are not stored by default. **Stored:** - Timestamp, model used, token counts, account ID, latency - Used for billing and service monitoring **Not stored:** - Prompt content, model responses, user messages - Your data passes through and is not retained ## Compliance With all processing in the Netherlands, Appelon simplifies compliance with European data protection requirements. - **GDPR:** No data transfers outside the EU. No need for SCCs or adequacy decisions. - **Dutch law:** Processing falls under Dutch and EU jurisdiction only. - **No US Cloud Act exposure:** Infrastructure is not operated by US hyperscalers. ## For your DPO Need documentation for your data protection assessment? We can provide: - Data processing agreement (DPA) - Technical and organizational measures (TOMs) - Subprocessor list - Infrastructure documentation Contact us at [privacy@appelon.ai](mailto:privacy@appelon.ai) for compliance documentation. --- # Chat completions ``` POST /v1/chat/completions ``` ## Request body | Parameter | Type | Description | |-----------|------|-------------| | `model` | string | Model to use. Try `qwen` or `gemma`. **Required** | | `messages` | array | Array of message objects with `role` and `content`. **Required** | | `stream` | boolean | Stream partial responses as server-sent events. Default `false`. | | `temperature` | number | Sampling temperature, 0-2. Lower = more deterministic. Default `1`. | | `max_tokens` | integer | Maximum tokens to generate. Model decides if not set. | | `top_p` | number | Nucleus sampling threshold, 0-1. Default `1`. | ## Message roles Each message in the array has a `role` that determines how it's treated. | Role | Description | |------|-------------| | `system` | Sets context for the conversation. Placed first in messages array. | | `user` | Messages from the user. The model generates a response to this. | | `assistant` | Previous model responses. Use for multi-turn conversations. | ## Example request ```python import os from openai import OpenAI client = OpenAI( base_url="https://router.appelon.ai/v1", api_key=os.environ["APPELON_API_KEY"] ) response = client.chat.completions.create( model="qwen", messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What is the capital of the Netherlands?"} ] ) print(response.choices[0].message.content) # → "The capital of the Netherlands is Amsterdam." ``` ```javascript import OpenAI from 'openai'; const client = new OpenAI({ baseURL: 'https://router.appelon.ai/v1', apiKey: process.env.APPELON_API_KEY, }); const response = await client.chat.completions.create({ model: 'qwen', messages: [ { role: 'system', content: 'You are a helpful assistant.' }, { role: 'user', content: 'What is the capital of the Netherlands?' }, ], }); console.log(response.choices[0].message.content); // → "The capital of the Netherlands is Amsterdam." ``` ## Response Returns a completion object with the generated message. ```json { "id": "chatcmpl-abc123", "object": "chat.completion", "model": "Qwen/Qwen3.6-35B-A3B-FP8", "choices": [{ "index": 0, "message": { "role": "assistant", "content": "The capital of the Netherlands is Amsterdam." }, "finish_reason": "stop" }], "usage": { "prompt_tokens": 28, "completion_tokens": 12, "total_tokens": 40 } } ``` ## Streaming Set `stream: true` to receive tokens as they're generated. ```python stream = client.chat.completions.create( model="qwen", messages=[{"role": "user", "content": "Tell me a story"}], stream=True ) for chunk in stream: print(chunk.choices[0].delta.content, end="") ``` ```javascript const stream = await client.chat.completions.create({ model: 'qwen', messages: [{ role: 'user', content: 'Tell me a story' }], stream: true, }); for await (const chunk of stream) { process.stdout.write(chunk.choices[0]?.delta?.content || ''); } ``` --- # Embeddings ``` POST /v1/embeddings ``` ## Request body | Parameter | Type | Description | |-----------|------|-------------| | `model` | string | Embedding model. Use `bge-m3`. **Required** | | `input` | string or array | Text to embed. String or array of strings. **Required** | | `encoding_format` | string | Output format: `float` (default) or `base64`. | ## Example request ```python import os from openai import OpenAI client = OpenAI( base_url="https://router.appelon.ai/v1", api_key=os.environ["APPELON_API_KEY"] ) response = client.embeddings.create( model="bge-m3", input="This is a sample text to embed." ) embedding = response.data[0].embedding print(f"Dimensions: {len(embedding)}") # → Dimensions: 1024 ``` ## Response Returns an array of embedding objects, one for each input text. ```json { "object": "list", "model": "bge-m3", "data": [{ "object": "embedding", "index": 0, "embedding": [-0.023, 0.017, 0.042, ...] }], "usage": { "prompt_tokens": 8, "total_tokens": 8 } } ``` ## Batch embeddings Embed multiple texts in a single request for better efficiency. ```python response = client.embeddings.create( model="bge-m3", input=[ "First document to embed", "Second document to embed", "Third document to embed" ] ) # Returns 3 embeddings in response.data ``` ## Common use cases - **Semantic search:** Embed documents and queries, then find similar documents using cosine similarity. - **RAG:** Retrieve relevant context before generating responses with chat models. - **Clustering & classification:** Group similar documents or classify content based on embedding similarity. > **BGE-M3** produces 1024-dimensional vectors optimized for multilingual retrieval. It supports 100+ languages including Dutch, English, German, and French. --- # Image generation ``` POST /v1/images/generations ``` ## Request body | Parameter | Type | Description | |-----------|------|-------------| | `model` | string | Image model. Use `schnell` (fast) or `dev` (quality). **Required** | | `prompt` | string | Text description of the image to generate. **Required** | | `size` | string | Image dimensions. Default `1024x1024`. | | `n` | integer | Number of images to generate. Default `1`. | | `response_format` | string | `url` (default) or `b64_json` for base64-encoded image data. | ## Example request ```python import os from openai import OpenAI client = OpenAI( base_url="https://router.appelon.ai/v1", api_key=os.environ["APPELON_API_KEY"] ) response = client.images.generate( model="schnell", prompt="A serene Dutch landscape with windmills at sunset", size="1024x1024" ) image_url = response.data[0].url print(image_url) ``` ```javascript import OpenAI from 'openai'; const client = new OpenAI({ baseURL: 'https://router.appelon.ai/v1', apiKey: process.env.APPELON_API_KEY, }); const response = await client.images.generate({ model: 'schnell', prompt: 'A serene Dutch landscape with windmills at sunset', size: '1024x1024', }); console.log(response.data[0].url); ``` ## Response Returns an array of image objects with URLs or base64 data. ```json { "created": 1234567890, "data": [{ "url": "https://..." }] } ``` With `response_format: "b64_json"`: ```json { "created": 1234567890, "data": [{ "b64_json": "iVBORw0KGgoAAAANSUhEUgAA..." }] } ``` ## Available models | Model | Speed | Best for | |-------|-------|----------| | `schnell` | ~2s | Rapid iteration, prototyping | | `dev` | ~8s | Higher quality, better prompt adherence | ## Supported sizes FLUX supports flexible aspect ratios. Common sizes: | Size | Aspect ratio | |------|--------------| | `1024x1024` | 1:1 (square) | | `1024x768` | 4:3 (landscape) | | `768x1024` | 3:4 (portrait) | | `1280x720` | 16:9 (widescreen) | | `720x1280` | 9:16 (mobile) | --- # Speech to text > **Speaker diarization:** This endpoint transcribes audio AND identifies who said what. For simple transcription without speaker labels, this is still the endpoint to use. ``` POST /v1/audio/diarize ``` ## Request body | Parameter | Type | Description | |-----------|------|-------------| | `model` | string | Transcription model. Use `whisperx`. **Required** | | `file` | file | Audio file to transcribe. MP3, WAV, FLAC supported. **Required** | | `language` | string | Language code (e.g., `nl`, `en`). Auto-detected if not specified. | ## Example request The diarization endpoint is not part of the OpenAI SDK, so use a direct HTTP request. ```python import os import requests response = requests.post( "https://router.appelon.ai/v1/audio/diarize", headers={"Authorization": f"Bearer {os.environ['APPELON_API_KEY']}"}, files={"file": open("interview.mp3", "rb")}, data={"model": "whisperx", "language": "nl"} ) result = response.json() print(result["text"]) ``` ```bash curl -X POST "https://router.appelon.ai/v1/audio/diarize" \ -H "Authorization: Bearer $APPELON_API_KEY" \ -F "file=@interview.mp3" \ -F "model=whisperx" \ -F "language=nl" ``` ## Response with speaker labels WhisperX provides speaker diarization: it identifies different speakers in the audio. ```json { "text": "Welkom bij dit interview. Dank je wel voor de uitnodiging.", "segments": [ { "start": 0.0, "end": 2.5, "text": "Welkom bij dit interview.", "speaker": "SPEAKER_00" }, { "start": 2.8, "end": 5.1, "text": "Dank je wel voor de uitnodiging.", "speaker": "SPEAKER_01" } ] } ``` ## Supported formats - MP3 - WAV - FLAC - M4A - OGG ## Language support WhisperX supports 90+ languages including Dutch, English, German, French, Spanish, and more. Language is auto-detected, but specifying it improves accuracy. --- # Models ``` GET /v1/models ``` Lists all available models. Returns model IDs and their capabilities. ## Chat models Text generation and conversation. Use with `/v1/chat/completions`. ### qwen Qwen 3.6 35B (MoE, 3B active). Fast interactive model for conversations, analysis, and text generation. - ~85 tok/s - 128K context - A40 GPU **Recommended** for most use cases. ### gemma Gemma 4 31B (Dense). Deep analysis model with strong reasoning. Best for complex tasks. - ~85 tok/s - 128K context - Blackwell GPU ## Embedding models Convert text to vectors for search, similarity, and RAG. Use with `/v1/embeddings`. ### bge-m3 BGE-M3 multilingual embeddings. State-of-the-art for retrieval, supports 100+ languages including Dutch. - 1024 dimensions - 8K tokens max ## Image generation Generate images from text. Use with `/v1/images/generations`. ### flux-schnell FLUX Schnell. Fast generation (~2s per image) for rapid iteration and prototyping. - 1024×1024 - ~2s ### flux-dev FLUX Dev. Higher quality output with more detail and better prompt adherence. - 1024×1024 - ~8s ## Speech to text Transcription with speaker diarization. Use with `/v1/audio/diarize`. ### whisperx WhisperX with speaker diarization. Transcribes audio and identifies who said what. Supports Dutch and 90+ other languages. - MP3, WAV, FLAC - Speaker labels ## Model aliases Use short aliases or full model names interchangeably. | Alias | Model | |-------|-------| | `qwen` | Qwen/Qwen3.6-35B-A3B-FP8 | | `qwen-fast` | Qwen/Qwen3.6-35B-A3B-FP8 | | `gemma` | RedHatAI/gemma-4-31B-it-FP8-block | | `gemma-4` | RedHatAI/gemma-4-31B-it-FP8-block | | `flux-schnell` | schnell | | `flux-dev` | dev | | `diarize` | whisperx | > **Need a different model?** We can deploy additional models on request. Contact us at [support@appelon.ai](mailto:support@appelon.ai) --- # Migrating from OpenAI > Appelon's API is fully OpenAI-compatible. Your existing code, SDKs, and tools work without modification. Just change the base URL. ## The one-line change Add `base_url` to your OpenAI client initialization. ### Python ```python # Before (OpenAI) client = OpenAI( api_key="sk-openai-key" ) # After (Appelon) client = OpenAI( api_key="sk-appelon-key", base_url="https://router.appelon.ai/v1" # ← add this ) ``` ### Node.js ```javascript // Before (OpenAI) const client = new OpenAI({ apiKey: 'sk-openai-key', }); // After (Appelon) const client = new OpenAI({ apiKey: 'sk-appelon-key', baseURL: 'https://router.appelon.ai/v1', // ← add this }); ``` ## Model mapping Update model names to use Appelon's models. | OpenAI model | Appelon model | Use for | |--------------|---------------|---------| | gpt-4o | `qwen` | General chat, fast | | gpt-4-turbo | `gemma` | Complex reasoning | | text-embedding-3-small | `bge-m3` | Embeddings, search | | dall-e-3 | `flux-schnell` | Image generation | | whisper-1 | `whisperx` | Speech to text | ## Environment variables Set environment variables to avoid code changes entirely. ```bash # Add to .env or shell profile export OPENAI_API_KEY="sk-appelon-key" export OPENAI_BASE_URL="https://router.appelon.ai/v1" ``` With these variables set, code that uses `OpenAI()` without arguments will automatically use Appelon. ## What works - **Chat completions:** streaming, system prompts, multi-turn - **Embeddings:** single and batch - **Image generation:** FLUX models via /v1/images/generations - **Transcription:** WhisperX with speaker diarization - **Models endpoint:** /v1/models lists available models ## Not yet supported These OpenAI features are not available yet: - Function calling / tool use - Vision (image input) - Assistants API - Fine-tuning ## Using with LangChain LangChain's OpenAI integration works out of the box. ```python from langchain_openai import ChatOpenAI llm = ChatOpenAI( model="qwen", openai_api_key="sk-appelon-key", openai_api_base="https://router.appelon.ai/v1" ) ``` > **Need help migrating?** Contact us at [support@appelon.ai](mailto:support@appelon.ai) and we'll help you switch. ---