Is staik compatible with the OpenAI SDK?

Yes, staik offers an OpenAI-compatible REST API. Just change base_url to api.staik.se in your existing code.

What models are available?

staik offers qwen3.6:35b-a3b (262K context), qwen3.5:9b (262K context), gemma4:31b (96K context, vision), and bge-m3:latest (embedding via /v1/embeddings). All run on dedicated GPU hardware in Sweden. For vision, use gemma4:31b.

Where is my data stored?

All data is processed and stored on servers in Sweden. No prompts or responses leave the country. Fully GDPR and Schrems II compliant.

API ReferenceOpenClaw OpenCode Continue.dev Agents Tool Calling Web Search Claude Code

API Documentation

staik offers an OpenAI-compatible REST API. Just change the base_url.

Authentication

Authorization: Bearer sk-st-your-key-here

Base URL

https://api.staik.se/v1

Chat Completions

curl https://api.staik.se/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-st-your-key" \
  -d '{
    "model": "gemma4:31b",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Audio transcription

Transcribe audio to text via KB-Whisper large — the Swedish National Library's Swedish-trained Whisper model (50,000 hours of Swedish speech). OpenAI-compatible — just point the openai SDK at api.staik.se.

curl https://api.staik.se/v1/audio/transcriptions \
  -H "Authorization: Bearer sk-st-your-key" \
  -F file=@moten.mp3 \
  -F model=kb-whisper-large \
  -F language=sv

Python (OpenAI SDK drop-in):

from openai import OpenAI
client = OpenAI(api_key="sk-st-your-key", base_url="https://api.staik.se/v1")

with open("moten.mp3", "rb") as f:
    result = client.audio.transcriptions.create(
        model="kb-whisper-large",  # eller "whisper-1" — båda routar till KB-Whisper large
        file=f,
        language="sv",
    )
print(result.text)

Token consumption

Audio transcription is billed at 100 tokens per second of audio from the same token pool as the chat models. That means 1 minute of audio costs 6,000 tokens and 1 hour costs 360,000 tokens.

Plan	Token limit	Equivalent audio
Early Adopter	100 000 / day	~16 min / day
Hobby Mini	50 000 / day	~8 min / day
Hobby	250 000 / day	~41 min / day
Agent Mini	100 000 / hour	~16 min / hour
Agent	500 000 / hour	~83 min / hour
Agent Pro	1 000 000 / hour	~166 min / hour
Pay-as-you-go	purchased tokens	100K tokens = ~16 min

Every response includes headers X-Audio-Duration-Seconds and X-Audio-Tokens-Charged so you can track usage per request. Max file size: 25 MB (~30–45 min of audio). Supported formats: mp3, wav, m4a, ogg, flac, webm, mp4. response_format: json (default), text, or verbose_json (with segments).

Models

Choose model via the model field in your request. Each model runs on dedicated GPU hardware in Sweden.

Model	Maker	Parameters	Context window	Vision
qwen3.6:35b-a3b	Alibaba	35B MoE	262 144	—
qwen3.5:9b	Alibaba	9.7B	262 144	—
gemma4:31b	Google DeepMind	31B	98 304	✓
bge-m3:latest	BAAI	568M	8 192	embedding
kb-whisper-large	National Library of Sweden	1.5B	—	audio (sv)

Chat models support streaming (SSE). Context windows: gemma4:31b 96K, qwen3.6:35b-a3b 262K, qwen3.5:9b 262K. Only gemma4:31b has built-in vision, send images via image_url in messages. bge-m3:latest is an embedding model returning 1024-dimensional vectors (configure pgvector as vector(1024)), called via POST /v1/embeddings. kb-whisper-large is KB's Swedish-trained Whisper model, called via POST /v1/audio/transcriptions.

The models are open weights that we host on our own hardware in Sweden. Training data and responses are determined by the model maker, not by staik. Choose the model in the model field based on your use case. KB-Whisper is the only model we run that is trained in Sweden.

Context and max_tokens: your prompt plus max_tokens (the room reserved for the reply) must fit within the model's context window. Exceeding it returns a 400 — shorten the prompt or lower max_tokens. This matters most with gemma4:31b (96K, smaller than the others). If you omit max_tokens we apply a sensible per-model default so replies are always bounded — set it explicitly for longer or shorter replies.

Plans

Plan	Price	Token limit
Early Adopter	Free	100,000 / day
Pay-as-you-go	100K free + buy more	—
Hobby Mini	29 SEK/mo	50,000 / day
Hobby	59 SEK/mo	250,000 / day
Agent Mini	99 SEK/mo	100,000 / hour
Agent	149 SEK/mo	500,000 / hour
Agent Pro	219 SEK/mo	1,000,000 / hour

Agent tiers use a rolling 60-minute window instead of a daily cap — capacity is continuously reclaimed. Need more than 1M/h? See Agent Scale on the pricing page.

Error codes

Code	Description
401	Invalid or missing API key
429	Daily token limit exceeded or all slots busy
503	Model temporarily unavailable

Rate limit headers

Every response includes headers showing your current token usage.

Header	Description
X-RateLimit-Limit-Tokens	Daily token limit
X-RateLimit-Used-Tokens	Tokens used today
X-RateLimit-Remaining-Tokens	Tokens remaining

Ready to integrate?

Multi-agent workflows OpenClaw Configuration