API Documentation

staik offers an OpenAI-compatible REST API. Just change the base_url.

Authentication

Authorization: Bearer sk-st-your-key-here

Base URL

https://api.staik.se/v1

Chat Completions

curl https://api.staik.se/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-st-your-key" \
  -d '{
    "model": "gemma4:31b",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Audio transcription

Transcribe audio to text via KB-Whisper large — the Swedish National Library's Swedish-trained Whisper model (50,000 hours of Swedish speech). OpenAI-compatible — just point the openai SDK at api.staik.se.

curl https://api.staik.se/v1/audio/transcriptions \
  -H "Authorization: Bearer sk-st-your-key" \
  -F file=@moten.mp3 \
  -F model=kb-whisper-large \
  -F language=sv

Python (OpenAI SDK drop-in):

from openai import OpenAI
client = OpenAI(api_key="sk-st-your-key", base_url="https://api.staik.se/v1")

with open("moten.mp3", "rb") as f:
    result = client.audio.transcriptions.create(
        model="kb-whisper-large",  # eller "whisper-1" — båda routar till KB-Whisper large
        file=f,
        language="sv",
    )
print(result.text)

Token consumption

Audio transcription is billed at 100 tokens per second of audio from the same token pool as the chat models. That means 1 minute of audio costs 6,000 tokens and 1 hour costs 360,000 tokens.

PlanToken limitEquivalent audio
Early Adopter100 000 / day~16 min / day
Hobby Mini50 000 / day~8 min / day
Hobby250 000 / day~41 min / day
Agent Mini100 000 / hour~16 min / hour
Agent500 000 / hour~83 min / hour
Agent Pro1 000 000 / hour~166 min / hour
Pay-as-you-gopurchased tokens100K tokens = ~16 min

Every response includes headers X-Audio-Duration-Seconds and X-Audio-Tokens-Charged so you can track usage per request. Max file size: 25 MB (~30–45 min of audio). Supported formats: mp3, wav, m4a, ogg, flac, webm, mp4. response_format: json (default), text, or verbose_json (with segments).

Models

Choose model via the model field in your request. Each model runs on dedicated GPU hardware in Sweden.

ModelMakerParametersContext windowVision
qwen3.6:35b-a3bAlibaba35B MoE262 144
qwen3.5:9bAlibaba9.7B262 144
gemma4:31bGoogle DeepMind31B98 304
bge-m3:latestBAAI568M8 192embedding
kb-whisper-largeNational Library of Sweden1.5Baudio (sv)

Chat models support streaming (SSE). Context windows: gemma4:31b 96K, qwen3.6:35b-a3b 262K, qwen3.5:9b 262K. Only gemma4:31b has built-in vision, send images via image_url in messages. bge-m3:latest is an embedding model returning 1024-dimensional vectors (configure pgvector as vector(1024)), called via POST /v1/embeddings. kb-whisper-large is KB's Swedish-trained Whisper model, called via POST /v1/audio/transcriptions.

The models are open weights that we host on our own hardware in Sweden. Training data and responses are determined by the model maker, not by staik. Choose the model in the model field based on your use case. KB-Whisper is the only model we run that is trained in Sweden.

Context and max_tokens: your prompt plus max_tokens (the room reserved for the reply) must fit within the model's context window. Exceeding it returns a 400 — shorten the prompt or lower max_tokens. This matters most with gemma4:31b (96K, smaller than the others). If you omit max_tokens we apply a sensible per-model default so replies are always bounded — set it explicitly for longer or shorter replies.

Plans

PlanPriceToken limit
Early AdopterFree100,000 / day
Pay-as-you-go100K free + buy more
Hobby Mini29 SEK/mo50,000 / day
Hobby59 SEK/mo250,000 / day
Agent Mini99 SEK/mo100,000 / hour
Agent149 SEK/mo500,000 / hour
Agent Pro219 SEK/mo1,000,000 / hour

Agent tiers use a rolling 60-minute window instead of a daily cap — capacity is continuously reclaimed. Need more than 1M/h? See Agent Scale on the pricing page.

Error codes

CodeDescription
401Invalid or missing API key
429Daily token limit exceeded or all slots busy
503Model temporarily unavailable

Rate limit headers

Every response includes headers showing your current token usage.

HeaderDescription
X-RateLimit-Limit-TokensDaily token limit
X-RateLimit-Used-TokensTokens used today
X-RateLimit-Remaining-TokensTokens remaining

Ready to integrate?