API Documentation

staik offers an OpenAI-compatible REST API. Just change the base_url.

Authentication

Authorization: Bearer sk-st-your-key-here

Base URL

https://api.staik.se/v1

Chat Completions

curl https://api.staik.se/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-st-your-key" \
  -d '{
    "model": "gemma4:31b",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Audio transcription

Transcribe audio to text via KB-Whisper large — the Swedish National Library's Swedish-trained Whisper model (50,000 hours of Swedish speech). OpenAI-compatible — just point the openai SDK at api.staik.se.

curl https://api.staik.se/v1/audio/transcriptions \
  -H "Authorization: Bearer sk-st-your-key" \
  -F file=@moten.mp3 \
  -F model=kb-whisper-large \
  -F language=sv

Python (OpenAI SDK drop-in):

from openai import OpenAI
client = OpenAI(api_key="sk-st-your-key", base_url="https://api.staik.se/v1")

with open("moten.mp3", "rb") as f:
    result = client.audio.transcriptions.create(
        model="kb-whisper-large",  # eller "whisper-1" — båda routar till KB-Whisper large
        file=f,
        language="sv",
    )
print(result.text)

Token consumption

Audio transcription is billed at 100 tokens per second of audio from the same token pool as the chat models. That means 1 minute of audio costs 6,000 tokens and 1 hour costs 360,000 tokens.

PlanToken limitEquivalent audio
Early Adopter100 000 / day~16 min / day
Hobby Mini50 000 / day~8 min / day
Hobby250 000 / day~41 min / day
Agent Mini100 000 / hour~16 min / hour
Agent500 000 / hour~83 min / hour
Agent Pro1 000 000 / hour~166 min / hour
Pay-as-you-gopurchased tokens100K tokens = ~16 min

Every response includes headers X-Audio-Duration-Seconds and X-Audio-Tokens-Charged so you can track usage per request. Max file size: 25 MB (~30–45 min of audio). Supported formats: mp3, wav, m4a, ogg, flac, webm, mp4. response_format: json (default), text, or verbose_json (with segments).

Models

Choose model via the model field in your request. Each model runs on dedicated GPU hardware in Sweden.

ModelParametersContext windowVision
qwen3.6:35b-a3b35B MoE262 144
qwen3.5:9b9.7B32 768
gemma4:31b31B262 144
bge-m3:latest568M8 192embedding
kb-whisper-large1.5Baudio (sv)

Chat models use Q4_K_M quantization, support streaming (SSE), and have a 262K token context window (32K for qwen3.5:9b). qwen3.6:35b-a3b and gemma4:31b have built-in vision — send images via image_url in messages. bge-m3:latest is an embedding model returning 1024-dimensional vectors (configure pgvector as vector(1024)), called via POST /v1/embeddings. kb-whisper-large is KB's Swedish-trained Whisper model, called via POST /v1/audio/transcriptions.

Plans

PlanPriceToken limit
Early AdopterFree100,000 / day
Pay-as-you-go100K free + buy more
Hobby Mini29 SEK/mo50,000 / day
Hobby59 SEK/mo250,000 / day
Agent Mini99 SEK/mo100,000 / hour
Agent299 SEK/mo500,000 / hour
Agent Pro499 SEK/mo1,000,000 / hour

Agent tiers use a rolling 60-minute window instead of a daily cap — capacity is continuously reclaimed. Need more than 1M/h? See Agent Scale on the pricing page.

Error codes

CodeDescription
401Invalid or missing API key
429Daily token limit exceeded or all slots busy
503Model temporarily unavailable

Rate limit headers

Every response includes headers showing your current token usage.

HeaderDescription
X-RateLimit-Limit-TokensDaily token limit
X-RateLimit-Used-TokensTokens used today
X-RateLimit-Remaining-TokensTokens remaining

Ready to integrate?