API ReferenceOpenClawAgents Tool Calling Claude Code

OpenClaw Configuration

Configure OpenClaw with staik to use all models directly via Telegram, Discord, or the terminal.

Configuration

Add the following to your openclaw.json:

JSONopenclaw.json

{
  "models": {
    "providers": {
      "staik": {
        "baseUrl": "https://api.staik.se/v1",
        "apiKey": "sk-st-your-key",
        "api": "openai-completions",
        "models": [
          {
            "id": "gemma4:31b",
            "name": "Gemma 4 31B",
            "contextWindow": 262144,
            "contextTokens": 64000,
            "maxTokens": 8192
          },
          {
            "id": "qwen3.6:35b-a3b",
            "name": "Qwen 3.5 35B A3B",
            "contextWindow": 262144,
            "contextTokens": 64000,
            "maxTokens": 8192
          },
          {
            "id": "qwen3.5:9b",
            "name": "Qwen 3.5 9B",
            "contextWindow": 32768,
            "contextTokens": 28000,
            "maxTokens": 4096
          }
        ]
      }
    }
  },
  "agents": {
    "defaults": {
      "model": {
        "primary": "staik/gemma4:31b",
        "fallbacks": ["staik/qwen3.6:35b-a3b", "staik/qwen3.5:9b"]
      },
      "compaction": {
        "mode": "safeguard",
        "keepRecentTokens": 20000,
        "reserveTokens": 24000
      }
    }
  }
}

Available models

Model	Best for	Context window	Vision
gemma4:31b	Review, accuracy, language, vision	262 144	✓
qwen3.6:35b-a3b	Coding, complex tasks, vision	262 144	✓
qwen3.5:9b	Fast responses, simpler tasks	32 768	—

Switch models with /models in OpenClaw. fallbacks enables automatic failover if the primary model is unavailable.

Fine-tune context handling

These parameters control how OpenClaw manages conversation history and model responses.

Parameter	Where	Description
contextTokens	per model	Max tokens of conversation history per request. Lower if the model times out.
maxTokens	per model	Max tokens in the model response. Lower if responses are cut off or take too long.
keepRecentTokens	compaction	Number of tokens of recent messages that are never compressed. Raise for more context in short conversations.
reserveTokens	compaction	Tokens reserved for the model response during compaction. Raise if responses get truncated.
contextWindow	per model	The model's total context window (262K for all staik models). Do not change.

Tips

If the model stops responding mid-conversation — lower contextTokens.
If responses get truncated — raise reserveTokens.
If the model forgets what you just discussed — raise keepRecentTokens.

Config per plan

Every request counts both prompt tokens (conversation history) and completion tokens (model response). The key to staying within budget is contextTokens — it controls how much history is sent with each request.

Early Adopter 100 000 tokens / day

With 100K tokens per day you need to be frugal. Use the fast 9b model and keep context short.

Early Adopteropenclaw.json

{
  "models": {
    "providers": {
      "staik": {
        "baseUrl": "https://api.staik.se/v1",
        "apiKey": "sk-st-your-key",
        "api": "openai-completions",
        "models": [
          {
            "id": "qwen3.5:9b",
            "name": "Qwen 3.5 9B",
            "contextWindow": 32768,
            "contextTokens": 12000,
            "maxTokens": 2048
          }
        ]
      }
    }
  },
  "agents": {
    "defaults": {
      "model": {
        "primary": "staik/qwen3.5:9b"
      },
      "compaction": {
        "mode": "safeguard",
        "keepRecentTokens": 6000,
        "reserveTokens": 8000,
        "maxHistoryShare": 0.3
      }
    }
  }
}

Budget: ~12K tokens/request → ~8 conversation rounds per day

qwen3.5:9b is 3x faster and uses fewer tokens
contextTokens: 12K (not 64K) — 5x fewer prompt tokens per request
maxTokens: 2048 — sufficient for most responses
maxHistoryShare: 0.3 — compresses early to save tokens

Hobby 250 000 tokens / day

More room for longer conversations. You can use larger models and more context.

Hobbyopenclaw.json

{
  "models": {
    "providers": {
      "staik": {
        "baseUrl": "https://api.staik.se/v1",
        "apiKey": "sk-st-your-key",
        "api": "openai-completions",
        "models": [
          {
            "id": "qwen3.6:35b-a3b",
            "name": "Qwen 3.5 35B",
            "contextWindow": 262144,
            "contextTokens": 24000,
            "maxTokens": 4096
          },
          {
            "id": "qwen3.5:9b",
            "name": "Qwen 3.5 9B",
            "contextWindow": 32768,
            "contextTokens": 16000,
            "maxTokens": 4096
          }
        ]
      }
    }
  },
  "agents": {
    "defaults": {
      "model": {
        "primary": "staik/qwen3.5:9b",
        "fallbacks": ["staik/qwen3.6:35b-a3b"]
      },
      "compaction": {
        "mode": "safeguard",
        "keepRecentTokens": 10000,
        "reserveTokens": 16000,
        "maxHistoryShare": 0.4
      }
    }
  }
}

Budget: ~16K tokens/request → ~15 conversation rounds per day

9b as primary, 35b as fallback for complex tasks
contextTokens: 24K on 35b gives good context without blowing the budget

Agent 500 000 tokens / hour (rolling)

The Agent plan uses a rolling window instead of a daily limit — perfect for intense sessions. Full context and all models.

Agentopenclaw.json

{
  "models": {
    "providers": {
      "staik": {
        "baseUrl": "https://api.staik.se/v1",
        "apiKey": "sk-st-your-key",
        "api": "openai-completions",
        "models": [
          {
            "id": "gemma4:31b",
            "name": "Gemma 4 31B",
            "contextWindow": 262144,
            "contextTokens": 48000,
            "maxTokens": 8192
          },
          {
            "id": "qwen3.6:35b-a3b",
            "name": "Qwen 3.5 35B",
            "contextWindow": 262144,
            "contextTokens": 48000,
            "maxTokens": 8192
          },
          {
            "id": "qwen3.5:9b",
            "name": "Qwen 3.5 9B",
            "contextWindow": 32768,
            "contextTokens": 28000,
            "maxTokens": 4096
          }
        ]
      }
    }
  },
  "agents": {
    "defaults": {
      "model": {
        "primary": "staik/gemma4:31b",
        "fallbacks": ["staik/qwen3.6:35b-a3b", "staik/qwen3.5:9b"]
      },
      "compaction": {
        "mode": "safeguard",
        "keepRecentTokens": 20000,
        "reserveTokens": 24000,
        "maxHistoryShare": 0.6
      }
    }
  }
}

Budget: ~50K tokens/request → ~10 requests per hour (capacity is continuously reclaimed)

Gemma 4 as primary — best for review and language
All three models available with automatic fallback
48K context — sufficient for large code projects
Rolling window: tokens from an hour ago no longer count

Quick reference

Plan	contextTokens	maxTokens	Model	Capacity
Early Adopter	12 000	2 048	qwen3.5:9b	~8 / day
Hobby Mini	8 000	2 048	qwen3.5:9b	~5 / day
Hobby	24 000	4 096	qwen3.5:9b + 35b	~15 / day
Agent Mini	24 000	4 096	qwen3.6:35b	~4 / h
Agent	48 000	8 192	gemma4 + 35b + 9b	~10 / h
Agent Pro	64 000	8 192	gemma4 + 35b + 9b	~20 / h

Sub-agents (ACP)

When your agent works on larger projects (e.g. an HTML/CSS/JS one-page with multiple files), OpenClaw normally splits the work into sub-agents via ACP (Agent Coordination Protocol). Each sub-agent gets isolated context, which dramatically lowers token usage.

Why it matters

Without ACP, the main agent has to hold the entire project state in its context between each file generation. With 5 files and 30k tokens of context per request, you burn 150k+ tokens in a single round. With ACP, each sub-agent only gets its file-specific context, which can lower token usage by 70-80%.

Configuration

Add the following to your openclaw.json:

ACPopenclaw.json

{
  "agents": {
    "list": [{
      "id": "<your_agent_id>",
      "runtime": {
        "type": "acp",
        "acp": {
          "agent": "codex",
          "backend": "acpx",
          "mode": "persistent"
        }
      }
    }]
  },
  "session": {
    "threadBindings": {
      "enabled": true,
      "spawnSubagentSessions": true
    }
  }
}

Verify ACP is working

Run the following directly in your chat (e.g. Telegram) to spawn a sub-agent manually:

/acp spawn <agent_id>

To see exactly what OpenClaw is doing in real time (which sub-agents are spawned, where errors occur), enable verbose logging directly in your chat:

/verbose

Common issues

"Cannot start sub-agent (gateway connection closed)"

This is OpenClaw's internal gateway, not the staik API gateway. Check that:

the acpx backend is installed
gateway.mode: "local" is set
the OpenClaw gateway is running on the port set in gateway.port
no other processes are blocking the port (e.g. previous instances that didn't shut down)

When ACP fails, the main agent falls back to doing everything itself, which is why the entire project context gets sent in every request — and you burn tokens fast.

Troubleshooting

OpenClaw produces no response

If OpenClaw does not produce any response at all, try turning off streaming in your provider configuration. Add "streaming": "off" to your provider block:

Fixopenclaw.json

{
  "models": {
    "providers": {
      "staik": {
        "baseUrl": "https://api.staik.se/v1",
        "apiKey": "sk-st-your-key",
        "api": "openai-completions",
        "streaming": "off",
        "models": [...]
      }
    }
  }
}

Why does this help?

Some channels and configurations can have issues with streaming responses from OpenAI-compatible APIs. Turning off streaming makes OpenClaw send a regular request and wait for the full response before displaying it — bypassing any streaming-related issues.

Agent asks "Should I run?" between steps

For large multi-step tasks, the agent may break work into batches and ask for confirmation between them. This comes from OpenClaw's own agent prompt or the model's default behavior — not the staik gateway. OpenClaw lacks a built-in flag to disable this, but you can override it with a system prompt.

Option 1: per Telegram group

Fixopenclaw.json

{
  "channels": {
    "telegram": {
      "groups": {
        "<your_chat_id>": {
          "systemPrompt": "You are an autonomous agent. Execute ALL steps of a task in sequence without asking between them. Stop only on (1) an explicit error requiring a decision, or (2) when the entire task is done. Never use phrases like \"Should I run?\" or \"Do you want me to continue?\" — just execute the next step."
        }
      }
    }
  }
}

Option 2: workspace files (applies globally)

OpenClaw reads agent persona from markdown files in the workspace folder (e.g. ~/.openclaw/workspace/). Create or edit IDENTITY.md or SOUL.md with the same instruction — it then applies globally, not just per channel.

Other causes

Sub-agent could not start: If OpenClaw reports that the gateway connection is closed, its internal sub-agent system has failed (that's OpenClaw's own gateway, not the staik API gateway). The main agent then serializes work and often breaks it into batches.
Task too large for one response: If the task requires more output than maxTokens allows, the agent must split it. Raise maxTokens to 8192 if budget allows.

429 Too Many Requests mid-session

You've hit your token limit (daily for Hobby/EA, hourly rolling for Agent). Large contexts (~30k tokens/request) can hit Agent (500k/h) in 16 requests. Solutions:

Lower contextTokens from 64000 → 16000–32000
Use qwen3.5:9b as primary instead of 35b
Upgrade to a higher-limit plan — see pricing

Use the X-RateLimit-Used-Tokens response header to monitor usage in real time.

Next steps

Multi-agent workflows API Reference