# API reference (/docs/api-reference) tokenroute exposes two surfaces: ## OpenAI-compatible (`/v1/*`) [#openai-compatible-v1] Drop-in replacement for `api.openai.com/v1`. Same request/response shapes, just point your SDK at `https://api.tokenroute.io/v1` and use an `sk-tr-*` key. See [Quickstart](/docs/quickstart) for SDK snippets. | Endpoint | Status | | --------------------------- | -------------------------- | | `POST /v1/chat/completions` | Live (streaming supported) | | `GET /v1/models` | Live | | `POST /v1/embeddings` | Phase B | ## Management API (`/api/v1/*`) [#management-api-apiv1] Account, key, balance, and top-up management. Two auth modes: * **Logto JWT** (Bearer) — for user-facing endpoints (`/api/v1/me`, `/api/v1/balance`, `/api/v1/topup`, `/api/v1/me/keys`). Obtained via `tokenroute login` (OAuth device-flow). * **X-Internal-Secret** — for the admin endpoints used by paradigx storefront / partner bridge. Not for end users. The live OpenAPI schema is at `https://api.tokenroute.io/openapi.json`. ## Discovery for agents [#discovery-for-agents] If you need to bootstrap OIDC without the CLI: ```bash curl https://api.tokenroute.io/api/v1/auth/discovery ``` Returns `{issuer, client_id, device_authorization_endpoint, token_endpoint, scopes, resource}` — enough to run RFC 8628 device-flow yourself. ## Rate limits [#rate-limits] * Burst: 10 RPS / key * Sustained: depends on tier (see `tokenroute models` for per-model pricing & limits) * Hitting the limit returns HTTP 429 with a `Retry-After` header ## Errors [#errors] OpenAI-style error envelope: ```json { "error": { "message": "...", "type": "authentication_error", "code": 401 } } ``` | Code | Type | Meaning | | ---- | ---------------------- | --------------------------- | | 401 | `authentication_error` | Missing / bad / revoked key | | 402 | `insufficient_quota` | Balance exhausted — top up | | 429 | `rate_limit_error` | Slow down | | 503 | `service_unavailable` | Try again shortly | # CLI reference (/docs/cli) The `tokenroute` CLI is the primary surface for managing your tokenroute account from a terminal or an AI agent. It's a Python package on PyPI (`pip install tokenroute`); a Node thin wrapper (`npx tokenroute@latest`) ships in Phase B. ## Install [#install] ```bash pipx install tokenroute # global, pinned uvx tokenroute # one-shot, no install npx tokenroute@latest # via the npm thin wrapper (Phase B) ``` ## Global flags & env [#global-flags--env] | Flag / env | Default | Purpose | | ------------------------------ | --------------------------- | ------------------------------------------- | | `--json` / `TOKENROUTE_JSON=1` | off | Machine-parseable JSON output. | | `TOKENROUTE_API_URL` | `https://api.tokenroute.io` | Override gateway base URL. | | `TOKENROUTE_API_KEY` | *(unset)* | sk-tr-\* key for LLM calls — skips `login`. | Credentials live at `~/.tokenroute/credentials.json` (owner-readable only on POSIX). The most recently created raw API key is cached at `~/.tokenroute/last_key.txt` so `env`, `test`, and `models` can find it with no arguments. ## Commands [#commands] ### `tokenroute login` [#tokenroute-login] OAuth device-flow against Logto. Opens your browser; CLI polls until you authorize. ```bash $ tokenroute login Visit: https://auth.paradigx.com/device And enter code: NHWB-FGSR Waiting for authorization... OK logged in ``` ### `tokenroute logout` [#tokenroute-logout] Forget locally stored credentials. ### `tokenroute whoami` [#tokenroute-whoami] Show current user + balance. ```bash $ tokenroute whoami --json { "id": "f4a8...", "display_name": "Agent", "email": "agent@example.com", "logto_sub": "logto_xyz", "balance_usd": "0.00", "status": "active" } ``` ### `tokenroute keys create --name ` [#tokenroute-keys-create---name-name] Create a new `sk-tr-*` API key. Raw value is shown **once**; cached to `~/.tokenroute/last_key.txt` (pass `--no-cache` to skip). ```bash $ tokenroute keys create --name fieryeye --json {"id":"...","name":"fieryeye","key_prefix":"sk-tr-abcd","raw":"sk-tr-...","balance_usd":"0",...} ``` ### `tokenroute keys list` [#tokenroute-keys-list] Table or JSON list of your keys. Raw values are never returned. ### `tokenroute keys revoke ` [#tokenroute-keys-revoke-id] Mark a key revoked. Future calls with it return 401. ### `tokenroute balance` [#tokenroute-balance] Sum of credit across all active keys. ```bash $ tokenroute balance --json {"balance_usd": "5.00", "currency": "USD"} ``` ### `tokenroute topup --amount ` [#tokenroute-topup---amount-usd] Get a Stripe Checkout URL to add credit. Default behavior opens it in the browser. **Agents must not auto-pay** — hand the URL to the user. ```bash $ tokenroute topup --amount 5 --json {"checkout_url": "https://checkout.stripe.com/c/pay/...", "session_id": "cs_...", "amount_usd": "5", "expires_at": 1700000000} ``` ### `tokenroute usage --days N` [#tokenroute-usage---days-n] Summary spend over the last N days (1-365). Only successful requests count. ```bash $ tokenroute usage --days 7 --json {"range_days": 7, "total_requests": 142, "total_tokens_in": 12450, "total_tokens_out": 3200, "total_cost_usd": "0.18"} ``` ### `tokenroute env [--key sk-tr-...]` [#tokenroute-env---key-sk-tr-] Output `OPENAI_API_KEY=...` + `OPENAI_BASE_URL=...` for shell redirection. ```bash $ tokenroute env >> .env $ tail -2 .env OPENAI_API_KEY=sk-tr-cached123 OPENAI_BASE_URL=https://api.tokenroute.io/v1 ``` ### `tokenroute test [--model X]` [#tokenroute-test---model-x] Send a tiny chat completion to verify the gateway + your key work end-to-end. Default model is `openai/gpt-4o-mini`. ```bash $ tokenroute test OK connected (openai/gpt-4o-mini) reply: OK ``` ### `tokenroute models` [#tokenroute-models] List available models with pricing. ```bash $ tokenroute models --json | jq '.[] | .id' | head "openai/gpt-4o-mini" "openai/gpt-4o" "anthropic/claude-haiku-4-5" ``` ## Exit codes [#exit-codes] | Code | Meaning | | ---- | ---------------------------------------------------------------------- | | 0 | Success | | 1 | User error (4xx from API: insufficient balance, invalid input, no key) | | 2 | Network error | | 3 | Server error (5xx, device-flow failure) | # For AI agents (/docs/for-agents) You are an AI agent (Claude Code / Codex / Cursor / OpenClaw / Hermes / etc.) and your user needs an LLM API key plumbed into a project. Here is the complete flow in the order you should run it. Every command supports `--json` and respects `TOKENROUTE_API_KEY` env var so your sub-agent runs don't need interactive auth. ## TL;DR sequence [#tldr-sequence] ```bash # 1. install + login (one-time per user) npx tokenroute@latest login # opens browser, user logs in # 2. create a key for this project npx tokenroute keys create --name --json # → {"id":"...","raw":"sk-tr-XXXX",...} ← cache `raw`, you only get it once # 3. write env to the project cd npx tokenroute env >> .env # → adds: OPENAI_API_KEY=sk-tr-XXXX # OPENAI_BASE_URL=https://api.tokenroute.io/v1 # 4. test it works npx tokenroute test # → OK connected, model responds OK # 5. if balance is 0 and you can't proceed → ask user to top up npx tokenroute topup --amount 5 --json # → {"checkout_url":"https://checkout.stripe.com/..."} # ⚠ HAND THIS URL TO THE USER. DO NOT AUTO-PAY. ``` ## Hard constraints [#hard-constraints] 1. **Agents must NOT auto-pay.** `topup` returns a Stripe Checkout URL — relay it to the user, full stop. KYC and anti-fraud rules. 2. **`raw` API keys are shown ONCE.** On `keys create`, save the `raw` field into your project's `.env` immediately. The CLI also caches it to `~/.tokenroute/last_key.txt` so subsequent `env` / `test` / `models` calls find it without arguments. 3. **`tokenroute test` requires a model the user has credit for.** Default is `openai/gpt-4o-mini` (\~$0.00015 / 1k input). If credit is 0, `test` returns HTTP 402 — that's your cue to call `topup`. ## Exit codes [#exit-codes] | Code | Meaning | What to do | | ---- | ------------------------------------------------------------------ | -------------------------------------------------- | | 0 | Success | Continue. | | 1 | User error / 4xx from API (insufficient balance, invalid key, ...) | Surface to user; usually means top-up or re-login. | | 2 | Network error | Retry with backoff. | | 3 | Server error / 5xx / device-flow timeout | Retry once; if persistent, surface to user. | ## Programmatic discovery [#programmatic-discovery] If you need to bootstrap from scratch without the CLI: ```bash # 1. discover OIDC config curl https://api.tokenroute.io/api/v1/auth/discovery # → {issuer, client_id, device_authorization_endpoint, token_endpoint, ...} # 2. then run OAuth device-flow against `issuer` yourself ``` The full OpenAPI spec is at `https://api.tokenroute.io/openapi.json`. For LLM-friendly indexing, this site auto-publishes [`/llms.txt`](https://docs.tokenroute.io/llms.txt) and [`/llms-full.txt`](https://docs.tokenroute.io/llms-full.txt). ## Sub-agent / CI usage [#sub-agent--ci-usage] For non-interactive flows (CI pipelines, sub-agents calling other sub-agents), skip `login` entirely: ```bash export TOKENROUTE_API_KEY=sk-tr-... curl https://api.tokenroute.io/v1/chat/completions \ -H "Authorization: Bearer $TOKENROUTE_API_KEY" \ -H "Content-Type: application/json" \ -d '{"model":"openai/gpt-4o-mini","messages":[{"role":"user","content":"hi"}]}' ``` ## Common multi-product scenarios [#common-multi-product-scenarios] ### Setting up fieryeye [#setting-up-fieryeye] ```bash cd fieryeye-project npx tokenroute keys create --name fieryeye-prod --json # → raw saved npx tokenroute env >> .env # → OPENAI_API_KEY set # fieryeye starts; if it can't auth, fall back to topup flow ``` ### Setting up multiple sub-agents in a workspace [#setting-up-multiple-sub-agents-in-a-workspace] ```bash # parent agent creates one key per project for proj in api worker dashboard; do npx tokenroute keys create --name "$proj-prod" --json > /tmp/$proj-key.json done # each sub-agent reads its own .json and exports TOKENROUTE_API_KEY ``` ## Troubleshooting [#troubleshooting] | Symptom | Likely cause | Fix | | --------------------------------- | -------------------------- | ------------------------------------ | | `tokenroute test` → HTTP 402 | Balance = 0 | Run `topup`, surface URL to user. | | `tokenroute test` → HTTP 401 | Key revoked or wrong | `keys list` to see what's active. | | `login` hangs after browser opens | User declined / closed tab | Exit code 3, ask user to retry. | | `env` says "no API key available" | No `keys create` yet | Run `keys create` first; then `env`. | # Introduction (/docs) **tokenroute** is an OpenAI-compatible LLM API gateway that does three things: 1. **One API key, many providers.** Point any OpenAI SDK at `https://api.tokenroute.io/v1` and call OpenAI, Anthropic, Google, DeepSeek, and more — no per-vendor SDKs, no per-vendor keys. 2. **Auto-routing by prompt complexity.** The gateway scores your prompt and picks the cheapest model that can handle it (SIMPLE → REASONING tiers). You stop paying GPT-4 prices to answer "what's 2+2". 3. **Agent-first surface.** A `tokenroute` CLI + remote MCP server expose the entire lifecycle — login, key creation, top-up, usage — so Claude Code, Codex, OpenClaw, Hermes, and other agents can wire it up in a handful of commands. ## Why agent-first? [#why-agent-first] When a coding agent installs a new LLM-powered app for a user, the painful step is *getting the API key plumbed in*. Most providers force the user to open a browser dashboard, copy a key, paste it into `.env`, restart. We collapsed that into: ```bash npx tokenroute login # OAuth device-flow, browser opens npx tokenroute keys create --name myapp # raw key shown once npx tokenroute env >> .env # OPENAI_API_KEY + BASE_URL written ``` Every CLI command supports `--json` and `TOKENROUTE_API_KEY` env var, so sub-agents and CI pipelines can do the same flow non-interactively. ## What's next [#whats-next] From zero to first request in under 2 minutes. Designed for agents reading this directly — copy-paste ready. Every command with examples and JSON shapes. OpenAI-compatible endpoints + management API. # Quickstart (/docs/quickstart) Pick the path that matches how you're using tokenroute. ## A. From a Python or Node app (OpenAI SDK) [#a-from-a-python-or-node-app-openai-sdk] You already have OpenAI SDK code? Change two lines: ```python from openai import OpenAI client = OpenAI( base_url="https://api.tokenroute.io/v1", api_key="sk-tr-...", # from `tokenroute keys create` ) resp = client.chat.completions.create( model="openai/gpt-4o-mini", messages=[{"role": "user", "content": "Hello"}], ) print(resp.choices[0].message.content) ``` ```typescript import OpenAI from "openai"; const client = new OpenAI({ baseURL: "https://api.tokenroute.io/v1", apiKey: "sk-tr-...", }); const r = await client.chat.completions.create({ model: "openai/gpt-4o-mini", messages: [{ role: "user", content: "Hello" }], }); console.log(r.choices[0].message.content); ``` ```bash curl https://api.tokenroute.io/v1/chat/completions \ -H "Authorization: Bearer sk-tr-..." \ -H "Content-Type: application/json" \ -d '{ "model": "openai/gpt-4o-mini", "messages": [{"role": "user", "content": "Hello"}] }' ``` ## B. From the CLI (recommended for first-time setup) [#b-from-the-cli-recommended-for-first-time-setup] ```bash # install (any of these) pipx install tokenroute # pinned global install uvx tokenroute # one-shot, no install npx tokenroute@latest # via the npm thin wrapper (Phase B) # log in once tokenroute login # opens browser for OAuth device-flow # create your first key tokenroute keys create --name myapp # write OPENAI_API_KEY + BASE_URL into your project tokenroute env >> .env # verify it works tokenroute test ``` ## C. From an AI agent (Claude Code, Codex, etc.) [#c-from-an-ai-agent-claude-code-codex-etc] If you're an agent reading this on behalf of a user, [hop to **For AI agents**](/docs/for-agents) for the copy-paste sequence. ## D. From Claude Code or Cursor (MCP) [#d-from-claude-code-or-cursor-mcp] MCP support is in development (Phase B). Once live, one line wires tokenroute into Claude Code: ```bash claude mcp add tokenroute https://mcp.tokenroute.io/mcp ``` ## What you get back [#what-you-get-back] The `model` field is namespaced `provider/model-id`. tokenroute supports OpenAI, Anthropic, Google, DeepSeek, Mistral out of the box — full list at [`tokenroute models`](/docs/cli) or `GET /v1/models`. ## Topping up [#topping-up] New accounts start at $0. Get a Stripe Checkout link via: ```bash tokenroute topup --amount 5 ``` Or, `POST /api/v1/topup` returns `{ checkout_url }` for programmatic flows. **Agents must not auto-pay** — hand the URL to the user. # botu — production agent runtime on tokenroute (/docs/case-studies/botu) # botu on tokenroute [#botu-on-tokenroute] [botu](https://agent.botu.io) is the Paradigx production runtime for AI assistant skills — a fork of [OpenClaw](https://openclaw.dev) running on `agent.botu.io`. **All** LLM traffic from botu sandboxes goes through tokenroute as the upstream gateway, with Claude Haiku 4.5 as primary and Sonnet 4.5 as fallback. ## What's wired [#whats-wired] botu's `openclaw.json` config points its built-in LLM router at tokenroute's OpenAI-compatible endpoint: ```jsonc { "llm": { "providers": [ { "type": "openai", "baseUrl": "https://api.tokenroute.io/v1", "apiKey": "${LITELLM_API_KEY}", // sk-tr-* from `tokenroute keys create` "models": [ { "id": "anthropic/claude-haiku-4-5", "name": "Claude Haiku 4.5 (via tokenroute)", "role": "primary" }, { "id": "anthropic/claude-sonnet-4-5", "name": "Claude Sonnet 4.5 (via tokenroute)", "role": "fallback" } ] } ] } } ``` `LITELLM_API_KEY` is injected via the host `.env` file — botu containers never see a real Anthropic key. To upstream agents inside botu sandboxes, this is transparent: they call `https://api.anthropic.com`-shaped APIs and tokenroute does the routing. ## Why this matters for agent-first [#why-this-matters-for-agent-first] This is the **agent runtime** half of the agent-first story: * **Tenant isolation**: each botu tenant gets its own sk-tr key. Per-tenant spend visible in `tokenroute usage`. * **Provider fallback for free**: if Anthropic Haiku has a quota event, tokenroute auto-routes the next request to Sonnet (or any other configured tier) without the agent code knowing. * **One bill**: Paradigx pays tokenroute, tokenroute pays Anthropic/OpenAI/etc. No N-way invoice reconciliation across providers. * **Cost cap per agent run**: botu sandboxes use `tokenroute keys create --name ` per session, so any runaway agent loop hits a per-key balance ceiling rather than draining the whole org credit. ## How a botu deploy uses tokenroute end-to-end [#how-a-botu-deploy-uses-tokenroute-end-to-end] ```bash # 1. Operator creates a tokenroute API key for the botu deployment tokenroute keys create --name botu-prod --json # → {"raw":"sk-tr-..."} # 2. Inject it into the botu host .env (alayion in our case) ssh alayion 'echo "LITELLM_API_KEY=sk-tr-..." >> /opt/botu/.env' # 3. Compose up botu — it reads .env, container env vars resolve # ${LITELLM_API_KEY} in openclaw.json, and every LLM call from skills # inside botu sandboxes routes through tokenroute. ssh alayion 'cd /opt/botu && docker compose up -d' ``` Replace `ssh alayion ...` with whatever your deploy story is — Render, Fly, Hetzner, K8s. The botu side is just "set one env var". ## Phase A → Phase B closure [#phase-a--phase-b-closure] This is the **proof of concept** for tokenroute's bet that AI products want a *gateway* layer between them and the half-dozen LLM providers: | Before (per-product per-provider) | After (botu via tokenroute) | | ----------------------------------------------------- | ---------------------------------------------------------------- | | Maintain Anthropic + OpenAI + Gemini keys in 5 places | One `LITELLM_API_KEY` per product deploy | | Fallback logic inlined into every skill | Server-side, transparent to agent code | | Spend visibility per provider, not per use case | Per-key in `tokenroute usage`, plus per-tenant if you scope keys | | KYC + payment with each vendor | Paradigx fronts upstream costs, customers top up tokenroute | ## See also [#see-also] * botu deploy compose: [`paradigx-deploy/alayion/docker-compose.yml`](https://github.com/jiangjin11/botu/blob/main/paradigx-deploy/alayion/docker-compose.yml) — note the `ANTHROPIC_API_KEY=sk-ant-real-key-is-tokenroute-not-here` decoy * Live: `https://agent.botu.io/healthz`