# API reference (/docs/api-reference)


tokenroute exposes two surfaces:

## OpenAI-compatible (`/v1/*`) [#openai-compatible-v1]

Drop-in replacement for `api.openai.com/v1`. Same request/response shapes, just point your SDK at `https://api.tokenroute.io/v1` and use an `sk-tr-*` key. See [Quickstart](/docs/quickstart) for SDK snippets.

| Endpoint                    | Status                     |
| --------------------------- | -------------------------- |
| `POST /v1/chat/completions` | Live (streaming supported) |
| `GET /v1/models`            | Live                       |
| `POST /v1/embeddings`       | Phase B                    |

## Management API (`/api/v1/*`) [#management-api-apiv1]

Account, key, balance, and top-up management. Two auth modes:

* **Logto JWT** (Bearer) — for user-facing endpoints (`/api/v1/me`, `/api/v1/balance`, `/api/v1/topup`, `/api/v1/me/keys`). Obtained via `tokenroute login` (OAuth device-flow).
* **X-Internal-Secret** — for the admin endpoints used by paradigx storefront / partner bridge. Not for end users.

The live OpenAPI schema is at `https://api.tokenroute.io/openapi.json`.

## Discovery for agents [#discovery-for-agents]

If you need to bootstrap OIDC without the CLI:

```bash
curl https://api.tokenroute.io/api/v1/auth/discovery
```

Returns `{issuer, client_id, device_authorization_endpoint, token_endpoint, scopes, resource}` — enough to run RFC 8628 device-flow yourself.

## Rate limits [#rate-limits]

* Burst: 10 RPS / key
* Sustained: depends on tier (see `tokenroute models` for per-model pricing & limits)
* Hitting the limit returns HTTP 429 with a `Retry-After` header

## Errors [#errors]

OpenAI-style error envelope:

```json
{
  "error": {
    "message": "...",
    "type": "authentication_error",
    "code": 401
  }
}
```

| Code | Type                   | Meaning                     |
| ---- | ---------------------- | --------------------------- |
| 401  | `authentication_error` | Missing / bad / revoked key |
| 402  | `insufficient_quota`   | Balance exhausted — top up  |
| 429  | `rate_limit_error`     | Slow down                   |
| 503  | `service_unavailable`  | Try again shortly           |


# CLI reference (/docs/cli)


The `tokenroute` CLI is the primary surface for managing your tokenroute account from a terminal or an AI agent. It's a Python package on PyPI (`pip install tokenroute`); a Node thin wrapper (`npx tokenroute@latest`) ships in Phase B.

## Install [#install]

```bash
pipx install tokenroute     # global, pinned
uvx tokenroute              # one-shot, no install
npx tokenroute@latest       # via the npm thin wrapper (Phase B)
```

## Global flags & env [#global-flags--env]

| Flag / env                     | Default                     | Purpose                                     |
| ------------------------------ | --------------------------- | ------------------------------------------- |
| `--json` / `TOKENROUTE_JSON=1` | off                         | Machine-parseable JSON output.              |
| `TOKENROUTE_API_URL`           | `https://api.tokenroute.io` | Override gateway base URL.                  |
| `TOKENROUTE_API_KEY`           | *(unset)*                   | sk-tr-\* key for LLM calls — skips `login`. |

Credentials live at `~/.tokenroute/credentials.json` (owner-readable only on POSIX). The most recently created raw API key is cached at `~/.tokenroute/last_key.txt` so `env`, `test`, and `models` can find it with no arguments.

## Commands [#commands]

### `tokenroute login` [#tokenroute-login]

OAuth device-flow against Logto. Opens your browser; CLI polls until you authorize.

```bash
$ tokenroute login
Visit: https://auth.paradigx.com/device
And enter code: NHWB-FGSR
Waiting for authorization...
OK logged in
```

### `tokenroute logout` [#tokenroute-logout]

Forget locally stored credentials.

### `tokenroute whoami` [#tokenroute-whoami]

Show current user + balance.

```bash
$ tokenroute whoami --json
{
  "id": "f4a8...",
  "display_name": "Agent",
  "email": "agent@example.com",
  "logto_sub": "logto_xyz",
  "balance_usd": "0.00",
  "status": "active"
}
```

### `tokenroute keys create --name <name>` [#tokenroute-keys-create---name-name]

Create a new `sk-tr-*` API key. Raw value is shown **once**; cached to `~/.tokenroute/last_key.txt` (pass `--no-cache` to skip).

```bash
$ tokenroute keys create --name fieryeye --json
{"id":"...","name":"fieryeye","key_prefix":"sk-tr-abcd","raw":"sk-tr-...","balance_usd":"0",...}
```

### `tokenroute keys list` [#tokenroute-keys-list]

Table or JSON list of your keys. Raw values are never returned.

### `tokenroute keys revoke <id>` [#tokenroute-keys-revoke-id]

Mark a key revoked. Future calls with it return 401.

### `tokenroute balance` [#tokenroute-balance]

Sum of credit across all active keys.

```bash
$ tokenroute balance --json
{"balance_usd": "5.00", "currency": "USD"}
```

### `tokenroute topup --amount <usd>` [#tokenroute-topup---amount-usd]

Get a Stripe Checkout URL to add credit. Default behavior opens it in the browser. **Agents must not auto-pay** — hand the URL to the user.

```bash
$ tokenroute topup --amount 5 --json
{"checkout_url": "https://checkout.stripe.com/c/pay/...", "session_id": "cs_...", "amount_usd": "5", "expires_at": 1700000000}
```

### `tokenroute usage --days N` [#tokenroute-usage---days-n]

Summary spend over the last N days (1-365). Only successful requests count.

```bash
$ tokenroute usage --days 7 --json
{"range_days": 7, "total_requests": 142, "total_tokens_in": 12450, "total_tokens_out": 3200, "total_cost_usd": "0.18"}
```

### `tokenroute env [--key sk-tr-...]` [#tokenroute-env---key-sk-tr-]

Output `OPENAI_API_KEY=...` + `OPENAI_BASE_URL=...` for shell redirection.

```bash
$ tokenroute env >> .env
$ tail -2 .env
OPENAI_API_KEY=sk-tr-cached123
OPENAI_BASE_URL=https://api.tokenroute.io/v1
```

### `tokenroute test [--model X]` [#tokenroute-test---model-x]

Send a tiny chat completion to verify the gateway + your key work end-to-end. Default model is `openai/gpt-4o-mini`.

```bash
$ tokenroute test
OK connected (openai/gpt-4o-mini)
  reply: OK
```

### `tokenroute models` [#tokenroute-models]

List available models with pricing.

```bash
$ tokenroute models --json | jq '.[] | .id' | head
"openai/gpt-4o-mini"
"openai/gpt-4o"
"anthropic/claude-haiku-4-5"
```

## Exit codes [#exit-codes]

| Code | Meaning                                                                |
| ---- | ---------------------------------------------------------------------- |
| 0    | Success                                                                |
| 1    | User error (4xx from API: insufficient balance, invalid input, no key) |
| 2    | Network error                                                          |
| 3    | Server error (5xx, device-flow failure)                                |


# For AI agents (/docs/for-agents)


You are an AI agent (Claude Code / Codex / Cursor / OpenClaw / Hermes / etc.) and your user needs an LLM API key plumbed into a project. Here is the complete flow in the order you should run it. Every command supports `--json` and respects `TOKENROUTE_API_KEY` env var so your sub-agent runs don't need interactive auth.

## TL;DR sequence [#tldr-sequence]

```bash
# 1. install + login (one-time per user)
npx tokenroute@latest login                       # opens browser, user logs in

# 2. create a key for this project
npx tokenroute keys create --name <project> --json
# → {"id":"...","raw":"sk-tr-XXXX",...}  ← cache `raw`, you only get it once

# 3. write env to the project
cd <project-dir>
npx tokenroute env >> .env
# → adds: OPENAI_API_KEY=sk-tr-XXXX
#         OPENAI_BASE_URL=https://api.tokenroute.io/v1

# 4. test it works
npx tokenroute test
# → OK connected, model responds OK

# 5. if balance is 0 and you can't proceed → ask user to top up
npx tokenroute topup --amount 5 --json
# → {"checkout_url":"https://checkout.stripe.com/..."}
# ⚠ HAND THIS URL TO THE USER. DO NOT AUTO-PAY.
```

## Hard constraints [#hard-constraints]

<Callout type="warn">
  1. **Agents must NOT auto-pay.** `topup` returns a Stripe Checkout URL — relay it to the user, full stop. KYC and anti-fraud rules.
  2. **`raw` API keys are shown ONCE.** On `keys create`, save the `raw` field into your project's `.env` immediately. The CLI also caches it to `~/.tokenroute/last_key.txt` so subsequent `env` / `test` / `models` calls find it without arguments.
  3. **`tokenroute test` requires a model the user has credit for.** Default is `openai/gpt-4o-mini` (\~$0.00015 / 1k input). If credit is 0, `test` returns HTTP 402 — that's your cue to call `topup`.
</Callout>

## Exit codes [#exit-codes]

| Code | Meaning                                                            | What to do                                         |
| ---- | ------------------------------------------------------------------ | -------------------------------------------------- |
| 0    | Success                                                            | Continue.                                          |
| 1    | User error / 4xx from API (insufficient balance, invalid key, ...) | Surface to user; usually means top-up or re-login. |
| 2    | Network error                                                      | Retry with backoff.                                |
| 3    | Server error / 5xx / device-flow timeout                           | Retry once; if persistent, surface to user.        |

## Programmatic discovery [#programmatic-discovery]

If you need to bootstrap from scratch without the CLI:

```bash
# 1. discover OIDC config
curl https://api.tokenroute.io/api/v1/auth/discovery
# → {issuer, client_id, device_authorization_endpoint, token_endpoint, ...}

# 2. then run OAuth device-flow against `issuer` yourself
```

The full OpenAPI spec is at `https://api.tokenroute.io/openapi.json`. For LLM-friendly indexing, this site auto-publishes [`/llms.txt`](https://docs.tokenroute.io/llms.txt) and [`/llms-full.txt`](https://docs.tokenroute.io/llms-full.txt).

## Sub-agent / CI usage [#sub-agent--ci-usage]

For non-interactive flows (CI pipelines, sub-agents calling other sub-agents), skip `login` entirely:

```bash
export TOKENROUTE_API_KEY=sk-tr-...
curl https://api.tokenroute.io/v1/chat/completions \
  -H "Authorization: Bearer $TOKENROUTE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"openai/gpt-4o-mini","messages":[{"role":"user","content":"hi"}]}'
```

## Common multi-product scenarios [#common-multi-product-scenarios]

### Setting up fieryeye [#setting-up-fieryeye]

```bash
cd fieryeye-project
npx tokenroute keys create --name fieryeye-prod --json    # → raw saved
npx tokenroute env >> .env                                # → OPENAI_API_KEY set
# fieryeye starts; if it can't auth, fall back to topup flow
```

### Setting up multiple sub-agents in a workspace [#setting-up-multiple-sub-agents-in-a-workspace]

```bash
# parent agent creates one key per project
for proj in api worker dashboard; do
  npx tokenroute keys create --name "$proj-prod" --json > /tmp/$proj-key.json
done
# each sub-agent reads its own .json and exports TOKENROUTE_API_KEY
```

## Troubleshooting [#troubleshooting]

| Symptom                           | Likely cause               | Fix                                  |
| --------------------------------- | -------------------------- | ------------------------------------ |
| `tokenroute test` → HTTP 402      | Balance = 0                | Run `topup`, surface URL to user.    |
| `tokenroute test` → HTTP 401      | Key revoked or wrong       | `keys list` to see what's active.    |
| `login` hangs after browser opens | User declined / closed tab | Exit code 3, ask user to retry.      |
| `env` says "no API key available" | No `keys create` yet       | Run `keys create` first; then `env`. |


# Introduction (/docs)


**tokenroute** is an OpenAI-compatible LLM API gateway that does three things:

1. **One API key, many providers.** Point any OpenAI SDK at `https://api.tokenroute.io/v1` and call OpenAI, Anthropic, Google, DeepSeek, and more — no per-vendor SDKs, no per-vendor keys.
2. **Auto-routing by prompt complexity.** The gateway scores your prompt and picks the cheapest model that can handle it (SIMPLE → REASONING tiers). You stop paying GPT-4 prices to answer "what's 2+2".
3. **Agent-first surface.** A `tokenroute` CLI + remote MCP server expose the entire lifecycle — login, key creation, top-up, usage — so Claude Code, Codex, OpenClaw, Hermes, and other agents can wire it up in a handful of commands.

## Why agent-first? [#why-agent-first]

When a coding agent installs a new LLM-powered app for a user, the painful step is *getting the API key plumbed in*. Most providers force the user to open a browser dashboard, copy a key, paste it into `.env`, restart. We collapsed that into:

```bash
npx tokenroute login                    # OAuth device-flow, browser opens
npx tokenroute keys create --name myapp # raw key shown once
npx tokenroute env >> .env              # OPENAI_API_KEY + BASE_URL written
```

Every CLI command supports `--json` and `TOKENROUTE_API_KEY` env var, so sub-agents and CI pipelines can do the same flow non-interactively.

## What's next [#whats-next]

<Cards>
  <Card title="Quickstart" href="/docs/quickstart">
    From zero to first request in under 2 minutes.
  </Card>

  <Card title="For AI agents" href="/docs/for-agents">
    Designed for agents reading this directly — copy-paste ready.
  </Card>

  <Card title="CLI reference" href="/docs/cli">
    Every command with examples and JSON shapes.
  </Card>

  <Card title="API reference" href="/docs/api-reference">
    OpenAI-compatible endpoints + management API.
  </Card>
</Cards>


# Quickstart (/docs/quickstart)


Pick the path that matches how you're using tokenroute.

## A. From a Python or Node app (OpenAI SDK) [#a-from-a-python-or-node-app-openai-sdk]

You already have OpenAI SDK code? Change two lines:

<Tabs items="[&#x22;OpenAI Python&#x22;, &#x22;OpenAI Node&#x22;, &#x22;curl&#x22;]">
  <Tab value="OpenAI Python">
    ```python
    from openai import OpenAI

    client = OpenAI(
        base_url="https://api.tokenroute.io/v1",
        api_key="sk-tr-...",  # from `tokenroute keys create`
    )
    resp = client.chat.completions.create(
        model="openai/gpt-4o-mini",
        messages=[{"role": "user", "content": "Hello"}],
    )
    print(resp.choices[0].message.content)
    ```
  </Tab>

  <Tab value="OpenAI Node">
    ```typescript
    import OpenAI from "openai";

    const client = new OpenAI({
      baseURL: "https://api.tokenroute.io/v1",
      apiKey: "sk-tr-...",
    });
    const r = await client.chat.completions.create({
      model: "openai/gpt-4o-mini",
      messages: [{ role: "user", content: "Hello" }],
    });
    console.log(r.choices[0].message.content);
    ```
  </Tab>

  <Tab value="curl">
    ```bash
    curl https://api.tokenroute.io/v1/chat/completions \
      -H "Authorization: Bearer sk-tr-..." \
      -H "Content-Type: application/json" \
      -d '{
        "model": "openai/gpt-4o-mini",
        "messages": [{"role": "user", "content": "Hello"}]
      }'
    ```
  </Tab>
</Tabs>

## B. From the CLI (recommended for first-time setup) [#b-from-the-cli-recommended-for-first-time-setup]

```bash
# install (any of these)
pipx install tokenroute     # pinned global install
uvx tokenroute              # one-shot, no install
npx tokenroute@latest       # via the npm thin wrapper (Phase B)

# log in once
tokenroute login            # opens browser for OAuth device-flow

# create your first key
tokenroute keys create --name myapp

# write OPENAI_API_KEY + BASE_URL into your project
tokenroute env >> .env

# verify it works
tokenroute test
```

## C. From an AI agent (Claude Code, Codex, etc.) [#c-from-an-ai-agent-claude-code-codex-etc]

If you're an agent reading this on behalf of a user, [hop to **For AI agents**](/docs/for-agents) for the copy-paste sequence.

## D. From Claude Code or Cursor (MCP) [#d-from-claude-code-or-cursor-mcp]

<Callout type="info">
  MCP support is in development (Phase B). Once live, one line wires tokenroute into Claude Code:
</Callout>

```bash
claude mcp add tokenroute https://mcp.tokenroute.io/mcp
```

## What you get back [#what-you-get-back]

The `model` field is namespaced `provider/model-id`. tokenroute supports OpenAI, Anthropic, Google, DeepSeek, Mistral out of the box — full list at [`tokenroute models`](/docs/cli) or `GET /v1/models`.

## Topping up [#topping-up]

New accounts start at $0. Get a Stripe Checkout link via:

```bash
tokenroute topup --amount 5
```

Or, `POST /api/v1/topup` returns `{ checkout_url }` for programmatic flows. **Agents must not auto-pay** — hand the URL to the user.


# botu — production agent runtime on tokenroute (/docs/case-studies/botu)


# botu on tokenroute [#botu-on-tokenroute]

[botu](https://agent.botu.io) is the Paradigx production runtime for AI assistant skills — a fork of [OpenClaw](https://openclaw.dev) running on `agent.botu.io`. **All** LLM traffic from botu sandboxes goes through tokenroute as the upstream gateway, with Claude Haiku 4.5 as primary and Sonnet 4.5 as fallback.

## What's wired [#whats-wired]

botu's `openclaw.json` config points its built-in LLM router at tokenroute's OpenAI-compatible endpoint:

```jsonc
{
  "llm": {
    "providers": [
      {
        "type": "openai",
        "baseUrl": "https://api.tokenroute.io/v1",
        "apiKey": "${LITELLM_API_KEY}",   // sk-tr-* from `tokenroute keys create`
        "models": [
          {
            "id": "anthropic/claude-haiku-4-5",
            "name": "Claude Haiku 4.5 (via tokenroute)",
            "role": "primary"
          },
          {
            "id": "anthropic/claude-sonnet-4-5",
            "name": "Claude Sonnet 4.5 (via tokenroute)",
            "role": "fallback"
          }
        ]
      }
    ]
  }
}
```

`LITELLM_API_KEY` is injected via the host `.env` file — botu containers never see a real Anthropic key. To upstream agents inside botu sandboxes, this is transparent: they call `https://api.anthropic.com`-shaped APIs and tokenroute does the routing.

## Why this matters for agent-first [#why-this-matters-for-agent-first]

This is the **agent runtime** half of the agent-first story:

* **Tenant isolation**: each botu tenant gets its own sk-tr key. Per-tenant spend visible in `tokenroute usage`.
* **Provider fallback for free**: if Anthropic Haiku has a quota event, tokenroute auto-routes the next request to Sonnet (or any other configured tier) without the agent code knowing.
* **One bill**: Paradigx pays tokenroute, tokenroute pays Anthropic/OpenAI/etc. No N-way invoice reconciliation across providers.
* **Cost cap per agent run**: botu sandboxes use `tokenroute keys create --name <session>` per session, so any runaway agent loop hits a per-key balance ceiling rather than draining the whole org credit.

## How a botu deploy uses tokenroute end-to-end [#how-a-botu-deploy-uses-tokenroute-end-to-end]

```bash
# 1. Operator creates a tokenroute API key for the botu deployment
tokenroute keys create --name botu-prod --json
# → {"raw":"sk-tr-..."}

# 2. Inject it into the botu host .env (alayion in our case)
ssh alayion 'echo "LITELLM_API_KEY=sk-tr-..." >> /opt/botu/.env'

# 3. Compose up botu — it reads .env, container env vars resolve
#    ${LITELLM_API_KEY} in openclaw.json, and every LLM call from skills
#    inside botu sandboxes routes through tokenroute.
ssh alayion 'cd /opt/botu && docker compose up -d'
```

Replace `ssh alayion ...` with whatever your deploy story is — Render, Fly, Hetzner, K8s. The botu side is just "set one env var".

## Phase A → Phase B closure [#phase-a--phase-b-closure]

This is the **proof of concept** for tokenroute's bet that AI products want a *gateway* layer between them and the half-dozen LLM providers:

| Before (per-product per-provider)                     | After (botu via tokenroute)                                      |
| ----------------------------------------------------- | ---------------------------------------------------------------- |
| Maintain Anthropic + OpenAI + Gemini keys in 5 places | One `LITELLM_API_KEY` per product deploy                         |
| Fallback logic inlined into every skill               | Server-side, transparent to agent code                           |
| Spend visibility per provider, not per use case       | Per-key in `tokenroute usage`, plus per-tenant if you scope keys |
| KYC + payment with each vendor                        | Paradigx fronts upstream costs, customers top up tokenroute      |

## See also [#see-also]

* botu deploy compose: [`paradigx-deploy/alayion/docker-compose.yml`](https://github.com/jiangjin11/botu/blob/main/paradigx-deploy/alayion/docker-compose.yml) — note the `ANTHROPIC_API_KEY=sk-ant-real-key-is-tokenroute-not-here` decoy
* Live: `https://agent.botu.io/healthz`