LLM Providers

TeamWeb AI can route agent traffic through multiple LLM providers concurrently. An admin configures one or more providers in Settings; each assistant then either picks its own provider/model override or follows the deployment defaults.

Main and utility tiers

Every agent call is tagged as one of two tiers:

Main — the conversational turn in the agent loop. Uses the assistant’s provider_id + model_name if set, otherwise the deployment main default.
Utility — workflow guards, classifiers, and message summarisation. Uses the assistant’s utility_provider_id + utility_model_name if set, otherwise the deployment utility default.

The main and utility defaults can point at the same provider config or different ones — a common setup is a capable model like Claude Opus or GPT-4o on the main tier with a cheap, fast model like Claude Haiku or GPT-4o-mini on the utility tier for guards.

Workflow-internal LLM calls (guards, classifiers) currently still use the main-tier provider. The utility-tier data model is in place and per-assistant overrides persist correctly, but the workflow engine hasn’t yet been wired to route through the utility tier. Track progress in the code comment tagged TODO(llm-tiers) in app/services/workflow/orchestrator.py.

Sign in with ChatGPT / Claude — no registration required

Anthropic and OpenAI both publish public OAuth clients inside their CLI tools (Claude Code and Codex respectively). TeamWeb AI piggybacks on those same public clients, so admins never register anything with Anthropic or OpenAI — just click Connect and paste a URL back.

This matches how pi-mono and OpenClaw sign in: the authorize URL uses the vendor’s published client_id, the token exchange uses PKCE (no client secret), and the redirect URI is pinned to a localhost:<port> value on the vendor’s side.

The paste flow

Because the vendor’s OAuth client is locked to a localhost redirect URI (it was designed for CLI tools), a web deployment can’t catch the redirect cleanly. So the flow works like this:

Click Connect on the provider’s row in Settings.
TeamWeb AI opens a wizard page with a consent-URL link and a paste box.
Click the link — the vendor’s OAuth page opens in a new tab. Complete the sign-in.
Your browser gets redirected to http://localhost:1455/... (or :53692/... for Anthropic). That page shows “This site can’t be reached” — that’s expected. The authorization code is in the URL bar.
Copy the full URL from the address bar and paste it into the wizard’s textarea. Submit.
TeamWeb AI exchanges the code server-side, persists the refresh token (encrypted), and shows the connected account on the provider’s Settings row.

If the paste expires (15-minute window), just hit Connect again.

OpenAI (Codex flow)

Authorize URL: https://auth.openai.com/oauth/authorize
Token URL: https://auth.openai.com/oauth/token
Client ID: Codex CLI’s public client (app_EMoamEEZ73f0CkXaXp7hrann).
Scopes: openid profile email offline_access, plus the Codex- specific id_token_add_organizations and codex_cli_simplified_flow flags that produce a long-lived refresh token.
Account identifier: pulled from the access token’s JWT payload under https://api.openai.com/auth.chatgpt_account_id.

Usage is drawn from your ChatGPT Plus/Pro/Business subscription quota rather than API credits. API keys still work as an automatic fallback if OAuth fails or your quota runs out.

Under the hood, the OpenAI plugin calls OpenAI’s Responses API (client.responses.create) on both paths. The API-key path hits api.openai.com; the OAuth path points at chatgpt.com/backend-api/codex with the same Codex-identity headers Codex CLI sends. Chat Completions is not used.

Anthropic (Claude Code flow)

Authorize URL: https://claude.ai/oauth/authorize
Token URL: https://platform.claude.com/v1/oauth/token
Client ID: Claude Code’s public client.
Scopes: org:create_api_key user:profile user:inference user:sessions:claude_code user:mcp_servers user:file_upload.
The state round-trip is the PKCE verifier — the Claude OAuth server expects them to be equal.

Usage is drawn from your Claude Pro/Max subscription rather than API credits. Like OpenAI, API keys fall back automatically.

Using API keys instead

Neither provider requires OAuth. If you prefer pay-per-request usage:

Settings → (provider row) → cog → paste the API key (OpenAI keys start with sk-, Anthropic with sk-ant-).
Click Save. The provider enables automatically.
On the Settings index, pick a model under the provider’s row by clicking its Main or Utility radio button.

Both paths (API key + OAuth) can be configured simultaneously. When both are active, OAuth is tried first and API key is the automatic fallback.

Adding a new model

Curated model catalogs ship with each provider plugin, but when a new model drops you don’t need to wait for a code update. Under Settings → Custom Models, add a row with:

plugin_name — which provider offers the model
model_id — the provider-native model identifier
display_name, context_window, input_price_per_mtok, output_price_per_mtok

Custom entries extend the dropdowns in the assistant form and playground immediately. If you use a model_id that shadows a built-in catalog entry, a warning badge appears in the list — custom values win over built-ins with the same id.

Pricing and capability flags on custom entries are your responsibility to keep accurate. LLMLog cost calculations read directly from them.

Per-assistant overrides

When editing an assistant, the Model accordion exposes a two-tier selector:

Main tier — Provider dropdown + model dropdown (or free-text for Ollama). “Use deployment default” defers entirely to the admin-configured main default.
Utility tier — Same structure; collapsed under “Advanced” and usually left on the default.

Model choices are the merged catalog (plugin manifest + custom_models rows) for the selected provider. Ollama is a special case — its catalog is admin-supplied via plugin config or custom models, but the assistant form accepts free-text model names to match the pull-any-model Ollama workflow.

Cross-provider playground comparisons

The playground’s Model A / Model B dropdowns now span every enabled provider’s catalog. A comparison run can pair Claude Sonnet against GPT-4o side-by-side to compare quality, cost, and latency on the same prompt.

The encoded form in the dropdown is <config_id>:<model_id>, parsed by the agent loop so the playground run routes through the chosen provider config — including its OAuth tokens, if any — without touching the assistant’s own provider binding.

Production Deployment