# LLM Configuration

Device Agent uses the primary agent model to generate DeviceSpecs, understand chat requests, and
call device commands. Configure at least one reachable primary agent model before starting the quick
start.

## Console Configuration

After Device Agent starts, open `http://127.0.0.1:3000`, go to **Settings → Models**, and configure
the **Agent model** area:

| Field | Notes |
| --- | --- |
| Provider | Select the LLM provider. Common entries include `openai`, `anthropic`, `qwen`, `kimi`, `ollama`, and `openai-compatible`. |
| Model | Select or enter the model ID. Dropdown options come from the selected provider's available model list; custom entries such as `openai-compatible` allow manual model IDs. |
| API key | Enter the credential for the selected provider. |
| Base URL | Set this for custom OpenAI-compatible endpoints or local model services. `openai-compatible` usually requires it. `ollama` defaults to `http://localhost:11434/v1` when left empty. |
| Max iterations | Maximum reasoning and tool-call iterations for one request. The default is usually enough. |

After saving, restart Agent Gateway before continuing if the page indicates that a restart is
required.

## LLM Providers

Available providers and models can change across versions and deployments. Use the list shown in
**Settings → Models** as the source of truth.

Common entries include:

- OpenAI (`openai`)
- Anthropic (`anthropic`)
- Qwen / DashScope (`qwen`)
- Kimi (`kimi`)
- Ollama (`ollama`)
- OpenAI-compatible endpoint (`openai-compatible`)
- Google / Google Vertex (`google`, `google-vertex`)
- DeepSeek (`deepseek`)
- OpenRouter (`openrouter`)
- xAI (`xai`)
- Groq (`groq`)
- Mistral (`mistral`)
- Amazon Bedrock (`amazon-bedrock`)
- Azure OpenAI Responses (`azure-openai-responses`)
- Vercel AI Gateway (`vercel-ai-gateway`)
- GitHub Copilot (`github-copilot`)
- OpenCode (`opencode`, `opencode-go`)
- Codex (`openai-codex`)

Common configuration patterns:

| Type | Example `LLM_PROVIDER` | Configuration |
| --- | --- | --- |
| OpenAI | `openai` | Set `OPENAI_API_KEY`. Fill **Base URL** only when using a proxy or custom compatible endpoint. |
| Anthropic | `anthropic` | Set `ANTHROPIC_API_KEY`. |
| Other providers | `google`, `deepseek`, `openrouter`, `xai`, `groq`, `mistral`, and others | Select the provider and model in the console, then fill **API key**. For startup config, `LLM_API_KEY` can be used as the generic credential. |
| OpenAI-compatible endpoint | `openai-compatible` | Set a custom model ID, `LLM_BASE_URL`, and `LLM_API_KEY`. `OPENAI_API_KEY` can also provide the credential. |
| Qwen / DashScope | `qwen` | Set `QWEN_API_KEY`. Defaults to `https://dashscope.aliyuncs.com/compatible-mode/v1`. |
| Kimi | `kimi` | Set `KIMI_API_KEY`. `OPENAI_API_KEY` can also be used as a compatible credential. Defaults to `https://api.moonshot.cn/v1`. |
| Kimi Coding | `kimi-coding` | Set `KIMI_API_KEY`. |
| Ollama | `ollama` | Local model service. Defaults to `http://localhost:11434/v1` and usually does not need an API key. |
| Codex | `openai-codex` | Uses a special auth flow. See the Codex section below. |

Common examples are `openai` / `gpt-5.5`, `anthropic` / `claude-sonnet-4-6`, `qwen` / `qwen-plus`, and `ollama` / `llama3.2`.

### Codex Auth

`openai-codex` is not configured like a normal OpenAI API key. The recommended path is to sign in
with the Codex CLI on the machine running Device Agent, then let Device Agent read the default
`~/.codex/auth.json`. You can also override it with either of the following options:

```bash
LLM_PROVIDER=openai-codex
LLM_MODEL=<codex-model-id>
OPENAI_CODEX_AUTH_FILE=/path/to/auth.json
```

Or provide an access token directly:

```bash
LLM_PROVIDER=openai-codex
LLM_MODEL=<codex-model-id>
OPENAI_CODEX_ACCESS_TOKEN=...
```

When `openai-codex` is selected in the console, the regular **API key** field acts as a Codex access
token override. The **Codex auth file** field points to `auth.json`. If neither field is set,
Device Agent tries the Codex CLI default auth file.

## `.env` Configuration

Use `.env` for first startup, container deployment, or environments without UI access. Common
examples are below.

OpenAI:

```bash
LLM_PROVIDER=openai
LLM_MODEL=gpt-5.5
OPENAI_API_KEY=sk-...
```

Qwen:

```bash
LLM_PROVIDER=qwen
LLM_MODEL=qwen-plus
QWEN_API_KEY=sk-...
```

OpenAI-compatible endpoint:

```bash
LLM_PROVIDER=openai-compatible
LLM_MODEL=your-model-id
LLM_BASE_URL=https://llm.example.com/v1
LLM_API_KEY=sk-...
```

Ollama:

```bash
LLM_PROVIDER=ollama
LLM_MODEL=llama3.2
LLM_BASE_URL=http://localhost:11434/v1
```

Codex:

```bash
LLM_PROVIDER=openai-codex
LLM_MODEL=<codex-model-id>
OPENAI_CODEX_AUTH_FILE=~/.codex/auth.json
```

Values in `.env` are written into or used to override `.device_agent/config.json` at startup. To keep
values saved from the console long-term, remove the matching `.env` variables or keep both places
aligned.

## Model Requirements

The quick start needs a model that can reliably generate structured output and perform tool calls.
Use a model with tool-call support and at least an 8K context window. For longer device descriptions
or conversations with more device state, use a 32K or larger context window.

## Vision Model

The vision model is used for camera images, screenshots, and multimodal input analysis. You can
skip it for a basic device-control quick start.

| Mode | Configuration | Notes |
| --- | --- | --- |
| Auto | `VISION_PROVIDER=auto` | Default mode. Reuses the primary agent model when it supports image input. No separate API key is needed. |
| DashScope | `VISION_PROVIDER=dashscope`, `VISION_MODEL=qwen-vl-plus` or `qwen-vl-max`, `VISION_API_KEY=...` | Uses a separate Qwen VL model. Use this when the primary agent model does not support images, or when vision should be configured separately. |

The console **Vision model** area controls enablement, provider, model, API key, and timeout.

For all configuration sources and apply behavior, see [Configuration](../configuration.md).
