How to run an AI coding agent on Crusoe with OpenCode
Run OpenCode as a fully open-source AI coding agent on infrastructure you control — no prompt logging, no black boxes, and a dual-model workflow optimized for real codebases.

Your code leaves your machine, hits an API you don't control, and gets processed on infrastructure you can't audit. Most AI coding tools work this way. You don't notice the trade-off until it matters — and by then, you've built your entire workflow around it.
OpenCode paired with Crusoe Managed Inference eliminates that trade-off. Open-source coding agent, open-source models, running on infrastructure you chose. Crusoe doesn't log or store your prompts or responses, and all data is encrypted in transit. You can see every layer of the stack, with nothing left opaque.
Here’s how to set it up from scratch.
What is OpenCode?
OpenCode is an open-source, terminal-native AI coding agent built by the team at Anomaly (the same folks behind Terminal.shop). If you've used Claude Code or GitHub Copilot Workspace, it'll feel familiar. The difference: it's fully open-source, model-agnostic, and designed from the ground up by developers who live in the terminal.
Since launching in June 2025, OpenCode has grown fast. The GitHub repo has over 134,000 stars, the project hit 650,000 monthly active users within five months, and the Discord community has over 34,000 members. Companies like Cloudflare have adopted it internally for sensitive development environments.
How it works
OpenCode initiates an interactive terminal session where you as the user describe what you desire to build. It reviews your existing files, generates code, executes shell commands, and refines tasks through iterations. What sets it apart from a chat wrapper is the sophisticated workflows it enables.
Top features
Plan and build modes
OpenCode separates thinking from doing. In Plan mode, the agent operates as a read-only architect: it analyzes your codebase, understands requirements, and produces an implementation plan without modifying any files. Once you've approved the plan, switch to Build mode and the agent becomes a hands-on engineer, writing, modifying, and deleting files to execute the work. This two-step workflow prevents the "just start coding and hope for the best" pitfall of most AI coding tools.
Model-agnostic architecture
OpenCode integrates 75+ LLM providers, including Claude, GPT, Gemini, DeepSeek, Llama, Ollama (local models), and any OpenAI-compatible endpoint (like Crusoe Managed Inference). You can assign different models for specific tasks, such as a fast general model for building and a reasoning model for planning. The precise Crusoe configuration is explained later.
LSP and MCP integration
OpenCode integrates with 20+ Language Server Protocol servers, giving it access to real code intelligence like diagnostics, go-to-definition, symbol lookup, and hover info. It also supports the Model Context Protocol (MCP) for connecting to databases, APIs, and third-party services. This means the agent isn't just reading raw text; it understands your codebase with the same tooling your editor uses.
Why it matters for this guide
The model-agnostic design is the key feature here. Point OpenCode at any OpenAI-compatible API and it works. That's what makes the pairing with Crusoe Managed Inference so clean: one config file, a bearer token, and you're running frontier open-source models on Crusoe's optimized inference engine with MemoryAlloyTM technology without changing how you work.
Why Crusoe Managed Inference?
Most managed inference providers give you an API and a black box behind it. You don't know where the GPUs are, how the traffic is routed, or what happens to your prompts after they leave your machine.
Crusoe is different. The infrastructure is purpose-built for AI workloads — not repurposed cloud VMs with GPUs bolted on. The inference runs on Crusoe's own data centers, powered by MemoryAlloy™ technology (a cluster-native KV cache fabric that delivers up to 9.9x faster time-to-first-token) and fastokens, an open-source Rust BPE tokenizer built in collaboration with NVIDIA that cuts TTFT by up to 40% on long-context workloads by eliminating tokenization as a bottleneck. No cold starts, no rate limit surprises, and full clarity on where your data is going.
If you're evaluating managed inference for production, especially for sensitive codebases or regulated environments, the combination of performance and transparency matters. This is the setup that makes OpenCode viable for teams that can't afford to treat infrastructure as an afterthought.
Here's what's available through Crusoe Intelligence Foundry today:
For this guide, we're pairing GPT-OSS 120B as the builder (best tool call support) with DeepSeek V3 as the planner (fast, capable general-purpose coding model). Together they cover the full plan-and-build workflow without breaking a sweat.
Getting started
1. Install OpenCode
brew install anomalyco/tap/opencodeOr for the desktop app (currently in beta):
brew install --cask opencode-desktop2. Get your Crusoe API key
Generate a bearer token at console.crusoecloud.com/foundry/api-keys. You'll need this to authenticate against the Crusoe Managed Inference endpoint.
3. Add your auth token
Open ~/.local/share/opencode/auth.json and add:
{
"crusoe": {
"type": "api",
"key": "<your-bearer-token>"
}
}
4. Configure OpenCode to use Crusoe models
Open ~/.config/opencode/opencode.json. This registers Crusoe as a provider, defines available models with their context limits, and sets defaults for build and plan modes.
Tip: You can copy the full config below directly, or grab it from this GitHub Gist to avoid any formatting issues.
Full config:
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"crusoe": {
"npm": "@ai-sdk/openai-compatible",
"options": {
"baseURL": "https://managed-inference-api-proxy.crusoecloud.com/v1"
},
"models": {
"openai/gpt-oss-120b": {
"name": "GPT-OSS 120B (Best Builder)",
"limit": { "context": 131072, "output": 8192 }
},
"deepseek-ai/DeepSeek-V3-0324": {
"name": "DeepSeek V3 (Best Planner)",
"limit": { "context": 163840, "output": 8192 }
},
"meta-llama/Llama-3.3-70B-Instruct": {
"name": "Llama 3.3 70B",
"limit": { "context": 131072, "output": 8192 }
},
"Qwen/Qwen3-235B-A22B-Instruct-2507": {
"name": "Qwen 3 235B (Massive Context)",
"limit": { "context": 262144, "output": 8192 }
},
"nvidia/NVIDIA-Nemotron-3-Super-120B-A12B": {
"name": "Nemotron 3 Super 120B",
"limit": { "context": 262144, "output": 8192 }
},
"nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B": {
"name": "Nemotron 3 Nano 30B",
"limit": { "context": 262144, "output": 8192 }
},
"google/gemma-4-31b-it": {
"name": "Gemma 4 31B",
"limit": { "context": 262141, "output": 8192 }
}
}
}
},
"model": "crusoe/openai/gpt-oss-120b",
"small_model": "crusoe/google/gemma-3-12b-it",
"mode": {
"build": {
"model": "crusoe/openai/gpt-oss-120b"
},
"plan": {
"model": "crusoe/deepseek-ai/DeepSeek-V3-0324"
}
},
"permission": {
"bash": "allow",
"read_file": "allow",
"ls": "allow",
"glob": "allow",
"grep": "allow"
}
}
A few things to note about this config:
The default model is set to GPT-OSS 120B since it currently has the most reliable tool call support in OpenCode. Build mode uses it for the same reason. Plan mode routes to DeepSeek V3, which is fast, capable at general-purpose coding, and well-suited for analyzing codebases and producing implementation plans. Gemma 3 12B is set as the small model for lightweight tasks that don't need heavy compute.
Using OpenCode
Once configured, run:
opencodeThis opens an interactive terminal session. Describe a task, have it read your codebase, write new features, fix bugs, or run shell commands.
Switching models mid-session
Hit Ctrl+P, type switch model, and pick whichever Crusoe model fits the task. Different models for different jobs.
What actually works in practice
After consistently applying an AI-powered coding setup across multiple projects, we've identified key patterns for maximum efficiency. This shows how to use the right model for each mode:
1. Start with DeepSeek V3 for Planning (Plan mode):
- Action: Initiate every task in "Plan mode" using DeepSeek V3.
- Benefit: This model excels at reading the codebase, mapping dependencies, and drafting a solid approach before code generation begins.
- Warning: Skipping this step is the primary cause of inefficient AI coding sessions; the agent will start writing with insufficient context, leading to time-consuming corrections.
2. Switch to GPT-OSS 120B for Building (Build mode):
- Action: Transition to GPT-OSS 120B the moment you are ready to execute the plan and build the solution.
- Benefit: While DeepSeek V3 is a strong generalist, GPT-OSS is optimized for reliable tool calls. It avoids the occasional tool call failures on complex, multi-file edits that can occur with DeepSeek V3.
3. Embrace the dual-model workflow:
- Principle: Resist the urge to use a single "best" model for all tasks.
- Rationale: DeepSeek V3 provides superior plans by focusing on structural reasoning without execution attempts, while GPT-OSS writes better code due to its optimization for tool use. Utilizing both models is the intended and most effective workflow.
Best practices for session management
1. Warm-up large repositories:
- When: On codebases exceeding 50 files.
- Action: Execute a broad read command before asking for specific changes.
- Example: Prompt the model with, "read the project structure and summarize the architecture."
- Goal: Provide the model with essential context to prevent hallucinating file paths or overlooking critical dependencies.
2. Know when to reset:
- When: If the agent enters a fix-loop or begins producing increasingly inaccurate output.
- Action: Immediately terminate the session and start fresh with a clear prompt.
- Rationale: Although long context windows are helpful, they also accumulate "noise." A clean, new session will consistently outperform one that has become bloated and polluted.
Explore all available models
If you want to test a model directly before adding it to your config, you can query the API:
curl -X POST https://managed-inference-api-proxy.crusoecloud.com/v1/chat/completions \
-H 'Authorization: Bearer <your-bearer-token>' \
-H 'Content-Type: application/json' \
-d '{
"model": "openai/gpt-oss-120b",
"messages": [{"role": "user", "content": "Hello"}]
}'
You can swap the model field for any model available on Crusoe Foundry. The full list, including context lengths and output limits, is at console.crusoecloud.com/foundry/models.
Troubleshooting
Authentication errors (401/403): Double-check that your bearer token in ~/.local/share/opencode/auth.json matches the one generated at console.crusoecloud.com/foundry/api-keys. Tokens can expire — regenerate if needed.
Homebrew tap fails to link: If brew install anomalyco/tap/opencode fails, try brew untap anomalyco/tap && brew tap anomalyco/tap to refresh the tap, then reinstall.
Model not found errors: Make sure the model ID in your ~/.config/opencode/opencode.json matches the exact string from the Crusoe Foundry model list. Model IDs are case-sensitive (e.g., deepseek-ai/DeepSeek-V3-0324, not deepseek-ai/deepseek-v3-0324).
Tool calls failing with a specific model: Not all models support tool calls equally. If you're seeing errors in Build mode, switch to GPT-OSS 120B — it has the most reliable tool call support on Crusoe Foundry. You can keep your current model for Plan mode where tool calls aren't needed.
Wrapping up
The standard pitch for AI coding tools has always been: give up control, get convenience. OpenCode on Crusoe flips that. You choose the models. You choose the infrastructure. You own the config file. The setup takes 10 minutes.
The stack is modular by design. Swap in better models as they ship. Move between Plan and Build as the work demands. Point the same agent at a different endpoint if your requirements change. Nothing is locked in. Get started here.



