AI API cost comparison: running local models vs API providers
A detailed breakdown of the real costs of running AI agents: comparing OpenAI, Anthropic, DeepSeek, OpenRouter, and local models via Ollama. Find the right balance for your usage.
The two-part cost of a personal AI agent
Running OpenClaw on a personal VPS has two distinct cost components: the server itself (fixed monthly fee) and the AI model inference (variable, based on usage). Most people focus on the server cost but the AI API spend is what actually scales with use.
Understanding both — and the tradeoffs between API providers and local models — lets you build an agent that fits your budget without compromising on capability.
Server cost: the fixed floor
OpenClaw itself is lightweight. A Hetzner CX22 at EUR 4.35/month handles personal use comfortably. For local model inference via Ollama, you need more:
| Scenario | Server | Monthly Cost |
|---|---|---|
| Agent only (API models) | 2 vCPU / 4 GB RAM | EUR 4–6/month |
| Agent + small local model | 4 vCPU / 8 GB RAM | EUR 12–20/month |
| Agent + capable local model | 8 vCPU / 32 GB RAM | EUR 40–80/month |
Running local models requires substantially more RAM than running the agent framework alone. A 7B parameter model at 4-bit quantization needs roughly 5 GB of RAM just to load.
API provider pricing (as of late 2025)
All API pricing is per million tokens. A "token" is roughly 0.75 words. A typical back-and-forth message is 200–800 tokens combined input and output.
| Model | Input (per 1M tokens) | Output (per 1M tokens) | Quality tier |
|---|---|---|---|
| GPT-4o | USD 5.00 | USD 15.00 | Flagship |
| GPT-4o mini | USD 0.15 | USD 0.60 | Fast/cheap |
| Claude 3.5 Sonnet | USD 3.00 | USD 15.00 | Flagship |
| Claude 3.5 Haiku | USD 0.80 | USD 4.00 | Fast/cheap |
| DeepSeek V3 | USD 0.27 | USD 1.10 | Strong / very cheap |
| Gemini 1.5 Flash | USD 0.075 | USD 0.30 | Fast/cheap |
For context: if you send 100 messages per day averaging 500 tokens each, you consume roughly 1.5 million tokens per month (accounting for context window accumulation). At DeepSeek V3 pricing, that's under USD 2/month.
Using OpenRouter for cost optimization
OpenRouter aggregates dozens of models under a single API key and billing account. This means you can:
- Switch models without reconfiguring your agent
- Use the cheapest model for simple tasks and route complex requests to stronger models
- Access models from Anthropic, OpenAI, Meta, and others through one invoice
OpenClaw supports OpenRouter natively. Set OPENROUTER_API_KEY in your configuration and specify models by their OpenRouter identifier (deepseek/deepseek-chat, anthropic/claude-3-5-sonnet, etc.).
Local models via Ollama: when it makes sense
Ollama lets you run open-weight models (Llama, Mistral, Gemma, etc.) directly on your server with no API calls. This means:
- Zero per-token cost — pay only for the server hardware
- Complete privacy — no data leaves your VPS at all
- No rate limits — inference speed is limited only by your hardware
The tradeoff: Local models require powerful hardware, and even the best open-weight models currently trail frontier models (GPT-4o, Claude 3.5) on complex reasoning tasks.
When local models are the right choice
- You process highly sensitive data and want zero API exposure
- You have high message volume where API costs accumulate significantly
- Your use cases are straightforward (summarization, simple Q&A) where a 7B model performs adequately
- You want to experiment with fine-tuned models tailored to your needs
Recommended setup by usage level
| Profile | Recommended Setup | Estimated Monthly Cost |
|---|---|---|
| Casual user (30 msg/day) | Hetzner CX22 + GPT-4o mini | EUR 5–7/month |
| Regular user (100 msg/day) | Hetzner CX22 + DeepSeek V3 via OpenRouter | EUR 6–10/month |
| Power user (300+ msg/day) | Hetzner CPX31 + mix of DeepSeek + Claude Haiku | EUR 15–25/month |
| Privacy-first user | Hetzner CPX41 + Ollama + Llama 3.1 8B | EUR 25–40/month |
The k-claw installer lets you configure your preferred model during setup and change it at any time from the dashboard — no reinstallation required.
Stop paying per-seat. Pay once, own your agent.
OpenClaw runs on a EUR 4/month VPS. Add your own API keys. k-claw gets it installed and configured in 15 minutes.
See pricingRelated articles
What is a personal AI agent? A complete guide for 2026
Learn what personal AI agents are, how they work, and why self-hosting gives you privacy, control, and unlimited customization compared to cloud-based assistants.
How to install OpenClaw on a VPS: step-by-step guide
A complete walkthrough for installing OpenClaw on your own VPS. From choosing a server to configuring AI models and messaging channels.