A detailed breakdown of the real costs of running AI agents: comparing OpenAI, Anthropic, DeepSeek, OpenRouter, and local models via Ollama. Find the right balance for your usage.

The two-part cost of a personal AI agent

Running OpenClaw on a personal VPS has two distinct cost components: the server itself (fixed monthly fee) and the AI model inference (variable, based on usage). Most people focus on the server cost but the AI API spend is what actually scales with use.

Understanding both — and the tradeoffs between API providers and local models — lets you build an agent that fits your budget without compromising on capability.

Server cost: the fixed floor

OpenClaw itself is lightweight. A Hetzner CX22 at EUR 4.35/month handles personal use comfortably. For local model inference via Ollama, you need more:

Scenario	Server	Monthly Cost
Agent only (API models)	2 vCPU / 4 GB RAM	EUR 4–6/month
Agent + small local model	4 vCPU / 8 GB RAM	EUR 12–20/month
Agent + capable local model	8 vCPU / 32 GB RAM	EUR 40–80/month

Running local models requires substantially more RAM than running the agent framework alone. A 7B parameter model at 4-bit quantization needs roughly 5 GB of RAM just to load.

API provider pricing (as of late 2025)

All API pricing is per million tokens. A "token" is roughly 0.75 words. A typical back-and-forth message is 200–800 tokens combined input and output.

Model	Input (per 1M tokens)	Output (per 1M tokens)	Quality tier
GPT-4o	USD 5.00	USD 15.00	Flagship
GPT-4o mini	USD 0.15	USD 0.60	Fast/cheap
Claude 3.5 Sonnet	USD 3.00	USD 15.00	Flagship
Claude 3.5 Haiku	USD 0.80	USD 4.00	Fast/cheap
DeepSeek V3	USD 0.27	USD 1.10	Strong / very cheap
Gemini 1.5 Flash	USD 0.075	USD 0.30	Fast/cheap

For context: if you send 100 messages per day averaging 500 tokens each, you consume roughly 1.5 million tokens per month (accounting for context window accumulation). At DeepSeek V3 pricing, that's under USD 2/month.

Using OpenRouter for cost optimization

OpenRouter aggregates dozens of models under a single API key and billing account. This means you can:

Switch models without reconfiguring your agent
Use the cheapest model for simple tasks and route complex requests to stronger models
Access models from Anthropic, OpenAI, Meta, and others through one invoice

OpenClaw supports OpenRouter natively. Set OPENROUTER_API_KEY in your configuration and specify models by their OpenRouter identifier (deepseek/deepseek-chat, anthropic/claude-3-5-sonnet, etc.).

Local models via Ollama: when it makes sense

Ollama lets you run open-weight models (Llama, Mistral, Gemma, etc.) directly on your server with no API calls. This means:

Zero per-token cost — pay only for the server hardware
Complete privacy — no data leaves your VPS at all
No rate limits — inference speed is limited only by your hardware

The tradeoff: Local models require powerful hardware, and even the best open-weight models currently trail frontier models (GPT-4o, Claude 3.5) on complex reasoning tasks.

When local models are the right choice

You process highly sensitive data and want zero API exposure
You have high message volume where API costs accumulate significantly
Your use cases are straightforward (summarization, simple Q&A) where a 7B model performs adequately
You want to experiment with fine-tuned models tailored to your needs

Recommended setup by usage level

Profile	Recommended Setup	Estimated Monthly Cost
Casual user (30 msg/day)	Hetzner CX22 + GPT-4o mini	EUR 5–7/month
Regular user (100 msg/day)	Hetzner CX22 + DeepSeek V3 via OpenRouter	EUR 6–10/month
Power user (300+ msg/day)	Hetzner CPX31 + mix of DeepSeek + Claude Haiku	EUR 15–25/month
Privacy-first user	Hetzner CPX41 + Ollama + Llama 3.1 8B	EUR 25–40/month

The k-claw installer lets you configure your preferred model during setup and change it at any time from the dashboard — no reinstallation required.

AI API cost comparison: running local models vs API providers

The two-part cost of a personal AI agent

Server cost: the fixed floor

API provider pricing (as of late 2025)

Using OpenRouter for cost optimization

Local models via Ollama: when it makes sense

When local models are the right choice

Recommended setup by usage level

Stop paying per-seat. Pay once, own your agent.

Related articles

What is a personal AI agent? A complete guide for 2026

How to install OpenClaw on a VPS: step-by-step guide