Back to blog

AI API cost comparison: running local models vs API providers

A detailed breakdown of the real costs of running AI agents: comparing OpenAI, Anthropic, DeepSeek, OpenRouter, and local models via Ollama. Find the right balance for your usage.

K-Claw Team·October 30, 2025·3 min read

The two-part cost of a personal AI agent

Running OpenClaw on a personal VPS has two distinct cost components: the server itself (fixed monthly fee) and the AI model inference (variable, based on usage). Most people focus on the server cost but the AI API spend is what actually scales with use.

Understanding both — and the tradeoffs between API providers and local models — lets you build an agent that fits your budget without compromising on capability.

Server cost: the fixed floor

OpenClaw itself is lightweight. A Hetzner CX22 at EUR 4.35/month handles personal use comfortably. For local model inference via Ollama, you need more:

ScenarioServerMonthly Cost
Agent only (API models)2 vCPU / 4 GB RAMEUR 4–6/month
Agent + small local model4 vCPU / 8 GB RAMEUR 12–20/month
Agent + capable local model8 vCPU / 32 GB RAMEUR 40–80/month

Running local models requires substantially more RAM than running the agent framework alone. A 7B parameter model at 4-bit quantization needs roughly 5 GB of RAM just to load.

API provider pricing (as of late 2025)

All API pricing is per million tokens. A "token" is roughly 0.75 words. A typical back-and-forth message is 200–800 tokens combined input and output.

ModelInput (per 1M tokens)Output (per 1M tokens)Quality tier
GPT-4oUSD 5.00USD 15.00Flagship
GPT-4o miniUSD 0.15USD 0.60Fast/cheap
Claude 3.5 SonnetUSD 3.00USD 15.00Flagship
Claude 3.5 HaikuUSD 0.80USD 4.00Fast/cheap
DeepSeek V3USD 0.27USD 1.10Strong / very cheap
Gemini 1.5 FlashUSD 0.075USD 0.30Fast/cheap

For context: if you send 100 messages per day averaging 500 tokens each, you consume roughly 1.5 million tokens per month (accounting for context window accumulation). At DeepSeek V3 pricing, that's under USD 2/month.

Using OpenRouter for cost optimization

OpenRouter aggregates dozens of models under a single API key and billing account. This means you can:

  • Switch models without reconfiguring your agent
  • Use the cheapest model for simple tasks and route complex requests to stronger models
  • Access models from Anthropic, OpenAI, Meta, and others through one invoice

OpenClaw supports OpenRouter natively. Set OPENROUTER_API_KEY in your configuration and specify models by their OpenRouter identifier (deepseek/deepseek-chat, anthropic/claude-3-5-sonnet, etc.).

Local models via Ollama: when it makes sense

Ollama lets you run open-weight models (Llama, Mistral, Gemma, etc.) directly on your server with no API calls. This means:

  • Zero per-token cost — pay only for the server hardware
  • Complete privacy — no data leaves your VPS at all
  • No rate limits — inference speed is limited only by your hardware

The tradeoff: Local models require powerful hardware, and even the best open-weight models currently trail frontier models (GPT-4o, Claude 3.5) on complex reasoning tasks.

When local models are the right choice

  • You process highly sensitive data and want zero API exposure
  • You have high message volume where API costs accumulate significantly
  • Your use cases are straightforward (summarization, simple Q&A) where a 7B model performs adequately
  • You want to experiment with fine-tuned models tailored to your needs

Recommended setup by usage level

ProfileRecommended SetupEstimated Monthly Cost
Casual user (30 msg/day)Hetzner CX22 + GPT-4o miniEUR 5–7/month
Regular user (100 msg/day)Hetzner CX22 + DeepSeek V3 via OpenRouterEUR 6–10/month
Power user (300+ msg/day)Hetzner CPX31 + mix of DeepSeek + Claude HaikuEUR 15–25/month
Privacy-first userHetzner CPX41 + Ollama + Llama 3.1 8BEUR 25–40/month

The k-claw installer lets you configure your preferred model during setup and change it at any time from the dashboard — no reinstallation required.

Stop paying per-seat. Pay once, own your agent.

OpenClaw runs on a EUR 4/month VPS. Add your own API keys. k-claw gets it installed and configured in 15 minutes.

See pricing