Memorandum
ShurIQ Monthly Recurring Costs — April 2026 Billing Update & OpenRouter Migration Plan
From Jonathan Dubowsky, ShurIQ System Operations To Limore Shur, Shur Creative Partners Date April 3, 2026 Re April recurring subscription costs, billing consolidation timeline, cost optimization status
Summary: I just received the last of this month's recurring charges. Total monthly subscriptions: $2,900 across 8 provider accounts. Invoice attached (INV-2026-0403). $1,600 is needed by Monday — certain systems are blocked until topped up. Remaining $1,300 by Wednesday. Accompanying this memo: a cost model spreadsheet and a routing diagram showing why single-billing isn't possible yet and the path to get there.

Current Monthly Cost Breakdown

ShurIQ runs on a multi-provider architecture. Each provider serves a distinct function — inference, storage, retrieval, deployment, memory. Here's what we're paying monthly:

Provider Function Monthly Category
Google Gemini Vector store, RAG, primary inference $1,500 System-critical
Cloudflare Compute, agent sandbox, site hosting $270 System-critical
Claude Max (Anthropic) Reasoning, code gen, orchestration $200 Premium tier
OpenAI Codex Code gen, function calling, redundancy $200 Premium tier
Kimi K2 Max Cost optimization testing $200 Optimization test
ManusAI Autonomous agent execution $200 Premium tier
Letta.ai Persistent agent memory $200 Premium tier
MiniMax Batch processing, nightly research $130 Premium tier
Total Monthly Subscriptions $2,900

These are base subscriptions only. Overage costs — incurred when usage exceeds subscription limits — are a separate, variable category. Overages typically hit during intensive operations: multilingual ontology builds, large-corpus RAG ingestion, or stack-ranking runs across 1,700+ companies.

Two Distinct Cost Categories

1. Recurring Subscriptions (Predictable)

The $2,900 above. These are fixed monthly charges that keep each provider's service active. They include usage allowances — a certain number of API calls, tokens, or compute hours per month.

2. Ad-Hoc Overages (Variable)

When a specific operation exceeds the subscription ceiling. Examples:

Overages have historically ranged from $700 to $2,500 per incident, each tied to a specific capability expansion. We're actively capping these — the system architecture now includes hard limits on per-provider spend to prevent surprise charges.

Why Single Billing Isn't Possible Yet

I know you want to put down one company card in one place and manage these costs the way you manage other business expenses. We're working toward that. Here's why it's not a flip-the-switch situation.

ShurIQ currently routes to 19 distinct provider endpoints. Each route is a direct API connection — ShurIQ sends requests to api.gemini, api.anthropic, api.openai, etc., each with its own authentication, message format, and billing account. The system is like an energy grid: some sources can't go offline even briefly without breaking downstream operations.

You can't just point everything at a new endpoint. Each provider has slightly different API contracts — message formats, response structures, token counting. Switching a route without testing can silently degrade output quality. We have to verify each one.
Current State — 19 Direct Routes
SHUR IQ System-Critical api.gemini $1,500/mo — RAG + Inference cloudflare workers $270/mo — Compute Premium Tier api.anthropic $200/mo api.openai $200/mo api.manus $200/mo api.letta $200/mo api.minimax $130/mo Optimization Testing api.kimi (Moonshot) $200/mo — benchmarking + 11 Additional Provider Routes Transcription, translation, specialized models... 19 separate accounts = 19 separate bills
Target State — OpenRouter Consolidated
SHUR IQ OPENROUTER Single API • Single Bill • One Card Inference Models (Routed by OR) Gemini Claude OpenAI Kimi MiniMax Manus + 13 more models Direct Bill (Not Inference) Cloudflare ($270) Letta Memory ($200) Gemini Storage ($varies) Result: 1 primary bill + 3 exceptions Corporate card on OpenRouter covers ~75% of spend 3 remaining direct bills are stable, predictable

OpenRouter Migration — The Path to Single Billing

OpenRouter is an API aggregator that consolidates multiple model providers behind a single billing account. One card, one invoice, one dashboard. That's the destination.

Timeline: 30-40 days from now (target: mid-May 2026)

Each of the 19 routes requires a structured migration process:

Per-Route Migration (7 steps):
  1. Snapshot — Capture current system state so we can revert if anything breaks
  2. Fork — Create an isolated branch for the route change
  3. Switch — Reroute from direct API to OpenRouter endpoint
  4. Adapt — Adjust message format and parameters for OpenRouter's API contract
  5. Benchmark — Run 5-6 performance tests comparing output quality against baseline
  6. Validate — Confirm no degradation in accuracy, speed, or cost efficiency
  7. Merge — If all tests pass, merge the route into production

That's 7 steps × 19 routes = 133 discrete operations. Each requires human oversight — this isn't something the AI can do unsupervised, because the whole point is ensuring the AI's own output quality isn't compromised by the switch.

Active Cost Optimization

We're not just paying these bills — we're actively testing ways to reduce them.

Model Substitution Testing

We're running head-to-head comparisons on identical workloads across providers. Current test: Kimi K2 vs. Claude for ontology design inference. Early results suggest Kimi can deliver comparable quality at 60-65% lower cost for certain operation types. If benchmarks hold, that's a potential reduction from ~$1,300/mo to ~$500/mo on ontology inference alone.

Nightly Auto-Research Optimization

The system runs ~100 research experiments per night at approximately $3 per experiment. We're testing each provider's cost-per-experiment ratio to find the optimal mix of quality and cost across research, RAG ingestion, and report generation.

Context: Our current total monthly spend (subscriptions + overages) is approximately $4,000-5,000. For the volume and quality of output ShurIQ produces, industry-comparable operations typically run $15,000-20,000/month. The architecture is already highly cost-efficient. The goal now is to maintain that efficiency while making billing manageable.

What Happens Next

This week Cost model spreadsheet operational — plugs into business model projections Apr 6 (Mon) First payment tranche ($1,600) needed to restore system-critical services Apr 8 (Wed) Second payment tranche ($1,300) to cover remaining subscriptions Apr 7-18 First batch of OpenRouter route migrations (targeting 5-6 routes) Mid-May Target completion for OpenRouter consolidation — single billing active Ongoing Model substitution benchmarks producing cost-reduction recommendations

The spreadsheet I'm building will give us a live model of monthly costs by provider, forward projections, and overage risk estimates. Once the OpenRouter migration completes, we'll have one dashboard and one card on file.