Skip to main content

Command Palette

Search for a command to run...

Our Azure Bill Spiked Overnight — Here's Exactly How We Cut It 60% (7 Real Fixes)

Updated
2 min read

We run a multi-tenant analytics SaaS on Azure (.NET 9 / ASP.NET Core, Angular 19, ~110k MAU, ~3,200 req/sec). The bill was a boring ~\(4,800/mo for a year, then it climbed to ~\)12,100/mo in three weeks with no big feature shipped. Here are the seven causes and the fixes that cut it 60%.

Where the money went (before to after)

  • Log / telemetry ingestion: $3,100 to $340/mo (a Debug level left on; 95 GB/day to 9 GB/day)

  • Compute / autoscale: $3,400 to $1,500/mo (scaled out but never in; 8 instances 24/7)

  • Orphaned resources: $1,900 to $0 (a forgotten load-test env + unattached disks/IPs)

  • Egress / bandwidth: $1,200 to $280/mo (large files, no CDN)

  • SQL + Redis: $1,600 to $900/mo (premium tiers + no reservations)

  • Storage transactions: $500 to $190/mo (millions of tiny blob ops, hot tier)

  • AI / LLM tokens: $400 to $190/mo (uncached RAG, oversized model)

  • Total: ~$12,100 to ~$4,700/mo (-60%, below the original baseline)

What it covers

  • Turning the lights on first: tagging, a cost dashboard, anomaly alerts

  • The diagnostic queries (az consumption + KQL log-ingestion-by-source)

  • Before/after config for each fix (sampling, autoscale Bicep, blob lifecycle, CDN, LLM caching)

  • A FinOps checklist so it doesn't recur

  • The honest limits of cost optimization

The lesson

Most runaway bills are a visibility problem, not an architecture one. Two config mistakes and some deleted waste did most of the work. Cost is a production signal: tag everything, dashboard it, and alert on spend-per-day the way you alert on p95. ~3 days to fix, paid back in ~2.

Full post with real config, queries, and dollar figures: https://prepstack.co.in/blog/azure-bill-spiked-how-we-cut-it-60-percent