What's the highest-ROI cloud cost reduction?

Right-sizing. Most teams over-provision by 2–5×. A right-sizing exercise + commitment plans (savings plans / reserved instances) typically cuts 30–50% of spend in the first quarter.

Should I move off the cloud to save money?

If your workload is steady-state and you have ops capacity, on-prem can be 3–5× cheaper. The catch is the ops capacity. For most teams, optimizing cloud is the right call before considering migration.

Cloud Cost Optimization in 2026 — The Tactics That Actually Work

Cloud bills compound. A team that started at $5k/month is at $50k/month in 18 months without doing anything wrong. The cost-optimization playbook isn’t exotic — it’s a few boring tactics applied consistently. This post is the working set.

Right-sizing (the biggest lever)

Most teams over-provision. Run for 2 weeks. Look at peak utilization. If CPU averages 15%, you have 2–4× too much.

EC2 / Cloud Run / GKE: look at actual CPU + memory.
RDS: connections + CPU + memory.
ElastiCache: memory + CPU.

Tools: AWS Compute Optimizer, GCP Recommender, K8s VPA recommendations.

A right-sizing pass typically cuts 30%+ of compute spend.

Savings plans / reserved instances

For predictable steady-state workloads:

AWS Savings Plans (Compute): 1-year commit → 27% off; 3-year → 54% off.
GCP Committed Use Discounts: similar savings.
Azure Reserved Instances: similar.

Commit only to your true baseline (the load you’ll have regardless). Variable on-demand.

For most production: 60–80% of compute on commitments, 20–40% on-demand for flexibility.

Spot instances

For interruptible work (batch jobs, CI, async workers, training):

EC2 Spot: up to 90% off.
GCP Spot VMs / Preemptibles: similar.
K8s with Karpenter: schedules pods on spot pools automatically.

Apps must handle SIGTERM (graceful shutdown). Otherwise interruptions cause failures.

Egress reduction

Egress is the silent killer:

AWS: $0.09/GB out. 100 TB/month = $9k.
Cloudflare: $0/GB. Bunny: $0.005/GB.

Tactics:

CDN in front of everything user-facing — see Cloudflare Workers + D1 .
Cross-region replication within free tiers; minimize cross-region traffic.
VPC endpoints so internal traffic doesn’t egress through NAT.
Compress responses (gzip/br).

Idle resource cleanup

Tag everything; sweep:

Unattached EBS volumes.
Snapshots from 2021.
Test EC2s left running.
NAT Gateways for VPCs nobody uses anymore.
Idle ElastiCache / RDS dev clusters at night.

Tools: AWS Trusted Advisor, AWS Cost Explorer, Cloud Custodian, infracost in CI.

A monthly cleanup typically finds 5–15% waste.

Database tier-down

A db.r6g.4xlarge for dev / staging is excessive. Right-size by environment:

Prod: real size.
Staging: 1/2 of prod.
Dev: smallest viable.

Use auto-shutdown for dev/staging during off-hours.

CI/CD costs

Build minutes add up. Caching (CI/CD Best Practices ) is the biggest lever. A 10-minute build dropping to 2 minutes cuts CI cost 80%.

Container right-sizing

Resource requests in K8s drive scheduling. Most teams set them once and forget.

Use VPA (Vertical Pod Autoscaler) recommendations.
Set requests to actual usage; limits higher to allow bursts.
Monitor and adjust.

For Kubernetes specifics .

Observability cost

Datadog / New Relic / Splunk bills can match cloud bills.

Sample traces (OTel). 10% sampling is plenty for most.
Log retention: hot 14 days, cold 90 days, then S3 archive.
High-cardinality metrics are expensive — drop unused dimensions.

For self-hosted alternatives see Observability 2.0 .

LLM costs

For AI-heavy apps:

Prompt caching (90% off cached tokens).
Model routing (see LLM Routing in 2026 ).
Batching (50% off via batch APIs).
Self-host above ~$30k/month.

See LLM Cost Optimization in 2026 .

FinOps culture

Tools alone don’t save money. Culture does:

Per-team cost dashboards. Visible. Discussed.
Architectural reviews include cost.
PR template includes “estimated cost impact.”
Quarterly waste-audit week.

Without culture, costs creep back. With it, savings stick.

Read this next

If you want my cost-audit checklist + monthly review template, it’s at rajpoot.dev .

Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .

Right-sizing (the biggest lever)#

Savings plans / reserved instances#

Spot instances#

Egress reduction#

Idle resource cleanup#

Database tier-down#

CI/CD costs#

Container right-sizing#

Observability cost#

LLM costs#

FinOps culture#

Read this next#