Skip to main content

Cost Optimization Patterns — Match Resources to Usage

Most cloud waste comes from over-provisioning and wrong pricing model — match instance type and commitment to actual usage patterns.

When to use

  • Quarterly architecture and cost reviews
  • Before and after traffic changes (launches, sunsets)

Tradeoffs

  • Reserved/committed use is wasted if usage drops before commitment ends
  • Spot/preemptible instances can be interrupted — only for fault-tolerant workloads
Workload typeRecommended pricing modelWhy
Baseline stable load (web API, always-on)1-year Reserved / Committed use~40% cheaper than on-demand
Variable/spiky loadOn-demand + Auto-scalingPay only for what's used
Fault-tolerant batch / data processingSpot / Preemptible~70–80% cheaper, acceptable interruptions
Dev/test (off during nights/weekends)On-demand + Auto-scaling to zeroNo idle cost
Large ML training jobsSpot with checkpointingMaximize GPU utilization cost

Cost example:

  • 10 × m5.xlarge on-demand: $2,160/mo
  • 10 × m5.xlarge reserved (1yr): $1,296/mo (40% savings)
  • 7 × reserved + 3 × spot: ~$1,100/mo (savings depend on spot availability)

Gotcha: The most expensive cloud resource is one you forgot to turn off. Tag every resource with owner + expiry. Implement budget alerts at 50%, 80%, 100% of monthly target.