Cost Optimization Patterns — Match Resources to Usage
Most cloud waste comes from over-provisioning and wrong pricing model — match instance type and commitment to actual usage patterns.
When to use
- Quarterly architecture and cost reviews
- Before and after traffic changes (launches, sunsets)
Tradeoffs
- Reserved/committed use is wasted if usage drops before commitment ends
- Spot/preemptible instances can be interrupted — only for fault-tolerant workloads
| Workload type | Recommended pricing model | Why |
|---|---|---|
| Baseline stable load (web API, always-on) | 1-year Reserved / Committed use | ~40% cheaper than on-demand |
| Variable/spiky load | On-demand + Auto-scaling | Pay only for what's used |
| Fault-tolerant batch / data processing | Spot / Preemptible | ~70–80% cheaper, acceptable interruptions |
| Dev/test (off during nights/weekends) | On-demand + Auto-scaling to zero | No idle cost |
| Large ML training jobs | Spot with checkpointing | Maximize GPU utilization cost |
Cost example:
- 10 × m5.xlarge on-demand: $2,160/mo
- 10 × m5.xlarge reserved (1yr): $1,296/mo (40% savings)
- 7 × reserved + 3 × spot: ~$1,100/mo (savings depend on spot availability)
Gotcha: The most expensive cloud resource is one you forgot to turn off. Tag every resource with owner + expiry. Implement budget alerts at 50%, 80%, 100% of monthly target.