Cost Optimization Patterns — Match Resources to Usage

Most cloud waste comes from over-provisioning and wrong pricing model — match instance type and commitment to actual usage patterns.

When to use

Quarterly architecture and cost reviews
Before and after traffic changes (launches, sunsets)

Tradeoffs

Reserved/committed use is wasted if usage drops before commitment ends
Spot/preemptible instances can be interrupted — only for fault-tolerant workloads

Workload type	Recommended pricing model	Why
Baseline stable load (web API, always-on)	1-year Reserved / Committed use	~40% cheaper than on-demand
Variable/spiky load	On-demand + Auto-scaling	Pay only for what's used
Fault-tolerant batch / data processing	Spot / Preemptible	~70–80% cheaper, acceptable interruptions
Dev/test (off during nights/weekends)	On-demand + Auto-scaling to zero	No idle cost
Large ML training jobs	Spot with checkpointing	Maximize GPU utilization cost

Cost example:

10 × m5.xlarge on-demand: $2,160/mo

10 × m5.xlarge reserved (1yr): $1,296/mo (40% savings)

7 × reserved + 3 × spot: ~$1,100/mo (savings depend on spot availability)

Gotcha: The most expensive cloud resource is one you forgot to turn off. Tag every resource with owner + expiry. Implement budget alerts at 50%, 80%, 100% of monthly target.