1. Compute Compute is usually 50-70% of the bill and where the fastest wins live.
List instances below 20% average CPU over 14 days Use CloudWatch / Cloud Monitoring metrics, not gut feel. Anything consistently under 20% is a rightsizing or termination candidate. Find dev/staging workloads running nights and weekends Non-production rarely needs 168 hours a week. Scheduled stop/start alone often cuts those environments 60-70%. Check for orphaned instances with no traffic or owner Cross-reference load balancer targets and recent SSH/exec activity against the inventory. Review instance generations: are you on the current family? Newer generations are typically 10-20% cheaper per unit of performance. Same workload, smaller bill. Audit autoscaling minimums A min of 6 when nightly traffic needs 2 is a standing tax. Check scaling history against the floor. Identify workloads safe for Spot / Preemptible pricing Batch jobs, CI runners, and stateless workers tolerate interruption and run 60-90% cheaper.
2. Storage and data transfer Find unattached volumes and orphaned snapshots Detached disks and snapshot chains from deleted instances quietly accumulate for years. Check object storage lifecycle policies exist Logs and backups older than 30-90 days belong in infrequent-access or archive tiers, or deleted on a schedule. Look for oversized provisioned IOPS Compare provisioned vs consumed IOPS. Paying for 10,000 while using 400 is common. Trace your top 3 data transfer line items Cross-AZ chatter, NAT gateway processing, and egress to the internet are the usual suspects. Each has an architectural fix. Check old backups against your actual retention policy If the policy says 35 days and the bucket holds 3 years, that is pure waste.
3. Commitments and pricing Measure your steady-state baseline The load that runs 24/7 regardless of traffic is what you can safely commit. Compare baseline against current Savings Plans / Reserved coverage Under-covered baseline = paying on-demand for predictable load. Over-covered = paying for capacity you shed. Check commitment utilization, not just coverage A 3-year reservation at 40% utilization is worse than on-demand. Review licensing-included pricing vs BYOL where relevant Databases and Windows workloads often have a cheaper licensing path.
4. Visibility and governance Savings evaporate without guardrails. These checks make the win permanent.
Enforce a tagging standard (owner, environment, service) Untagged spend is unaccountable spend. Target at least 90% of cost tagged. Set budget alerts at 80% and 100% per account/project Alerts to a channel humans read, not an inbox nobody owns. Detect anomalies automatically Native anomaly detection (AWS Cost Anomaly Detection, GCP budget alerts) is free. Turn it on. Review the bill monthly with an owner One named person, 30 minutes a month, with the authority to act. This single habit prevents most regressions. Delete or downsize the resources this checklist found, then re-baseline Record the new baseline so next month’s drift is visible.