AWS Compute Cost Playbook: Savings Plans, RIs & Spot

Should you buy AWS Savings Plans, Reserved Instances, or Spot? A practical guide and decision framework to slash startup compute costs by 30%.

June 2, 2026Updated June 2, 20268 min read

TL;DR

You are paying full price for AWS compute, and there is no reason to.

Savings Plans: Commit to a $/hour spend for 1 or 3 years. Flexible across instance types. Best for predictable baseline workloads.
Reserved Instances (RIs): Commit to a specific instance type in a specific region. Deeper discounts than Savings Plans but zero flexibility. Best for databases and stateful services.
Spot Instances: Use AWS spare capacity at 60% to 90% off. Can be reclaimed with 2 minutes notice. Best for stateless, fault-tolerant workloads.
The Verdict: The winning strategy is a layered approach — RIs for your database layer, Savings Plans for your baseline compute, and Spot for everything that can tolerate interruption.

Not sure which pricing model fits your infrastructure? I will analyze your AWS usage patterns and recommend the exact combination of Savings Plans, RIs, and Spot that maximizes your savings.

Book a Free 15-Minute Consultation — no commitments, just a clear recommendation.

If you are running your AWS workloads entirely on On-Demand pricing, you are paying the retail sticker price for every second of compute. AWS sets On-Demand as the default because it is the most profitable option — for them.

For a growing startup running a moderate production infrastructure (a few EKS worker nodes, an RDS cluster, some batch processing), the difference between On-Demand and optimized pricing is typically $3,000 to $15,000 per month.

That is not a rounding error. That is an engineer's salary. Or three months of marketing budget. Or the difference between extending your runway and a down round.

Let's break down every pricing model so you can make an informed decision.

Option 1: Savings Plans

What it is: You commit to spending a minimum $/hour on compute for 1 or 3 years. In exchange, AWS gives you a discount of 30% to 60% depending on the commitment length and payment option.

Key advantage: Savings Plans are flexible. If you commit to $10/hour of compute, that commitment applies automatically across EC2, Fargate, and Lambda regardless of instance family, size, operating system, or region.

When to Use Savings Plans

Your compute usage is predictable and you have at least 6 months of AWS Cost Explorer data showing a consistent baseline.
You run a mix of instance types and expect to change them over the commitment period (e.g., migrating from x86 to Graviton).
You want to keep operational flexibility while still getting meaningful discounts.

When NOT to Use Savings Plans

You are a very early-stage startup (less than 3 months on AWS) and your usage patterns are still volatile.
You are planning a major re-architecture (e.g., moving from EC2 to serverless) within the commitment period.

The Decision Matrix

Commitment	Payment	Discount vs On-Demand
1-year, No Upfront	Monthly	~20-30%
1-year, All Upfront	One-time	~30-38%
3-year, No Upfront	Monthly	~35-48%
3-year, All Upfront	One-time	~50-60%

My recommendation for startups: Start with a 1-year, No Upfront Compute Savings Plan. It gives you meaningful savings with zero upfront cash risk and monthly payments that align with your cash flow.

Option 2: Reserved Instances (RIs)

What it is: You commit to a specific instance type (e.g., db.r6g.xlarge) in a specific region for 1 or 3 years. The discount is deeper than Savings Plans because you are giving AWS more certainty.

Key advantage: RIs offer the deepest discounts available — up to 72% off On-Demand for a 3-year All Upfront commitment.

Key risk: If you change instance types, move regions, or reduce usage during the commitment period, you are stuck paying for capacity you don't use. RIs are not flexible.

When to Use Reserved Instances

Production databases (RDS/ElastiCache). Your database instance type rarely changes. You know you will run db.r6g.2xlarge for the next year. This is the ideal RI candidate.
Stable, long-running stateful services. If you have an Elasticsearch cluster or a Kafka broker that has been the same instance type for 6+ months, lock in the savings.

When NOT to Use Reserved Instances

Anything stateless. Web servers, API gateways, and worker processes should use Savings Plans or Spot instead.
Anything you might migrate. If there is any chance you will switch database engines (e.g., from RDS PostgreSQL to Aurora), an RI on the old instance type becomes a wasted commitment.

Option 3: Spot Instances

What it is: You use AWS's unused spare compute capacity at a 60% to 90% discount. The catch: AWS can reclaim your instance with a 2-minute warning whenever they need the capacity back.

Key advantage: The savings are staggering. A c6g.xlarge On-Demand costs ~$0.136/hour. The Spot price for the same instance is often $0.04 to $0.05/hour. That is a 65% to 70% reduction.

When to Use Spot

Kubernetes worker nodes running stateless microservices. Configure node affinity so that pods with graceful shutdown handlers land on Spot nodes.
CI/CD build runners. Your GitHub Actions or GitLab CI runners don't need to be persistent. Spin up a Spot instance, run the build, and terminate.
Batch processing and data pipelines. ETL jobs, ML training runs, and video transcoding are perfect Spot candidates.

When NOT to Use Spot

Production databases. Never put a database on Spot. The 2-minute reclamation window is not enough time to safely shut down a database with open transactions.
Singleton services without failover. If your architecture has a single instance handling critical traffic with no automatic failover, Spot reclamation causes an outage.

Surviving Spot Reclamation

The 2-minute warning is not optional — your application must handle it:

spot_handler.py

import signal
import sys
import time
import requests
 
 
def check_spot_interruption():
    """Poll the EC2 metadata endpoint for interruption notices."""
    try:
        response = requests.get(
            "http://169.254.169.254/latest/meta-data/"
            "spot/instance-action",
            timeout=2,
        )
        if response.status_code == 200:
            return response.json()
    except requests.exceptions.RequestException:
        pass
    return None
 
 
def graceful_shutdown(signum, frame):
    """Handle SIGTERM from Spot interruption."""
    print("Received SIGTERM. Draining connections...")
    # Stop accepting new requests
    # Wait for in-flight requests to complete
    # Flush buffers and close database connections
    time.sleep(5)
    print("Graceful shutdown complete.")
    sys.exit(0)
 
 
signal.signal(signal.SIGTERM, graceful_shutdown)

In Kubernetes, the AWS Node Termination Handler automates this entirely. When a Spot reclamation notice arrives, it automatically cordons the node, drains running pods, and allows the Kubernetes scheduler to place them on another node — all within the 2-minute window.

The Layered Strategy (Putting It All Together)

The optimal approach is not choosing one pricing model. It is layering all three:

┌─────────────────────────────────────────────────────┐
│                 Your AWS Compute Bill                │
├─────────────────────────────────────────────────────┤
│                                                     │
│  Layer 1: Reserved Instances (RIs)                  │
│  ├── RDS Production (db.r6g.2xlarge)    → 72% off   │
│  ├── ElastiCache (cache.r6g.xlarge)     → 65% off   │
│  └── Kafka brokers (m6g.2xlarge)        → 68% off   │
│                                                     │
│  Layer 2: Compute Savings Plan                      │
│  ├── Baseline EKS worker nodes          → 35% off   │
│  ├── Lambda functions                   → 30% off   │
│  └── Fargate tasks                      → 30% off   │
│                                                     │
│  Layer 3: Spot Instances                            │
│  ├── Stateless microservices            → 70% off   │
│  ├── CI/CD runners                      → 65% off   │
│  └── Batch processing                   → 80% off   │
│                                                     │
│  Layer 4: On-Demand (Last Resort)                   │
│  └── Only for unpredictable burst traffic           │
│                                                     │
└─────────────────────────────────────────────────────┘

Layer 1 (RIs) covers your fixed, predictable database and stateful layer. Maximum discount, zero flexibility needed.

Layer 2 (Savings Plans) covers your baseline compute that runs 24/7 but might change instance types as you optimize (e.g., migrating to Graviton). I cover the Graviton migration in my AWS Cost Optimization Guide.

Layer 3 (Spot) covers everything that can tolerate interruption. This is where the dramatic savings happen for compute-heavy workloads.

Layer 4 (On-Demand) is your safety net for traffic spikes and unexpected bursts. On a well-optimized stack, On-Demand should account for less than 10% of your total compute spend.

The Decision Framework

Before committing to any pricing model, ask these four questions:

Is the workload stable? (Same instance type for 6+ months → RI)
Is the workload predictable? (Consistent baseline usage → Savings Plan)
Is the workload stateless and fault-tolerant? (Can survive interruption → Spot)
Is the workload unpredictable and critical? (Burst traffic → On-Demand)

The Operational Reality

Commitment risk. A 3-year All Upfront RI on a db.r5.4xlarge costs ~$30,000 upfront. If you migrate to Aurora Serverless six months in, that money is gone. Start with 1-year No Upfront commitments until your architecture stabilizes.
Savings Plan coverage gaps. AWS recommends Savings Plan amounts based on your last 30 days of usage. If your usage drops (e.g., you optimize your infrastructure), your Savings Plan commitment becomes wasted spend. Always commit to 80% of your baseline, not 100%.
Spot availability varies by region. Some instance types in popular regions (us-east-1) are reclaimed frequently. Diversify across multiple instance types and availability zones to reduce interruption rates.

The Payoff

Most startups are paying the full On-Demand sticker price for 100% of their compute. By layering RIs, Savings Plans, and Spot, a typical mid-stage startup can reduce their monthly compute bill by 30% to 50% — without changing a single line of application code.

That is not optimization. That is free money sitting on the table, waiting for someone to pick it up.

Still paying full price for AWS compute? Most startups can cut their cloud bill by 30% to 50% using the right combination of Savings Plans, RIs, and Spot Instances.

I will analyze your usage patterns and recommend the exact commitment strategy for your infrastructure.

Stop overpaying. Book a Free Consultation.

Free Architecture Offer

Get a Free 15-Minute AWS Bill Review

Send me your AWS setup and I'll pinpoint the exact resources bleeding money, then hand you a prioritized roadmap to cut your bill 30–40%. No automated PDF — just the Terraform and Python fixes.

✓ 15-Minute Quick Call✓ Direct Tech Analysis✓ Zero Commitments

Book Your Free Review View FinOps Services