The Idle EC2 Reaper: Automating Cloud Resource Shutdowns in Non-Production Environments
Your staging and development environments run 24/7 but your engineers work 8 hours a day. Learn how to build a Terraform-managed Lambda function that automatically shuts down idle non-prod resources at night, saving 60% of your non-production compute bill instantly.
TL;DR
Your non-production AWS environments are running 16 hours a day with zero users. That is a 66% waste rate on compute.
- The Stack: Terraform (infrastructure as code), AWS Lambda (serverless execution), EventBridge (cron scheduling), Python (Lambda runtime).
- The Verdict: A single afternoon of engineering work can save your company $2,000 to $5,000 per month by shutting down dev/staging resources outside business hours. The ROI is immediate and permanent.
Want this deployed to your AWS account today? I will set up the Reaper, configure the schedules around your team's working hours and time zones, and hand you the Terraform code.
Book a Free Consultation — one afternoon of work, months of savings.
The $3,000 Night Shift
Open your AWS Cost Explorer right now. Filter by environment tag — staging, dev, qa, whatever your team uses. Now look at the 24-hour usage graph.
You will see a flat line. No dips at night. No gaps on weekends. Your staging environment runs exactly the same at 3 AM on a Sunday as it does at 10 AM on a Monday.
Your engineers work roughly 8 hours a day, 5 days a week. That is 40 out of 168 hours per week. Your non-production infrastructure is idle for 76% of every week.
If your non-prod compute bill is $5,000/month, you are burning $3,800 every month on servers that are doing absolutely nothing.
The Solution: The EC2 Reaper
The Reaper is a Lambda function triggered by EventBridge cron rules that:
- 7 PM (local time): Stops all EC2 instances, scales down ECS services, and pauses RDS instances tagged with
Environment: non-prod. - 8 AM (local time): Starts everything back up before engineers arrive.
- Friday 7 PM: Shuts everything down for the weekend.
- Monday 8 AM: Wakes everything up for the new work week.
The Lambda Function
reaper.py
import boto3
import logging
logger = logging.getLogger()
logger.setLevel(logging.INFO)
ec2 = boto3.client("ec2")
rds = boto3.client("rds")
ecs = boto3.client("ecs")
def get_tagged_instances(action: str) -> list[str]:
"""Find EC2 instances tagged for the Reaper."""
filters = [
{"Name": "tag:Environment", "Values": ["dev", "staging", "qa"]},
{"Name": "tag:Reaper", "Values": ["enabled"]},
]
if action == "stop":
filters.append(
{"Name": "instance-state-name", "Values": ["running"]}
)
elif action == "start":
filters.append(
{"Name": "instance-state-name", "Values": ["stopped"]}
)
response = ec2.describe_instances(Filters=filters)
instance_ids = []
for reservation in response["Reservations"]:
for instance in reservation["Instances"]:
instance_ids.append(instance["InstanceId"])
return instance_ids
def stop_rds_instances():
"""Stop all non-prod RDS instances."""
response = rds.describe_db_instances()
for db in response["DBInstances"]:
tags = rds.list_tags_for_resource(
ResourceName=db["DBInstanceArn"]
)["TagList"]
tag_map = {t["Key"]: t["Value"] for t in tags}
if (
tag_map.get("Environment") in ("dev", "staging", "qa")
and tag_map.get("Reaper") == "enabled"
and db["DBInstanceStatus"] == "available"
):
rds.stop_db_instance(
DBInstanceIdentifier=db["DBInstanceIdentifier"]
)
logger.info(f"Stopped RDS: {db['DBInstanceIdentifier']}")
def start_rds_instances():
"""Start all non-prod RDS instances."""
response = rds.describe_db_instances()
for db in response["DBInstances"]:
tags = rds.list_tags_for_resource(
ResourceName=db["DBInstanceArn"]
)["TagList"]
tag_map = {t["Key"]: t["Value"] for t in tags}
if (
tag_map.get("Environment") in ("dev", "staging", "qa")
and tag_map.get("Reaper") == "enabled"
and db["DBInstanceStatus"] == "stopped"
):
rds.start_db_instance(
DBInstanceIdentifier=db["DBInstanceIdentifier"]
)
logger.info(f"Started RDS: {db['DBInstanceIdentifier']}")
def scale_ecs_services(desired_count: int):
"""Scale non-prod ECS services to the desired count."""
clusters = ecs.list_clusters()["clusterArns"]
for cluster_arn in clusters:
services = ecs.list_services(
cluster=cluster_arn
)["serviceArns"]
for service_arn in services:
service_detail = ecs.describe_services(
cluster=cluster_arn,
services=[service_arn],
)["services"][0]
tags = {
t["key"]: t["value"]
for t in service_detail.get("tags", [])
}
if (
tags.get("Environment") in ("dev", "staging", "qa")
and tags.get("Reaper") == "enabled"
):
ecs.update_service(
cluster=cluster_arn,
service=service_arn,
desiredCount=desired_count,
)
logger.info(
f"Scaled ECS {service_detail['serviceName']} "
f"to {desired_count}"
)
def handler(event, context):
action = event.get("action", "stop")
if action == "stop":
# Stop EC2
instance_ids = get_tagged_instances("stop")
if instance_ids:
ec2.stop_instances(InstanceIds=instance_ids)
logger.info(f"Stopped EC2 instances: {instance_ids}")
# Stop RDS
stop_rds_instances()
# Scale ECS to 0
scale_ecs_services(desired_count=0)
return {
"action": "stop",
"ec2_stopped": len(instance_ids),
"message": "Non-prod environments shut down for the night.",
}
elif action == "start":
# Start EC2
instance_ids = get_tagged_instances("start")
if instance_ids:
ec2.start_instances(InstanceIds=instance_ids)
logger.info(f"Started EC2 instances: {instance_ids}")
# Start RDS
start_rds_instances()
# Scale ECS back up
scale_ecs_services(desired_count=1)
return {
"action": "start",
"ec2_started": len(instance_ids),
"message": "Non-prod environments are waking up.",
}The Terraform Module
Deploy the Reaper as infrastructure-as-code so it is version controlled, reviewable, and reproducible:
reaper.tf
resource "aws_lambda_function" "reaper" {
function_name = "ec2-reaper"
runtime = "python3.12"
handler = "reaper.handler"
timeout = 300
memory_size = 256
filename = data.archive_file.reaper_zip.output_path
source_code_hash = data.archive_file.reaper_zip.output_base64sha256
role = aws_iam_role.reaper_role.arn
environment {
variables = {
SLACK_WEBHOOK_URL = var.slack_webhook_url
}
}
}
# Stop everything at 7 PM UTC on weekdays
resource "aws_cloudwatch_event_rule" "stop_weekday" {
name = "reaper-stop-weekday"
schedule_expression = "cron(0 19 ? * MON-FRI *)"
}
resource "aws_cloudwatch_event_target" "stop_weekday" {
rule = aws_cloudwatch_event_rule.stop_weekday.name
arn = aws_lambda_function.reaper.arn
input = jsonencode({ action = "stop" })
}
# Start everything at 8 AM UTC on weekdays
resource "aws_cloudwatch_event_rule" "start_weekday" {
name = "reaper-start-weekday"
schedule_expression = "cron(0 8 ? * MON-FRI *)"
}
resource "aws_cloudwatch_event_target" "start_weekday" {
rule = aws_cloudwatch_event_rule.start_weekday.name
arn = aws_lambda_function.reaper.arn
input = jsonencode({ action = "start" })
}
# Stop everything Friday night for the entire weekend
resource "aws_cloudwatch_event_rule" "stop_weekend" {
name = "reaper-stop-weekend"
schedule_expression = "cron(0 19 ? * FRI *)"
}
resource "aws_cloudwatch_event_target" "stop_weekend" {
rule = aws_cloudwatch_event_rule.stop_weekend.name
arn = aws_lambda_function.reaper.arn
input = jsonencode({ action = "stop" })
}
# IAM Role with least-privilege permissions
resource "aws_iam_role" "reaper_role" {
name = "ec2-reaper-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = { Service = "lambda.amazonaws.com" }
}]
})
}
resource "aws_iam_role_policy" "reaper_policy" {
name = "ec2-reaper-policy"
role = aws_iam_role.reaper_role.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Action = [
"ec2:DescribeInstances",
"ec2:StartInstances",
"ec2:StopInstances",
]
Resource = "*"
Condition = {
StringEquals = {
"ec2:ResourceTag/Reaper" = "enabled"
}
}
},
{
Effect = "Allow"
Action = [
"rds:DescribeDBInstances",
"rds:ListTagsForResource",
"rds:StartDBInstance",
"rds:StopDBInstance",
]
Resource = "*"
},
{
Effect = "Allow"
Action = [
"ecs:ListClusters",
"ecs:ListServices",
"ecs:DescribeServices",
"ecs:UpdateService",
]
Resource = "*"
},
{
Effect = "Allow"
Action = ["logs:*"]
Resource = "arn:aws:logs:*:*:*"
},
]
})
}Tagging Your Resources
The Reaper uses two tags to decide what to manage:
| Tag Key | Tag Value | Purpose |
|---|---|---|
Environment | dev, staging, or qa | Identifies non-production resources |
Reaper | enabled | Opt-in flag for automatic shutdown |
Any resource without both tags is completely ignored. This gives teams fine-grained control — if a specific staging database must stay running overnight for a nightly integration test, just remove the Reaper: enabled tag.
The Operational Reality
- RDS cold-start delay. Starting a stopped RDS instance takes 5 to 10 minutes. If your engineers arrive at 8 AM sharp and immediately hit the database, they will see connection errors. Set the "start" cron 15 minutes earlier than the earliest engineer's workday.
- ECS task definition drift. When you scale an ECS service to 0 and back to 1, it uses the current task definition. If someone deployed a broken version at 6:55 PM, the Reaper will dutifully bring up the broken version at 8 AM. Always verify your latest deployment is healthy before end-of-day.
- Spot interruption overlap. If your non-prod instances are Spot-backed (which they should be for additional savings), AWS may reclaim them before the Reaper stops them. This is harmless — the Reaper will simply find no running instances to stop.
- Multi-timezone teams. If your engineers span US, Europe, and Asia, set the "start" time to the earliest timezone and the "stop" time to the latest. Or deploy separate Reaper instances per region.
The Math
| Metric | Before Reaper | After Reaper |
|---|---|---|
| Non-prod EC2 hours/week | 168 | 50 |
| Non-prod RDS hours/week | 168 | 55 |
| Weekly compute utilization | 100% | ~30% |
| Monthly non-prod savings | — | ~$2,000 to $5,000 |
The Reaper pays for itself in the first night it runs.
Your dev and staging environments are burning money while your engineers sleep. A single Terraform module can cut your non-production compute bill by 60% starting tonight.
I will deploy the Reaper to your AWS account and configure it for your team's schedule.
Stop paying Amazon to heat empty servers. Book a Free Consultation.
Get weekly DevOps insights
Join engineers who read my deep-dives on Kubernetes, AWS cost optimization, CI/CD, and infrastructure automation.

DevOps Engineer & Cloud Consultant | FinOps, GitOps & Kubernetes Expert
I build systems that run reliably, scale efficiently, and deploy intelligently. See how I can help your team.