What is the AWS Cost and Usage Report (CUR)?

The AWS Cost and Usage Report (CUR) is the most granular billing dataset AWS provides, breaking down every line item by service, usage type, resource, and hour. Unlike the summarized Billing console, the CUR is delivered as detailed files to S3, which lets you analyze exactly where your spend goes.

How do I enable the AWS Cost and Usage Report?

In the AWS Billing console, open Cost & Usage Reports, create a new report, and point it at an S3 bucket for delivery. Enabling hourly granularity and resource IDs gives you the detail needed to trace waste down to individual resources; the first files typically arrive within about 24 hours.

How do I parse the AWS CUR with Python?

Download the CUR files from S3 and load them with a Python library like pandas, then group and aggregate the line items by service, usage type, or resource to surface your biggest cost drivers. From there you can script recurring reports that flag idle compute, NAT gateway charges, orphaned snapshots, and data-transfer spikes automatically.

Why is my AWS bill higher than the Billing console shows?

The Billing console only shows summarized totals, so it hides the resource-level detail where waste accumulates. The CUR exposes the full picture, revealing costs like idle EC2 instances, per-gigabyte NAT gateway processing, forgotten EBS snapshots, and inter-AZ data transfer that the summary view rolls up and obscures.

What are the biggest hidden costs in the AWS CUR?

The four recurring culprits are idle or oversized compute, NAT gateway data-processing charges, a graveyard of orphaned EBS snapshots, and unexpected data-transfer fees between availability zones or regions. These rarely show up clearly in the console but stand out immediately once you aggregate the CUR by usage type.

How can I reduce my AWS bill by 30%?

Start by parsing the CUR to rank your largest and most wasteful line items, then eliminate idle compute, consolidate NAT gateways, delete orphaned snapshots, and rearchitect chatty cross-AZ traffic. Automating this analysis so it runs continuously is what turns a one-time cleanup into a durable 20 to 30 percent reduction.

Parse AWS Cost & Usage Reports (CUR) with Python

Learn to parse AWS Cost & Usage Reports programmatically using Python. Find hidden cloud waste and build an actionable cost-reduction playbook.

June 5, 2026Updated July 29, 20266 min read

TL;DR

The AWS billing console gives you totals. The Cost and Usage Report (CUR) gives you truth.

The Stack: Python (pandas + boto3), AWS CUR (Parquet format), S3 (data lake), Athena (optional SQL querying).
The Verdict: A 200-line Python script can surface more actionable cost savings than a $50,000/year FinOps SaaS tool. The data is already there — you just need to read it.

Want me to run this analysis on your AWS account? I will parse your CUR data, identify every dollar of waste, and hand you a prioritized action plan.

Book a Free 15-Minute AWS Bill Review — no commitments, just clarity.

Why the Billing Console Lies to You

Open your AWS Billing Dashboard right now. You will see a bar chart showing total spend per service: EC2, RDS, S3, Lambda. It looks clean. It looks manageable.

It is hiding everything that matters.

The billing console aggregates costs at the service level. It does not tell you:

Which specific EC2 instance is costing you $2,400/month and running at 4% CPU utilization.
Which S3 bucket is racking up $800/month in PUT request charges because a misconfigured application is writing millions of tiny objects.
Which NAT Gateway is silently processing 12TB of internal traffic that should be routed through a free VPC Endpoint.

To find that data, you need the Cost and Usage Report (CUR).

Step 1: Enable CUR (5 Minutes)

If you haven't already, enable CUR delivery to S3 in Parquet format:

Go to AWS Billing Console → Cost & Usage Reports → Create Report.
Name it cur-detailed-report.
Select Include resource IDs (critical for identifying specific resources).
Choose Parquet format (columnar storage, dramatically faster to parse than CSV).
Set the S3 delivery bucket.

AWS will start delivering hourly/daily reports within 24 hours. Each report is a set of Parquet files partitioned by date.

Step 2: The Python Parser

Here is the core script that reads your CUR data, groups spending by resource, and identifies the top cost offenders:

cur_analyzer.py

import boto3
import pandas as pd
from datetime import datetime, timedelta
 
def download_cur_files(bucket: str, prefix: str, local_dir: str = "/tmp/cur"):
    """Download the latest CUR Parquet files from S3."""
    s3 = boto3.client("s3")
    paginator = s3.get_paginator("list_objects_v2")
 
    files = []
    for page in paginator.paginate(Bucket=bucket, Prefix=prefix):
        for obj in page.get("Contents", []):
            if obj["Key"].endswith(".parquet"):
                local_path = f"{local_dir}/{obj['Key'].split('/')[-1]}"
                s3.download_file(bucket, obj["Key"], local_path)
                files.append(local_path)
    return files
 
 
def analyze_cur(files: list[str]) -> pd.DataFrame:
    """Parse CUR Parquet files and return top cost offenders."""
    frames = [pd.read_parquet(f) for f in files]
    df = pd.concat(frames, ignore_index=True)
 
    # Filter to the last 30 days of usage
    df["line_item_usage_start_date"] = pd.to_datetime(
        df["line_item_usage_start_date"]
    )
    cutoff = datetime.now() - timedelta(days=30)
    df = df[df["line_item_usage_start_date"] >= cutoff]
 
    # Group by resource and sum the unblended cost
    cost_by_resource = (
        df.groupby(
            [
                "line_item_product_code",
                "line_item_resource_id",
                "line_item_usage_type",
            ]
        )["line_item_unblended_cost"]
        .sum()
        .reset_index()
        .sort_values("line_item_unblended_cost", ascending=False)
    )
 
    return cost_by_resource.head(50)
 
 
def generate_report(top_costs: pd.DataFrame) -> str:
    """Generate a Markdown report of the top 50 cost offenders."""
    lines = ["# AWS Cost Offender Report", ""]
    lines.append(f"Generated: {datetime.now().isoformat()}")
    lines.append("")
    lines.append(
        "| Service | Resource ID | Usage Type | 30-Day Cost |"
    )
    lines.append(
        "| :------ | :---------- | :--------- | ----------: |"
    )
 
    for _, row in top_costs.iterrows():
        cost = f"${row['line_item_unblended_cost']:,.2f}"
        lines.append(
            f"| {row['line_item_product_code']} "
            f"| `{row['line_item_resource_id'][:40]}` "
            f"| {row['line_item_usage_type']} "
            f"| {cost} |"
        )
 
    return "\n".join(lines)
 
 
if __name__ == "__main__":
    BUCKET = "your-cur-bucket"
    PREFIX = "cur-detailed-report/cur-detailed-report/year=2026/month=5"
 
    files = download_cur_files(BUCKET, PREFIX)
    top_costs = analyze_cur(files)
    report = generate_report(top_costs)
 
    with open("cost_report.md", "w") as f:
        f.write(report)
 
    print(report)
    print(f"\nTotal 30-day spend in top 50 resources: "
          f"${top_costs['line_item_unblended_cost'].sum():,.2f}")

Run it, and you will get a Markdown table showing exactly which resources are draining your budget. No dashboards. No SaaS subscriptions. Just raw data and truth.

Step 3: What to Look For

Once you have the output, here are the patterns that consistently reveal the biggest savings:

The Idle Compute Monster

If you see an i-0abc123... EC2 instance costing $1,800/month but your CloudWatch metrics show 3% average CPU, that instance is over-provisioned by at least 60%. Downsize it or migrate the workload to a Spot-backed Kubernetes pod.

The NAT Gateway Tax

Look for NatGateway-Bytes usage types. If you see $5,000+ in NAT Gateway data processing, your private subnets are routing internal AWS traffic (S3, ECR, DynamoDB) through the public internet. Deploy VPC Endpoints and watch that line item vanish. I cover this in detail in my FinOps Engineering Guide.

The Snapshot Graveyard

EBS snapshots accumulate silently. Old AMIs that nobody uses anymore still retain their snapshots, costing $0.05/GB/month. Filter your CUR data for EBS:SnapshotUsage and you will likely find hundreds of dollars in zombie storage.

The Data Transfer Surprise

Cross-region and cross-AZ data transfer charges are invisible in the billing console but glaringly obvious in CUR data. If your microservices are chatting across availability zones, you are paying $0.01/GB for every internal API call.

Automating It

Don't run this script manually. Set up an EventBridge rule to trigger a Lambda function weekly. The Lambda runs the analysis, writes the report to S3, and posts a summary to your #finops Slack channel:

lambda_handler.py

def handler(event, context):
    files = download_cur_files(BUCKET, PREFIX)
    top_costs = analyze_cur(files)
    report = generate_report(top_costs)
 
    # Upload to S3
    boto3.client("s3").put_object(
        Bucket="reports-bucket",
        Key=f"cost-reports/{datetime.now().strftime('%Y-%m-%d')}.md",
        Body=report.encode(),
    )
 
    # Post to Slack
    total = top_costs["line_item_unblended_cost"].sum()
    post_to_slack(
        f"Weekly CUR Analysis complete. "
        f"Top 50 resources account for ${total:,.2f} in spend. "
        f"Report: https://reports-bucket.s3.amazonaws.com/..."
    )

The Operational Reality

CUR files are massive. A medium-sized AWS account generates hundreds of megabytes of CUR data per month. Use Parquet (not CSV) and consider AWS Athena for SQL-based querying if your account is large.
Resource IDs change. Auto Scaling Groups create and terminate instances constantly. Focus on usage types and patterns, not individual instance IDs, for dynamic workloads.
Savings Plans distort costs. If you have active Savings Plans or Reserved Instances, the line_item_unblended_cost may not reflect your actual spend. Use savings_plan_savings_plan_effective_cost columns for a more accurate picture.

The Payoff

Your AWS bill is not a black box. The data to slash it by 30% is already sitting in an S3 bucket, waiting for someone to read it.

This script is the diagnostic. The next step is the surgery — re-architecting the resources that are bleeding you dry. If this analysis shows you are wasting more than $1,500/month, you don't need a tool; you need an architect.

Want to pipe those findings directly into your accounting system? See how to automate AWS billing into Odoo ERP so finance gets the numbers without manual exports.

Is your AWS bill hiding $10,000 in waste? Most companies have never looked at their CUR data. When they do, the savings are immediate and dramatic.

I will run this analysis on your account and hand you a prioritized action plan — or show you exactly how to do it yourself.

Stop guessing. Book a Free Infrastructure Audit.

Frequently Asked Questions

What is the AWS Cost and Usage Report (CUR)?: The AWS Cost and Usage Report (CUR) is the most granular billing dataset AWS provides, breaking down every line item by service, usage type, resource, and hour. Unlike the summarized Billing console, the CUR is delivered as detailed files to S3, which lets you analyze exactly where your spend goes.
How do I enable the AWS Cost and Usage Report?: In the AWS Billing console, open Cost & Usage Reports, create a new report, and point it at an S3 bucket for delivery. Enabling hourly granularity and resource IDs gives you the detail needed to trace waste down to individual resources; the first files typically arrive within about 24 hours.
How do I parse the AWS CUR with Python?: Download the CUR files from S3 and load them with a Python library like pandas, then group and aggregate the line items by service, usage type, or resource to surface your biggest cost drivers. From there you can script recurring reports that flag idle compute, NAT gateway charges, orphaned snapshots, and data-transfer spikes automatically.
Why is my AWS bill higher than the Billing console shows?: The Billing console only shows summarized totals, so it hides the resource-level detail where waste accumulates. The CUR exposes the full picture, revealing costs like idle EC2 instances, per-gigabyte NAT gateway processing, forgotten EBS snapshots, and inter-AZ data transfer that the summary view rolls up and obscures.
What are the biggest hidden costs in the AWS CUR?: The four recurring culprits are idle or oversized compute, NAT gateway data-processing charges, a graveyard of orphaned EBS snapshots, and unexpected data-transfer fees between availability zones or regions. These rarely show up clearly in the console but stand out immediately once you aggregate the CUR by usage type.
How can I reduce my AWS bill by 30%?: Start by parsing the CUR to rank your largest and most wasteful line items, then eliminate idle compute, consolidate NAT gateways, delete orphaned snapshots, and rearchitect chatty cross-AZ traffic. Automating this analysis so it runs continuously is what turns a one-time cleanup into a durable 20 to 30 percent reduction.

Free Architecture Offer

Get a Free 15-Minute AWS Bill Review

Send me your AWS setup and I'll pinpoint the exact resources bleeding money, then hand you a prioritized roadmap to cut your bill 30–40%. No automated PDF — just the Terraform and Python fixes.

✓ 15-Minute Quick Call✓ Direct Tech Analysis✓ Zero Commitments

Book Your Free Review View FinOps Services