How to Reduce Redshift Cluster Size for Low CPU Utilization

Overview

When your Amazon Redshift cluster consistently shows CPU utilization below 20%, it's a clear indicator that you're over-provisioned. This tutorial demonstrates how to downsize a Redshift cluster from 4 nodes to 2 nodes using elastic resize, reducing costs by approximately 50% while maintaining adequate performance for your workload.

Cost Impact: Reducing from 4 dc2.large nodes to 2 nodes saves approximately $365/month (50% reduction from ~$730/month to ~$365/month).

Time Required: The elastic resize process typically takes 10-15 minutes.

Prerequisites

An Amazon Redshift cluster with sustained low CPU utilization (<20% over 7 days)
AWS Console access with permissions to modify Redshift clusters
Understanding that the cluster will be read-only briefly during the resize operation

Step 1: Navigate to Redshift Console and Select Cluster

Navigate to the Redshift console and locate the cluster showing low CPU utilization. The cluster should have an "Available" status.

Click on the cluster name to view its details.

Step 2: Review CPU Utilization Metrics

In the cluster details page, click on the Monitoring or Metrics tab to review CPU utilization.

Verify that average CPU has remained below 20% for the past 7 days. This confirms the cluster is significantly over-provisioned and is a good candidate for downsizing.

Key Metrics to Check:

CPU Utilization: Should be consistently <20%
Query Patterns: Ensure low CPU isn't due to temporary low usage
Storage Usage: Verify you have adequate headroom after reducing nodes

Step 3: Review Current Configuration

Click on the Properties or Configuration tab to view the current cluster setup.

Note the current configuration:

Node Type: dc2.large (or your specific node type)
Number of Nodes: 4
Estimated Monthly Cost: ~$730/month for 4 dc2.large nodes

After resizing to 2 nodes, the cost will be approximately $365/month.

Step 4: Initiate Elastic Resize

Click the Actions dropdown button at the top right of the cluster details page.

Select Resize from the Actions menu.

When prompted, choose Elastic resize for faster completion (typically 10-15 minutes vs several hours for classic resize).

Elastic vs Classic Resize:

Elastic Resize: Faster (10-15 min), temporarily read-only, limited node type changes
Classic Resize: Slower (hours), supports all configuration changes, creates new cluster

Step 5: Configure New Node Count

In the resize configuration screen:

Change the Number of nodes from 4 to 2
Keep the Node type as dc2.large (or your current type)
Review the estimated new monthly cost shown in the console (~50% reduction)

Important Considerations:

Ensure your data will fit on 2 nodes (check storage capacity)
With <20% CPU usage, performance impact should be negligible
Database will be read-only for a few minutes during resize

Step 6: Review Resize Impact

Before confirming, review the resize summary panel:

Estimated Duration: 10-15 minutes for elastic resize
Cluster Availability: Temporarily read-only during operation
Expected Cost Reduction: ~$365/month savings
Performance Impact: Minimal for workloads with <20% CPU utilization

Double-check that you're comfortable with the brief read-only period for your applications.

Step 7: Confirm and Start Resize

Click Resize cluster to begin the operation.

The cluster status will change to "Resizing". You can navigate away and return later to check progress.

Step 8: Monitor Resize Progress

The cluster details page will show the resize status. During elastic resize:

Cluster enters "Resizing" state
Cluster becomes read-only (queries can read but not write)
Nodes are added/removed
Cluster returns to "Available" state

You can monitor the progress in the console or set up CloudWatch alarms to notify you when complete.

Step 9: Verify Completion

Once the resize completes:

Verify cluster status returns to Available
Confirm the configuration shows 2 nodes
Check that your applications can connect and run queries normally
Monitor CPU utilization over the next few days to ensure it remains acceptable

The cluster is now operational at half the previous cost.

Alternative Approaches

If this elastic resize doesn't fully optimize your costs, consider these alternatives:

1. Pause/Resume Scheduling

For clusters used only during business hours, implement automated pause/resume:

# Lambda function example for pause/resume scheduling
import boto3

redshift = boto3.client('redshift')

def lambda_handler(event, context):
    cluster_id = 'your-cluster-id'

    if event['action'] == 'pause':
        redshift.pause_cluster(ClusterIdentifier=cluster_id)
    else:
        redshift.resume_cluster(ClusterIdentifier=cluster_id)

Savings: 50-70% reduction with no performance impact during active hours

2. Migrate to Redshift Serverless

For workloads with unpredictable patterns, Redshift Serverless offers:

Pay only for actual compute used (RPU-hours)
Auto-scaling based on demand
No cluster management overhead

Best For: Development/test environments, sporadic analytics, variable workloads

3. Classic Resize for Node Type Changes

If you want to change node types simultaneously:

Use classic resize instead of elastic
Takes longer (hours vs minutes)
Supports any configuration change

4. Further Node Reduction

If CPU remains low after downsizing to 2 nodes:

Consider reducing to 1 node
Evaluate smaller node types (dc2.large → dc2.medium)
For very light workloads, migrate to Redshift Serverless

Cost Summary

Before Optimization:

4 × dc2.large nodes
~$730/month
CPU Utilization: <20%

After Optimization:

2 × dc2.large nodes
~$365/month
Expected CPU Utilization: <40% (still comfortable headroom)

Savings: $365/month (~$4,380/year)

For clusters with <20% CPU utilization, this resize typically has negligible performance impact while delivering significant cost savings.

Monitoring Post-Resize

After completing the resize, monitor these metrics for 1-2 weeks:

CPU Utilization: Should remain below 60% for comfortable operation
Query Performance: Watch for any degradation in query times
Disk Space: Ensure storage capacity is adequate
Concurrent Queries: Verify your workload handles concurrency well

If metrics show stress, you can always resize back up using the same elastic resize process.

Troubleshooting

Resize Fails or Takes Too Long

Classic Resize Fallback: If elastic resize fails, try classic resize
Check Cluster Health: Ensure cluster is in "Available" state before resizing
Snapshot First: For critical clusters, take a manual snapshot before resizing

Performance Degradation After Resize

Insufficient Capacity: May need to resize back up or optimize queries
Workload Changed: Verify CPU patterns haven't changed since initial analysis
Disk I/O Bottleneck: Check if storage is now constrained

Applications Can't Connect During Resize

Expected Behavior: Brief read-only period is normal for elastic resize
Plan Maintenance Window: Schedule resize during low-traffic periods
Configure Retries: Ensure applications have connection retry logic

Best Practices

Monitor First: Collect at least 7 days of CPU metrics before resizing
Start Conservative: Reduce nodes gradually (4→3→2) if uncertain
Test in Non-Prod: Try the process in dev/staging first
Snapshot Before: Always take a manual snapshot before major changes
Schedule Wisely: Perform resize during maintenance windows or low-traffic periods
Document Changes: Keep records of configuration changes and rationale

Conclusion

Downsizing an over-provisioned Redshift cluster is a straightforward way to reduce AWS costs without impacting performance. For clusters with sustained CPU utilization below 20%, reducing node count by 50% typically delivers proportional cost savings with minimal risk.

Monitor your cluster's performance post-resize and adjust as needed. The elastic resize feature makes it easy to scale both up and down based on actual workload requirements.

Overview​

Prerequisites​

Step 1: Navigate to Redshift Console and Select Cluster​

Step 2: Review CPU Utilization Metrics​

Step 3: Review Current Configuration​

Step 4: Initiate Elastic Resize​

Step 5: Configure New Node Count​

Step 6: Review Resize Impact​

Step 7: Confirm and Start Resize​

Step 8: Monitor Resize Progress​

Step 9: Verify Completion​

Alternative Approaches​

1. Pause/Resume Scheduling​

2. Migrate to Redshift Serverless​

3. Classic Resize for Node Type Changes​

4. Further Node Reduction​

Cost Summary​

Monitoring Post-Resize​

Troubleshooting​

Resize Fails or Takes Too Long​

Performance Degradation After Resize​

Applications Can't Connect During Resize​

Best Practices​

Conclusion​