How to Right-Size Overprovisioned EBS Volumes
Overview
When your EBS volumes show low utilization over extended periods, you're paying for capacity you don't need. This tutorial demonstrates how to safely replace an overprovisioned EBS volume by creating a snapshot, stopping the instance, and attaching a new volume - while uncovering a critical AWS limitation in the process.
Important AWS Limitation Discovered: This tutorial reveals a critical AWS constraint - you cannot create a volume from a snapshot that is smaller than the original snapshot source volume. This means the snapshot-based resize approach is primarily useful for:
- Moving volumes between availability zones
- Changing volume types (e.g., gp2 to gp3)
- Creating backup volumes at the same size
- Testing volume configurations
For actual downsizing, you'll need to use alternative approaches (see "Alternative Approaches" section).
Cost Impact: Due to the AWS limitation, this specific example maintains the 50 GB size with no cost reduction. However, the process demonstrates the complete workflow for volume replacement and highlights the limitations you'll encounter when trying to downsize.
Time Required: The entire process takes approximately 15-20 minutes, including snapshot creation, volume replacement, and verification.
Tutorial Approach: This tutorial uses AWS CLI for precise control and reproducibility. All commands can be adapted for AWS Console use if preferred.
Prerequisites
- An EBS volume that you want to resize or replace
- AWS CLI configured with credentials that have permissions to manage EC2 instances, volumes, and snapshots
- Understanding that the instance will need to be stopped during the volume replacement
- Backup or snapshot of critical data (though we'll create one as part of this process)
Step 1: Identify the Target Volume
First, list the volumes in your account to identify the overprovisioned volume:
aws ec2 describe-volumes \
--region us-east-1 \
--query 'Volumes[*].[VolumeId,Size,VolumeType,State,Attachments[0].InstanceId,Attachments[0].Device]' \
--output table
In this example, we're working with:
- Volume ID:
vol-03a19a24976347dd7 - Instance:
i-08faee10c7a14133f - Size: 50 GB
- Type: gp3
Get detailed information about the target volume:
aws ec2 describe-volumes \
--volume-ids vol-03a19a24976347dd7 \
--region us-east-1
Note these details from the output:
- Size: 50 GiB
- Volume Type: gp3
- IOPS: 3000
- Throughput: 125 MiB/s
- Attached Device: /dev/xvda
- Availability Zone: us-east-1a
- State: in-use
Step 2: Review CloudWatch Metrics (Optional)
You can review CloudWatch metrics via the console or CLI to understand volume utilization patterns. Navigate to the EC2 Volumes console and click the volume's Monitoring tab, or use CloudWatch CLI commands.
Look for:
- Read/write operations over time
- Read/write latency
- Throughput patterns
Low activity metrics over an extended period (30+ days) indicate the volume may be oversized for the workload. However, always verify actual filesystem usage before proceeding.
Step 3: Connect to Instance and Check Disk Usage
Before making any changes, connect to the EC2 instance to check actual disk usage. You can use AWS Systems Manager Session Manager, EC2 Instance Connect, or SSH.
Using Session Manager (requires SSM agent on instance):
aws ssm start-session --target i-08faee10c7a14133f --region us-east-1
Or use EC2 Instance Connect if available, or traditional SSH.
Step 4: Verify Filesystem Usage
Once connected to the instance, check disk usage:
df -h
Example output:
Filesystem Size Used Avail Use% Mounted on
/dev/xvda1 50G 2.1G 48G 5% /
This shows you exactly how much of the 50 GB volume is actually being used. In this example, only 2.1 GB (4.2%) of the volume capacity was utilized, confirming the volume is significantly overprovisioned.
Key Decision Point: Based on actual usage, determine your target size. Remember to leave adequate growth buffer (typically 50-100% of current usage). For 2.1 GB usage, a 10 GB volume would provide nearly 400% headroom.
Step 5: Create Snapshot for Backup
Before making any changes, create a snapshot of the volume as a backup using AWS CLI:
aws ec2 create-snapshot \
--volume-id vol-03a19a24976347dd7 \
--description "Pre-resize backup - vol-03a19a24976347dd7" \
--tag-specifications 'ResourceType=snapshot,Tags=[{Key=Name,Value=ebs-volume-overprovisioned-backup}]' \
--region us-east-1
Note the snapshot ID from the output (e.g., snap-0fbfa4e29c36b654b).
Wait for Completion: Monitor the snapshot progress:
aws ec2 describe-snapshots \
--snapshot-ids snap-0fbfa4e29c36b654b \
--region us-east-1 \
--query 'Snapshots[0].[SnapshotId,State,Progress]'
Or wait for completion:
aws ec2 wait snapshot-completed \
--snapshot-ids snap-0fbfa4e29c36b654b \
--region us-east-1
This typically takes 5-10 minutes depending on the amount of data on the volume.
Step 6: Stop the Instance
To safely detach and replace the volume, stop the instance first:
aws ec2 stop-instances \
--instance-ids i-08faee10c7a14133f \
--region us-east-1
Wait for Stopped State: Monitor until the instance reaches "stopped" state:
aws ec2 wait instance-stopped \
--instance-ids i-08faee10c7a14133f \
--region us-east-1
This typically takes 1-2 minutes.
Step 7: Attempt to Create Smaller Volume (Demonstrates AWS Limitation)
Now let's try to create a 10 GB volume from our 50 GB snapshot:
aws ec2 create-volume \
--snapshot-id snap-0fbfa4e29c36b654b \
--availability-zone us-east-1a \
--volume-type gp3 \
--iops 3000 \
--throughput 125 \
--size 10 \
--region us-east-1
This command will fail with an error similar to:
An error occurred (InvalidParameterValue) when calling the CreateVolume operation:
The size of a volume can only be increased, not decreased.
Step 8: Create New Volume at Minimum Size
Since we cannot create a smaller volume from the snapshot, we'll create one at the same size (50 GB):
aws ec2 create-volume \
--snapshot-id snap-0fbfa4e29c36b654b \
--availability-zone us-east-1a \
--volume-type gp3 \
--iops 3000 \
--throughput 125 \
--size 50 \
--region us-east-1
Note the VolumeId from the output (e.g., vol-012df1af06cfd48bb).
Step 9: Wait for Volume Creation
Wait for the new volume to become available:
aws ec2 wait volume-available \
--volume-ids vol-012df1af06cfd48bb \
--region us-east-1
This typically takes 1-2 minutes.
Step 10: Detach Old Volume
With the instance stopped, detach the old volume:
aws ec2 detach-volume \
--volume-id vol-03a19a24976347dd7 \
--region us-east-1
Wait for detachment to complete:
aws ec2 wait volume-available \
--volume-ids vol-03a19a24976347dd7 \
--region us-east-1
This typically takes 30-60 seconds.
Step 11: Attach New Volume
Attach the new volume to the instance at the same device name:
aws ec2 attach-volume \
--volume-id vol-012df1af06cfd48bb \
--instance-id i-08faee10c7a14133f \
--device /dev/xvda \
--region us-east-1
Critical: Use the exact same device name (/dev/xvda in this case) that the original volume used. The instance expects its root volume at this specific device.
Wait for attachment to complete:
aws ec2 wait volume-in-use \
--volume-ids vol-012df1af06cfd48bb \
--region us-east-1
Step 12: Start the Instance
Start the instance with the new volume attached:
aws ec2 start-instances \
--instance-ids i-08faee10c7a14133f \
--region us-east-1
Wait for the instance to reach "running" state:
aws ec2 wait instance-running \
--instance-ids i-08faee10c7a14133f \
--region us-east-1
This typically takes 1-3 minutes.
Step 13: Verify the New Volume Works
Once the instance is running, connect to it and verify the new volume:
# Connect via Session Manager
aws ssm start-session --target i-08faee10c7a14133f --region us-east-1
Then check the filesystem:
# Check mount status
df -h
# Check volume device
lsblk
# Verify disk usage matches what we saw before
df -h /
Expected output should show:
- The root filesystem is mounted at /
- Total size is 50GB (same as before)
- Used space matches previous measurements (~2.1 GB)
- All data is intact from the snapshot
Test that your applications start correctly and can access data normally.
Step 14: Delete Old Volume
After thoroughly testing that the new volume works (recommend waiting several hours or days), delete the old volume:
aws ec2 delete-volume \
--volume-id vol-03a19a24976347dd7 \
--region us-east-1
Important: Only delete the old volume after confirming the new volume works perfectly. Once deleted, the volume cannot be recovered (though you still have the snapshot as backup).
Cost Impact
In this specific case, since AWS doesn't allow creating smaller volumes from snapshots, there's no cost reduction:
- Original: 50 GB × $0.08/GB = $4.00/month (gp3)
- New: 50 GB × $0.08/GB = $4.00/month (gp3)
- Net Savings: $0.00/month
However, the snapshot creation adds a small ongoing cost:
- Snapshot: 50 GB × $0.05/GB = $2.50/month (stored in S3)
Recommendation: After confirming the new volume is stable for several days, consider deleting the snapshot if you have other backup mechanisms in place.
Alternative Approaches for Actual Downsizing
Since snapshot-based resizing cannot reduce volume size, here are alternative methods to actually downsize overprovisioned volumes:
1. Create Fresh Smaller Volume and Copy Data
This is the most reliable method for downsizing:
- Create a brand new volume at your target size (e.g., 10 GB)
- Attach both volumes to the instance
- Format and mount the new volume
- Use rsync or dd to copy data from old to new volume:
sudo rsync -avxHAX /old-mount/ /new-mount/ - Update /etc/fstab and bootloader configuration
- Detach old volume, rename new volume's device, reboot
Pros: Can reduce to any size that fits your data Cons: More complex, requires careful bootloader configuration for root volumes
2. Modify Volume (In-Place Resize)
AWS allows in-place volume modifications, but only for increases, not decreases:
# This works - increasing size
aws ec2 modify-volume --volume-id vol-xxx --size 100
# This does NOT work - decreasing size
aws ec2 modify-volume --volume-id vol-xxx --size 25
After increasing, you must extend the filesystem:
sudo growpart /dev/xvda 1
sudo resize2fs /dev/xvda1 # for ext4
# or
sudo xfs_growfs / # for XFS
Pros: No detachment needed, minimal downtime Cons: Only works for increasing size, not decreasing
3. Filesystem-Level Shrinking (Advanced)
For non-root volumes, you can shrink the filesystem first, then create a smaller volume:
- Unmount the volume
- Run filesystem check:
e2fsck -f /dev/xvdf - Shrink filesystem:
resize2fs /dev/xvdf 10G - Create new smaller volume
- Use dd to copy only the used portion
Pros: Can achieve significant size reduction Cons: Very risky for root volumes, requires precise calculations, data loss risk
4. Application-Level Backup and Restore
For data volumes (non-root), use application-specific backup tools:
- Backup data using application tools (mysqldump, pg_dump, application export, etc.)
- Create new smaller volume
- Restore data to new volume
Pros: Clean migration, no filesystem complexity Cons: Requires application downtime, only works for data volumes
5. Automated EBS Lifecycle Management
Implement CloudWatch alarms and Lambda functions to:
- Monitor volume utilization metrics
- Alert when volumes are consistently underutilized
- Automatically create tickets or trigger manual review
- Track volume right-sizing opportunities across your infrastructure
Pros: Scalable, proactive cost management Cons: Doesn't automatically fix the issue, requires organizational process
Best Practices
- Monitor First: Always check CloudWatch metrics AND filesystem usage before resizing
- Leave Growth Buffer: Size volumes with 50-100% headroom above current usage
- Test in Non-Production: Practice the process on dev/test instances first
- Snapshot Everything: Always create snapshots before making changes
- Verify Before Cleanup: Test the new volume thoroughly before deleting the old one
- Document Device Names: Note device names carefully, especially for root volumes
- Consider Volume Types: When replacing volumes, evaluate whether gp3 is still the best choice
- Plan for Downtime: Communicate instance stop time with stakeholders
Troubleshooting
Instance Won't Boot After Volume Replacement
Symptom: Instance shows "running" but is unreachable, status checks failing
Common Causes:
- Wrong device name used during attachment
- Snapshot was taken while filesystem was inconsistent
- Bootloader configuration issue
Solutions:
- Stop instance, detach new volume, reattach old volume (good reason to keep it!)
- Attach new volume as secondary device, boot with old volume, investigate
- Use EC2 Serial Console or recovery instance to check bootloader
Cannot Create Smaller Volume from Snapshot
Symptom: AWS console shows "The size of a volume can only be increased, not decreased"
This is expected behavior, not a bug. See "Alternative Approaches" section for downsizing methods.
Volume Attachment Fails
Symptom: "Volume already attached" or "Volume in use" errors
Solutions:
- Ensure volume is in "available" state before attaching
- Check that instance is fully stopped
- Verify no other instances have the volume attached
Summary
This tutorial demonstrated the process of replacing an EBS volume using snapshots, while uncovering an important AWS limitation: volumes created from snapshots cannot be smaller than the source volume. This makes snapshot-based resizing suitable for:
- Volume type changes (gp2 → gp3)
- Availability zone migrations
- Creating backup volumes
- Increasing volume size
For actual downsizing, you must use alternative approaches such as creating fresh volumes and copying data, or shrinking filesystems before migration.
The process covered:
- ✅ Reviewing volume metrics and usage
- ✅ Creating backup snapshots
- ✅ Stopping instances safely
- ✅ Creating new volumes (with size limitation awareness)
- ✅ Detaching and attaching volumes
- ✅ Verifying functionality
- ✅ Cleaning up old resources
Always thoroughly test new volumes before deleting old ones, and maintain snapshots until you're confident in the new configuration.