Skip to main content

Storage Gateway Fault Tolerance

Overview

This check identifies AWS Storage Gateway instances running on single EC2 instances, which creates a single point of failure. Storage Gateway hosted on EC2 lacks the built-in redundancy that on-premises hypervisor platforms or hardware appliances can provide through clustering and failover capabilities.

Risk

When Storage Gateway runs on a single EC2 instance, your file storage becomes vulnerable to:

  • EC2 instance failures - Hardware issues or software crashes can interrupt access
  • Availability Zone outages - Regional disruptions affect all resources in that AZ
  • Maintenance events - Instance reboots or terminations disrupt connectivity
  • Network issues - Temporary network problems can halt file operations

For applications requiring high availability, this architecture increases the risk of service interruptions and potential data integrity issues from interrupted write operations.

Remediation Steps

Prerequisites

You need:

  • AWS Console access with permissions to manage Storage Gateway
  • Access to your current gateway configuration (shares, volumes, or tape settings)
  • Understanding of which applications use the gateway
Required IAM permissions

Your IAM user or role needs permissions including:

  • storagegateway:* - Full Storage Gateway access
  • ec2:DescribeInstances - To view EC2-hosted gateway details
  • iam:PassRole - If the gateway uses an IAM role

Choose Your Remediation Path

You have two main options for addressing this finding:

Option A: Migrate to a managed multi-AZ service (Recommended for most workloads)

  • Amazon EFS for Linux file shares
  • Amazon FSx for Windows File Server for Windows workloads
  • Amazon FSx for Lustre for high-performance computing

Option B: Deploy Storage Gateway on a fault-tolerant platform

  • VMware vSphere (with vSphere HA)
  • Microsoft Hyper-V (with Failover Clustering)
  • Linux KVM (with appropriate HA configuration)
  • AWS Storage Gateway Hardware Appliance

If your workload uses NFS file shares, Amazon EFS provides automatic multi-AZ redundancy.

  1. Open the Amazon EFS console at https://console.aws.amazon.com/efs
  2. Click Create file system
  3. Enter a name for your file system
  4. Select your VPC
  5. Click Create (uses recommended settings with multi-AZ)
  6. Update your applications to mount the EFS file system instead of the Storage Gateway share
  7. After validating the migration, delete the old Storage Gateway
AWS CLI (optional)

List your current Storage Gateway gateways:

aws storagegateway list-gateways \
--region us-east-1

Get details about a specific gateway to understand its configuration:

aws storagegateway describe-gateway-information \
--gateway-arn "arn:aws:storagegateway:us-east-1:123456789012:gateway/sgw-12345678" \
--region us-east-1

Create an EFS file system as a replacement:

aws efs create-file-system \
--performance-mode generalPurpose \
--throughput-mode bursting \
--encrypted \
--tags Key=Name,Value=my-replacement-filesystem \
--region us-east-1
CloudFormation (optional)

Deploy a multi-AZ EFS file system:

AWSTemplateFormatVersion: '2010-09-09'
Description: Multi-AZ EFS file system to replace single-instance Storage Gateway

Parameters:
VpcId:
Type: AWS::EC2::VPC::Id
Description: VPC for EFS mount targets

SubnetIds:
Type: List<AWS::EC2::Subnet::Id>
Description: Subnets for EFS mount targets (multiple AZs recommended)

Resources:
EFSSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: Security group for EFS mount targets
VpcId: !Ref VpcId
SecurityGroupIngress:
- IpProtocol: tcp
FromPort: 2049
ToPort: 2049
CidrIp: 10.0.0.0/8
Description: NFS access from internal network
Tags:
- Key: Name
Value: efs-mount-target-sg

FileSystem:
Type: AWS::EFS::FileSystem
Properties:
Encrypted: true
PerformanceMode: generalPurpose
ThroughputMode: bursting
FileSystemTags:
- Key: Name
Value: fault-tolerant-file-storage

MountTarget1:
Type: AWS::EFS::MountTarget
Properties:
FileSystemId: !Ref FileSystem
SubnetId: !Select [0, !Ref SubnetIds]
SecurityGroups:
- !Ref EFSSecurityGroup

MountTarget2:
Type: AWS::EFS::MountTarget
Condition: HasSecondSubnet
Properties:
FileSystemId: !Ref FileSystem
SubnetId: !Select [1, !Ref SubnetIds]
SecurityGroups:
- !Ref EFSSecurityGroup

Conditions:
HasSecondSubnet: !Not [!Equals [!Select [1, !Ref SubnetIds], '']]

Outputs:
FileSystemId:
Description: EFS File System ID
Value: !Ref FileSystem

FileSystemDnsName:
Description: DNS name for mounting
Value: !Sub '${FileSystem}.efs.${AWS::Region}.amazonaws.com'
Terraform (optional)
# Multi-AZ EFS file system to replace single-instance Storage Gateway

variable "vpc_id" {
description = "VPC ID for EFS mount targets"
type = string
}

variable "subnet_ids" {
description = "Subnet IDs for EFS mount targets (multiple AZs)"
type = list(string)
}

variable "allowed_cidr_blocks" {
description = "CIDR blocks allowed to access EFS"
type = list(string)
default = ["10.0.0.0/8"]
}

resource "aws_security_group" "efs" {
name_prefix = "efs-mount-target-"
vpc_id = var.vpc_id
description = "Security group for EFS mount targets"

ingress {
from_port = 2049
to_port = 2049
protocol = "tcp"
cidr_blocks = var.allowed_cidr_blocks
description = "NFS access from internal network"
}

egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}

tags = {
Name = "efs-mount-target-sg"
}
}

resource "aws_efs_file_system" "main" {
encrypted = true
performance_mode = "generalPurpose"
throughput_mode = "bursting"

tags = {
Name = "fault-tolerant-file-storage"
}
}

resource "aws_efs_mount_target" "main" {
count = length(var.subnet_ids)
file_system_id = aws_efs_file_system.main.id
subnet_id = var.subnet_ids[count.index]
security_groups = [aws_security_group.efs.id]
}

output "file_system_id" {
description = "EFS File System ID"
value = aws_efs_file_system.main.id
}

output "file_system_dns_name" {
description = "DNS name for mounting"
value = aws_efs_file_system.main.dns_name
}

Option B: Deploy Storage Gateway on Fault-Tolerant Platform

If you must continue using Storage Gateway, deploy it on a platform that supports high availability.

  1. Open the Storage Gateway console at https://console.aws.amazon.com/storagegateway
  2. Click Create gateway
  3. Select your gateway type (File Gateway, Volume Gateway, or Tape Gateway)
  4. For Host platform, choose one of:
    • VMware ESXi - Use with vSphere HA for automatic failover
    • Microsoft Hyper-V - Use with Windows Failover Clustering
    • Linux KVM - Configure with your HA solution
    • Hardware Appliance - Purpose-built device with redundant components
  5. Download the VM image for your chosen platform
  6. Deploy the VM on your hypervisor with HA enabled
  7. Return to the console and click Next to activate your gateway
  8. Recreate your file shares, volumes, or tape configurations
  9. Update client applications to point to the new gateway
  10. Delete the old EC2-hosted gateway after successful migration

Verification

After remediation, verify the finding is resolved:

  1. Open the Storage Gateway console
  2. Select your gateway from the list
  3. Check the Host platform field - it should show something other than "Amazon EC2"

If you migrated to EFS or FSx:

  1. Open the respective service console
  2. Verify your file system shows Available status
  3. Confirm mount targets exist in multiple Availability Zones
CLI verification commands

List gateways and check their host platform:

# List all gateways
aws storagegateway list-gateways --region us-east-1

# Get details for each gateway (replace with your gateway ARN)
aws storagegateway describe-gateway-information \
--gateway-arn "arn:aws:storagegateway:us-east-1:123456789012:gateway/sgw-12345678" \
--region us-east-1 \
--query '{Name: GatewayName, Platform: HostEnvironment, State: GatewayState}'

The HostEnvironment field should show VMWARE, HYPER-V, KVM, or HARDWARE_APPLIANCE rather than EC2.

For EFS verification:

aws efs describe-file-systems \
--region us-east-1 \
--query 'FileSystems[*].{Id: FileSystemId, Name: Name, State: LifeCycleState, AZs: NumberOfMountTargets}'

Additional Resources

Notes

  • Data migration: Before deleting your EC2-hosted gateway, ensure all data is migrated. For file gateways, this means copying files to the new storage location.

  • Client updates: Applications connecting to the gateway will need configuration changes to point to the new endpoint.

  • Cost considerations: Managed services like EFS and FSx have different pricing models. Review costs before migrating.

  • FSx File Gateway deprecation: AWS has deprecated FSx File Gateway for new customers. Consider native FSx for Windows File Server instead.

  • Hybrid scenarios: If you need on-premises caching with cloud storage, the Hardware Appliance or hypervisor-based deployments provide fault tolerance while maintaining the hybrid architecture.