Skip to main content

Ensure ElastiCache Redis Cluster Has Multi-AZ Enabled

Overview

This check verifies that your Amazon ElastiCache for Redis replication groups have Multi-AZ with automatic failover enabled. Multi-AZ distributes your primary and replica nodes across separate Availability Zones, providing high availability and automatic recovery if an outage occurs.

Risk

Without Multi-AZ enabled, your Redis cluster is vulnerable to:

  • Service outages: If the Availability Zone hosting your Redis nodes fails, your application loses access to cached data
  • Cold-cache rebuilds: After a failure, rebuilding the cache from scratch puts heavy load on your backend databases
  • Cascading failures: Database overload from cache rebuilds can cause timeouts and failures across your entire application
  • Data loss: Recent writes may be lost during failure events before they can be replicated

Remediation Steps

Prerequisites

  • Access to the AWS Console with permissions to modify ElastiCache resources, OR
  • AWS CLI installed and configured with appropriate credentials
  • Your Redis cluster must have at least one replica (Multi-AZ requires replicas)

AWS Console Method

  1. Open the Amazon ElastiCache console
  2. In the left navigation, click Redis OSS caches
  3. Select the replication group you want to modify
  4. Click the Modify button
  5. In the Availability section, set Multi-AZ to Enabled
  6. Ensure Auto-failover is also set to Enabled (required for Multi-AZ)
  7. Under Schedule modifications, select Apply immediately (or schedule for your maintenance window)
  8. Click Modify
AWS CLI Method

Use the modify-replication-group command to enable Multi-AZ:

aws elasticache modify-replication-group \
--replication-group-id <your-replication-group-id> \
--multi-az-enabled \
--automatic-failover-enabled \
--apply-immediately \
--region us-east-1

Replace <your-replication-group-id> with your actual replication group identifier.

To find your replication group IDs:

aws elasticache describe-replication-groups \
--query "ReplicationGroups[*].[ReplicationGroupId,MultiAZ,AutomaticFailover]" \
--output table \
--region us-east-1
CloudFormation

To create a new Redis replication group with Multi-AZ enabled:

AWSTemplateFormatVersion: '2010-09-09'
Description: ElastiCache Redis replication group with Multi-AZ enabled

Parameters:
ReplicationGroupId:
Type: String
Description: Identifier for the replication group
Default: example-redis-cluster

SubnetGroupName:
Type: String
Description: Name of the cache subnet group

SecurityGroupId:
Type: String
Description: Security group ID for the cluster

Resources:
RedisReplicationGroup:
Type: AWS::ElastiCache::ReplicationGroup
Properties:
ReplicationGroupId: !Ref ReplicationGroupId
ReplicationGroupDescription: Redis replication group with Multi-AZ enabled
Engine: redis
EngineVersion: '7.0'
CacheNodeType: cache.t3.micro
NumCacheClusters: 2
MultiAZEnabled: true
AutomaticFailoverEnabled: true
Port: 6379
CacheSubnetGroupName: !Ref SubnetGroupName
SecurityGroupIds:
- !Ref SecurityGroupId
Tags:
- Key: Environment
Value: production

Outputs:
PrimaryEndpoint:
Description: Primary endpoint address
Value: !GetAtt RedisReplicationGroup.PrimaryEndPoint.Address
ReaderEndpoint:
Description: Reader endpoint address
Value: !GetAtt RedisReplicationGroup.ReaderEndPoint.Address

Key properties for Multi-AZ:

  • MultiAZEnabled: true - Enables Multi-AZ deployment
  • AutomaticFailoverEnabled: true - Required for Multi-AZ; enables automatic promotion of replica to primary
  • NumCacheClusters: 2 - Must be at least 2 (one primary + one replica) for Multi-AZ
Terraform
resource "aws_elasticache_replication_group" "example" {
replication_group_id = "example-redis-cluster"
description = "Redis replication group with Multi-AZ enabled"

engine = "redis"
engine_version = "7.0"
node_type = "cache.t3.micro"

num_cache_clusters = 2

# Enable Multi-AZ with automatic failover
multi_az_enabled = true
automatic_failover_enabled = true

port = 6379

subnet_group_name = aws_elasticache_subnet_group.example.name
security_group_ids = [aws_security_group.redis.id]

tags = {
Environment = "production"
}
}

Key arguments for Multi-AZ:

  • multi_az_enabled = true - Enables Multi-AZ deployment
  • automatic_failover_enabled = true - Required for Multi-AZ; enables automatic failover
  • num_cache_clusters = 2 - Must be at least 2 for Multi-AZ (primary + replica)

Verification

After applying the changes, verify Multi-AZ is enabled:

  1. In the ElastiCache console, select your replication group
  2. Check the Multi-AZ column shows Enabled
  3. Verify Auto-failover also shows Enabled
CLI Verification
aws elasticache describe-replication-groups \
--replication-group-id <your-replication-group-id> \
--query "ReplicationGroups[0].[MultiAZ,AutomaticFailover]" \
--output text \
--region us-east-1

Expected output:

enabled	enabled

Additional Resources

Notes

  • Replica requirement: Multi-AZ requires at least one replica node. If your cluster has no replicas, you must add one before enabling Multi-AZ.
  • Automatic failover: Multi-AZ and automatic failover work together. Enabling Multi-AZ automatically requires automatic failover to be enabled.
  • Brief interruption: Enabling Multi-AZ on an existing cluster may cause a brief interruption during the modification. Consider scheduling this during a maintenance window.
  • Cost consideration: Multi-AZ requires additional replica nodes in different Availability Zones, which increases costs. However, this is a worthwhile investment for production workloads requiring high availability.
  • Application configuration: Ensure your application uses the primary endpoint for writes and optionally the reader endpoint for read scaling. This allows seamless failover without application changes.