Skip to main content

MSK Cluster Should Use the Latest Version

Overview

This check verifies that your Amazon Managed Streaming for Apache Kafka (MSK) clusters are running the latest supported Apache Kafka version. Keeping your MSK clusters up to date ensures you benefit from the newest security patches, bug fixes, and performance improvements.

Risk

Running an outdated Kafka version exposes your cluster to several risks:

  • Security vulnerabilities: Older versions may contain known security flaws that attackers can exploit
  • Stability issues: Missing bug fixes can lead to broker crashes or partition instability
  • Forced upgrades: When versions reach end-of-support, AWS may automatically upgrade your cluster, potentially causing unexpected compatibility issues
  • Missing features: Newer versions include performance optimizations and capabilities that improve reliability

Remediation Steps

Prerequisites

You need:

  • Access to the AWS Console with permissions to modify MSK clusters, OR
  • AWS CLI configured with appropriate IAM permissions (kafka:UpdateClusterKafkaVersion, kafka:DescribeCluster)
Required IAM permissions

Your IAM user or role needs these permissions:

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"kafka:ListClusters",
"kafka:DescribeCluster",
"kafka:ListKafkaVersions",
"kafka:UpdateClusterKafkaVersion"
],
"Resource": "*"
}
]
}

AWS Console Method

  1. Open the Amazon MSK console
  2. Make sure you are in the us-east-1 region (or your target region)
  3. In the left navigation, click Clusters
  4. Click on the name of the cluster you want to upgrade
  5. On the cluster details page, note the current Apache Kafka version
  6. Click the Actions dropdown, then select Update Apache Kafka version
  7. In the dialog, select the latest available version from the dropdown
  8. Review the upgrade details and click Update
  9. Wait for the cluster status to return to Active (this can take 15-60 minutes depending on cluster size)

Important: MSK upgrades are performed with rolling restarts to minimize downtime, but brief connection interruptions may occur. Schedule upgrades during maintenance windows.

AWS CLI (optional)

Step 1: List available Kafka versions

aws kafka list-kafka-versions \
--region us-east-1 \
--query 'KafkaVersions[?Status==`ACTIVE`].Version' \
--output table

Step 2: Get your cluster's current version

aws kafka describe-cluster \
--cluster-arn arn:aws:kafka:us-east-1:123456789012:cluster/my-cluster/abc123 \
--region us-east-1 \
--query 'ClusterInfo.{Name:ClusterName,CurrentVersion:CurrentBrokerSoftwareInfo.KafkaVersion,ClusterVersion:CurrentVersion}'

Step 3: Update the cluster to the latest version

aws kafka update-cluster-kafka-version \
--cluster-arn arn:aws:kafka:us-east-1:123456789012:cluster/my-cluster/abc123 \
--current-version K1A2B3C4D5E6F7G8H \
--target-kafka-version 3.6.0 \
--region us-east-1

Replace:

  • arn:aws:kafka:us-east-1:123456789012:cluster/my-cluster/abc123 with your cluster ARN
  • K1A2B3C4D5E6F7G8H with the ClusterVersion value from Step 2 (not the Kafka version)
  • 3.6.0 with the latest version from Step 1

Step 4: Monitor the upgrade progress

aws kafka describe-cluster \
--cluster-arn arn:aws:kafka:us-east-1:123456789012:cluster/my-cluster/abc123 \
--region us-east-1 \
--query 'ClusterInfo.State'

The cluster will show UPDATING during the upgrade and return to ACTIVE when complete.

CloudFormation (optional)

Update the KafkaVersion property in your CloudFormation template to the latest supported version:

AWSTemplateFormatVersion: '2010-09-09'
Description: MSK Cluster with latest Kafka version

Parameters:
KafkaVersion:
Type: String
Default: '3.6.0'
Description: Apache Kafka version for the MSK cluster
ClusterName:
Type: String
Default: my-msk-cluster
Description: Name of the MSK cluster
VpcId:
Type: AWS::EC2::VPC::Id
Description: VPC ID for the MSK cluster
SubnetIds:
Type: List<AWS::EC2::Subnet::Id>
Description: Subnet IDs for the MSK cluster (at least 2)

Resources:
MSKCluster:
Type: AWS::MSK::Cluster
Properties:
ClusterName: !Ref ClusterName
KafkaVersion: !Ref KafkaVersion
NumberOfBrokerNodes: 3
BrokerNodeGroupInfo:
InstanceType: kafka.m5.large
ClientSubnets: !Ref SubnetIds
SecurityGroups:
- !Ref MSKSecurityGroup
StorageInfo:
EBSStorageInfo:
VolumeSize: 100
EncryptionInfo:
EncryptionInTransit:
ClientBroker: TLS
InCluster: true
EncryptionAtRest:
DataVolumeKMSKeyId: alias/aws/kafka
EnhancedMonitoring: PER_TOPIC_PER_BROKER
LoggingInfo:
BrokerLogs:
CloudWatchLogs:
Enabled: true
LogGroup: !Ref MSKLogGroup

MSKSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: Security group for MSK cluster
VpcId: !Ref VpcId
SecurityGroupIngress:
- IpProtocol: tcp
FromPort: 9092
ToPort: 9098
CidrIp: 10.0.0.0/8

MSKLogGroup:
Type: AWS::Logs::LogGroup
Properties:
LogGroupName: !Sub /aws/msk/${ClusterName}
RetentionInDays: 30

Outputs:
ClusterArn:
Description: ARN of the MSK cluster
Value: !Ref MSKCluster
KafkaVersion:
Description: Kafka version of the cluster
Value: !Ref KafkaVersion

Deploy or update the stack:

aws cloudformation update-stack \
--stack-name my-msk-stack \
--template-body file://template.yaml \
--parameters ParameterKey=KafkaVersion,ParameterValue=3.6.0 \
--region us-east-1
Terraform (optional)

Update the kafka_version argument in your aws_msk_cluster resource:

variable "kafka_version" {
description = "Apache Kafka version for the MSK cluster"
type = string
default = "3.6.0"
}

variable "cluster_name" {
description = "Name of the MSK cluster"
type = string
default = "my-msk-cluster"
}

variable "vpc_id" {
description = "VPC ID for the MSK cluster"
type = string
}

variable "subnet_ids" {
description = "List of subnet IDs for the MSK cluster"
type = list(string)
}

resource "aws_msk_cluster" "main" {
cluster_name = var.cluster_name
kafka_version = var.kafka_version
number_of_broker_nodes = 3

broker_node_group_info {
instance_type = "kafka.m5.large"
client_subnets = var.subnet_ids
security_groups = [aws_security_group.msk.id]

storage_info {
ebs_storage_info {
volume_size = 100
}
}
}

encryption_info {
encryption_in_transit {
client_broker = "TLS"
in_cluster = true
}
}

enhanced_monitoring = "PER_TOPIC_PER_BROKER"

logging_info {
broker_logs {
cloudwatch_logs {
enabled = true
log_group = aws_cloudwatch_log_group.msk.name
}
}
}
}

resource "aws_security_group" "msk" {
name = "${var.cluster_name}-sg"
description = "Security group for MSK cluster"
vpc_id = var.vpc_id

ingress {
from_port = 9092
to_port = 9098
protocol = "tcp"
cidr_blocks = ["10.0.0.0/8"]
}
}

resource "aws_cloudwatch_log_group" "msk" {
name = "/aws/msk/${var.cluster_name}"
retention_in_days = 30
}

output "cluster_arn" {
description = "ARN of the MSK cluster"
value = aws_msk_cluster.main.arn
}

output "kafka_version" {
description = "Kafka version of the cluster"
value = aws_msk_cluster.main.kafka_version
}

Apply the changes:

terraform plan -var="kafka_version=3.6.0"
terraform apply -var="kafka_version=3.6.0"

Verification

After the upgrade completes, verify the cluster is running the new version:

  1. In the MSK console, click on your cluster name
  2. Check that the Apache Kafka version shows the latest version
  3. Confirm the cluster Status is Active
CLI verification
aws kafka describe-cluster \
--cluster-arn arn:aws:kafka:us-east-1:123456789012:cluster/my-cluster/abc123 \
--region us-east-1 \
--query 'ClusterInfo.{Name:ClusterName,KafkaVersion:CurrentBrokerSoftwareInfo.KafkaVersion,State:State}'

Expected output:

{
"Name": "my-cluster",
"KafkaVersion": "3.6.0",
"State": "ACTIVE"
}

Additional Resources

Notes

  • Upgrade path: MSK only supports upgrading to newer versions; you cannot downgrade. Always test upgrades in a non-production environment first.
  • Client compatibility: Before upgrading, verify that your Kafka clients (producers and consumers) are compatible with the target Kafka version. Check your client library documentation for version compatibility.
  • Rolling upgrade: MSK performs rolling updates, upgrading one broker at a time. This minimizes downtime but may cause brief rebalancing of partitions.
  • Configuration changes: If you need to update the cluster configuration along with the version, you can specify the --configuration-info parameter in the CLI command.
  • MSK Serverless: If you use MSK Serverless, version management is automatic and this check does not apply.
  • Maintenance windows: Schedule upgrades during low-traffic periods to minimize impact on your applications.