Skip to main content

Step Functions State Machines Should Have Logging Enabled

Overview

This check verifies that AWS Step Functions state machines have execution logging enabled to CloudWatch Logs. The check fails if the state machine does not have a loggingConfiguration defined with a logging level above OFF.

Logging captures workflow execution history, making it possible to troubleshoot failures, monitor performance, and meet compliance requirements.

Risk

Without execution logs:

  • Workflow failures go undetected - You cannot see what went wrong or when
  • Troubleshooting becomes difficult - No visibility into execution history increases recovery time
  • Audit trails are missing - Compliance frameworks (PCI, SOC2, ISO27001) require logging for accountability
  • Security incidents may go unnoticed - Unauthorized or anomalous executions leave no evidence

Remediation Steps

Prerequisites

  • Access to the AWS Console with permissions to modify Step Functions state machines
  • A CloudWatch Logs log group to receive the logs (you can create one during setup)

AWS Console Method

  1. Open the AWS Step Functions Console
  2. Click State machines in the left navigation
  3. Select the state machine you want to configure
  4. Click Edit
  5. Scroll down to the Logging section
  6. Toggle logging ON
  7. Select an existing CloudWatch Logs log group, or create a new one
  8. Set the Log level to one of:
    • ERROR - Logs only failed executions (recommended for most use cases)
    • ALL - Logs all execution events (useful for debugging)
  9. Optionally enable Include execution data if you need input/output details in logs
  10. Click Save

Note: The state machine's IAM role must have permissions to write to CloudWatch Logs. If you see permission errors, see the troubleshooting section below.

AWS CLI (optional)

First, create a logging configuration file:

cat > logging-config.json << 'EOF'
{
"level": "ERROR",
"includeExecutionData": false,
"destinations": [
{
"cloudWatchLogsLogGroup": {
"logGroupArn": "arn:aws:logs:us-east-1:<account-id>:log-group:/aws/vendedlogs/states/<state-machine-name>:*"
}
}
]
}
EOF

Replace <account-id> and <state-machine-name> with your values.

Then update the state machine:

aws stepfunctions update-state-machine \
--region us-east-1 \
--state-machine-arn arn:aws:states:us-east-1:<account-id>:stateMachine:<state-machine-name> \
--logging-configuration file://logging-config.json

Logging Levels:

  • ALL - Logs everything (Start, Pass, Fail, Succeed, etc.)
  • ERROR - Logs only failed executions
  • FATAL - Logs only fatal errors
  • OFF - Disables logging (this is what we want to avoid)

Creating a log group first (if needed):

aws logs create-log-group \
--region us-east-1 \
--log-group-name /aws/vendedlogs/states/<state-machine-name>
CloudFormation (optional)

This template creates a Step Functions state machine with logging properly configured:

AWSTemplateFormatVersion: '2010-09-09'
Description: Step Functions State Machine with logging enabled

Parameters:
StateMachineName:
Type: String
Description: Name of the Step Functions state machine
Default: MyStateMachine
LogRetentionDays:
Type: Number
Description: Number of days to retain logs
Default: 30
AllowedValues: [1, 3, 5, 7, 14, 30, 60, 90, 120, 150, 180, 365, 400, 545, 731, 1096, 1827, 2192, 2557, 2922, 3288, 3653]

Resources:
StateMachineLogGroup:
Type: AWS::Logs::LogGroup
Properties:
LogGroupName: !Sub '/aws/vendedlogs/states/${StateMachineName}'
RetentionInDays: !Ref LogRetentionDays

StateMachineRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Principal:
Service: states.amazonaws.com
Action: sts:AssumeRole
Policies:
- PolicyName: CloudWatchLogsDeliveryPolicy
PolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Action:
- logs:CreateLogDelivery
- logs:GetLogDelivery
- logs:UpdateLogDelivery
- logs:DeleteLogDelivery
- logs:ListLogDeliveries
- logs:PutResourcePolicy
- logs:DescribeResourcePolicies
- logs:DescribeLogGroups
Resource: '*'

StateMachine:
Type: AWS::StepFunctions::StateMachine
Properties:
StateMachineName: !Ref StateMachineName
RoleArn: !GetAtt StateMachineRole.Arn
LoggingConfiguration:
Level: ERROR
IncludeExecutionData: false
Destinations:
- CloudWatchLogsLogGroup:
LogGroupArn: !GetAtt StateMachineLogGroup.Arn
Definition:
Comment: Sample state machine with logging
StartAt: HelloWorld
States:
HelloWorld:
Type: Pass
End: true

Outputs:
StateMachineArn:
Description: ARN of the state machine
Value: !Ref StateMachine
LogGroupArn:
Description: ARN of the CloudWatch Logs log group
Value: !GetAtt StateMachineLogGroup.Arn

Deploy with:

aws cloudformation deploy \
--region us-east-1 \
--template-file template.yaml \
--stack-name stepfunctions-logging-stack \
--capabilities CAPABILITY_IAM \
--parameter-overrides StateMachineName=MyStateMachine
Terraform (optional)
terraform {
required_version = ">= 1.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = ">= 4.0"
}
}
}

variable "state_machine_name" {
description = "Name of the Step Functions state machine"
type = string
default = "my-state-machine"
}

variable "log_retention_days" {
description = "Number of days to retain CloudWatch Logs"
type = number
default = 30
}

variable "logging_level" {
description = "Logging level for the state machine (ALL, ERROR, FATAL)"
type = string
default = "ERROR"
validation {
condition = contains(["ALL", "ERROR", "FATAL"], var.logging_level)
error_message = "Logging level must be ALL, ERROR, or FATAL."
}
}

# CloudWatch Log Group for Step Functions
resource "aws_cloudwatch_log_group" "sfn_logs" {
name = "/aws/vendedlogs/states/${var.state_machine_name}"
retention_in_days = var.log_retention_days
}

# IAM Role for Step Functions
resource "aws_iam_role" "sfn_role" {
name = "${var.state_machine_name}-role"

assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Principal = {
Service = "states.amazonaws.com"
}
Action = "sts:AssumeRole"
}
]
})
}

# IAM Policy for CloudWatch Logs
resource "aws_iam_role_policy" "sfn_logging" {
name = "${var.state_machine_name}-logging-policy"
role = aws_iam_role.sfn_role.id

policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Action = [
"logs:CreateLogDelivery",
"logs:GetLogDelivery",
"logs:UpdateLogDelivery",
"logs:DeleteLogDelivery",
"logs:ListLogDeliveries",
"logs:PutResourcePolicy",
"logs:DescribeResourcePolicies",
"logs:DescribeLogGroups"
]
Resource = "*"
}
]
})
}

# Step Functions State Machine with logging
resource "aws_sfn_state_machine" "main" {
name = var.state_machine_name
role_arn = aws_iam_role.sfn_role.arn

logging_configuration {
log_destination = "${aws_cloudwatch_log_group.sfn_logs.arn}:*"
level = var.logging_level
include_execution_data = false
}

definition = jsonencode({
Comment = "Sample state machine with logging"
StartAt = "HelloWorld"
States = {
HelloWorld = {
Type = "Pass"
End = true
}
}
})
}

output "state_machine_arn" {
description = "ARN of the Step Functions state machine"
value = aws_sfn_state_machine.main.arn
}

output "log_group_arn" {
description = "ARN of the CloudWatch Logs log group"
value = aws_cloudwatch_log_group.sfn_logs.arn
}

Apply with:

terraform init
terraform apply -var="state_machine_name=my-state-machine"

Verification

After enabling logging:

  1. Go to the Step Functions Console
  2. Select your state machine
  3. Look for the Logging section - it should show the log level and destination
  4. Run a test execution and confirm logs appear in CloudWatch
CLI verification commands
# Describe the state machine and check logging configuration
aws stepfunctions describe-state-machine \
--region us-east-1 \
--state-machine-arn arn:aws:states:us-east-1:<account-id>:stateMachine:<state-machine-name> \
--query 'loggingConfiguration'

Expected output should show a logging level other than OFF:

{
"level": "ERROR",
"includeExecutionData": false,
"destinations": [
{
"cloudWatchLogsLogGroup": {
"logGroupArn": "arn:aws:logs:us-east-1:123456789012:log-group:/aws/vendedlogs/states/MyStateMachine:*"
}
}
]
}

Run Prowler to verify the fix:

prowler aws --check stepfunctions_statemachine_logging_enabled --region us-east-1

Additional Resources

Notes

  • Logging levels: Choose ERROR for production (captures failures without excessive logs) or ALL for debugging
  • Sensitive data: Set includeExecutionData to false if your workflows process sensitive information to avoid logging it
  • IAM permissions: The state machine's execution role needs CloudWatch Logs permissions. If logging fails silently, check the role's policies
  • Log retention: Set a retention policy on your log group to control storage costs
  • Cost considerations: Logging to CloudWatch incurs charges based on data ingested and stored. ERROR level logging is more cost-effective than ALL
  • Express workflows: Express state machines have different logging requirements - see AWS documentation for details