Skip to main content

Enable CloudWatch Logging for AWS Glue ETL Jobs

Note: This Prowler check is marked as DEPRECATED. The remediation guidance below remains valid, but the check may be removed in future Prowler versions.

Overview

This check verifies that AWS Glue ETL jobs have continuous CloudWatch logging enabled. CloudWatch logging captures detailed job execution information, making it easier to monitor performance, troubleshoot failures, and detect security issues.

Risk

Without logging enabled, AWS Glue jobs lack visibility into job activities and failures. This makes it difficult to:

  • Detect unauthorized access or credential abuse
  • Troubleshoot job failures and performance issues
  • Investigate potential data exfiltration in ETL scripts
  • Meet compliance requirements for audit trails

Missing logs can hide tampering with data transformations and leave security incidents undetected.

Remediation Steps

Prerequisites

You need permission to modify AWS Glue jobs. This typically requires the glue:UpdateJob permission.

AWS Console Method

  1. Open the AWS Glue Console in your AWS account
  2. In the left navigation, click ETL jobs under "Data Integration and ETL"
  3. Select the job you want to update
  4. Click Edit (or click the job name, then Edit job)
  5. Scroll down and expand Advanced properties
  6. Find the Continuous logging section
  7. Check the box to Enable CloudWatch logs
  8. (Optional) Specify a custom log group name
  9. Click Save
AWS CLI Method

Use the update-job command to enable continuous CloudWatch logging:

aws glue update-job \
--region us-east-1 \
--job-name <your-job-name> \
--job-update '{
"DefaultArguments": {
"--enable-continuous-cloudwatch-log": "true"
}
}'

Replace <your-job-name> with the actual name of your Glue job.

Important: The update-job command replaces the entire job configuration. To preserve existing settings, first retrieve the current job definition:

# Get current job configuration
aws glue get-job \
--region us-east-1 \
--job-name <your-job-name> \
--query 'Job.DefaultArguments'

Then include all existing default arguments in your update, adding the logging parameter.

Optional: Specify a custom log group:

aws glue update-job \
--region us-east-1 \
--job-name <your-job-name> \
--job-update '{
"DefaultArguments": {
"--enable-continuous-cloudwatch-log": "true",
"--continuous-log-logGroup": "/aws-glue/jobs/<your-job-name>"
}
}'
CloudFormation

Add the --enable-continuous-cloudwatch-log parameter to your Glue job's DefaultArguments:

AWSTemplateFormatVersion: '2010-09-09'
Description: AWS Glue ETL Job with CloudWatch logging enabled

Parameters:
JobName:
Type: String
Description: Name of the Glue job
ScriptLocation:
Type: String
Description: S3 path to the ETL script
IAMRoleArn:
Type: String
Description: ARN of the IAM role for the Glue job

Resources:
GlueJob:
Type: AWS::Glue::Job
Properties:
Name: !Ref JobName
Role: !Ref IAMRoleArn
Command:
Name: glueetl
ScriptLocation: !Ref ScriptLocation
PythonVersion: "3"
GlueVersion: "4.0"
DefaultArguments:
"--enable-continuous-cloudwatch-log": "true"
"--continuous-log-logGroup": !Sub "/aws-glue/jobs/${JobName}"
NumberOfWorkers: 2
WorkerType: G.1X

Outputs:
JobName:
Description: Name of the created Glue job
Value: !Ref GlueJob

Key configuration:

  • --enable-continuous-cloudwatch-log: Set to "true" to enable logging
  • --continuous-log-logGroup: (Optional) Custom log group name
Terraform

Add the logging arguments to your aws_glue_job resource's default_arguments:

terraform {
required_version = ">= 1.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}

provider "aws" {
region = "us-east-1"
}

variable "job_name" {
description = "Name of the Glue job"
type = string
}

variable "script_location" {
description = "S3 path to the ETL script"
type = string
}

variable "iam_role_arn" {
description = "ARN of the IAM role for the Glue job"
type = string
}

resource "aws_glue_job" "etl_job" {
name = var.job_name
role_arn = var.iam_role_arn

command {
name = "glueetl"
script_location = var.script_location
python_version = "3"
}

glue_version = "4.0"
number_of_workers = 2
worker_type = "G.1X"

default_arguments = {
"--enable-continuous-cloudwatch-log" = "true"
"--continuous-log-logGroup" = "/aws-glue/jobs/${var.job_name}"
}
}

output "job_name" {
description = "Name of the created Glue job"
value = aws_glue_job.etl_job.name
}

Key configuration:

  • --enable-continuous-cloudwatch-log: Set to "true" to enable logging
  • --continuous-log-logGroup: (Optional) Custom log group for job logs

Verification

After enabling logging, verify the configuration:

  1. In the AWS Glue Console, open your job
  2. Check the Advanced properties section
  3. Confirm Continuous logging shows as enabled
  4. Run the job and check CloudWatch Logs for the log group (default: /aws-glue/jobs-logs-v2/)
CLI Verification
# Check if logging is enabled for a specific job
aws glue get-job \
--region us-east-1 \
--job-name <your-job-name> \
--query 'Job.DefaultArguments."--enable-continuous-cloudwatch-log"'

Expected output: "true"

To list all jobs and their logging status:

aws glue get-jobs \
--region us-east-1 \
--query 'Jobs[*].{Name:Name,Logging:DefaultArguments."--enable-continuous-cloudwatch-log"}'

Additional Resources

Notes

  • IAM permissions: The Glue job's IAM role must have permissions to write to CloudWatch Logs. Ensure the role has the logs:CreateLogGroup, logs:CreateLogStream, and logs:PutLogEvents permissions.
  • Cost consideration: Continuous logging generates CloudWatch Logs data, which incurs charges. Consider setting log retention policies to manage costs.
  • Log encryption: For sensitive data, configure the CloudWatch log group with KMS encryption.
  • Deprecation: This Prowler check is deprecated. Consider reviewing your Glue job monitoring strategy with AWS-native tools like AWS Security Hub.