Ensure Glue Data Catalogs Are Not Publicly Accessible
Overview
This check verifies that your AWS Glue Data Catalog does not have a resource policy that allows public access. The Glue Data Catalog stores metadata about your data assets (databases, tables, schemas, and connection information). If this metadata is publicly accessible, unauthorized users can discover sensitive information about your data infrastructure.
Risk
When a Glue Data Catalog is publicly accessible:
- Information disclosure: Attackers can enumerate your database schemas, table structures, and S3 data locations
- Data pipeline compromise: If write permissions are exposed, attackers can modify databases and tables, corrupting data lineage
- Lateral movement: Connection metadata may reveal paths to other data stores in your environment
- Compliance violations: Public exposure of data catalogs violates most security frameworks and regulations
Severity: High
Remediation Steps
Prerequisites
You need permission to view and modify Glue Data Catalog settings. Specifically, you need the glue:GetResourcePolicy, glue:PutResourcePolicy, and glue:DeleteResourcePolicy permissions.
AWS Console Method
- Sign in to the AWS Glue Console
- In the left navigation, click Settings (under "Data Catalog")
- Look for the Permissions section showing the resource policy
- Review the policy for any statements containing:
"Principal": "*""Principal": {"AWS": "*"}
- Click Edit resource policy
- Either:
- Remove the problematic statements that grant public access, or
- Delete the entire policy if no resource policy is needed
- Click Save
AWS CLI (optional)
View the Current Policy
aws glue get-resource-policy --region us-east-1
If the output contains "Principal": "*" or "Principal": {"AWS": "*"}, the catalog is publicly accessible.
Option 1: Delete the Resource Policy Entirely
If you do not need cross-account access, simply delete the policy:
aws glue delete-resource-policy --region us-east-1
Option 2: Replace with a Restrictive Policy
If you need cross-account access, replace the policy with one that specifies exact account IDs:
aws glue put-resource-policy \
--region us-east-1 \
--policy-in-json '{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::<TRUSTED-ACCOUNT-ID>:root"
},
"Action": [
"glue:GetDatabase",
"glue:GetDatabases",
"glue:GetTable",
"glue:GetTables",
"glue:GetPartition",
"glue:GetPartitions"
],
"Resource": [
"arn:aws:glue:us-east-1:<YOUR-ACCOUNT-ID>:catalog",
"arn:aws:glue:us-east-1:<YOUR-ACCOUNT-ID>:database/*",
"arn:aws:glue:us-east-1:<YOUR-ACCOUNT-ID>:table/*"
]
}
]
}'
Replace:
<TRUSTED-ACCOUNT-ID>with the 12-digit AWS account ID you want to grant access<YOUR-ACCOUNT-ID>with your own 12-digit AWS account ID
CloudFormation (optional)
Use the AWS::Glue::ResourcePolicy resource to define a secure policy:
AWSTemplateFormatVersion: '2010-09-09'
Description: Secure Glue Data Catalog resource policy
Parameters:
TrustedAccountId:
Type: String
Description: AWS Account ID to grant Glue Data Catalog access
AllowedPattern: '^\d{12}$'
ConstraintDescription: Must be a valid 12-digit AWS account ID
Resources:
GlueDataCatalogPolicy:
Type: AWS::Glue::ResourcePolicy
Properties:
PolicyInJson: !Sub |
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::${TrustedAccountId}:root"
},
"Action": [
"glue:GetDatabase",
"glue:GetDatabases",
"glue:GetTable",
"glue:GetTables",
"glue:GetPartition",
"glue:GetPartitions"
],
"Resource": [
"arn:aws:glue:${AWS::Region}:${AWS::AccountId}:catalog",
"arn:aws:glue:${AWS::Region}:${AWS::AccountId}:database/*",
"arn:aws:glue:${AWS::Region}:${AWS::AccountId}:table/*"
]
}
]
}
Outputs:
CatalogArn:
Description: The ARN of the Glue Data Catalog
Value: !Sub arn:aws:glue:${AWS::Region}:${AWS::AccountId}:catalog
To remove an existing public policy, you can deploy a stack that defines a restrictive policy (as above) to replace it, or manually delete the policy via the console/CLI.
Terraform (optional)
Define a Secure Resource Policy
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}
provider "aws" {
region = "us-east-1"
}
variable "trusted_account_id" {
description = "AWS Account ID to grant Glue Data Catalog access"
type = string
}
data "aws_caller_identity" "current" {}
data "aws_region" "current" {}
resource "aws_glue_resource_policy" "secure_catalog" {
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Principal = {
AWS = "arn:aws:iam::${var.trusted_account_id}:root"
}
Action = [
"glue:GetDatabase",
"glue:GetDatabases",
"glue:GetTable",
"glue:GetTables",
"glue:GetPartition",
"glue:GetPartitions"
]
Resource = [
"arn:aws:glue:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:catalog",
"arn:aws:glue:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:database/*",
"arn:aws:glue:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:table/*"
]
}
]
})
}
Remove the Resource Policy Entirely
If you do not need a resource policy, you can import and then remove it:
# First, import the existing policy
# terraform import aws_glue_resource_policy.existing arn:aws:glue:us-east-1:<ACCOUNT-ID>:catalog
# Then remove the resource block from your config and run terraform apply
Alternatively, use the AWS CLI to delete it directly (see CLI section above).
Verification
After making changes, verify the policy is no longer public:
- Go to the AWS Glue Console > Settings
- Check that the resource policy either:
- Does not exist, or
- Contains only specific account ARNs (no
*principals)
CLI verification
# Check if a policy exists
aws glue get-resource-policy --region us-east-1
# If no policy exists, you will see an error like:
# "An error occurred (EntityNotFoundException) when calling the GetResourcePolicy operation"
# If a policy exists, review the output and confirm no Principal contains "*"
Additional Resources
- AWS Glue Resource-Based Policies
- AWS Glue Cross-Account Access
- AWS Lake Formation for Fine-Grained Access Control
Notes
- Lake Formation alternative: For fine-grained data access control, consider using AWS Lake Formation instead of Glue resource policies. Lake Formation provides table-level and column-level permissions.
- Cross-account sharing: If you need to share data across accounts, always specify explicit account IDs rather than using wildcards.
- Principle of Least Privilege: Grant only the minimum permissions needed. Read-only actions like
GetTableare safer thanglue:*. - Regional scope: The Glue Data Catalog is regional. You may need to check and remediate each region where you use Glue.