© 2026 Stephen Adei. All rights reserved. All content on this site is the intellectual property of Stephen Adei. See License for terms of use and attribution.
CloudTrail, SNS & SQS — Considerations
CloudTrail
CloudTrail
- Management events: Enabled for all read/write (who changed Glue, S3, IAM, Step Functions). Sufficient for most audit/compliance; low cost.
- Data events: Selectively enabled for high-risk buckets (
ohpen-goldandohpen-quarantine) to track S3 object-level reads/writes. This provides comprehensive auditability for sensitive financial data while managing cost by limiting to critical buckets. - Log delivery: Dedicated S3 bucket (
ohpen-cloudtrail-logs); lifecycle/retention per policy. Log file validation enabled for integrity.
CloudTrail Selective Data Events Configuration:
Who gets alerts (analysts vs platform)
- Analysts consume data (Silver/Gold) via Athena and dashboards. They do not need run-level alerts (no email/Slack on every success or failure).
- Platform / Data Quality need alerts:
- ETL failures → SNS
ohpen-etl-failures→ Platform team (add email/Slack subscription in Console or Terraform) and/or SQS for Lambda (ticketing, logging). - Quarantine spikes → SNS
ohpen-quarantine-alerts→ Data Quality team (wire when post-ETL check or Lambda publishes to this topic).
- ETL failures → SNS
SNS/SQS usage
- SNS: One topic per signal (failures, quarantine). Add email or Slack subscriptions for humans; SQS subscription for automated consumers (Lambda).
- SQS: Used for decoupling and retries.
ohpen-etl-failuresqueue receives failure events from SNS; Lambda can consume and create tickets or post to Slack. DLQ (ohpen-etl-failures-dlq) holds poison messages after 3 receives for inspection. - Success: No SNS for successful runs; analysts use Athena/dashboards. Optional: future Lambda could write run metadata to DynamoDB without alerting analysts.
Terraform
- CloudTrail:
aws_cloudtrail.main,aws_s3_bucket.cloudtrail_logs(see CI/CD Workflow infrastructure section). - SNS:
aws_sns_topic.etl_failures,aws_sns_topic.quarantine_alerts(see Tooling & Controls). - SQS:
aws_sqs_queue.etl_failures,aws_sqs_queue.etl_failures_dlq; subscription from SNS to SQS (see System Architecture Overview for event flow). - EventBridge: Rule
ohpen-step-functions-failed(Step Functions execution FAILED → SNS); no change to the Step Function state machine (see CI/CD Workflow for orchestration details).
See also
- Tooling & Controls - Complete AWS services analysis including CloudTrail, SNS, SQS, KMS
- Traceability Design - Run identity and execution ARN propagation through SNS/SQS
- CI/CD Workflow - Step Functions orchestration and EventBridge triggers
- ETL Flow - CloudWatch metrics publishing and error handling
- Runtime Scenarios - Failure recovery and alerting scenarios