End-to-End System Architecture Overview

© 2026 Stephen Adei. All rights reserved. All content on this site is the intellectual property of Stephen Adei. See License for terms of use and attribution.

Scope & Assumptions

This solution is an OLAP analytics / data lake platform that delivers the assigned scope: CSV in S3 → validation → Parquet → Athena. Ohpen's OLTP core banking systems are upstream and out of scope; they are treated as systems of record. Data correctness, ACID guarantees, and transactional integrity are assumed at the source. This platform focuses on validation, auditability, analytics performance, and governance. Scope is limited to batch CSV-in-S3; CDC and event-driven ingestion are out of scope for this case.

This document provides a comprehensive view of the case study data lake platform (OLAP analytics layer), showing all components, data flows, integrations, and operational processes. The architecture is designed with AWS-first and managed-service principles in mind.

Complete System Architecture

Component Interactions

For runtime scenarios showing these interactions, see Runtime Scenarios. For ETL component interactions, see ETL Flow - Component Interaction.

1. Data Ingestion Flow

Source: CSV files containing financial transaction data
Ingestion: Files uploaded to Bronze S3 bucket (immutable, append-only)
Trigger: EventBridge detects new objects or scheduled run (daily 2 AM UTC)
Orchestration: Step Functions state machine starts ETL workflow

2. ETL Processing Flow

Metadata Enrichment: Each row enriched with row_hash, source_file_id, attempt_count
Loop Prevention: Duplicate detection, attempt limit enforcement, circuit breaker
Validation: Schema validation, domain validation, business rules
Transformation: Timestamp parsing, partitioning (year/month), data type conversion
Storage: Valid data → Silver/Gold, Invalid data → Quarantine/Condemned

3. Data Promotion Flow

Silver Promotion: Lambda function updates _LATEST.json and current/ prefix
Catalog Registration: Glue Data Catalog registers new partitions
Query Availability: Data becomes available for Athena queries

4. Query & Consumption Flow

SQL Queries: Analysts query via Athena workgroup
Partition Pruning: Athena only scans relevant partitions (95% cost reduction)
BI Integration: BI tools consume aggregated data from Gold layer
Reports: Finance team generates monthly reports

5. Monitoring & Alerting Flow

Metrics Collection: CloudWatch collects metrics from all services
Alarm Evaluation: CloudWatch alarms evaluate thresholds
Failure Detection: Step Functions failures trigger EventBridge rules
Notification: SNS topics publish alerts, SQS queues decouple consumers
Platform Team: Receives alerts via email/Slack, processes via Lambda consumers

Observability Design: For detailed patterns on run identity propagation, metric dimensioning, and cross-service correlation, see Traceability Design.

6. Security & Compliance Flow

Encryption: SSE-S3 for standard buckets, SSE-KMS for sensitive buckets
Access Control: IAM roles enforce least-privilege access
Audit Logging: CloudTrail logs all API calls (management + selective data events)
Key Management: KMS CMK with automatic rotation for sensitive data

7. CI/CD Deployment Flow

Code Changes: Developers commit code to GitHub
CI Validation: Pull requests trigger tests, linting, Terraform plan
CD Deployment: Merge to main triggers OIDC authentication, artifact build
Manual Approval: GitHub environment gate requires manual approval
Infrastructure Update: Terraform apply updates AWS resources
Artifact Deployment: ETL scripts uploaded to S3 artifacts bucket

Data Flow Summary

For detailed ETL data flow, see ETL Flow. For SQL query flow, see SQL Breakdown.

Primary Flow (Success Path):

CSV Files → Bronze → EventBridge → Step Functions → Glue ETL → Silver → Lambda Promotion → Gold → Glue Catalog → Athena → Analysts

Error Handling Flow:

Glue ETL → Validation Errors → Quarantine (quarantine/) → (Retry) → Silver OR (Max Attempts) → Quarantine (quarantine/condemned/)

Failure notification flow

Failure Notification Flow:

Step Functions FAILED → EventBridge → SNS → SQS → Platform Team (Email/Slack/Lambda)

Monitoring Flow:

All Services → CloudWatch Metrics/Logs → Alarms → SNS → Platform Team

Key Architectural Principles

For complete architectural design, see Data Lake Architecture. For design rationale, see Design Decisions Summary.

Medallion Architecture: Bronze (raw) → Silver (validated) → Gold (curated)
Immutable Audit Trail: Bronze layer is append-only, all runs preserved
Safe Publishing: Write-then-publish pattern with _SUCCESS markers
Loop Prevention: Duplicate detection, attempt limits, circuit breaker
Cost Optimization: Partition pruning, lifecycle policies, serverless model
Security First: Encryption at rest, least-privilege IAM, comprehensive audit logging
Observability: Comprehensive monitoring, alerting, and audit trails

Scope & Assumptions​

Complete System Architecture​

Component Interactions​

1. Data Ingestion Flow​

2. ETL Processing Flow​

3. Data Promotion Flow​

4. Query & Consumption Flow​

5. Monitoring & Alerting Flow​

6. Security & Compliance Flow​

7. CI/CD Deployment Flow​

Data Flow Summary​

Failure notification flow​

Key Architectural Principles​

See also​