ADR-003: Serverless Architecture (Glue/Athena)
© 2026 Stephen Adei. All rights reserved. All content on this site is the intellectual property of Stephen Adei. See License for terms of use and attribution.
Status
Accepted
Context
The system needs to process batch analytical workloads (OLAP) with monthly reporting requirements. The following options were considered:
- Serverless (Glue/Athena) (chosen)
- Always-on (EMR clusters/Redshift) (rejected)
- Hybrid (EMR on-demand) (rejected)
The workload is batch-oriented, not real-time. Month-end reports can tolerate 30-second query times.
Decision
Use serverless architecture with AWS Glue (ETL) and Athena (analytics) instead of always-on compute infrastructure.
Rationale
- Batch workload: OLAP workload does not require sub-second latency
- Cost efficiency: Pay-per-query model matches usage pattern ($27/month baseline vs $500+/month for always-on)
- No idle costs: Scale to zero when not in use
- Auto-scaling: AWS handles capacity planning automatically
- Operational simplicity: No cluster management, patching, or capacity planning
Consequences
Positive:
- Cost efficiency: $27/month baseline vs $500+/month for always-on infrastructure
- No idle costs: Pay only for actual usage
- Auto-scaling: AWS handles capacity automatically
- Operational simplicity: No cluster management required
- Sustainability: Scale to zero reduces power consumption
Negative:
- Sub-second latency not achievable (acceptable for batch workload)
- Cold start delays: First query may take 5-10 seconds
- Less control over compute resources: AWS manages capacity
Alternatives Considered
EMR Clusters (Always-on)
- Why rejected: High cost ($500+/month), idle costs when not processing, operational overhead (cluster management, patching).
EMR on-Demand
- Why rejected: Still requires cluster management, higher cost than serverless for batch workload.
Redshift
- Why rejected: Overkill for batch OLAP workload, high cost, requires data loading from S3.
Related Decisions
- Design Decisions Summary - Complete trade-off analysis for this decision
- ADR-002: Year/Month Partitioning - Partition pruning optimizes Athena queries
- ADR-005: Dual Pandas + PySpark - PySpark runs on Glue serverless
Implementation Evidence
- Code: ETL jobs run on AWS Glue (serverless Spark)
- Documentation: Tooling & Controls - Glue - Service selection rationale
- Architecture: Design Decisions Summary - Trade-off analysis