Skip to main content

Case study guide — where to find each deliverable

This page helps reviewers and readers who have the Ohpen case study (Financial Data Pipeline and Data Lake Optimization) find where each required deliverable is addressed in this documentation.

The case has five tasks plus an appendix. Below, each task is listed with pointers to the relevant docs.


1. Data Ingestion and Transformation

Case asks for: Python script (read from S3, validate, write Parquet partitioned by year/month), plus edge cases and assumptions.

DeliverableWhere to find it
ETL flow and implementationData ingestion & ETL — ETL flow
Edge cases and assumptionsData ingestion — Assumptions and edge cases
Reference ETL code and diagramsReference — ETL diagrams, ETL pseudocode

2. Data Lake Architecture Design

Case asks for: Folder structure for raw, processed, and aggregated data; strategy for schema evolution.

DeliverableWhere to find it
Folder structure and architectureData lake architecture — Architecture
Schema evolution and assumptionsData lake architecture — Assumptions and edge cases
Reference architecture and governanceReference — Data lake architecture, Governance diagrams
ADRs (format, partitioning, etc.)ADR 001 — Parquet format and remaining ADRs in sidebar under Architecture decisions (ADR)

3. SQL

Case asks for: SQL query for account balance history at end of each month, per account, for the first three months of 2024 (example output given in the case). Appendix A defines the transactions table.

DeliverableWhere to find it
SQL breakdown and implementationTasks — SQL — SQL breakdown, SQL implementation code
Assumptions and testingTasks — SQL — Assumptions and edge cases, Isolated testing
Reference SQL (complete and examples)Reference — SQL, SQL examples

4. DevOps Integration (CI/CD for Data Pipelines)

Case asks for: CI/CD workflow for the ETL pipeline (automated testing, automated infrastructure); list of artifacts.

DeliverableWhere to find it
CI/CD workflow (main deliverable)CI/CD Workflow — design, failure scenarios, orchestration, governance
Deployment and infrastructureTasks — DevOps & CI/CD — Deployment summary, Lambda implementation
Security and artifactsBusiness case security, Terraform backend, CI/CD complete (artifacts)

5. Communication and Documentation

Case asks for: Short email to non-technical stakeholders (metrics: records processed, errors); one-page technical document for the team.

DeliverableWhere to find it
Example stakeholder emailTasks — Communication — Stakeholder email, Stakeholder update (business)
Technical summary for the teamTasks — Communication — Technical reference
High-level overview for reviewersExecutive summary (sidebar)

Appendix A — Transaction table and sample data

The case defines a transactions table and sample rows used for the SQL task (e.g. balance history). The schema and logic are reflected in:


Evaluation criteria (case study)

The case states it evaluates: Technical skills (Python, SQL, cloud), Problem-solving (edge cases, scalability, performance), Clarity (explanations, documentation), Practicality (implementable solutions). The docs under Tasks, Reference, ADR, and Runbooks provide the evidence for these criteria.

© 2026 Stephen AdeiCC BY 4.0