Submission Handout — Ohpen Case 2026
Purpose: Reviewer-facing summary for external submission. Use this for cover email or as the "what to send" brief.
A) Submission Entry Points
| What | Where |
|---|---|
| Docusaurus URL | https://ohpen.stephenadei.nl (production) or local build: /docs/ as docs root |
| Landing | / (site home) → Documentation → /docs/ (doc home) or Overview → Executive Summary |
| Key pages to read first | 1) Documentation Home — "Quick Start for Reviewers"; 2) Executive Summary; 3) System Architecture Overview; 4) Presentation Handout (Interview tab) |
| Code entry points | ETL: tasks/data_ingestion_transformation/src/etl/ingest_transactions.py (and ingest_transactions_spark.py); SQL: tasks/sql/balance_history_2024_q1.sql, tasks/sql/schema.sql; CI/CD: tasks/devops_cicd/.github/workflows/, Terraform in task 04; static copy in site: docs-site/docusaurus/static/scripts/etl/, static/scripts/sql/ |
| Repo | Clone repo; run docs-site/docusaurus/scripts/sync-docs-symlinks.sh then build Docusaurus for local docs |
B) Deliverables Coverage Checklist (Tasks 1–5)
| Task | Deliverable | Answer page / entry | In nav? | Notes |
|---|---|---|---|---|
| §1 Data Ingestion & Transformation | ETL flow + code | ETL Flow, ETL Code | Implementation: ETL Pipeline (§1) | Clear. |
| §2 Data Lake Architecture | Architecture design | Data Lake Architecture | Architecture tab + Implementation | Clear. |
| §3 SQL Analytics | Query + breakdown | SQL Breakdown, SQL Code | Implementation: SQL Analytics (§3) | Some internal links use /docs/tasks/sql/SQL_IMPLEMENTATION_CODE (see C). |
| §4 DevOps & CI/CD | Workflow + artifacts | CI/CD Workflow, CI/CD Artifacts | Implementation: CI/CD (§4) | Clear. |
| §5 Communication & Documentation | Stakeholder comms + one-pager | Task 5 card on home links to /docs/tasks/communication_documentation/README | ⚠ README excluded from build → link 404s | No Task 5 in sidebar; card target missing. Use STAKEHOLDER_EMAIL or add redirect. |
Appendices: Under Reference tab (ETL Deep Dive, SQL Deep Dive, Infrastructure Deep Dive, Architecture Reference). Optional depth; not required for main review.
C) Packaging Issues (Minimal Fixes)
-
Task 5 entry 404:
**/tasks/**/README*indocusaurus.config.jsexcludestasks/communication_documentation/README.md. Homepage DeliverablesShowcase and HomepageFeatures link to/docs/tasks/communication_documentation/README.
Fix: Point those links to/docs/tasks/communication_documentation/STAKEHOLDER_EMAIL(or add redirect README → STAKEHOLDER_EMAIL). -
SQL “implementation code” links: Several docs (HANDOUT, ALL_SCRIPTS, SQL_BREAKDOWN, ASSUMPTIONS_AND_EDGE_CASES, EXECUTIVE_SUMMARY in one place) link to
/docs/tasks/sql/SQL_IMPLEMENTATION_CODE. Sidebar and canonical URL useSCRIPTS(tasks/sql/SCRIPTS).
Fix: Add redirect:/docs/tasks/sql/SQL_IMPLEMENTATION_CODE→/docs/tasks/sql/SCRIPTS. -
HANDOUT.md Task 5 links: Link to
.../communication_documentation/README(excluded) and.../TECHNICAL_REFERENCE(file exists; confirm doc id / build). Update README link to STAKEHOLDER_EMAIL (or same redirect as above). -
Landing “where to start”: React home (
/) has tagline + “Documentation” / “Overview” but no single paragraph “Start here: read X then Y.” The doc home (/docs/) has “Quick Start for Reviewers” and “If you have 5 minutes” — that is the effective “where to start.” Optional: add one sentence on/e.g. “Start with Executive Summary (Overview) then use the tabs for deep dives.” -
ADR index: Sidebar uses
adr/adr-index;docs/adr/README.mdhas frontmatterid: adr-index. With pathadr/README, doc id may beadr/READMEoradr/adr-indexdepending on theme; confirm build. If 404, add redirect or align sidebar id with actual doc id.
Internal / dev: docs/development/**, docs/guides/**, and various audit/pattern files are excluded in config. ADRs are under Reference and are reviewer-appropriate; not overwhelming.
D) Final Handout Outline (for Cover Email or One-Pager)
What to read (5 bullets, order matters)
- Documentation Home (
/docs/) — Quick Start for Reviewers and “If you have 5 minutes” tell you where to start. - Executive Summary — Business case, solution overview, and links to all deliverables.
- System Architecture Overview — End-to-end design and component roles.
- Implementation tab — ETL Flow (§1), Data Lake Architecture (§2), SQL Breakdown (§3), CI/CD Workflow (§4) as the main “answer” pages per task.
- Interview tab → Presentation Handout — Short narrative, pipeline diagram, and “what the system does” for interviews.
What the system does + key metrics (5 bullets)
- ETL: CSV in S3 → validation (schema, currency, timestamps) → Parquet in Silver (year/month partitions, run_id); invalid rows to Quarantine; loop prevention and circuit breaker.
- Data lake: Medallion (Bronze → Silver → Gold) + Quarantine/Condemned; schema evolution additive-only; governance and ownership boundaries.
- Analytics: Month-end balance history (e.g. Q1 2024) via partition pruning and window functions; 95% scan reduction for time-bounded queries; target <30s for 100M-row scale.
- CI/CD: GitHub Actions (lint, test, build); Terraform (S3, Glue, Step Functions, EventBridge, IAM); OIDC keyless auth; two-role separation; optional manual approval; automated rollback.
- Observability: Run identity (e.g. Step Functions ARN), CloudWatch metrics, CloudTrail audit; QUALITY_REQUIREMENTS and ADRs document evidence (partition pruning, lifecycle, serverless).
E) Final Verdict
Verdict: Submission-ready.
- Applied: (1) Task 5 link from homepage and HomepageFeatures → STAKEHOLDER_EMAIL; (2) Redirects added: SQL_IMPLEMENTATION_CODE → SCRIPTS, Task 5 README → STAKEHOLDER_EMAIL; (3) HANDOUT Task 5 link updated to STAKEHOLDER_EMAIL.
- Strengths: Clear entry points (doc home, Executive Summary, tabs), Tasks 1–5 reachable; appendices in Reference; internal/dev excluded; claims evidenced; terminology consistent.
- Optional: One-sentence “where to start” on React landing (
/); confirm ADR index builds asadr/adr-index(README hasid: adr-index).