Skip to main content

Submission Handout — Ohpen Case 2026

Purpose: Reviewer-facing summary for external submission. Use this for cover email or as the "what to send" brief.


A) Submission Entry Points

WhatWhere
Docusaurus URLhttps://ohpen.stephenadei.nl (production) or local build: /docs/ as docs root
Landing/ (site home) → Documentation/docs/ (doc home) or Overview → Executive Summary
Key pages to read first1) Documentation Home — "Quick Start for Reviewers"; 2) Executive Summary; 3) System Architecture Overview; 4) Presentation Handout (Interview tab)
Code entry pointsETL: tasks/data_ingestion_transformation/src/etl/ingest_transactions.py (and ingest_transactions_spark.py); SQL: tasks/sql/balance_history_2024_q1.sql, tasks/sql/schema.sql; CI/CD: tasks/devops_cicd/.github/workflows/, Terraform in task 04; static copy in site: docs-site/docusaurus/static/scripts/etl/, static/scripts/sql/
RepoClone repo; run docs-site/docusaurus/scripts/sync-docs-symlinks.sh then build Docusaurus for local docs

B) Deliverables Coverage Checklist (Tasks 1–5)

TaskDeliverableAnswer page / entryIn nav?Notes
§1 Data Ingestion & TransformationETL flow + codeETL Flow, ETL CodeImplementation: ETL Pipeline (§1)Clear.
§2 Data Lake ArchitectureArchitecture designData Lake ArchitectureArchitecture tab + ImplementationClear.
§3 SQL AnalyticsQuery + breakdownSQL Breakdown, SQL CodeImplementation: SQL Analytics (§3)Some internal links use /docs/tasks/sql/SQL_IMPLEMENTATION_CODE (see C).
§4 DevOps & CI/CDWorkflow + artifactsCI/CD Workflow, CI/CD ArtifactsImplementation: CI/CD (§4)Clear.
§5 Communication & DocumentationStakeholder comms + one-pagerTask 5 card on home links to /docs/tasks/communication_documentation/README⚠ README excluded from build → link 404sNo Task 5 in sidebar; card target missing. Use STAKEHOLDER_EMAIL or add redirect.

Appendices: Under Reference tab (ETL Deep Dive, SQL Deep Dive, Infrastructure Deep Dive, Architecture Reference). Optional depth; not required for main review.


C) Packaging Issues (Minimal Fixes)

  1. Task 5 entry 404: **/tasks/**/README* in docusaurus.config.js excludes tasks/communication_documentation/README.md. Homepage DeliverablesShowcase and HomepageFeatures link to /docs/tasks/communication_documentation/README.
    Fix: Point those links to /docs/tasks/communication_documentation/STAKEHOLDER_EMAIL (or add redirect README → STAKEHOLDER_EMAIL).

  2. SQL “implementation code” links: Several docs (HANDOUT, ALL_SCRIPTS, SQL_BREAKDOWN, ASSUMPTIONS_AND_EDGE_CASES, EXECUTIVE_SUMMARY in one place) link to /docs/tasks/sql/SQL_IMPLEMENTATION_CODE. Sidebar and canonical URL use SCRIPTS (tasks/sql/SCRIPTS).
    Fix: Add redirect: /docs/tasks/sql/SQL_IMPLEMENTATION_CODE/docs/tasks/sql/SCRIPTS.

  3. HANDOUT.md Task 5 links: Link to .../communication_documentation/README (excluded) and .../TECHNICAL_REFERENCE (file exists; confirm doc id / build). Update README link to STAKEHOLDER_EMAIL (or same redirect as above).

  4. Landing “where to start”: React home (/) has tagline + “Documentation” / “Overview” but no single paragraph “Start here: read X then Y.” The doc home (/docs/) has “Quick Start for Reviewers” and “If you have 5 minutes” — that is the effective “where to start.” Optional: add one sentence on / e.g. “Start with Executive Summary (Overview) then use the tabs for deep dives.”

  5. ADR index: Sidebar uses adr/adr-index; docs/adr/README.md has frontmatter id: adr-index. With path adr/README, doc id may be adr/README or adr/adr-index depending on theme; confirm build. If 404, add redirect or align sidebar id with actual doc id.

Internal / dev: docs/development/**, docs/guides/**, and various audit/pattern files are excluded in config. ADRs are under Reference and are reviewer-appropriate; not overwhelming.


D) Final Handout Outline (for Cover Email or One-Pager)

What to read (5 bullets, order matters)

  1. Documentation Home (/docs/) — Quick Start for Reviewers and “If you have 5 minutes” tell you where to start.
  2. Executive Summary — Business case, solution overview, and links to all deliverables.
  3. System Architecture Overview — End-to-end design and component roles.
  4. Implementation tab — ETL Flow (§1), Data Lake Architecture (§2), SQL Breakdown (§3), CI/CD Workflow (§4) as the main “answer” pages per task.
  5. Interview tab → Presentation Handout — Short narrative, pipeline diagram, and “what the system does” for interviews.

What the system does + key metrics (5 bullets)

  1. ETL: CSV in S3 → validation (schema, currency, timestamps) → Parquet in Silver (year/month partitions, run_id); invalid rows to Quarantine; loop prevention and circuit breaker.
  2. Data lake: Medallion (Bronze → Silver → Gold) + Quarantine/Condemned; schema evolution additive-only; governance and ownership boundaries.
  3. Analytics: Month-end balance history (e.g. Q1 2024) via partition pruning and window functions; 95% scan reduction for time-bounded queries; target <30s for 100M-row scale.
  4. CI/CD: GitHub Actions (lint, test, build); Terraform (S3, Glue, Step Functions, EventBridge, IAM); OIDC keyless auth; two-role separation; optional manual approval; automated rollback.
  5. Observability: Run identity (e.g. Step Functions ARN), CloudWatch metrics, CloudTrail audit; QUALITY_REQUIREMENTS and ADRs document evidence (partition pruning, lifecycle, serverless).

E) Final Verdict

Verdict: Submission-ready.

  • Applied: (1) Task 5 link from homepage and HomepageFeatures → STAKEHOLDER_EMAIL; (2) Redirects added: SQL_IMPLEMENTATION_CODE → SCRIPTS, Task 5 README → STAKEHOLDER_EMAIL; (3) HANDOUT Task 5 link updated to STAKEHOLDER_EMAIL.
  • Strengths: Clear entry points (doc home, Executive Summary, tabs), Tasks 1–5 reachable; appendices in Reference; internal/dev excluded; claims evidenced; terminology consistent.
  • Optional: One-sentence “where to start” on React landing (/); confirm ADR index builds as adr/adr-index (README has id: adr-index).
© 2026 Stephen AdeiCC BY 4.0