Skip to main content

Security & CI/CD Strategy

Executive Summary

The Data Lake implements a Cloud-Native, Zero-Trust Deployment Architecture. The design moves beyond legacy "static access keys" to a modern OpenID Connect (OIDC) identity federation model. This ensures that the deployment pipeline (GitHub Actions) authenticates to the infrastructure (AWS) using temporary, strictly scoped credentials that exist only for the duration of a single job.

1. The Distinction: CI vs. CD

Continuous Integration (Validation) and Continuous Delivery (Deployment) are strictly separated to ensure stability and security.

Continuous Integration (CI) - "The Safety Neth"

  • Goal: Validate code correctness before it ever touches production.
  • Trigger: Every Pull Request.
  • Security Profile: Read-Only / Mocked Environment.
  • Actions:
    • Unit Tests (PyTest)
    • Code Quality Checks (Ruff, SQLFluff)
    • Infrastructure Validation (terraform plan)
    • No access to production data or keys.

Continuous Delivery (CD) - "The Gatekeeper"

  • Goal: Safely release validated changes to production.
  • Trigger: Merge to main branch + Manual Approval.
  • Security Profile: Elevated Privileges (Scoped via OIDC).
  • Actions:
    • Authenticate via OIDC (Keyless).
    • Wait for Human Approval (Senior Engineer/Tech Lead).
    • Apply Infrastructure Changes (terraform apply).
    • Deploy Glue Jobs & State Machines.

2. Keyless Authentication (OIDC)

In a traditional setup, an IAM User's "Access Key" (e.g., AKIA...) is stored in GitHub Secrets. If this key leaks, an attacker has permanent access until it is rotated.

Solution: OIDC Federation

  1. Trust, Don't Shared Secrets: AWS accepts a cryptographically signed token from GitHub only if it matches a specific organization/repository and branch.
  2. Temporary Credentials: The token is exchanged for temporary AWS credentials valid for only 60 minutes.
  3. Auditable: Every deployment action is logged in AWS CloudTrail as receiving a "Federated Login" from GitHub, providing a perfect audit trail.

Why this matters: The approach eliminates the #1 cause of cloud data breaches—leaked static API keys.

3. Least Privilege & Auditability

The security model extends beyond deployment into the runtime environment:

LayerSecurity ControlBusiness Value
DeploymentOIDC + GitOpsAll infrastructure changes are version-controlled, code-reviewed, and deployed without static keys.
Data LakeImmutable Bronze LayerRaw data (bronze/) can be read but never overwritten or deleted, ensuring a legally defensible audit trail.
RuntimeRole SegregationThe ETL Job (Glue) cannot access Human Resources data; the Dashboard (QuickSight) cannot delete data.

4. Compliance & Governance

This architecture supports regulatory compliance (GDPR, SOC2) by enforcing:

  1. No Human Access to Production Data: Developers interact with the pipeline, not the database.
  2. Four-Eyes Principle: Infrastructure changes require a Pull Request review + a Deployment Approval check.
  3. Full Traceability: Every change in AWS can be traced back to a specific Commit SHA and PR in GitHub.

5. Compliance & Controls

This platform is aligned to EU fintech regulatory expectations, addressing:

  • GDPR – Privacy & security of processing
  • DORA – Operational resilience (in force since 17 January 2025)
  • SOC 1 / ISAE 3402 – Financial reporting control assurance
  • ISO/IEC 27001 – Security management system backbone
  • SOC 2 Type II – Customer trust for data platforms
  • EBA outsourcing guidelines – AWS/vendor governance, critical/important classification, exit plans
  • BCBS 239 – Data lineage, reconciliation, accuracy/completeness, reporting under stress

A full control matrix and third-party dependency register mapping these frameworks to architecture evidence is available in Compliance & Controls Framework (see below).

See also

© 2026 Stephen AdeiCC BY 4.0