IAM & Security Design
© 2026 Stephen Adei. All rights reserved. All content on this site is the intellectual property of Stephen Adei. See License for terms of use and attribution.
This section details the Identity and Access Management (IAM) architecture for the Data Lake, specifically focusing on the Deployment Security gap: how GitHub Actions authentically and securely deploys to AWS without long-lived access keys.
1. OIDC Identity Architecture ("Keyless Entry")
Traditional long-lived AWS Access Keys are replaced (which can be leaked) with OpenID Connect (OIDC) federation. This allows GitHub Actions to exchange its temporary job token for temporary AWS credentials.
Authentication Flow
Trust Policy (The "Security Gate")
This policy determines who can assume the role. This is strictly scoped to the specific repository and branch.
File: infra/iam/trust_policies/github_oidc_trust.json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::123456789012:oidc-provider/token.actions.githubusercontent.com"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"token.actions.githubusercontent.com:aud": "sts.amazonaws.com",
"token.actions.githubusercontent.com:sub": "repo:organization/repository:ref:refs/heads/main"
}
}
}
]
}
Security Controls in this Policy:
aud(Audience): Must bests.amazonaws.com. Prevents using tokens meant for other services.sub(Subject): The critical lock.repo:organization/repository: Only this specific repository.ref:refs/heads/main: Only workflows running on themainbranch.- Pull Requests from forks cannot assume this role.
2. Role Separation Strategy (CI vs. CD)
A strict separation is implemented between Continuous Integration (CI) and Continuous Delivery (CD) roles to minimize blast radius.
Role Responsibilities
| Feature | CI Role (CI-Reader) | CD Role (CD-Deployer) |
|---|---|---|
| Purpose | Validation, Testing, Planning | Deployment, Infrastructure Changes |
| Trigger | Pull Requests (Any Branch) | Merge to main (Protected) |
| S3 Access | Read-Only (Prod), Read/Write (Test/Artifacts) | Read/Write (Prod/Artifacts) |
| Terraform | terraform plan (Read-only on State) | terraform apply (Write to State) |
| AWS Services | None (Runs in GitHub/MinIO) | Full Control of ETL Resources (Glue, Step Functions) |
| Risk Profile | Low | High |
I. CI Role Policy (Read-Only / Test)
Used for PR validation. Cannot change production infrastructure.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "ReadStateForPlan",
"Effect": "Allow",
"Action": ["s3:GetObject", "s3:ListBucket"],
"Resource": "arn:aws:s3:::ohpen-terraform-state/*"
},
{
"Sid": "PushArtifacts",
"Effect": "Allow",
"Action": ["s3:PutObject"],
"Resource": "arn:aws:s3:::ohpen-artifacts/builds/*"
}
]
}
II. CD Role Policy (Deployer)
Used only after merge to main and manual approval gate.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "ManageInfrastructure",
"Effect": "Allow",
"Action": ["s3:*", "glue:*", "lambda:*", "states:*", "events:*", "iam:PassRole"],
"Resource": "*"
},
{
"Sid": "TerraformStateWrite",
"Effect": "Allow",
"Action": ["s3:GetObject", "s3:PutObject", "s3:ListBucket"],
"Resource": "arn:aws:s3:::ohpen-terraform-state/*"
}
]
}
Note: In a stricter enterprise environment, ManageInfrastructure would be scoped down to specific resource ARNs (e.g., arn:aws:glue:eu-west-1:_:job/ohpen-_).
3. IAM Role Relationships & Trust
IAM Role Trust Relationships:
3. Deployment Pipeline Security
The deployment pipeline itself acts as a security policy enforcement point.
Continuous Delivery (CD) - Manual Verification Pattern
The project adheres to Continuous Delivery (auto-preparation, manual release) rather than Continuous Deployment (auto-release) for the initial operational phase.
GitHub Environment Protection
GitHub Environments are used to enforce the "Manual Approval" gate outside of the code itself.
- Create Environment:
production - Required Reviewers: Tech Lead
- Deployment Branch Rule: Limit to
mainbranch only
This ensures that even if a developer modifies the workflow file to remove the "needs approval" step, the GitHub Environment settings will still block the deployment.
4. Least Privilege for Runtime Services
While the Deployment (CD) role needs broad permissions to create resources, the resources themselves (the "Runtime" roles) must be highly restricted.
Glue Service Role (ohpen-glue-service-role)
Principle: The ETL job should only touch specific S3 folders.
- Allowed:
s3:PutObjectonsilver/andgold/ - Allowed:
s3:GetObjectonbronze/ - Denied:
s3:DeleteObjectonbronze/(Immutable Audit Trail) - Denied: Access to
human-resources/or other sensitive buckets
Step Functions Role (ohpen-step-functions-role)
Principle: Orchestration only—cannot read data.
- Allowed:
glue:StartJobRun - Allowed:
glue:GetJobRun - Denied:
s3:*(The orchestrator does not need to see the data, only the metadata of the job)
5. Summary of Security Controls
| Control Layer | Mechanism | Implementation |
|---|---|---|
| Authentication | OIDC (OpenID Connect) | GitHub Actions → AWS STS (No long-lived keys) |
| Authorization | IAM Policies | Separate CI-Reader vs CD-Deployer roles |
| Access Scope | Trust Policy Conditions | repo:organization/repository:ref:refs/heads/main |
| Process Control | Continuous Delivery | Manual Application Approval Gate for Production |
| Runtime Control | Role Assumption | Services (Glue) assume distinct roles from Deployers |
See also
- CI/CD Workflow - How OIDC authentication and two-role separation are implemented
- Data Lake Architecture - Architecture secured by these IAM policies
- CI/CD Complete Reference - Complete CI/CD workflow details
- CI/CD Workflow Details - Detailed workflow implementation
- Design Decisions Summary - Security decision rationale and trade-offs