© 2026 Stephen Adei. All rights reserved. All content on this site is the intellectual property of Stephen Adei. See License for terms of use and attribution.
CI/CD Complete Reference
This section contains all CI/CD-related reference materials: testing guides and workflow details. For the main workflow documentation, see CI/CD Workflow.
Table of Contents
Part 1: CI/CD Testing Guide
Overview
Testing CI/CD workflows is highly practical and recommended! This guide covers multiple approaches to test your GitHub Actions workflows.
Why Test CI/CD?
Benefits: Catch issues early (find workflow problems before pushing); faster feedback (test locally without GitHub Actions); cost savings (reduce Actions minutes); confidence (workflows validated before deployment).
Testing Approaches
1. Local Testing with act (Recommended)
act is a tool that runs GitHub Actions workflows locally using Docker.
Installation
# Linux
curl https://raw.githubusercontent.com/nektos/act/master/install.sh | sudo bash
# macOS
brew install act
# Windows (with Chocolatey)
choco install act-cli
Prerequisites: Docker must be installed and running.
Quick Start
# List all workflows and jobs
cd tasks/devops_cicd
act -l
# Run all workflows (push event)
act push
# Run specific job
act push -j python-validation
# Run with verbose output
act push --verbose
Using the Test Script
A convenient test script is provided:
# Make script executable
chmod +x tasks/devops_cicd/scripts/test_ci_workflow.sh
# Run the test script
./tasks/devops_cicd/scripts/test_ci_workflow.sh
The script will:
- Check if
actis installed - Check if Docker is running
- List available jobs
- Let you choose which job to test
Limitations
- Secrets: Must be provided manually (use
actsecrets) - AWS Services: Cannot test AWS-specific steps (S3, Glue) without credentials
- GitHub API: Some actions require GitHub API access
Example: Testing with Secrets
# Create secrets file
cat > .secrets <<EOF
AWS_ACCESS_KEY_ID=test
AWS_SECRET_ACCESS_KEY=test
EOF
# Run with secrets
act push --secret-file .secrets
2. Workflow Syntax Validation
Validate YAML syntax and workflow structure:
# Using actionlint (recommended)
chmod +x tasks/devops_cicd/scripts/validate_workflow_syntax.sh
./tasks/devops_cicd/scripts/validate_workflow_syntax.sh
Or manually:
# Install actionlint
# Linux: https://github.com/rhymond/actionlint#installation
# macOS: brew install actionlint
# Validate workflow
actionlint tasks/devops_cicd/.github/workflows/ci.yml
3. Manual Testing on GitHub
For full integration testing, push to a test branch:
# Create test branch
git checkout -b test/ci-workflow
# Make a small change (e.g., add a comment)
echo "# Test CI" >> tasks/devops_cicd/.github/workflows/ci.yml
# Commit and push
git add .
git commit -m "test: CI workflow"
git push origin test/ci-workflow
# Check GitHub Actions tab for results
Cleanup:
git checkout main
git branch -D test/ci-workflow
git push origin --delete test/ci-workflow
4. Unit Testing Workflow Steps
Test individual workflow steps in isolation:
Example: Test Python Setup
# Test Python setup step locally
python3 -m venv test-venv
source test-venv/bin/activate
pip install --upgrade pip
pip install -r tasks/data_ingestion_transformation/requirements.txt
pip install -r tasks/data_ingestion_transformation/requirements-dev.txt
# Test linting
cd tasks/data_ingestion_transformation
ruff check src/ tests/
# Test unit tests
pytest tests/test_etl.py tests/test_integration.py -v
Example: Test PySpark Setup
# Test Java setup
java -version # Should be Java 17
# Test PySpark installation
python3 -c "import pyspark; print(pyspark.__version__)"
# Test PySpark tests
export JAVA_HOME=/usr/lib/jvm/java-17-openjdk-amd64
pytest tests/test_etl_spark.py -v
Testing Strategy
What to Test
- Workflow syntax — YAML is valid.
- Job dependencies — Jobs run in correct order.
- Step execution — Each step completes successfully.
- Environment setup — Python, Java, dependencies install correctly.
- Test execution — Unit tests run and pass.
- Linting — Code style checks pass.
- AWS integration — Requires AWS credentials (test separately).
What's Hard to Test Locally
- AWS services (S3, Glue, Step Functions) — Require AWS credentials.
- GitHub API — Some actions need GitHub API access.
- Secrets management — Must be provided manually.
- Matrix builds — Can be slow locally.
Recommended Testing Flow
CI/CD Test Checklist
Before merging CI/CD changes:
- Workflow YAML syntax is valid
- All jobs can run locally (with
act) - Python setup works (Python 3.10)
- Java setup works (Java 17)
- Dependencies install correctly
- Linting passes (ruff, sqlfluff)
- Unit tests pass (pytest)
- Workflow runs on test branch
- No secrets exposed in logs
Troubleshooting
act Issues
Problem: act cannot find Docker
# Solution: Ensure Docker is running
docker info
Problem: act uses wrong image size
# Solution: Select image size on first run
act push
# Choose: micro, medium, or large
Problem: Workflow fails with "secrets not found"
# Solution: Provide secrets manually
act push --secret-file .secrets
Workflow Issues
Problem: Tests fail locally but pass on GitHub
- Check Python version (should be 3.10)
- Check Java version (should be 17)
- Verify dependencies match
requirements.txt
Problem: Linting fails
# Run linting manually to see errors
cd tasks/data_ingestion_transformation
ruff check src/ tests/
Continuous Improvement
Add Workflow Tests to CI
You can even test your CI/CD workflows in CI! Add a workflow validation step:
# .github/workflows/validate-workflows.yml
name: Validate Workflows
on:
pull_request:
paths:
- ".github/workflows/**"
jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Validate workflow syntax
uses: schema-tools/actionlint@v1
with:
files: ".github/workflows/*.yml"
Resources
- act: https://github.com/nektos/act
- actionlint: https://github.com/rhymond/actionlint
- GitHub Actions Docs: https://docs.github.com/en/actions
Summary
Local CI/CD testing is supported and recommended for pre-push validation.
- Use
actfor local testing - Validate syntax with
actionlint - Test individual steps manually
- Push to test branch for full integration
- Test AWS-specific steps separately
Time Investment: ~10 minutes to set up, saves hours of debugging later!
Related Documentation
- CI/CD Workflow Design - Complete workflow design and architecture
- Test Suite Summary - Test suite implementation details
- Test Suite Documentation (source:
tasks/devops_cicd/tests/README) - Detailed test documentation - Scripts Documentation (source:
tasks/devops_cicd/scripts/README) - CI/CD testing scripts
Related Solutions
- ETL Pipeline - Code tested by this CI/CD
- Data Lake Architecture - Infrastructure provisioned by this CI/CD
- SQL Query Implementation - Code validated by this CI/CD
Technical Documentation
- Unified Testing Convention - Testing standards
- Testing Guide - Comprehensive testing documentation
Part 2: Workflow Failure Scenarios & Logic Details
Failure Scenarios
Critical Rule: Failed runs never update _LATEST.json or current/ prefix.
Failure Types:
- ETL Job Failure: Non-zero exit, no
_SUCCESS, no data written → Alert triggers, safe rerun - Partial Write: Job crashes mid-execution → Partial files ignored, new
run_idon rerun - Validation Failure: Quarantine rate > threshold → Data Quality Team reviews, fixes source, reruns
- Circuit Breaker: >100 same errors/hour → Pipeline halts, Platform Team investigates
- Critical errors: Business duplicate or circuit-breaker in quarantine → Promotion blocked; review and rerun (promotion gate does not check schema drift)
Safe Rerun: Each rerun uses new run_id, failed runs preserved for audit, only successful runs promoted.
Promotion Workflow: ETL writes to isolated run_id path → _SUCCESS marker → CloudWatch alarm → Human review (Domain Analyst + Platform Team) → Approval → Promote to production.
Infrastructure Details
Step Functions Orchestration:
- RunETL State: Invokes Glue job synchronously, auto-retries (≤3 attempts, exponential backoff)
- ValidateOutput State: Checks
_SUCCESSmarker, retries on eventual consistency - Error Handling: Catches failures, publishes CloudWatch metrics, logs execution details
IAM Prefix-Scoped Permissions:
- ETL Job:
bronze/*(read),silver/*(write),quarantine/*(write) - Platform Team:
bronze/*,silver/*,quarantine/*(read/write) - Domain Teams:
silver/{domain}/*(write),gold/{domain}/*(read) - Business/Analysts:
gold/*(read-only via Athena) - Compliance:
bronze/*,quarantine/*(read-only for audit)
Monitoring Details
Volume Metrics: run_id, input_rows, valid_rows_count, quarantined_rows_count, condemned_rows_count
Quality Metrics: quarantine_rate, validation_failure_rate, error_type_distribution
Loop Prevention: avg_attempt_count, duplicate_detection_rate, auto_condemnation_rate, circuit_breaker_triggers
Performance: rows_processed_per_run, duration_seconds, missing_partitions, runtime_anomalies
Alert Ownership:
- P1 (Immediate): Job failures, infrastructure errors, circuit breaker, SLA breaches → Data Platform Team
- P2 (2-4 hours): Quarantine spikes, validation failures, high attempt counts → Data Quality Team
- P3 (8 hours): Volume anomalies → Domain Teams
Governance Details
Ownership Matrix (abbreviated):
- Pipeline/CI/CD/Infrastructure: Data Platform Team
- Validation Rules: Domain Teams (Silver) / Business (Gold)
- Data Quality: Data Quality Team
- Schema: Domain Teams (Silver) / Business (Gold) approve; Platform implements
- Backfill: Platform executes; Domain/Business approves
Governance Workflows:
- Schema Change: Request → Layer-based review (Domain/Business) → Platform feasibility → Approval → Implementation → Versioning → Validation → Promotion
- Quality Issue: Alert → Data Quality triage → Source/Validation/Platform issue → Fix → Backfill approval → Reprocess → Validate → Promote
- Backfill: Request → Layer-based approval → Platform assessment → Schedule → Execute → Validate → Promote
Key Rules:
- Infrastructure changes via Terraform IaC and CI/CD only
- Failed runs never update
_LATEST.jsonorcurrent/ - Run isolation via
run_idmandatory - Human approval required for Silver promotion and condemned data deletion
- Quarantine rate thresholds configurable per dataset (default: 5%)
- Schema changes versioned via
schema_vfor backward compatibility
Related Documentation
- CI/CD Workflow - Main workflow documentation
- CI/CD Infrastructure Design - Complete CI/CD workflow and infrastructure details
- Architecture Boundaries - CI/CD design constraints
- CI/CD Testing Guide - Testing strategies and local development workflows
Related Solutions
- ETL Pipeline - Ingestion logic deployed by this workflow
- Data Lake Architecture - Infrastructure provisioned by this workflow
Technical Documentation
- Testing Guide - Comprehensive testing documentation