Full Testing Manual

This is the single reference for running all tests and locating reports in the Ohpen case study repo. It reflects the current root Makefile and task layout (including Task 1, 3, 4 Dockerized tests, scenario tests, script tests, and the aggregated report).

1. Running tests from repo root

All commands below are run from the repository root: /home/stephen/projects/ohpen-case-2026 (or your clone path).

1.1 Full suite (recommended)

make test

This runs, in order:

Task 1 (ETL) – Data ingestion & transformation tests in Docker (MinIO + pytest).
Task 3 (SQL) – Balance query tests in Docker (DuckDB + pytest).
Task 4 (CI/CD) – Workflow, Terraform, and orchestration tests in Docker.
Aggregator – Writes the combined summary to reports/ (see §3).

If any task fails, the build fails after the aggregator has run, so you still get an updated combined report.

Catch all output (including warnings): Run make test-capture instead of make test. This runs the same suite and writes full stdout+stderr to reports/full_test_output.log. Inspect warnings with: grep -in warn reports/full_test_output.log. Exit code is the same as make test.

1.2 Individual task tests

Command	What runs
`make test-task1`	Task 1 (ETL) tests only, in Docker.
`make test-task3`	Task 3 (SQL) tests only, in Docker.
`make test-task4`	Task 4 (CI/CD) tests only, in Docker.
`make test-task4-aws`	Task 4 with AWS creds (e.g. deployment smoke test).

1.3 Scenario and contract tests (Task 1)

make test-scenarios   # Bronze→Silver→Promotion scenario tests (MinIO)
make test-contract    # StoragePort + IngestStrategy contract tests
make test-invariants  # Row-conservation and no-duplicate-ID invariants (MinIO)
make test-golden      # Golden-file comparison for a1/a2 (MinIO)

All run in Docker; no local venv or Java/PySpark install needed. Task 1 is fully dockerized.

1.4 Unit tests only (no MinIO)

make test-unit

Runs Task 1 unit tests (excluding real_s3 and integration), then Task 3 and Task 4. Faster when you don’t need integration/MinIO.

1.5 Script / deliverable tests (local)

make test-scripts

Runs pytest for Task 5 script tests (e.g. tasks/communication_documentation/tests/). Does not use Docker; requires local pytest.

1.6 Help

make help

Lists all test-related targets and short descriptions.

2. Prerequisites

Docker (and Docker Compose) for make test, make test-task1, make test-task3, make test-task4, make test-scenarios, and make test-unit.
Python 3 and pytest for make test-scripts (and for running the aggregator script after make test).
MinIO is started automatically by Task 1 and scenario targets when needed (via docker-compose).

3. Where the reports are

3.1 Combined report (after `make test`)

Generated by scripts/test_report/aggregate_all.py after all three task test runs:

Artifact	Path (from repo root)
Summary (Markdown)	`reports/ALL_TESTS_SUMMARY.md`
Summary (JSON)	`reports/all_tests_summary.json`

These aggregate Task 1, Task 3, and Task 4 results (passed/failed/skipped/total/duration per task and overall). They are produced even if one or more tasks fail.

3.2 Per-task reports (HTML + JSON)

Each task that runs pytest writes its own reports when so configured:

Task	HTML report	JSON report
Task 1 (ETL)	`tasks/data_ingestion_transformation/reports/test_report.html`	`.../reports/test_report.json`
Task 3 (SQL)	`tasks/sql/reports/test_report.html`	`tasks/sql/reports/test_report.json`
Task 4 (CI/CD)	`tasks/devops_cicd/reports/test_report.html`	`tasks/devops_cicd/reports/test_report.json`

The HTML files are the full per-task reports (open in a browser). The JSON files are consumed by the aggregator and by CI/tooling.

3.3 Task 1 coverage (optional)

Task 1 pytest is configured with coverage; reports are under:

tasks/data_ingestion_transformation/reports/coverage/ (HTML)
tasks/data_ingestion_transformation/reports/coverage.json

4. What each task tests

Task 1: ETL (Pandas + PySpark), validation, loop prevention, S3/MinIO integration, scenario tests (ingestion, promotion, query simulation), idempotency, failure modes, performance, etc.
Task 3: SQL balance query (DuckDB), expected output, edge cases.
Task 4: GitHub Actions workflow syntax/structure, Terraform resources, orchestration (Step Functions, Lambda, etc.), consistency between workflow and code.
Task 5 (test-scripts): Compile/deliverable scripts (e.g. PDF compilation), no Docker.

5. Cleanup

make test-clean   # Remove Docker containers/images used by task tests
make clean-all   # test-clean + remove __pycache__, .pytest_cache, etc.

6. CI

The GitHub Actions workflow (.github/workflows/ci.yml) runs validation (e.g. Task 1 tests, linting). It does not necessarily run the full make test (all three tasks + aggregator). For a full run and combined report, run make test locally.

7. Known failing tests (Task 1 – Spark)

When you run the full Task 1 suite (make test-task1 or make test), five tests can fail. They are all Spark (PySpark) tests:

#	Test
1	`test_contract_ingest_strategy.py::TestIngestStrategyContractSpark::test_spark_strategy_same_metrics_shape`
2	`test_ingestion_scenarios.py::test_scenario_registry_parametrized[a1_spark]`
3	`test_ingestion_scenarios.py::test_scenario_registry_parametrized[a2_spark]`
4	`test_ingestion_scenarios.py::test_scenario_registry_parametrized[b1_spark]`
5	`test_ingestion_scenarios.py::test_scenario_registry_parametrized[quarantine_spark]`

(Other *_spark parametrizations, e.g. dup_id_spark, fail for the same reason; the run stops after 5 failures because of --maxfail=5.)

Root cause: Spark tries to read from s3:// URIs. In the test Docker environment, Hadoop/Spark has no FileSystem registered for the s3 scheme, so you get:

org.apache.hadoop.fs.UnsupportedFileSystemException: No FileSystem for scheme "s3"

Spark expects s3a:// and Hadoop S3A configuration (fs.s3a.endpoint, fs.s3a.access.key, etc.). The ETL code in src/etl/s3_operations_spark.py uses s3:// and the session from get_spark_session() does not set S3A/MinIO config, so Spark tests fail against MinIO. Pandas-based tests use boto3 and work.

To get a green run without fixing Spark S3: run only non-Spark tests, e.g. exclude the spark backend or the contract Spark test (e.g. pytest -k "not spark" or exclude the contract test). To fix the five failures: configure the Spark session for S3A and MinIO (use s3a:// URIs and set fs.s3a.* in the session or Hadoop config when S3_ENDPOINT_URL is set), and ensure the test image has the required hadoop-aws JARs (see Dockerfile.test and tests/test_spark_integration.py for existing s3a usage).

8. See also

Testing documentation index: TESTING_DOCUMENTATION_INDEX.md – Index of all testing-related docs.
Technical testing guide: docs/technical/TESTING.md – Deeper testing strategy and rationale.
Task 1 tests: tasks/data_ingestion_transformation/tests/README_TESTING.md (source only).
Task 3 tests: tasks/sql/README_TESTING.md (source only).
Scenario test report: docs/internal/audits/INGESTION_REALITY_TEST_REPORT.md (source only).
AWS testing (MinIO vs real S3): docs/technical/AWS_TESTING_MINIO_VS_REAL.md.

1. Running tests from repo root​

1.1 Full suite (recommended)​

1.2 Individual task tests​

1.3 Scenario and contract tests (Task 1)​

1.4 Unit tests only (no MinIO)​

1.5 Script / deliverable tests (local)​

1.6 Help​

2. Prerequisites​

3. Where the reports are​

3.1 Combined report (after make test)​

3.2 Per-task reports (HTML + JSON)​

3.3 Task 1 coverage (optional)​

4. What each task tests​

5. Cleanup​

6. CI​

7. Known failing tests (Task 1 – Spark)​

8. See also​