Efficient Test Runner Guide
Context: The commands below are for Task 1 (ETL) when run from
tasks/data_ingestion_transformation/. From repo root usemake test-task1ormake testand see TESTING_MANUAL.md for all root test commands and report locations.
Quick Start - Run All Tests with Metrics
The easiest and most efficient way to run all tests with metrics:
make test-with-metrics
This single command will:
- ✅ Run all tests in Docker (Pandas + PySpark)
- ✅ Automatically collect metrics (memory, CPU, time, Spark)
- ✅ Generate comprehensive metrics reports
- ✅ Display summary of results
Alternative Methods
Method 1: Using Make (Recommended)
# Run all tests with metrics (one command)
make test-with-metrics
# Or run tests and metrics separately
make test # Run tests only
make test-metrics # Generate metrics from existing report
# Run specific test suites with metrics
make test-pandas && make test-metrics
make test-spark && make test-metrics
Method 2: Using the Test Script
# Run all tests with metrics
./scripts/run_tests_with_metrics.sh
# Run specific test file
./scripts/run_tests_with_metrics.sh tests/test_etl.py
# Run with additional pytest options
./scripts/run_tests_with_metrics.sh tests/test_etl.py -k "test_validation"
Method 3: Using Docker Compose Directly
# Build and run tests
docker-compose -f docker-compose.test.yml build
docker-compose -f docker-compose.test.yml run --rm etl-tests pytest tests/ -v
# Then generate metrics
docker-compose -f docker-compose.test.yml run --rm etl-tests python scripts/generate_test_metrics.py
Available Test Commands
Standard Test Commands
make test # All tests (Pandas + PySpark)
make test-pandas # Pandas tests only
make test-spark # PySpark tests only
make test-unit # Unit tests only
make test-integration # Integration tests only
make test-load # Load tests (fast, excludes slow)
make test-load-full # All load tests (includes slow)
make test-edge-cases # Edge case tests
Metrics Commands
make test-with-metrics # Run tests + generate metrics (RECOMMENDED)
make test-metrics # Generate metrics from existing report
Utility Commands
make build # Build Docker test image
make clean # Clean Docker resources
make archive-reports # Archive test reports
make help # Show all available commands
What Metrics Are Collected?
Every test automatically collects:
System Metrics
- Time: Duration, CPU time, start/end timestamps
- Memory: RSS, peak memory, memory delta, VMS, shared memory
- CPU: CPU time, CPU percentage, user/system time
- System: Load average, thread count, open file descriptors
Spark Metrics (for PySpark tests)
- Executor memory usage
- Job metrics (tasks, completed, failed)
- Stage metrics
- Active jobs count
S3 Metrics (when using instrumented_s3_client fixture)
- Read/write/delete/list operations
- Bytes transferred
- Operation latency (avg, min, max, p50, p95, p99)
- Retry counts and errors
Viewing Metrics Reports
After running tests, metrics are available in:
reports/test_metrics.json- Machine-readable metricsreports/TEST_METRICS.md- Human-readable summary with resource usagereports/test_report.json- Full pytest JSON report with per-test metricsreports/test_report.html- Visual HTML report
Quick View
# View markdown summary
cat reports/TEST_METRICS.md
# View JSON metrics (requires jq)
cat reports/test_metrics.json | jq
# Open HTML report (Linux)
xdg-open reports/test_report.html
Performance Tips
1. Skip Slow Tests During Development
# Run tests excluding slow markers
docker-compose -f docker-compose.test.yml run --rm etl-tests pytest tests/ -v -m "not slow"
2. Run Tests in Parallel (if supported)
# Install pytest-xdist first, then:
docker-compose -f docker-compose.test.yml run --rm etl-tests pytest tests/ -n auto
3. Run Specific Test Files
# Run only one test file
make test-pandas # Only Pandas tests
make test-spark # Only PySpark tests
# Or use the script
./scripts/run_tests_with_metrics.sh tests/test_etl.py
4. Quick Test Without Rebuilding
# If Docker image already exists
make test-quick
Troubleshooting
Metrics Not Appearing
-
Check if psutil is installed:
docker-compose -f docker-compose.test.yml run --rm etl-tests pip list | grep psutil -
Verify test_report.json exists:
ls -la reports/test_report.json -
Regenerate metrics manually:
make test-metrics
Docker Issues
# Rebuild Docker image
make build
# Clean and rebuild
make clean
make build
Missing Dependencies
# Rebuild Docker image to install new dependencies
make build
Example Workflow
# 1. Run all tests with metrics
make test-with-metrics
# 2. Check results
cat reports/TEST_METRICS.md
# 3. If tests pass, archive reports
make archive-reports
# 4. View detailed HTML report
xdg-open reports/test_report.html
Integration with CI/CD
For CI/CD pipelines, use:
# Run tests and exit with error code on failure
make test-with-metrics || exit 1
The test-with-metrics command will:
- Exit with code 0 if all tests pass
- Exit with code 1 if any tests fail
- Generate metrics regardless of test outcome
Next Steps
- See Test Metrics Guide for detailed metrics documentation (excluded from documentation)
- See Testing Quick Start for more testing options (excluded from documentation)
- See Test Documentation for test structure details (excluded from documentation)
Related Documentation
Technical Documentation
- Testing Quick Start - Quick start guide for running tests
- Testing Guide - Comprehensive testing documentation
- Test Metrics Guide - Understanding test metrics
- Docker Testing - Dockerized test environment
- Unified Testing Convention - Testing standards
- Test Results Overview - Current test execution results
Task-Specific Documentation
- Task 1 Test Directory (source only) - Detailed test documentation
- Task 1 Test Reports (source only) - Test reporting overview