# Tests

This directory contains the testing utilities for the project. The comprehensive runner exercises all configured notebooks across frameworks, options and iterations, with special handling for the `two_agent` framework.

## Quick start

Run from the project root:

# Dry run (prints plan, no execution)
./tests/run_tests.sh dry-run

# Automated run (no prompts, quieter output)
./tests/run_tests.sh auto --silent

# Or call the runner directly
python tests/enhanced_test_runner.py -y --silent
```

Resume modes (direct runner):

```bash
python tests/enhanced_test_runner.py -y --silent --resume-mode resume  # resume where left off
python tests/enhanced_test_runner.py -y --silent --resume-mode retry   # retry failed then continue
python tests/enhanced_test_runner.py -y --silent --resume-mode skip    # skip failed and continue
python tests/enhanced_test_runner.py -y --silent --resume-mode fresh   # start fresh
```

## Scripts

- enhanced_test_runner.py
  - Comprehensive suite with phased execution (regular first, advanced last).
  - Frameworks: default (no flag), `autogen`, `crewai`, `two_agent`.
  - Options: `whole_code`, `manual_patch`, `agent_applies`.
  - Iterations: 1–3.
  - Special handling for `two_agent`:
    - Runs once per notebook with whole code only.
    - Invoked as `python start.py <notebook> gemini-1.5-flash -fw two_agent -n 1` (no `-opt`).

- run_tests.sh
  - Convenience wrapper around `enhanced_test_runner.py` with common commands: `start`, `auto`, `resume`, `retry`, `skip`, `fresh`, `dry-run`.

- test_runner.py
  - Simpler, legacy runner kept for reference and basic runs.

## Outputs

- CSV log: `tests/comprehensive_test_log_*.csv`
- Organised results: `runs/organized_results/`
- Raw run directory: `runs/<timestamp>/`
- Checkpoint for resume: `tests/testing_checkpoint.json`

## Requirements and notes

- Run from the project root: .
- Ensure the configured notebooks exist under `notebooks/`.
- The runner uses `gemini-1.5-flash` by default.
- Success detection relies on the marker: `All plots saved in directory:` and CSV validation.


