FIESTA Scattering Bio — scattering transforms complement a plankton CNN
Description
Headline change: the repository is now built around a stacked CNN + scattering meta-classifier instead of the prior 10-class classification demo.
Result (held-out, Decrop et al. 2025's test.txt)
| Method | Top-1 | Top-5 | Rare-class mean recall | |---|---:|---:|---:| | CNN alone (Decrop 2025) | 86.34 % | 98.70 % | 47.7 % | | Scattering + LR alone | 26.93 % | 60.43 % | 43.0 % | | 50/50 probability ensemble | 86.28 % | 94.82 % | 50.3 % | | Stacked LR (val-trained) | 85.62 % | 95.39 % | 56.1 % | | Oracle ceiling | 87.68 % | — | 64.6 % |
- 33 of 95 classes improve by > 1 pp under stacking; only 16 degrade.
- +8.4 pp rare-class recall — captures about half the oracle ceiling.
- −0.72 pp overall top-1 — the bounded trade-off.
Full per-class numbers in `results/stacking_val_trained_results.json`.
What changed
- Three-step pipeline (`01_scattering_features.py`, `02_cnn_predictions.py`, `03_stacking.py`) replaces the old single-notebook 10-class experiment.
- New companion repo fiesta-decrop-reproduction produces the CNN predictions on Decrop's val + test splits — reused here as the CNN baseline.
- Dockerfile rewritten: apt installs p7zip-full/git/libgl1/libglib2.0-0, pip install uses `--extra-index-url` for the torch CPU wheel (fixes the v0.2.0 build failure caused by `--index-url` overriding PyPI for non-torch packages).
- `environment.yml` expanded; new optional `environment-cnn.yml` for in-repo CNN inference.
- CITATION.cff and codemeta.json now cite Decrop et al. 2025 as the baseline paper (DOI 10.3389/fmars.2025.1699781).
Why this framing
CNNs on class-imbalanced microscopy datasets systematically underperform on rare taxa. Scattering coefficients are deterministic wavelet statistics with no learned parameters and no frequency bias. The two predictors err on different samples — scattering alone is uniquely correct on 1.3 % of test images, heavily enriched in rare classes — and a small linear meta-classifier trained on Decrop's held-out val split captures most of this complementarity.
This is the one FIESTA-OSCARS repository demonstrating scattering as a complementary layer on top of an existing deep model, alongside three gap-filling repos (astro, SST, SST-WGS84) and the CNN reproduction repo.
Breaking changes
- `01_plankton_classification.py`, `02_cnn_baseline.py`, `03_reproduce_decrop.py` removed. Git history preserves them.
- `experiments/` folder removed (scripts consolidated to top level).
- Old `results/plankton_classification_results.json` removed. The canonical artefact is now `results/stacking_val_trained_results.json`.
🤖 Generated with Claude Code
Notes
Files
annefou/fiesta-scattering-bio-v0.3.0.zip
Files
(31.4 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:177e8e397b1f25777150970970089312
|
31.4 kB | Preview Download |
Additional details
Related works
- Is supplement to
- Software: https://github.com/annefou/fiesta-scattering-bio/tree/v0.3.0 (URL)
Software
- Repository URL
- https://github.com/annefou/fiesta-scattering-bio