Published April 22, 2026 | Version v0.3.0
Software Open

FIESTA Scattering Bio — scattering transforms complement a plankton CNN

Authors/Creators

  • 1. LifeWatch ERIC

Description

Headline change: the repository is now built around a stacked CNN + scattering meta-classifier instead of the prior 10-class classification demo.

Result (held-out, Decrop et al. 2025's test.txt)

| Method | Top-1 | Top-5 | Rare-class mean recall | |---|---:|---:|---:| | CNN alone (Decrop 2025) | 86.34 % | 98.70 % | 47.7 % | | Scattering + LR alone | 26.93 % | 60.43 % | 43.0 % | | 50/50 probability ensemble | 86.28 % | 94.82 % | 50.3 % | | Stacked LR (val-trained) | 85.62 % | 95.39 % | 56.1 % | | Oracle ceiling | 87.68 % | — | 64.6 % |

  • 33 of 95 classes improve by > 1 pp under stacking; only 16 degrade.
  • +8.4 pp rare-class recall — captures about half the oracle ceiling.
  • −0.72 pp overall top-1 — the bounded trade-off.

Full per-class numbers in `results/stacking_val_trained_results.json`.

What changed

  • Three-step pipeline (`01_scattering_features.py`, `02_cnn_predictions.py`, `03_stacking.py`) replaces the old single-notebook 10-class experiment.
  • New companion repo fiesta-decrop-reproduction produces the CNN predictions on Decrop's val + test splits — reused here as the CNN baseline.
  • Dockerfile rewritten: apt installs p7zip-full/git/libgl1/libglib2.0-0, pip install uses `--extra-index-url` for the torch CPU wheel (fixes the v0.2.0 build failure caused by `--index-url` overriding PyPI for non-torch packages).
  • `environment.yml` expanded; new optional `environment-cnn.yml` for in-repo CNN inference.
  • CITATION.cff and codemeta.json now cite Decrop et al. 2025 as the baseline paper (DOI 10.3389/fmars.2025.1699781).

Why this framing

CNNs on class-imbalanced microscopy datasets systematically underperform on rare taxa. Scattering coefficients are deterministic wavelet statistics with no learned parameters and no frequency bias. The two predictors err on different samples — scattering alone is uniquely correct on 1.3 % of test images, heavily enriched in rare classes — and a small linear meta-classifier trained on Decrop's held-out val split captures most of this complementarity.

This is the one FIESTA-OSCARS repository demonstrating scattering as a complementary layer on top of an existing deep model, alongside three gap-filling repos (astro, SST, SST-WGS84) and the CNN reproduction repo.

Breaking changes

  • `01_plankton_classification.py`, `02_cnn_baseline.py`, `03_reproduce_decrop.py` removed. Git history preserves them.
  • `experiments/` folder removed (scripts consolidated to top level).
  • Old `results/plankton_classification_results.json` removed. The canonical artefact is now `results/stacking_val_trained_results.json`.

🤖 Generated with Claude Code

Notes

If you use this software, please cite this repository together with Decrop et al. 2025 (CNN baseline) and Delouis et al. 2022 (scattering method).

Files

annefou/fiesta-scattering-bio-v0.3.0.zip

Files (31.4 kB)

Name Size Download all
md5:177e8e397b1f25777150970970089312
31.4 kB Preview Download

Additional details

Related works