Published June 3, 2026 | Version v1
Dataset Open

HFQA - Hyperspectral Imaging for Fish Quality Assessment: A Spatial-Spectral Benchmark Dataset for Non-Destructive Fish Freshness Analysis

Description

 

This Zenodo record contains a representative subset of the full dataset: Pack No. 5, captured daily across all 16 storage days. The subset is released to let users inspect data quality, file formats, and tooling before working with the complete collection.

1. Overview

HFQA (Hyperspectral Imaging for Fish Quality Assessment) is a hyperspectral imaging (HSI) dataset capturing the visual and spectral degradation of Atlantic salmon (Salmo salar) fillets during cold storage. A single fillet pack was imaged once per day for 16 consecutive days under controlled refrigeration, producing a time series that documents freshness loss across the visible and near-infrared (VNIR) range.

The dataset is intended for research in:

  • Non-destructive food freshness / shelf-life estimation
  • Spectral regression and ordinal classification of storage day
  • Hyperspectral band selection and dimensionality reduction
  • Domain adaptation and few-shot / transfer learning across packs and days
  • General hyperspectral image processing method development

Each acquisition is provided in two interchangeable formats so users can adopt whichever fits their pipeline:

Format Contents Typical use
ENVI BIP (.bip + .hdr) Raw band-interleaved-by-pixel cube with full ASCII header Remote-sensing / MATLAB / ENVI / SPy workflows
NumPy (.npy + wavelengths.npy + metadata.json) Ready-to-load array with sidecar wavelength and metadata files Python / deep-learning workflows

Both formats encode identical pixel data and wavelengths — they are provided purely for convenience.

2. What's in this subset

This record contains Pack No. 5, all 16 days, in both formats:

HFQA_pack05/
├── README.md                     <- this file
├── bip/                          <- ENVI BIP format
│   ├── day_01.bip
│   ├── day_01.hdr
│   ├── day_02.bip
│   ├── day_02.hdr
│   ├── ...
│   ├── day_16.bip
│   └── day_16.hdr
├── npy/                          <- NumPy format
│   ├── day_01.npy
│   ├── day_02.npy
│   ├── ...
│   ├── day_16.npy
│   ├── wavelengths.npy           <- shared 1-D array of band centre wavelengths (nm)
│   └── metadata.json             <- acquisition + pack metadata
└── scripts/                      <- visualisation utilities (see Section 6)
    ├── visualize_npy.py
    ├── visualize_bip.py
    └── visualize_bip.m

Note on day ordering: files are named day_01 … day_16 so that simple alphabetical sorting yields correct chronological order. day_01 is the freshest (day of purchase / packing) and day_16 is the most degraded.

3. Acquisition details

The values below are templated placeholders — please replace the bracketed fields with your actual acquisition parameters before publishing.

Property Value
Specimen Atlantic salmon (Salmo salar) fillet, skin-on/off
Number of packs (full dataset) 50
This subset Pack No. 5 only
Days imaged 16 (one acquisition per day)
Storage condition Refrigerated at 0-2 °C, ambient light
Camera / sensor Pika XC2 HSI camera manufactured by Resonon Inc. (Bozeman, MT, USA)
Spectral range ~400–1000 nm (VNIR)
Number of bands 462 (see metadata.json / .hdr)
Spatial dimensions 1746 × 1346 pixels (see metadata.json / .hdr)
Illumination Halogen line light, two sources at 45°
Reference calibration white reference (Spectralon) + dark balance

Calibration: R = (raw − dark) / (white − dark).

4. Data format specifications

4.1 NumPy format (.npy/)

  • day_XX.npy — a 3-D array of shape (lines, samples, bands) = (H, W, B), dtype float32.
  • wavelengths.npy — a 1-D float64 array of length B giving the centre wavelength (in nanometres) of each band, shared across all 16 days.
  • metadata.json — acquisition metadata. Example structure:
{
  "dataset": "HFQA",
  "pack": 5,
  "day": 1,
  "species": "Salmo salar",
  "sensor": "VNIR",
  "lines": 600,
  "samples": 800,
  "bands": 120,
  "wavelength_unit": "nm",
  "wavelength_range": [400.0, 1000.0],
  "reflectance_scale": "[0,1]",
  "storage_temperature_c": 4.0,
  "acquisition_date": "2025-11-25"
}

Minimal load in Python:

import numpy as np, json
cube = np.load("npy/day_01.npy")              # (H, W, B) float32
wl   = np.load("npy/wavelengths.npy")          # (B,) nm
meta = json.load(open("npy/metadata.json"))

4.2 ENVI BIP format (.bip/)

  • day_XX.bip — raw binary cube in Band-Interleaved-by-Pixel layout. On disk, for each pixel the full spectrum is stored contiguously, scanning samples then lines.
  • day_XX.hdr — ENVI ASCII header. Key fields:
ENVI
samples = 800
lines   = 600
bands   = 120
data type = 4            ; 4 = 32-bit float (see ENVI codes)
interleave = bip
byte order = 0          
wavelength units = Nanometers
wavelength = { 400.0, 405.0, ..., 1000.0 }

ENVI data type codes used here: 1=uint8, 2=int16, 4=float32, 12=uint16 (full mapping handled by the provided scripts).

The cube can be opened directly in ENVI, in MATLAB via the provided script (or hypercube/multibandread), or in Python via the spectral (SPy) package or the bundled visualize_bip.py parser.

5. Quick start

Python

pip install numpy matplotlib            # core requirements
# pip install spectral                  # optional, for --use-spectral in the BIP script

# Visualise a NumPy cube
python scripts/visualize_npy.py --input npy/day_01.npy --outdir figures

# Visualise an ENVI BIP cube
python scripts/visualize_bip.py --input bip/day_01.bip --outdir figures

MATLAB

% From the dataset root, with scripts/ on the path:
addpath('scripts');
visualize_bip('bip/day_01.bip', 'figures');     % writes PNGs into figures/

6. Visualisation scripts

Three scripts are bundled in scripts/. All four figures they generate are designed for direct inclusion in papers and data descriptors.

Script Language Input format
visualize_npy.py Python 3 .npy (+ wavelengths.npymetadata.json)
visualize_bip.py Python 3 ENVI .bip (+ .hdr)
visualize_bip.m MATLAB ENVI .bip (+ .hdr)
band_tour.m MATLAB ENVI .bip (+ .hdr)
explorer.m MATLAB ENVI .bip (+ .hdr)
rgb_viewer.m MATLAB ENVI .bip (+ .hdr)

Each produces:

  1. Pseudo-RGB composite — a natural-colour reconstruction using bands nearest 650 / 550 / 450 nm.
  2. Single-band montage — a labelled grid of evenly-spaced bands across the spectral range.
  3. Mean spectral signature — the spatial-mean reflectance curve with a ±1σ shaded band.
  4. Overview dashboard — a single combined figure (pseudo-RGB + sample bands + mean spectrum) suitable as a dataset summary figure.

The Python ENVI reader is dependency-free (only numpy + matplotlib); pass --use-spectral to read via the spectral package instead. The MATLAB script requires no toolboxes (it uses low-level fread and a built-in header parser).

Reproducibility note: the two Python readers were verified to return byte-identical cubes and wavelengths from the BIP and NumPy versions of the same acquisition.

7. Suggested usage and benchmarks

To support comparable results across studies, we suggest:

  • Pack-level splitting. When using the full dataset, keep all days of a given pack within the same train/validation/test split to avoid leakage between near-identical adjacent-day images of the same fillet.
  • Storage-day targets. Day index (1–16) can serve as an ordinal regression or classification target.
  • Background masking. Cubes include non-fillet background; mask it (e.g. via an NDVI-like band ratio or simple thresholding) before computing fillet-level spectral statistics.

8. License

This dataset is released under the Creative Commons Attribution Non Commercial No Derivatives 4.0 International (CC-NC-ND-BY-4.0) license. 

9. Citation

If you use this dataset, please cite both the Zenodo record and the associated publication:

@dataset{hfqa_2026,
  author       = {Alam, Kazi Nabiul, Sheikh-Akbari, Akbar, Bagheri Zadeh, Pooneh, Leeds Beckett University},
  title        = {{HFQA - Hyperspectral Imaging for Fish Quality Assessment: A Spatial-Spectral Benchmark Dataset for Non-Destructive Freshness Analysis}},
  year         = {2026},
  publisher    = {Zenodo},
  version      = {1.0 (Pack 05 subset)},
  doi          = {10.5281/zenodo.20508624},
  url          = {https://doi.org/10.5281/zenodo.20508624}
}

10. Contact:

For questions, corrections, or access to the full dataset, please contact:

Kazi Nabiul Alam (Researcher) - School of Built Environment, Engineering and Computing, Leeds Beckett University. Email: k.alam3742@student.leedsbeckett.ac.uk · ORCID: 0000-0001-8089-9789

Dr. Akbar Sheikh-Akbari (Data Manager) - School of Built Environment, Engineering and Computing, Leeds Beckett University. Email: a.sheikh-akbari@leedsbeckett.ac.uk · ORCID: 0000-0003-0677-7083

Files

HSI cubes in .bip format.zip

Files (26.1 GB)

Name Size Download all
md5:a0adc37538a3bbe3968b413eae69ab47
5.7 kB Download
md5:3cd83a0b0e8044116cf291d0dc9f2156
6.3 kB Download
md5:7f27e62959c421d56ce18accf6eeb8dc
10.6 kB Download
md5:3a5665c38615393ec6365ae154848779
8.1 kB Download
md5:53201b52779e0b2bc8b1fb1b3b10004d
25.7 GB Preview Download
md5:369732b2cb57edf6b9362d716d52724d
426.3 MB Preview Download
md5:e8c68f65344cf62a27ea2773a14530dc
4.1 kB Download
md5:dc5c9804befbbe17068d965b073f2434
10.0 kB Download

Additional details

Dates

Created
2025-11-24

Software

Programming language
Python , MATLAB
Development Status
Active