HFQA - Hyperspectral Imaging for Fish Quality Assessment: A Spatial-Spectral Benchmark Dataset for Non-Destructive Fish Freshness Analysis
Authors/Creators
Description
This Zenodo record contains a representative subset of the full dataset: Pack No. 5, captured daily across all 16 storage days. The subset is released to let users inspect data quality, file formats, and tooling before working with the complete collection.
1. Overview
HFQA (Hyperspectral Imaging for Fish Quality Assessment) is a hyperspectral imaging (HSI) dataset capturing the visual and spectral degradation of Atlantic salmon (Salmo salar) fillets during cold storage. A single fillet pack was imaged once per day for 16 consecutive days under controlled refrigeration, producing a time series that documents freshness loss across the visible and near-infrared (VNIR) range.
The dataset is intended for research in:
- Non-destructive food freshness / shelf-life estimation
- Spectral regression and ordinal classification of storage day
- Hyperspectral band selection and dimensionality reduction
- Domain adaptation and few-shot / transfer learning across packs and days
- General hyperspectral image processing method development
Each acquisition is provided in two interchangeable formats so users can adopt whichever fits their pipeline:
| Format | Contents | Typical use |
|---|---|---|
ENVI BIP (.bip + .hdr) |
Raw band-interleaved-by-pixel cube with full ASCII header | Remote-sensing / MATLAB / ENVI / SPy workflows |
NumPy (.npy + wavelengths.npy + metadata.json) |
Ready-to-load array with sidecar wavelength and metadata files | Python / deep-learning workflows |
Both formats encode identical pixel data and wavelengths — they are provided purely for convenience.
2. What's in this subset
This record contains Pack No. 5, all 16 days, in both formats:
HFQA_pack05/
├── README.md <- this file
├── bip/ <- ENVI BIP format
│ ├── day_01.bip
│ ├── day_01.hdr
│ ├── day_02.bip
│ ├── day_02.hdr
│ ├── ...
│ ├── day_16.bip
│ └── day_16.hdr
├── npy/ <- NumPy format
│ ├── day_01.npy
│ ├── day_02.npy
│ ├── ...
│ ├── day_16.npy
│ ├── wavelengths.npy <- shared 1-D array of band centre wavelengths (nm)
│ └── metadata.json <- acquisition + pack metadata
└── scripts/ <- visualisation utilities (see Section 6)
├── visualize_npy.py
├── visualize_bip.py
└── visualize_bip.m
Note on day ordering: files are named
day_01 … day_16so that simple alphabetical sorting yields correct chronological order.day_01is the freshest (day of purchase / packing) andday_16is the most degraded.
3. Acquisition details
The values below are templated placeholders — please replace the bracketed fields with your actual acquisition parameters before publishing.
| Property | Value |
|---|---|
| Specimen | Atlantic salmon (Salmo salar) fillet, skin-on/off |
| Number of packs (full dataset) | 50 |
| This subset | Pack No. 5 only |
| Days imaged | 16 (one acquisition per day) |
| Storage condition | Refrigerated at 0-2 °C, ambient light |
| Camera / sensor | Pika XC2 HSI camera manufactured by Resonon Inc. (Bozeman, MT, USA) |
| Spectral range | ~400–1000 nm (VNIR) |
| Number of bands | 462 (see metadata.json / .hdr) |
| Spatial dimensions | 1746 × 1346 pixels (see metadata.json / .hdr) |
| Illumination | Halogen line light, two sources at 45° |
| Reference calibration | white reference (Spectralon) + dark balance |
Calibration: R = (raw − dark) / (white − dark).
4. Data format specifications
4.1 NumPy format (.npy/)
day_XX.npy— a 3-D array of shape (lines, samples, bands) =(H, W, B), dtypefloat32.wavelengths.npy— a 1-Dfloat64array of lengthBgiving the centre wavelength (in nanometres) of each band, shared across all 16 days.metadata.json— acquisition metadata. Example structure:
{
"dataset": "HFQA",
"pack": 5,
"day": 1,
"species": "Salmo salar",
"sensor": "VNIR",
"lines": 600,
"samples": 800,
"bands": 120,
"wavelength_unit": "nm",
"wavelength_range": [400.0, 1000.0],
"reflectance_scale": "[0,1]",
"storage_temperature_c": 4.0,
"acquisition_date": "2025-11-25"
}
Minimal load in Python:
import numpy as np, json
cube = np.load("npy/day_01.npy") # (H, W, B) float32
wl = np.load("npy/wavelengths.npy") # (B,) nm
meta = json.load(open("npy/metadata.json"))
4.2 ENVI BIP format (.bip/)
day_XX.bip— raw binary cube in Band-Interleaved-by-Pixel layout. On disk, for each pixel the full spectrum is stored contiguously, scanning samples then lines.day_XX.hdr— ENVI ASCII header. Key fields:
ENVI
samples = 800
lines = 600
bands = 120
data type = 4 ; 4 = 32-bit float (see ENVI codes)
interleave = bip
byte order = 0
wavelength units = Nanometers
wavelength = { 400.0, 405.0, ..., 1000.0 }
ENVI data type codes used here: 1=uint8, 2=int16, 4=float32, 12=uint16 (full mapping handled by the provided scripts).
The cube can be opened directly in ENVI, in MATLAB via the provided script (or hypercube/multibandread), or in Python via the spectral (SPy) package or the bundled visualize_bip.py parser.
5. Quick start
Python
pip install numpy matplotlib # core requirements
# pip install spectral # optional, for --use-spectral in the BIP script
# Visualise a NumPy cube
python scripts/visualize_npy.py --input npy/day_01.npy --outdir figures
# Visualise an ENVI BIP cube
python scripts/visualize_bip.py --input bip/day_01.bip --outdir figures
MATLAB
% From the dataset root, with scripts/ on the path:
addpath('scripts');
visualize_bip('bip/day_01.bip', 'figures'); % writes PNGs into figures/
6. Visualisation scripts
Three scripts are bundled in scripts/. All four figures they generate are designed for direct inclusion in papers and data descriptors.
| Script | Language | Input format |
|---|---|---|
visualize_npy.py |
Python 3 | .npy (+ wavelengths.npy, metadata.json) |
visualize_bip.py |
Python 3 | ENVI .bip (+ .hdr) |
visualize_bip.m |
MATLAB | ENVI .bip (+ .hdr) |
band_tour.m |
MATLAB | ENVI .bip (+ .hdr) |
explorer.m |
MATLAB | ENVI .bip (+ .hdr) |
rgb_viewer.m |
MATLAB | ENVI .bip (+ .hdr) |
Each produces:
- Pseudo-RGB composite — a natural-colour reconstruction using bands nearest 650 / 550 / 450 nm.
- Single-band montage — a labelled grid of evenly-spaced bands across the spectral range.
- Mean spectral signature — the spatial-mean reflectance curve with a ±1σ shaded band.
- Overview dashboard — a single combined figure (pseudo-RGB + sample bands + mean spectrum) suitable as a dataset summary figure.
The Python ENVI reader is dependency-free (only numpy + matplotlib); pass --use-spectral to read via the spectral package instead. The MATLAB script requires no toolboxes (it uses low-level fread and a built-in header parser).
Reproducibility note: the two Python readers were verified to return byte-identical cubes and wavelengths from the BIP and NumPy versions of the same acquisition.
7. Suggested usage and benchmarks
To support comparable results across studies, we suggest:
- Pack-level splitting. When using the full dataset, keep all days of a given pack within the same train/validation/test split to avoid leakage between near-identical adjacent-day images of the same fillet.
- Storage-day targets. Day index (1–16) can serve as an ordinal regression or classification target.
- Background masking. Cubes include non-fillet background; mask it (e.g. via an NDVI-like band ratio or simple thresholding) before computing fillet-level spectral statistics.
8. License
This dataset is released under the Creative Commons Attribution Non Commercial No Derivatives 4.0 International (CC-NC-ND-BY-4.0) license.
9. Citation
If you use this dataset, please cite both the Zenodo record and the associated publication:
@dataset{hfqa_2026,
author = {Alam, Kazi Nabiul, Sheikh-Akbari, Akbar, Bagheri Zadeh, Pooneh, Leeds Beckett University},
title = {{HFQA - Hyperspectral Imaging for Fish Quality Assessment: A Spatial-Spectral Benchmark Dataset for Non-Destructive Freshness Analysis}},
year = {2026},
publisher = {Zenodo},
version = {1.0 (Pack 05 subset)},
doi = {10.5281/zenodo.20508624},
url = {https://doi.org/10.5281/zenodo.20508624}
}
10. Contact:
For questions, corrections, or access to the full dataset, please contact:
Kazi Nabiul Alam (Researcher) - School of Built Environment, Engineering and Computing, Leeds Beckett University. Email: k.alam3742@student.leedsbeckett.ac.uk · ORCID: 0000-0001-8089-9789
Dr. Akbar Sheikh-Akbari (Data Manager) - School of Built Environment, Engineering and Computing, Leeds Beckett University. Email: a.sheikh-akbari@leedsbeckett.ac.uk · ORCID: 0000-0003-0677-7083
Files
HSI cubes in .bip format.zip
Files
(26.1 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:a0adc37538a3bbe3968b413eae69ab47
|
5.7 kB | Download |
|
md5:3cd83a0b0e8044116cf291d0dc9f2156
|
6.3 kB | Download |
|
md5:7f27e62959c421d56ce18accf6eeb8dc
|
10.6 kB | Download |
|
md5:3a5665c38615393ec6365ae154848779
|
8.1 kB | Download |
|
md5:53201b52779e0b2bc8b1fb1b3b10004d
|
25.7 GB | Preview Download |
|
md5:369732b2cb57edf6b9362d716d52724d
|
426.3 MB | Preview Download |
|
md5:e8c68f65344cf62a27ea2773a14530dc
|
4.1 kB | Download |
|
md5:dc5c9804befbbe17068d965b073f2434
|
10.0 kB | Download |
Additional details
Dates
- Created
-
2025-11-24