Published March 4, 2026 | Version 1.0.0
Technical note Open

Predicting Five Key Biomass Components from Pasture Images: A Dual-Stream Ensemble Approach for the CSIRO Image2Biomass Challenge

Authors/Creators

Description

This technical report presents an independent entry to the CSIRO Image2Biomass Prediction Kaggle competition (October 2025 – January 2026), which tasked participants with predicting five pasture biomass components — dry green vegetation, dry dead material, dry clover biomass, green dry matter (GDM), and total dry biomass — from high-resolution field images.

We propose a dual-stream ensemble combining: (1) a Vision Transformer backbone (vit_large_patch16_dinov3_qkvb) augmented with a Feature-wise Linear Modulation (FiLM) fusion module for left/right panoramic image fusion, and (2) a frozen SigLIP semantic feature extractor coupled with a gradient boosting ensemble (CatBoost, LightGBM, HistGradientBoosting, GradientBoosting). The two streams are combined via a weighted ensemble (88.5% / 11.5%). A physics-constrained post-processing step enforces biological mass balance constraints (GDM = Dry_Green + Dry_Clover, Dry_Total = GDM + Dry_Dead) and a state-based rule for Western Australia samples.

Key contributions include: a robust preprocessing pipeline with conditional orange timestamp inpainting (HSV masking, Telea algorithm, applied to 26.7% of images) and bottom-crop artifact removal; a FiLM-based dual-stream fusion module enabling cross-view interaction in O(d); text-guided semantic feature extraction via SigLIP cosine similarity probing; and physics-constrained post-processing via orthogonal projection.

The pipeline achieved a weighted R² of 0.7169 on the public leaderboard and 0.6172 (unofficial, top ~8%, silver medal zone) on the private leaderboard, ranking 167 out of 3802 teams. OOF 5-fold cross-validation yielded a weighted global R² of 0.8785 on the 357 training images, with Dry_Dead_g identified as the hardest target (R² = 0.698) due to visual ambiguity with bare soil.

All code is openly available at https://github.com/gtom-pandas/image2biomass. The dataset is provided by CSIRO, MLA, and FrontierSI under CC BY-SA 4.0.

Files

image2biomass technical paper GRACI.pdf

Files (2.7 MB)

Name Size Download all
md5:24d8273bfdd3ffcd67b1570ea22bd45b
2.7 MB Preview Download

Additional details

Related works

Is derived from
Event: https://www.kaggle.com/competitions/csiro-biomass (URL)
Is supplemented by
Computational notebook: https://github.com/gtom-pandas/image2biomass (URL)

Software

Repository URL
https://github.com/gtom-pandas/image2biomass
Programming language
Python
Development Status
Inactive