Actinidia chinensis Phenology Balanced Dataset (Portugal, 2025)

Pinheiro, Isabel; Valente, António; Baptista Neves dos Santos, Filipe; Cunha, Mario

doi:10.5281/zenodo.18224815

Published January 12, 2026 | Version v1

Dataset Restricted

Actinidia chinensis Phenology Balanced Dataset (Portugal, 2025)

1. INESC TEC
2. University of Trás-os-Montes and Alto Douro
3. Faculty of Engineering - University of Porto (FEUP)
4. Universidade do Porto Faculdade de Ciências

The Actinidia chinensis Phenology Balanced Dataset is a stratified and class-balanced subset derived from the Multi-Modal Actinidia chinensis Phenology Dataset, prepared explicitly for training hierarchical detection models that target female flower phenological staging.

The source material comprises the Labelled Images component from the original dataset, which contains smartphone-acquired imagery of kiwifruit reproductive structures annotated in Pascal VOC format according to a 17-class hierarchical taxonomy organised across three levels: structure (bud, flower, fruit), gender (female, male), and BBCH-adapted phenological stage.

The balanced dataset was generated through a two-stage processing pipeline:

Split stage: The source imagery was partitioned into test (99 images) and train+validation (1 556 images) subsets using stratified sampling to ensure proportional class representation in the test set across all hierarchical classification levels.
Balancing stage: Images exhibiting disproportionate annotation density for over-represented classes were systematically removed from the train+validation subset to facilitate subsequent balancing operations (1 311 images retained). Augmentation operations were applied iteratively until target class distributions were achieved (1 311 original + 649 augmented). Composite operations were constructed by combining one geometric transformation (flip, scale_rotate and downscale) with one appearance transformation (bright_contrast, grid_distortion, grid_dropout, unsharp and motion_blur). All augmentations were implemented using the Albumentations library, which includes bounding box coordinate transformation.

Operation	Description
flip	Horizontal and vertical reflection.
scale_rotate	Affine transformation with shift (±6.25%), scale (±10%), and rotation (±15°).
bright_contrast	Brightness and contrast modulation (±40%).
downscale	Resolution reduction (50%) with interpolation.
grid_distortion	Elastic grid-based spatial distortion.
grid_dropout	Grid-based region dropout (20% ratio).
unsharp	Unsharp masking for edge enhancement.
motion_blur	Directional motion blur simulation.

Phenological stage labels for bud and male flower classes were consolidated to their parent categories, reducing the taxonomy from 17 to 9 object classes while preserving full phenological granularity for the female flower pathway.

Original label	New label
bud_53	bud
bud_55	bud
bud_56	bud
bud_57	bud
bud	bud
flower_female_60	flower_female_60
flower_female_61	flower_female_61
flower_female_67	flower_female_67
flower_female_68	flower_female_68
flower_female_69	flower_female_69
flower_female	flower_female
flower_male_60	flower_male
flowermale_61	flower_male
flower_male_67	flower_male
flower_male	flower_male
flower	flower
fruit	-

The processed dataset is organised into two directories (test and train+validation), each containing an Images folder with JPEG files and an Annotations folder with corresponding Pascal VOC XML files.

Set	Images	Annotations	Classes
test	99	942	9
train + validation	1 960	16 694	9

The hierarchical detection pathway comprises three sequential classification tasks. The following tables present the class distribution for each level, demonstrating the balance achieved through the processing pipeline. Image counts indicate images containing at least one annotation of that class; individual images may contain multiple classes.

The first level performs structure detection, distinguishing between bud and flower structures:

Class	test	train+validation
bud	67 images with 473 annotations	1,256 images with 7,771 annotations
flower	174 images with 460 annotations	2,892 images with 8,827 annotations
background	11 images	98 images

The second level performs gender classification on detected flowers. Images containing only bud, fruit, or no annotations serve as background for this classification task:

Class	test	train+validation
flower_female	128 images with 228 annotations	1 694 images with 3 497 annotations
flower_male	27 images with 199 annotations	704 images with 4 295 annotations
flower	19 images with 33 annotations	494 images with 1 035 annotations
background	83 images	1 373 images

The third level performs phenological stage classification on detected female flowers. Images containing only male flowers, generic flowers, buds, fruit, or no annotations serve as background for this classification task:

Class	test	train+validation
flower_female	23 images with 32 annotations	332 images with 537 annotations
flower_female_60	23 images with 38 annotations	318 images with 485 annotations
flower_female_61	29 images with 45 annotations	360 images with 683 annotations
flower_female_67	19 images with 35 annotations	297 images with 797 annotations
flower_female_68	14 images with 26 annotations	213 images with 489 annotations
flower_female_69	20 images with 52 annotations	174 images with 506 annotations
background	130 images	1 559 images

The train+validation subset should be used with k-fold cross-validation for model development and hyperparameter optimisation. The test subset is reserved for final model evaluation and should not be used during training or validation. Users requiring the complete 17-class taxonomy or male flower phenological staging should refer to the original dataset.

Files

Restricted

The record is publicly accessible, but files are restricted. <a href="https://zenodo.org/account/settings/login?next=https://zenodo.org/records/18224815">Log in</a> to check if you have access.

	All versions	This version
Views	33	33
Downloads	1	1
Data volume	762.7 MB	762.7 MB

Actinidia chinensis Phenology Balanced Dataset (Portugal, 2025)

Authors/Creators

Description

Files

Restricted