Published December 9, 2025 | Version v1
Dataset Open

Dataset for orange fruit detection from UAV in citrus orchards

Description

Presentation

Campaneta-Orange-Fruit is a dataset featuring 550 synchronized captures of RGB and four-band multispectral (G, R, RE, NIR) imagery collected over commercial orange trees using a DJI Mavic 3 Multispectral UAV flying at 14 m above ground level. Each capture includes the RGB .JPG frame plus the four band-specific .TIF images; all filenames share the pattern DJI_YYYYMMDDHHMMSS_NNNN, with NNNN running from 0001 to 0550 to ensure one-to-one alignment across modalities.

In total, the dataset comprises 2,750 individual images (550 × 5 bands) and 301,232 annotated fruit instances, providing a rich basis for cross-spectral detection, yield estimation, and model generalization studies. The low-altitude UAV acquisition guarantees sub-centimeter spatial resolution and consistent canopy coverage across captures.

Table 1 summarizes the number of annotated instances per spectral band, as well as the per-image statistics in terms of bounding-box density and variability.

Band Instances Mean boxes/image Minimum boxes/image Maximum boxes/image
RGB 80 446 146.27 1 776
G 55 384 100.70 1 714
R 55 332 100.60 1 714
RE 55 158 100.29 1 713
NIR 54 912 99.84 1 714

Table 1: Distribution of annotated orange fruit instances per spectral band in the Campaneta-Orange-Fruit dataset.

Differences in instance counts across bands arise from several factors intrinsic to multispectral acquisition: (i) small variations in field of view and optical alignment between the RGB and multispectral sensors; (ii) spectral reflectance differences that cause some fruits to appear less contrasted or partially invisible in certain narrow bands (particularly in NIR and Red-Edge); and (iii) parallax effects due to minor geometric offsets between the multispectral lenses. During the homography-based reprojection process, bounding boxes with excessive distortion, reduced visible area, or partial clipping were automatically filtered, which also contributes to the lower object counts observed in the non-RGB bands.

Study Area

The Campaneta-Orange-Fruit dataset was collected in a commercial citrus orchard located in Corbera, Valencia, Spain (latitude 39.1769 N, longitude 0.3538 W). The study area covers orange trees managed under conventional irrigation and pruning practices. The orchard is situated in the Mediterranean climatic region, characterized by mild winters, hot dry summers, and average annual precipitation around 450 mm. The selected site provides representative canopy geometry and illumination conditions for UAV-based multispectral imaging of fruit-bearing trees.

/figures/Figure_1.png — Figure 1 shows the study area map, illustrating the orchard boundary and the UAV flight coverage conducted during the data collection campaign in Corbera (Valencia, Spain).

Repository Structure

The dataset is organized into five main directories corresponding to the spectral bands acquired during the UAV flight: RGB/, G/, R/, RE/, and NIR/. Each folder contains the image files (.JPG for RGB and .TIF for multispectral bands) along with their corresponding YOLO-format annotation files (.txt) and a classes.txt file defining the object categories. Filenames are synchronized across all bands following the pattern DJI_YYYYMMDDHHMMSS_NNNN, where NNNN ranges from 0001 to 0550, ensuring one-to-one correspondence between modalities. The root directory also includes a README.md file describing acquisition details, annotation format, and data usage guidelines. 

/figures/Figure_2.png — Figure 2 shows the repository structure of the Campaneta-Orange-Fruit dataset, illustrating the organization of image files, annotations, and metadata across the five spectral band directories (RGB, G, R, RE, and NIR).

Homography-based label transfer

The multispectral bands were annotated through a homography-based reprojection pipeline to ensure geometric consistency with the RGB reference annotations. The alignment between RGB and each spectral band (G, R, RE, NIR) was estimated using ORB keypoints and a RANSAC-based homography solver implemented in the script estimate_band_homographies.py.

Once the per-band homography matrices were computed, the RGB YOLO annotations were projected into each multispectral image using transfer_yolo_labels.py. This process filters boxes that fall partially outside the target frame, become too small, or lose visibility due to perspective distortion, ensuring reliable pixel-level correspondence across all modalities.

Both scripts are included in the /scripts/ folder of the repository to allow reproducibility and extension of the label-transfer workflow.

Files

Files (20.9 GB)

Name Size Download all
md5:a2f54961e3feedf30c2ec93dc4ad2f8e
20.9 GB Download
md5:94834c1da9bd7ce1329111975c0b8505
2.5 MB Download
md5:58fd818d998b76e880028a17568728fb
6.2 kB Download

Additional details

Funding

Ministerio de Ciencia, Innovación y Universidades
Agriculture 6.0 TED2021-131040B-C33
Conselleria de Cultura, Educación y Ciencia, Generalitat Valenciana
Dronia INREIA/2024/164