Published April 6, 2023 | Version v1
Dataset Open

Tsetse fly wing landmark data for morphometrics (Vol 20, 21)

  • 1. Stellenbosch University

Description

Single-wing images were captured from 14,354 pairs of field-collected tsetse wings of species Glossina pallidipes and G. m. morsitans and analysed together with relevant biological data. To answer research questions regarding these flies, we need to locate 11 anatomical landmark coordinates on each wing. The manual location of landmarks is time-consuming, prone to error, and simply infeasible given the number of images. Automatic landmark detection has been proposed to locate these landmark coordinates. We developed a two-tier method using deep learning architectures to classify images and make accurate landmark predictions. The first tier used a classification convolutional neural network to remove most wings that were missing landmarks. The second tier provided landmark coordinates for the remaining wings. For the second tier, compared direct coordinate regression using a convolutional neural network and segmentation using a fully convolutional network. For the resulting landmark predictions, we evaluate shape bias using Procrustes analysis. We employ a data-centric approach paying particular attention to consistent labelling and data augmentations in training data to improve model performance. The classification model used for the first tier achieved perfect classification on the test set. For an image size of 1024×1280, data augmentation reduced the mean pixel distance error from 8.3 (95% CI [4.4,10.3]) to 5.34 (95% CI [3,7]) for the regression model. For the segmentation model, data augmentation did not alter the mean pixel distance error of 3.43 (95% CI [1.9,4.4]). Segmentation had a higher computational complexity and some large outliers. Both models showed minimal shape bias. We chose to deploy the regression model on complete unannotated data since the regression model had a lower computational cost and more stable predictions than the segmentation model. The resulting landmark dataset was provided for future morphometric analysis.

Notes

Funding provided by: DST-NRF Centre of Excellence for Invasion Biology
Crossref Funder Registry ID: http://dx.doi.org/10.13039/100014436
Award Number:

Funding provided by: National Research Foundation
Crossref Funder Registry ID: http://dx.doi.org/10.13039/501100001321
Award Number:

Files

missing_landmarkwings_L.zip

Files (3.7 GB)

Name Size Download all
md5:d4a65e50a425c06763b9d212b4df8b95
2.6 GB Preview Download
md5:7621fb9e901bde3aec7d1b2795f03619
94.5 MB Download
md5:a92257987a0c72e51913d38da45231dc
146.7 MB Download
md5:ceee60f7747c0e4970c9e8a6217c69c7
12.8 MB Preview Download
md5:d4fa09f876cebdc257a5fe30f18306d0
3.2 kB Preview Download
md5:f270ce89ba6adcee6780e53da9b1e868
285.9 MB Preview Download
md5:1364359ec7c338c7d2b83a98ab10d6a7
531.9 MB Preview Download

Additional details