AI validated plant observations from social media: Flickr images from central London 2011-2019
August, Tom A;
Millard, Joseph W;
This dataset is the result of using an AI image classifer to classify images of plants on social media. We believe this is the first AI validated dataset of biological records taken from social media. This represents the dawn of AI naturalists whose domain of exploration is not the outdoor world but the digital realm. These AI naturalists will trawl streams of data from all over the globe, identifying genuine images of species, and in so doing create valuable data sets that will further our understanding of the distribution of wildlife on our planet.
This dataset contains 31,973 classifications of images taken in central London between May 2011 and September 2019 retrieved using the search term 'flower' on Flickr.com. Some images have very low classification confidence (7910 below 0.1), while others have very high confidence (3185 over 0.9). As expected given the spatial extent of the dataset many of the observations are of planted species in gardens and parks.
August_et_al_2019.csv provides the data while metadata.txt contains a description of the data and its generation.
TA conceived the idea, undertook the analyses and wrote the plantnet R-package used to interface with the PlantNet API for classification (https://github.com/BiologicalRecordsCentre/plantnet). TA, NF, CM, NB, JM, RS and ES, developed a prototype of this idea at the Quantitative Ecology Hackathon 2019 hosted by the British Ecological Society. NF developed the photosearcher R-package, used for retrieving image data from Flickr (https://github.com/nfox29/photosearcher). PB and AA, granted free access to the PlantNet API for this project (https://my.plantnet.org/) and assisted TA in the development of the plantnet R-package.
The BES Hackathon was supported by the British Ecological Society and Methods in Ecology and Evolution. TA was supported by COST action CA17122 'Increasing understanding of alien species through citizen science' and Natural Environment Research Council award number NE/R016429/1 as part of the UK-SCAPE programme delivering National Capability.