README * Reference * This data set is a supplementary material to the following article: Tarling P, Cantor M, Clapés A, Escalera S. Deep learning with self-supervision and uncertainty regularization to count fish in underwater images. arXiv: http://arxiv.org/abs/2104.14964 If you use any of these videos, images or annotated data, please cite the article above and the following two repositories: Cantor M. 2021. Underwater surveys of mullet schools (Mugil liza) with Adaptive Resolution Imaging Sonar. [Data set]. Zenodo. DOI: 10.5281/zenodo.4717411 https://github.com/ptarling/DeepLearningFishCounting * Description * This dataset is part of a research project that employs deep learning, with a density-based regression approach, to count fish in low-resolution sonar images (Tarling et al. preprint arXiv DOI: http://arxiv.org/abs/2104.14964). In this repository, we provide data from sonar-based underwater videos of schools of migratory mullets (Mugil liza) recorded at the Tesoura beach (28.495775 S, 48.759996 W), a 100-meter long beach at the inlet canal connecting the Laguna lagoon system to the Atlantic Ocean, in southern Brazil. Since the water transparency at the lagoon canal is very low (from 0.3 to 1.5m visibility; collected in situ with a Secchi disk), mullet schools were recorded by deploying an Adaptive Resolution Imaging Sonar, ARIS 3000 (Sound Metrics Corp, WA, USA), which uses 128 beams to project a wedge-shaped volume of acoustic energy and convert their returning echoes into a digital overhead view of the mullet schools. This dataset contains 500 fully annotated images that were manually marked for the location and abundance of mullet fish, and  126 raw sonar video files, representing over 100k images. The files are organized as follows: 1) "2018-MM-DD_HHMMSS" files are mp4 videos (you may need to add the file extension ".mp4"): There are 126 ARIS files converted into MP4 videos totalling over 789MB of underwater footage captured at 3 frames/seconds. Note that file names indicate the date and time the video was recorded. 2) ".npy" files (in Labelled_data.zip): From the video files, 500 images were selected for labelling. Images (x) were cropped to represent a 4x8.5m2 area and resized to 320 x 576 pixels. Mullet fish were marked with a point annotation. Corresponding ground truth density maps (y) were generated by convolving a Gaussian kernel over the image mask, size =4 and standard deviation = 1. The labelled dataset was randomly split into a holdout partition of 350 training images, 70 validation, and 80 test.  3) ".csv" files: log of frames selected for the labelled subset of data 4) ".h5" file: pre-trained weights for our multi-task with uncertainty regularisation network To advance the development of these machine learning tools, we also make our code openly available (https://github.com/ptarling/DeepLearningFishCounting). * Funding * The data sampling was supported by research grants from the National Geographic Society (Discovery Grant WW210R-17) and post-doctoral fellowships from Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES Brazil; #88881.170254/2018-01) and Conselho Nacional de Pesquisa e Desenvolvimento Tecnológico (CNPq Brazil; #153797/2016-9). The research on the machine learning tools has been partially supported by the Spanish project PID2019-105093GB-I00 (MINECO/FEDER, UE) and CERCA Programme/Generalitat de Catalunya, and by ICREA under the ICREA Academia programme awarded to Sergio Escalera.