README_Antarctic_polynoids_2022.txt file was generated on 2022-06-21 by COWART GENERAL INFORMATION 1. Title of Dataset: Cytochrome c oxidase subunit I (Cox1) and 16S ribosomal RNA (16S) sequence dataset for the biogeography of Antarctic scale-worms. 2. Author Information A. Principal Investigator Contact Information Name: Stephane HOURDEZ Institution: UMR 8222 CNRS-Sorbonne Université Laboratoire d’Ecogéochimie des Environnements Benthiques (LECOB) Address: Observatoire Océanologique de Banyuls Avenue Pierre Fabre, 66650 Banyuls-sur-mer, FRANCE Email: stephane.hourdez@obs-banyuls.fr B. Associate or Co-investigator Contact Information Name: Dominique COWART Institution: Company for Open Ocean Observations and Logging (COOOL) Address: Saint Leu, La Réunion 97436 FRANCE Email: dcowart.cooolresearch@gmail.com 3. Date of data collection : 2010 - 2018 4. Geographic location of data collection : The Southern Ocean which includes waters off of Adélie Land, Ross Sea, the Antarctic peninsula, as well as the Kerguelen and Tierra del Fuego (Chile, South America) archipelagos. 5. Information about funding sources that supported the collection of the data: Programma Nazionale di Ricerche in Antartide (PNRA), PNRA16_00120-A1 (TNB-CODE) and PNRA18_00078 (RossMODE); Census of Antarctic Marine Life (CAML); Centre National de la Recherche Scientifique (CNRS), EC2CO (ANTARES); University of Illinois at Urbana-Champaign (UIUC), Department of Evolution, Ecology, and Behavior. SHARING/ACCESS INFORMATION 1. Licenses/restrictions placed on the data: none 2. Links to publications that cite or use the data: none 3. Links to other publicly accessible locations of the data: NCBI GenBank, which includes accession numbers MT138932 - MT139461 (Cox1) and MT139654 - MT139872 (16S) at www.ncbi.nlm.nih.gov/genbank. 4. Links/relationships to ancillary data sets: Previously published Antarctic polynoid sequences were obtained from the following publications - Bogantes, VE, Whelan, NV, Webster, K, Mahon, AR, Halanych, KM (2020). Unrecognized diversity of a scale worm, Polyeunoa laevis (Annelida: Polynoidae), that feeds on soft coral. Zoologica Scripta., 49(2),236–249. Brasier, MJ, Harle, J, Wiklund, H, Jeffreys, RM, Linse, K, Ruhl, HA, Glover, AG (2017). Distributional patterns of polychaetes across the West Antarctic based on DNA barcoding and particle tracking analyses. Frontiers in Marine Science, 4(356). Gallego, R, Lavery, S, Sewell, M (2014). The meroplankton community of the oceanic Ross Sea during late summer. Antarctic Science, 26(4), 345-360. Serpetti, N, Taylor, ML, Brennan, D, Green, DH, Rogers, AD, Paterson, GLJ, Narayanaswamy, BE (2017). Ecological adaptations and commensal evolution of the Polynoidae (Polychaeta) in the Southwest Indian Ocean Ridge: a phylogenetic approach. Deep Sea Research Part II: Topical Studies in Oceanography, 137, 237-281. 5. Was data derived from another source? no A. If yes, list source(s): n/a 6. Recommended citation for this dataset: Cowart DA, Schiaparelli S, Alvaro MC, Cecchetto M, Le Port AS, Jollivet D, Hourdez, S. (2022) Origin, diversity, and biogeography of Antarctic scale worms (Polychaeta: Polynoidae): a wide-scale barcoding approach. DATA & FILE OVERVIEW 1. File List: Cowart_etal_sequences_Cox1.fasta FASTA file that contains sequence data (n = 530). Each sequence has a header with the name (example: Seq1), the species (example: Gorekia crassicirris), the isolate name that includes geographic origin (example: Adelie Land_TA95), and the gene name (example: Cytochrome oxidase subunit I (COI), also known as Cox1). Cowart_etal_metadata_Cox1.csv CSV file that contains metadata for each sequence found in Cowart_etal_sequences_Cox1.fasta. The information provided here includes who collected the sequence, the isolate name (example: TA95), the collection date in YYYY format, the country/origin/geography, the isolation source and the geographic coordinates in decimal degrees. Cowart_etal_sequences_16S.fasta FASTA file that contains sequence data (n = 219). Each sequence has a header with the name (example: Seq1), the species (example: Gorekia crassicirris), the isolate name that includes geographic origin (example: Adelie Land_TA124), and the gene name (example: 16S ribosomal RNA (16S)). Cowart_etal_metadata_16S.csv CSV file that contains metadata for each sequence found in Cowart_etal_sequences_16S.fasta. The information provided here includes who collected the sequence, the isolate name (example: TA95), the collection date in YYYY format, the country/origin/geography, the isolation source and the geographic coordinates in decimal degrees. 2. Relationship between files, if important: Cowart_etal_sequences_Cox1.fasta and Cowart_etal_metadata_Cox1.csv are linked. The first file is a FASTA of sequences whose metadata information can be identified via the isolate name (example: TA124) provided in the second file, which contains the metadata for each sequence. Cowart_etal_sequences_16S.fasta and Cowart_etal_metadata_16S.csv are linked. The first file is a FASTA of sequences whose metadata information can be identified via the isolate name (example: TA124) provided in the second file, which contains the metadata for each sequence. 3. Additional related data collected that was not included in the current data package: none additional. 4. Are there multiple versions of the dataset? no A. If yes, name of file(s) that was updated: n/a i. Why was the file updated? n/a ii. When was the file updated? n/a METHODOLOGICAL INFORMATION 1. Description of methods used for collection/generation of data: Polynoids were retrieved across several years aboard multiple research vessels, aided by an assortment of sampling equipment. Specimens processed within the context of this study were preserved either in 96° ethanol or at ‑80 °C. Each collected individual was identified to the lowest possible taxonomic level based on morphological characters. DNA was extracted from each specimen either by following a modified CTAB protocol (Doyle & Doyle 1987) or  shipped to the Canadian Center for DNA Barcoding (CCDB) at the University of Guelph to be processed following the CCDB automated standard protocols. PCR assays were performed using extracted DNA to amplify fragments of two mitochondrial genes, Cytochrome c oxidase subunit I (Cox1) and 16S ribosomal RNA (16S). Resulting PCR products were either submitted to Eurofins Scientific for purification and Sanger sequencing in both directions, using ABI BigDye® Terminator v3.1 Cycle sequencing kit (Applied Biosystems) or underwent amplification and sequencing at CCDB. Sources: Doyle, J, Doyle, J (1987). Genomic plant DNA preparation from fresh tissue-CTAB method. Phytochem Bulletin, 19(11), 11-15. Canadian Center for DNA Barcoding (CCDB – University of Guelph, Canada): http://ccdb.ca/resources/ 2. Methods for processing the data: Resulting sequence chromatograms were visualized, assembled, and edited using Codoncode Aligner 7.1.2 (CodonCode Corporation) and Geneious v.10.0.5 (Kearse et al. 2012). Other sequence visualization and alignment programs may be used if desired. 3. Instrument- or software-specific information needed to interpret the data: Sequence visualization programs are required to visualize and align this data. These programs include, but are not limited to Codoncode Aligner (CodonCode Corporation) and Geneious (Kearse et al. 2012). Kearse, M, Moir, R, Wilson, A, Stones-Havas, S, Cheung, M, Sturrock, S, Buxton, S, Cooper, A, Markowitz, S, Duran, C, Thierer, T, Ashton, B, Meintjes, P, Drummond, A (2012). Geneious Basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics, 28(12), 1647-1649. 4. Standards and calibration information, if appropriate: none 5. Environmental/experimental conditions: n/a 6. Describe any quality-assurance procedures performed on the data: n/a