# # # # # # # # # # # # # # # # # # Home ground advantage: local Atlantic salmon have higher reproductive fitness than dispersers in the wild. K.B. Mobley, H. Granroth-Wilding, M. Ellmen, J-P. Vähä, T. Aykanat, S.E. Johnston, P. Orell, J. Erkinaro & C.R. Primmer. # # # # # # # # # # # # # # # # # # **Description of data files** # # # # # # # # # This file accompanies the repository submission of all data used in the above publication in Science Advances. The data consists of six files, all in .csv format. Each file is explained in turn below. # # # # # # # # # "Adults.csv" This file contains all phenotypic and reproductive success data, and population assignment details for each individual adult salmon sampled at the main study site, lower Utsjoki over for cohort years (2011-15), and the secondary site, Akujoki, for 2011 cohort year. The columns contain the following information. ID: unique identifier for each individual. The first 3 letters indicate the sampling location, the first two-digit number the sampling year (followed by an "A" for adults) and the second three-digit number a sequential unique identifier. Location: sampling site (lower Utsjoki or Akujoki). RunYear: the year in which adults were sampled and associated offspring the following spring Sex: male or female, assessed by visual inspection. Weight: in kg. Length: in cm. Condition: the residual from a linear model of weight predicted by length for each sex and spawning cohort; individuals that were heavier than expected based on their length have positive values and those that were lighter than expected have negative values. Respawner: has this individual spawned before, based on morphological indicators? Logical (1/0 = yes/no). RecordedAge: the number winters spent at sea before spawning (sea-winters), as recorded during sampling from scale morphology. InterpAge: for individuals where age could not be determined based on scale morphology, sea age interpolated from the individual's weight according to the weight distribution of known-age individuals. SeaAge: the number of sea winters spent at sea before spawning, used in analyses. This combines RecordedAge where possible, with missing data replaced by InterpAge. LocalOrDisperser: based on population assignment using ONCOR, was this individual born in the same spawning location as where it was caught as a spawning adult (local) or elsewhere (disp). NatalPop: the genetically assigned population from which the individual likely originated based on ONCOR simulations. NOTE: to avoid character recognition problems when reading in the files, all diacritics (umlauts, cedillas, carons etc.) are omitted here; see Fig. 1 in the main text for correct spellings. Assign_prob: Probability of assignment to natalPOP based on ONCOR simulations. DistanceMoved: distance in km between the individual's natal population assigned by ONCOR simulations and the location where the adult was captured (lower Utsjoki or Akujoki). Local individuals have a value of 0. Please note that all adults in this study have migrated to sea and returned to spawn. NrOffsp: number of offspring assigned to this individual based on parentage analysis. NrMates: number of other individuals assigned as the other parent of this individual's offspring, i.e. the minimum number of mates that this individual had. PropMatesDispersers: the proportion of this individual's mates that came from other spawning locations. AnnualAdultSampleSize: annual sampling effort of adults, as the number of adults caught in the relevant sampling year. AnnualOffspSampleSize: annual sampling effort of offspring, as the number of adults caught in the relevant sampling year. NA = missing data. # # # # # # # # # "Genotypes.csv" This file contains all microsatellite genotype data for adults and offspring used in parentage assignments, for all years and sampling locations. Microsatellite loci are listed in two columns, (a) for the first allele and (b) for the second allele. The columns contain the following information: ID: unique identifier for each individual, as above. "0y" in the ID indicates the young-of-the-year (<1 year) used in this study (older juveniles were also sampled in parallel but not included in this study). Location: sampling site (lower Utsjoki or Akujoki). RunYear: the cohort year in which adults and associated offspring were sampled. SampledAs: role in pedigree at sampling, offspring or adult. LociGenotyped: number of microsatellite markers successfully genotyped. SSsp2215a -- EST19b: microsatellite loci genotypes. NA = missing data. # # # # # # # # # "MateChoice.csv" This file contains data for each pair of adults identified from offspring parentage assignments as having mated together. These data were used to test for assortative mating and other aspects of mate choice. The columns contain the following information: FocalID: unique identifier, as above, for the focal individual. MateID: unique identifier, as above, for each individual that the focal individual is known to have mated with. NrOffsp: the number of sampled offspring assigned to each pair. FocalOrigin: the origin (local or disperser) of the focal individual. MateOrigin: the origin (local or disperser) of the mate. FocalWeight: weight in kg of the focal individual. MateWeight: weight in kg of the mate. FocalAge: sea age in years of the focal individual. MateAge: sea age in years of the mate. FocalSex: sex (male or female) of the focal individual. All identified mates were the opposite sex to the focal individual. RunYear: the year in which adults were sampled and associated offspring the following spring. NA = missing data. # # # # # # # # # "ParentageAssignments.csv" This file contains the output of the pedigree fit, i.e. parentage assignments, for all sampled offspring. For ease of interpretation, the origin (local or disperser) and the natal population of the mother and father are also provided. "NA" means that the parent assignment was made with insufficient confidence (<90%), and "us" means that the assignment is confidently not assigned to one of the sampled adults, i.e. the parent is confidently an unsampled individual. The columns contain the following information: OffspID: unique identifier, as above, for the offspring individual. Dam: unique identifier (or NA or us), as above, for its mother. Sire: unique identifier (or NA or us), as above, for its father. DamOrigin: the origin (local or disperser) of the assigned Dam/mother (NA where the Dam/mother was unsampled or not confidently assigned). SireOrigin: the origin (local or disperser) of the assigned Sire/father (NA where the Sire/father was unsampled or not confidently assigned). DamNatalPop: the genetically assigned population from which the mother came (NA as for origin). NOTE: to avoid character recognition problems when reading in the files, all diacritics (umlauts, cedillas, carons etc.) are omitted here; see fig. 1 min the main text for correct spellings. SireNatalPop: the genetically assigned population from which the father came (NA as for origin). NOTE: as above, diacritics are omitted. # # # # # # # # # "Pop assignment – adult genotypes.csv" Genotype file for adults from lower Utsjoki and Akujoki used for population assignment in ONCOR. Data is in genepop two digit format. NOTE: to avoid character recognition problems when reading in the files, all diacritics (umlauts, cedillas, carons etc.) are omitted here; see Fig. 1 in the main text for correct spellings. The columns contain the following information: ID: unique identifier for each individual, as above. Location: sampling site (lower Utsjoki or Akujoki). RunYear: the cohort year in which adults and associated offspring were sampled. EST107-Ssosl438: Microsatellite loci in gneopop 2 digit format. One column/microsatellite locus (i.e. first 2 digits = first allele, last 2 digits = second allele). NA = missing data. # # # # # # # # # "Pop assignment – baseline sample genotypes.csv" Genotype file for adults sampled from baseline populations in Vaha et al. 2017 used for population assignment in ONCOR. NOTE: to avoid character recognition problems when reading in the files, all diacritics (umlauts, cedillas, carons etc.) are omitted here; see Fig 1. In the main text and Vaha et al. 2017 for correct spellings. The columns contain the following information: Location: baseline sampling location names used in this study (Fig. 1 maintext, Fig. S2). Abbreviation: baseline sampling location name abbreviations used in this study (Fig. 1 maintext, Fig. S2). Location#Vaha_etal_2017: baseline sampling location name used in Vaha et al. 2017. EST107-Ssosl438: Microsatellite loci in genepop 2 digit format. One column/microsatellite locus (i.e. first 2 digits = first allele, last 2 digits = second allele). NA = missing data. References J.-P. Vaha, J. Erkinaro, M. Falkegård, P. Orell, E. Niemela, Genetic stock identification of Atlantic salmon and its evaluation in a large population complex. Canadian Journal of Fisheries and Aquatic Sciences 74, 327-338 (2017). # # # # # # # # #