Published October 7, 2020 | Version v1
Dataset Open

Close relatives in population samples: Evaluation of the consequences for genetic stock identification

  • 1. Swedish University of Agricultural Sciences
  • 2. Marine Scotland

Description

Determining the origin of individuals in mixed population samples is key in many ecological, conservation and management contexts. Genetic data can be analyzed using Genetic Stock Identification (GSI), where the origin of single individuals is determined using Individual Assignment (IA) and population proportions are estimated with Mixed Stock Analysis (MSA). In such analyses, allele frequencies in a reference baseline are required. Unknown individuals or mixture proportions are assigned to source populations based on the likelihood that their multilocus genotypes occur in a particular baseline sample. Representative sampling of populations included in a baseline is important when designing and performing GSI. Here we investigate the effects of family sampling on GSI, using both simulated and empirical genotypes for Atlantic salmon (Salmo salar). We show that non-representative sampling leading to inclusion of close relatives in a reference baseline may introduce bias in estimated proportions of contributing populations in a mixed sample, and increases the amount of incorrectly assigned individual fish. Simulated data further show that the induced bias increases with increasing family structure, but that it can be partly mitigated by increased baseline population sample sizes. Results from standard accuracy tests of GSI (using only a reference baseline and/or self-assignment) gave a false and elevated indication of the baseline power and accuracy to identify stock proportions and individuals. These findings suggest that family structure in baseline population samples should be quantified and its consequences evaluated, before carrying out GSI.

Notes

Among the 1870 individuals in the original empirical baseline population samples, 96.5 % had complete genotypes at all 17 microsatellites; one individual had missing data at three loci, five at two loci and 60 at one locus, resulting in overall 0.23 % missing genotypes. Repeat genotyping of a sub-set of individuals resulted in a repeatability of 100 %, and hence an estimated error rate of zero.

Funding provided by: Swedish Research Council Formas
Crossref Funder Registry ID:
Award Number: 2013-1288

Funding provided by: Svenska Forskningsrådet Formas
Crossref Funder Registry ID: http://dx.doi.org/10.13039/501100001862
Award Number: 2013‐1288

Funding provided by: Havs- och Vattenmyndigheten
Crossref Funder Registry ID: http://dx.doi.org/10.13039/100009486

Files

Baseline_EB1.txt

Files (2.1 MB)

Name Size Download all
md5:09b177110c96a59220097cad2b32875c
271.0 kB Preview Download
md5:e0a26a17241b254b8e57e0514ee78d9c
106.3 kB Preview Download
md5:3059ce643e6c8476af5148fbe1f32b93
74.1 kB Preview Download
md5:234feae0792b77d085645b22adb77191
74.1 kB Preview Download
md5:81fa8438007d401a163006e1b7efb6e0
75.1 kB Preview Download
md5:4b57f19dc10a1d67a8cb69889c29520b
75.1 kB Preview Download
md5:a23add01ef74ce7e33830b71fd522f32
75.1 kB Preview Download
md5:a8e3229f69caa4d554e8773bb2544d35
149.7 kB Preview Download
md5:c7294de492295201d2156e78bc921fb2
149.7 kB Preview Download
md5:1e100fb66ed9ff57991ed73d5cb5c89a
149.7 kB Preview Download
md5:d44de2e81a2bc3ed4deae080922f1d04
298.8 kB Preview Download
md5:1738099f2a5f40bcab97136623f17f06
298.8 kB Preview Download
md5:29e44eadc91378bf82d8212fd0f41c93
298.8 kB Preview Download
md5:78811175216e2334b0353c7bd7eefc42
13.2 kB Preview Download
md5:1a1fc32991252467cb2d1c1f7be72664
13.2 kB Preview Download
md5:fde8a48485c262e7020c93ad303d4dbc
13.0 kB Preview Download
md5:c7708eaf4b821b1b71df49cbc1f0fe8a
13.3 kB Preview Download

Additional details

Related works

Is cited by
10.1111/1755-0998.13131 (DOI)