Published January 23, 2024 | Version v1
Software Open

Data From: what mandrills leave behind: using fecal samples to characterize the major histocompatibility complex in a threatened primate

  • 1. University of New Orleans
  • 2. University of Exeter
  • 3. University of East Anglia
  • 4. Université des sciences et techniques de Masuku
  • 5. National Agency for National Parks
  • 6. University of Stirling

Description

The major histocompatibility complex (MHC) can be useful in guiding conservation planning because of its influence on immunity, fitness, and reproductive ecology in vertebrates. The mandrill (Mandrillus sphinx) is a threatened primate endemic to central Africa. Considerable research in this species has shown that the MHC is important for disease resistance, mate choice, and reproductive success. However, all previous MHC research in mandrills has focused on an inbred semi-captive population, so their genetic diversity may have been underestimated. Here we expand our current knowledge of mandrill MHC variation by performing next-generation sequencing of non-invasively collected fecal samples from a large wild horde in central Gabon. We observe MHC lineages and alleles shared with other primates, and we uncover 45 putative new class II MHC DRB alleles, including representatives of the DRB9 pseudogene, which has not previously been identified in mandrills. We also document methodological challenges associated with fecal samples in NGS-based MHC research. Even with high read depth, the replicability of alleles from fecal samples was lower than that of tissue samples, and allele assignments are inconsistent between sample types. Further, the common assumption that variants with very high read depth should represent true alleles does not appear to be reliable for fecal samples. Nevertheless, the use of degraded DNA in the present study still enabled significant progress in quantifying immunogenetic diversity and its evolution in wild primates.

Notes

Most files can be opened using an open source text editor such as JEdit or Notepad++. The .csv files will be easist to view in a spreadsheet such as Excel or Google Sheets, but they can also be opened in a text editor if needed. Users may wish to open fasta files using an open source genetics software such as MEGA-X.

Funding provided by: University of New Orleans Office of Research and Sponsored Programs*
Crossref Funder Registry ID:
Award Number: CON000000002361

Funding provided by: Freeport-McMoRan (United States)
Crossref Funder Registry ID: https://ror.org/05yfskh02
Award Number:

Methods

Further details of data analysis can be found in the publication manuscript, but in brief, samples were collected from a wild population of mandrills (Mandrillus sphinx) in Lopé National Park, Gabon. The majority of the samples are non-invasively collected feces, but samples of blood and plucked hair were also collected from anesthetized animals. Each sample was PCR-amplified for a 157-base fragment of the DRB gene of the class II major histocompatibility complex, then sequenced on Illumina Miseq. An initial Miseq run was performed in 2018, containing 192 fecal samples. In 2019, a Miseq Nano run was performed, containing the 24 samples of blood and plucked hair along with 23 replicate fecal samples from the 2018 sequencing run. In 2021, the third sequencing run was performed using standard Miseq, and this run included nine replicates of the blood and hair samples and 183 replicated fecal samples.

To quantify repeatability between runs, each Illumina run was processed separately. Pre-processing was performed using the ampliSAT pipeline (Sebastian et al. 2016), and alleles were assigned to each sample in each run using the degree of change method (Lighten et al. 2014). For samples that were replicated between runs, a replicability score (RA) was calculated based on the proportion of variants that appeared in both runs. A custom Python script was used to compare allele assignments of samples in each pair of Illumina runs.

To generate consensus allele assignments for each sample, each Illumina run was reanalyzed in the AmpliSAT pipeline, using more relaxed parameters in order to minimize the chance of allelic dropout. Then, a custom Python script was used to extract sequence variants that replicated between runs for each sample. This set of replicated variants for each sample was then screened for errors such as replicated chimeric sequences. After discarding such sequence artifacts, the set of variants that remained was considered the individual's "true" allele assignment.

Files

Files (15.5 kB)

Name Size Download all
md5:cfad367931da9a2bf2872d5e239b18b6
5.5 kB Download
md5:e0cf2746009ba8df1bdef81e457a9193
4.5 kB Download
md5:79f637f195ea3df29aa02d31aee5e227
5.5 kB Download

Additional details

Related works

Is cited by
10.1007/s10592-023-01587-2 (DOI)
Is source of
10.5061/dryad.w9ghx3fst (DOI)