There is a newer version of the record available.

Published April 5, 2023 | Version 0.1.1
Dataset Open

Data from: Enamel proteins reveal biological sex and genetic variability within southern African Paranthropus

  • 1. Globe Institute, University of Copenhagen
  • 2. Center for protein research, University of Copenhagen
  • 3. Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Denmark
  • 4. Section for Molecular Ecology and Evolution, Globe Institute, University of Copenhagen, Denmar
  • 5. Human Evolution Research Institute (HERI), University of Cape Town, Cape Town, South Africa

Description

This dataset contains the sequences of Paranthropus robustus, first described in 'Enamel proteins reveal biological sex and genetic variability within southern African Paranthropus', as well as the reference data and all the results from the analysis of those sequences.

Folders and Sub-Folders:

- Paranthropus_Raw_AA_Sequences_Unaligned: Contains 2 fasta files. Paranthropus_Unaligned.fasta contains all the Paranthropus robustus sequences that were used for all of the analyses. Paranthropus_Unaligned_UNFILTERED.fasta contains all the Paranthropus robusts sequences before filtering for SAP quality/confidence. These sequences were not used in any of the analyses, but are provided here for openness. 

 

 

- Reference_Datasets: Contains 3 fasta files. Each fasta file is a reference dataset used in at least one analysis. The identity and origin of each sample is described in the supplementary document of the publication.

 

 

- Phylogenetic_Analysis_Datasets_and_Trees: Contains the following 4 folders

    - Paranthropus_Alignments_All_Datasets: Contains 3 folders. Each folder contains the aligned and I/L corrected MSAs (Multiple Sequence Alignments) of Paranthropus robustus and a reference dataset.

    - Paranthropus_Diversity_Dataset_Trees_Results: Contains all analysis done using the 'diversity' reference dataset. Contains one folder for each protein, which includes the protein alignment and the phylogenetic tree of that protein. Additionally a folder named 'CONCATENATED' contains the concatenated alignemnts and trees. The BEAST2-STARBEAST3 folder contains the Starbeast3 analysis, including the xml, output log file, output trees and the input taxon set file.

    - Paranthropus_Representative_Dataset_Trees_Results: Contains all analysis done using the 'representative' reference dataset. Contains one folder for each protein, which includes the protein alignment and the phylogenetic tree of that protein. Additionally a folder named 'CONCATENATED' contains the concatenated alignemnts and trees. The BEAST2 folder contains the time-calibrated BEAST2 analysis, including the xml, output log file, output trees. The folder Distance_Matrix contains the generated distance matrix and the Rscript used to generate the heatmap from it.

    - Paranthropus_Independent_Dataset_Trees_Results: Contains all nexus files and tree-figures used in the analysis of the 'independent' reference dataset. 

 

Files

Paranthropus_Enamel.zip

Files (245.9 MB)

Name Size Download all
md5:51d4b8712d8fa2ae00383aad6f69f0c6
245.9 MB Preview Download

Additional details

Funding

PUSHH – Palaeoproteomics to Unleash Studies on Human History 861389
European Commission