Published March 9, 2024 | Version 1.0
Dataset Open

Benchmarking imputation methods for categorical biological data

  • 1. ROR icon University of Fribourg
  • 2. ROR icon University of Zurich
  • 3. ROR icon Swansea University
  • 4. ROR icon University of Gothenburg

Contributors

  • 1. ROR icon University of Fribourg

Description

Description:

Welcome to the Zenodo repository for Publication Benchmarking imputation methods for categorical biological data, a comprehensive collection of datasets and scripts utilized in our research endeavors. This repository serves as a vital resource for researchers interested in exploring the empirical and simulated analyses conducted in our study.

Contents:

  1. empirical_analysis:

    • Trait Dataset of Elasmobranchs: A collection of trait data for elasmobranch species obtained from FishBase , stored as RDS file.
    • Phylogenetic Tree: A phylogenetic tree stored as a TRE file.
    • Imputations Replicates (Imputation): Replicated imputations of missing data in the trait dataset, stored as RData files.
    • Error Calculation (Results): Error calculation results derived from imputed datasets, stored as RData files.
    • Scripts: Collection of R scripts used for the implementation of empirical analysis.
  2. simulation_analysis:

    • Input Files: Input files utilized for simulation analyses as CSV files
    • Data Distribution PDFs: PDF files displaying the distribution of simulated data and the missingness.
    • Output Files: Simulated trait datasets, trait datasets with missing data, and trait imputed datasets with imputation errors calculated as RData files.
    • Scripts: Collection of R scripts used for the simulation analysis.
  3. TDIP_package:

    • Scripts of the TDIP Package: All scripts related to the Trait Data Imputation with Phylogeny (TDIP) R package used in the analyses.

Purpose:

This repository aims to provide transparency and reproducibility to our research findings by making the datasets and scripts publicly accessible. Researchers interested in understanding our methodologies, replicating our analyses, or building upon our work can utilize this repository as a valuable reference.

Citation:

When using the datasets or scripts from this repository, we kindly request citing Publication Benchmarking imputation methods for categorical biological data and acknowledging the use of this Zenodo repository.

Thank you for your interest in our research, and we hope this repository serves as a valuable resource in your scholarly pursuits.

Files

benchmark_imputation_categorical.zip

Files (321.3 MB)

Name Size Download all
md5:ded1613b1ec414dffa9668c1fd4b9de0
321.3 MB Preview Download

Additional details

Dates

Available
2024-03-09

Software

Programming language
R , Python
Development Status
Active