Code for "A first look at sea-lavenders genomics – can genome wide SNP information tip the scales of controversy in the Limonium vulgare species complex?"
Creators
- 1. University of Lisbon
Description
Limonium data analyses
In this repository you will find:
- A Makefile describing the data analysis process
- A Dockerfile describing how to build a docker container to analyse the data
- A bin directory, containing some custom scripts as well as symlinks for the used software (tailored for the Docker image)
- A Static_param_files directory containing static configuration files that work as the analysis backbone
- A submodules directory containing clones of other git repositories with scripts used in the analyses
What you won't find here:
- Input data files
- Output files (AKA, analyses results)
Running the analyses:
- Get the docker image:
docker pull stunts/limonium_data_analyses
- Start a shell in the docker container. The volumes containing the input files and output directory should be mounted in this stage:
docker run -v /data/Limonium/analyses_infiles:/RRL/infiles -v /data/Limonium/analyses_outfiles:/RRL/outfiles -i -t stunts/limonium_data_analyses:general_002
Of course, this should be adapted to your own file system file locations!
- Once inside the interactive docker shell, you can run the analyses themselves. In order to match what was used in the paper:
# Species dataset
make run_name=Limonium_species /RRL/outfiles/RAxML/RAxML_Limonium_species_tree.svg clustering
# Lvu_Lma dataset
make run_name=Limonium_lvu_lma miss_thres_percent=90 /RRL/outfiles/Limonium_lvu_lma/Annotations/annotations.tab clustering
This might take a while, but in the end you will have all the analyses results, 100% reproducible!
Warning:
Some steps are likely to take a considerable amount of time to run, namely the RAxML analysis, and the Annotation step. The annotation step will download the Uniparc database for the annotations. At the time of writing, this database was 100GB (Uniparc release 2022_01). It will certainly increase with time. Also, I am not aware of a way to obtain specific Uniparc releases, which might affect future reproducibility.
Files
04-Limonium_GBS_data_analyses.zip
Files
(2.0 MB)
Name | Size | Download all |
---|---|---|
md5:8f6a487fd258d0ef30294d143325396f
|
2.0 MB | Preview Download |