# The rapid radiation of *Bomarea* (Alstroemeriaceae: Liliales), driven by the rise of the Andes

This dataset contains the data to fully reproduce analyses on the evolutionary history, diversification, and biogeography of *Bomarea* and outgroups in Alstroemeriaceae. These data include intermediate and processed molecular data (raw molecular data is on NCBI SRA, link below), the results of phylogenetic analyses, biogeographic data on species' ranges, and the results of diversification and biogeographic analyses. 


## Description of the data and file structure

### This repository contains a series of analysis-specific folders and relevant files:


#### <span style="color:orange">[Dir]</span> aln_editing 
Alignments and scripts for editing and curating alignments. All code is in R, which is available from [CRAN](https://cran.r-project.org/).

- <span style="color:orange">[Dir]</span> original_alns # alignments produced by the assembly pipeline described in Endara & Burleigh (2023, in prep)
- <span style="color:orange">[Dir]</span> final_alns # alignments used in the analyses after curating with the code files listed below
- 1_get_loci_for_genetrees.R 
- 2_sort_loci_highcoverage_good_gt.R
- 3_revise_alns.R
- 4_split_loci.R
- 5_trim_alns.R
- 6_concatenate_loci.R
- 7_make_bom_edu_subset.R
- 8_make_bpp_input_aln.R
#### <span style="color:orange">[Dir]</span> astral 
Scripts and data to make gene trees and run astral from alignments. Code is in bash and calls [IQ-TREE](http://www.iqtree.org/), a maximum likelihood phylogenetic reconstruction method, and [ASTRAL](https://github.com/smirarab/ASTRAL), an approximate coalescent phylogenetic reconstruction method. 
- <span style="color:orange">[Dir]</span> alignments # input data for building gene trees with iqtree
- <span style="color:orange">[Dir]</span> gene_trees # output from iqtree
- astral.tre # output of astral
- gene_trees.sh # build gene trees 
- in_BS10.trees # input of astral
- run_astral.sh # code for running astral 
#### <span style="color:orange">[Dir]</span> contaiminant_ID 
Script for IDing contaminant sequences in alignments in [R](https://cran.r-project.org/).
- blast_bad_seqs.R
#### <span style="color:orange">[Dir]</span> dec_alstr 
Scripts and data/ output for Alstroemeriaceae-wide DEC analysis. Code is in Rev, implemented in [RevBayes](https://revbayes.github.io/), for specifying phylogenetic probabilistic graphical models.
- <span style="color:orange">[Dir]</span> data # input data for RevBayes analysis 
- <span style="color:orange">[Dir]</span> data_prep # prepping input data for RevBayes analysis, code is in [R](https://cran.r-project.org/)
- <span style="color:orange">[Dir]</span> output # RevBayes output files
- <span style="color:orange">[Dir]</span>] scripts # RevBayes code 
#### <span style="color:orange">[Dir]</span> dec_bom_only 
Scripts and data/ output for Bomarea only DEC analysis. Code is in Rev, implemented in [RevBayes](https://revbayes.github.io/), for specifying phylogenetic probabilistic graphical models.
- <span style="color:orange">[Dir]</span> data # input data for RevBayes analysis 
- <span style="color:orange">[Dir]</span> data_prep # prepping input data for RevBayes analysis, code is in [R](https://cran.r-project.org/)
- <span style="color:orange">[Dir]</span> output # RevBayes output files
- <span style="color:orange">[Dir]</span> scripts # RevBayes code 
#### <span style="color:orange">[Dir]</span> genesortR_bom 
Files for running [genesortR](https://github.com/mongiardino/genesortR) script to choose loci for dating
#### <span style="color:orange">[Dir]</span> iqtree_part 
Files for estimating a maximum likelihood tree using [IQ-TREE](http://www.iqtree.org/), and the output of the analysis.
#### <span style="color:orange">[Dir]</span> lsbdp 
Lineage-specific birth-death process analysis on Bomarea tree. Code is in Rev, implemented in [RevBayes]{https://revbayes.github.io/}, for specifying phylogenetic probabilistic graphical models.
- <span style="color:orange">[Dir]</span> data # input data for RevBayes analysis 
- <span style="color:orange">[Dir]</span> output # RevBayes output files
- <span style="color:orange">[Dir]</span> scripts # RevBayes code 
#### <span style="color:orange">[Dir]</span> partitionfinder 
Files to identify partitions for ML analysis using [PartitionFinder](https://www.robertlanfear.com/partitionfinder/) and the output of the analysis
#### <span style="color:orange">[Dir]</span> relaxed_dating 
Relaxed-clock data of astral topology. Code is in Rev, implemented in [RevBayes]{https://revbayes.github.io/}, for specifying phylogenetic probabilistic graphical models.
- <span style="color:orange">[Dir]</span> data # input data for RevBayes analysis 
- <span style="color:orange">[Dir]</span> data_prep # prepping input data for RevBayes analysis, code is in [R](https://cran.r-project.org/)
- <span style="color:orange">[Dir]</span> output # RevBayes output files
- <span style="color:orange">[Dir]</span> scripts # RevBayes code 

## Sharing/Access information

Links to other publicly accessible locations of data:

- Raw molecular data is available on on NCBI SRA BioProject ID PRJNA881339

Biogeographic data was derived from the following source:
 - [Kew's Plants of the World Online](https://powo.science.kew.org/)


## Code/Software

All code is also available on GitHub at <https://github.com/cmt2/bom_phy_analysis>
