Software Open Access

# justincbagley/RAPFX: RAPFX pre-release version 0

Justin C. Bagley

<a href="http://imgur.com/HQ4qD4r"><img src="http://i.imgur.com/HQ4qD4r.png" title="source: Justin C. Bagley" width=40% height=40% align="center" /></a>

Sensitivity analysis of RADseq assembly parameter effects on downstream genetic analyses

This is the version zero, the initial development release of RAPFX, a new script pipeline for testing the sensitivity of downstream genetic analyses to varying RADseq assembly parameters in pyRAD (Eaton 2014) or ipyrad (Eaton and Overcast 2016). Although molecular ecologists often select an assembly or SNP calls from a single run of an assembler with a fixed set of parameter values (e.g. Phred offset, clustering percentage), this is problematic because these parameters (1) can have substantial effects on the resulting assembly and (2) they can also substantially affect downstream genetic analyses (e.g. population structure inference, phylogeny reconstruction). Here, RAPFX enters and takes steps towards automating testing of the sensitivity of assemblies and genetic inferences to pyRAD/ipyrad parameters. This can help users probe assembly parameters and select an assembly with more optimal output, and that yields genetic inferences tending towards an average rather than extreme values (say for number of genetic clusters).

This software is timely as pyRAD/ipyrad has become increasingly popular, especially given this assembler's improved handling of indels (insertions and deletions) over Stacks. However, in the future, I hope to extend this to sensitivity analyses for other assemblers as well, including Stacks and dDocent. But first expect more edits and minor versions improving the current pyRAD-related code, which may be partially stable/unstable but is available for use and testing.

Feel free to e-mail me at jcbagley (at) vcu.edu if you want to use this software or have questions.

What can you do with RAPFX?
• Infer maximum-likelihood gene trees in RAxML for all loci produced by a set of n pyRAD/ipyrad assemblies
• Infer population genetic structure (number of clusters) in fastSTRUCTURE for each assembly in a set of n assemblies
• Qualitatively assess sensitivity of genetic inferences to pyRAD/ipyrad assembly parameters
• Quantify sensitivity of resulting (downstream) genetically-based results (number of clusters, gene tree distances) to varying pyRAD/ipyrad assembly parameters
• Use some available code for parameter importance testing

Files (52.7 kB)
Name Size
justincbagley/RAPFX-v0.1.0.zip
52.7 kB
1
0
views