Let's look at what we have
tree
inseq=1kP_LAR_orthogroup
head data/$inseq.fasta
grep '>' data/$inseq.fasta -c
this is the new route opposed to the random subsetting.
grep -o '^>[A-Z]\{4\}-' data/$inseq.fasta | cut -c 2-5 | sort | uniq -c
grep '^>' data/$inseq.fasta | grep -v '[A-Z]\{4\}-' | cut -c 2-6 | sort | uniq -c
mkdir data/LAR_orthogroup_selection_v1
Using the codes above, and the online sample list of the 1kPlants project, I made an overview table containing extra information about the samples to
the summary looks like this:
And the full table is online here: https://docs.google.com/spreadsheets/d/1v2igxY_nr7ETMoUdbqpY0QKVxJ-KYiRiO2lLoyOABsw/edit?usp=sharing
I made a selection in
./data/LAR_orthogroup_selection_v1/1kP_LAR_RNAselection
and
./data/LAR_orthogroup_selection_v1/1kP_LAR_DNAselection
then there is Erbils guide v4
./data/Erbilsguide_v3.fasta
supplemented with ./data/Erbils_verified_sequences_v2_nonredundantwithv3.fasta
, and extra Azolla sequences.
Azfi PIP-likes V1.txt
& Azfi PIP-likes V2.txt
ls data -l
Now let's compose a fasta combining all these subsets.
grep -f ./data/LAR_orthogroup_selection_v1/1kP_LAR_DNAselection data/$inseq.fasta | grep '^>' > ./data/LAR_orthogroup_selection_v1/1kP_LAR_selection.names
grep -f ./data/LAR_orthogroup_selection_v1/1kP_LAR_RNAselection data/$inseq.fasta | grep '^>' >> ./data/LAR_orthogroup_selection_v1/1kP_LAR_selection.names
grep '>' -c ./data/LAR_orthogroup_selection_v1/1kP_LAR_selection.names
grep '>' ./data/LAR_orthogroup_selection_v1/1kP_LAR_selection.names | sort | uniq | wc -l
head ./data/LAR_orthogroup_selection_v1/1kP_LAR_selection.names
rm ./data/LAR_orthogroup_selection_v1/1kP_LAR_selection_v1.fasta 2> /dev/null
/opt/rnaseq/bin/extract_sequence_by_id.pl --ID=./data/LAR_orthogroup_selection_v1/1kP_LAR_selection.names \
--fileformat=fasta \
--sequences=data/$inseq.fasta \
--outfile=./data/LAR_orthogroup_selection_v1/1kP_LAR_selection_v1.fasta
grep '>' ./data/LAR_orthogroup_selection_v1/1kP_LAR_selection_v1.fasta | sort | wc -l
grep '>' ./data/LAR_orthogroup_selection_v1/1kP_LAR_selection_v1.fasta | sort | uniq | wc -l
Now we have our fasta with the selected kP sequences, let's add extra Azolla and guide sequences.
cat ./data/Erbils_guide_v3.fasta \
./data/Erbils_verified_sequences_v2_nonredundantwithv3.fasta \
'./data/LAR_orthogroup_selection_v1/Azfi PIP-likes V1.txt' \
'./data/LAR_orthogroup_selection_v1/Azfi PIP-likes V2.txt' | grep '>' | sort | uniq -c
cat ./data/Erbils_guide_v3.fasta \
./data/Erbils_verified_sequences_v2_nonredundantwithv3.fasta \
'./data/LAR_orthogroup_selection_v1/Azfi PIP-likes V1.txt' \
'./data/LAR_orthogroup_selection_v1/Azfi PIP-likes V2.txt' > ./data/LAR_orthogroup_selection_v1/1kP_LAR_guide_v5.fasta
cat ./data/LAR_orthogroup_selection_v1/1kP_LAR_guide_v5.fasta \
./data/LAR_orthogroup_selection_v1/1kP_LAR_selection_v1.fasta \
> ./data/1kP_LAR_selectionv1_guide_v5.fasta
inseq=1kP_LAR_selectionv1_guide_v5
echo $inseq
grep '>' data/$inseq.fasta -c
Skipping this section for now
n=700
grep '>' data/$inseq.fasta | tr -d '>' | shuf | head -n $n > data/"$inseq"_random"$n".txt
grep '>' ../LAR_tree/alignments_raw/LAR_orthogroup_LARselectionExtraFernsAndLycophytes_guide2_aligned-mafft-linsi.fasta | tr -d '>' | tail -n +49 > data/"$inseq"_random"$n".txt
wc -l ./data/"$inseq"_random700.txt
Let's add all Azolla stuff in there, but first see what is there already
grep -i azolla data/"$inseq"_random"$n".txt
grep -i azolla data/"$inseq".fasta
Remove double guide sequence AZOLLAFILICULOIDES
sed -i '/AZOLLAFILICULOIDES/d' data/"$inseq".fasta
head data/"$inseq"_random"$n".txt
/opt/rnaseq/bin/extract_sequence_by_id.pl --ID=<(sort data/"$inseq"_random"$n".txt | uniq) \
--fileformat=txt \
--sequences=<(cat data/$inseq.fasta | tr -d '.')\
--outfile=data/"$inseq"_random"$n".fasta
grep '>' -c data/"$inseq"_random"$n".fasta
n=700
grep -i azolla data/"$inseq"_random"$n".fasta
ls data
head data/1kP_LAR_orthogroup_random700.fasta
head data/Erbils_guide_v3.fasta
grep '>' data/Erbils_verified_sequences_v2_nonredundantwithv3.fasta
grep '>' data/Erbils_guide_v3.fasta
cat data/Erbils_guide_v3.fasta > data/1kP_LAR_orthogroup_random700_guidev4.fasta
cat data/Erbils_verified_sequences_v2_nonredundantwithv3.fasta >> data/1kP_LAR_orthogroup_random700_guidev4.fasta
echo "\n" >> data/1kP_LAR_orthogroup_random700_guidev4.fasta
cat data/1kP_LAR_orthogroup_random700.fasta >> data/1kP_LAR_orthogroup_random700_guidev4.fasta
grep '>' -c ./data/1kP_LAR_orthogroup_random700_guidev4.fasta
inseq="$inseq"_random"$n"_guidev4
echo $inseq
head data/$inseq.fasta
First with MAFFT
mafft --help
This is probably the most acurate mafft setting, which is turned off by default in normal or auto mafft for alignments bigger than 200 sequences. Since We're moving to a publication quality tree, let's do is this way and see.
conda activate phylogenetics
rm "./data/alignments_raw/$inseq"_aligned-mafft-linsi.fasta
if [ ! -d ./data/alignments_raw/ ]
then mkdir ./data/alignments_raw
fi
if [ ! -f "./data/alignments_raw/$inseq"_aligned-mafft.fasta ]
then linsi --thread 12 data/$inseq.fasta > "./data/alignments_raw/$inseq"_aligned-mafft-linsi.fasta
fi
ls ./data/alignments_raw
head ./data/alignments_raw/"$inseq"_aligned-mafft-linsi.fasta
odds are, your alignment is quite gappy which may confuse tree building algorithms. Often it is better to remove gappy columns in your alignment. Let's have a look at this with trimAl
. Short for 'trim alignment' No Artificial intelegence stuff going on here.
trimal -h
I'm trying some manual trimming as alternative
mkdir data/alignments_trimmed 2> /dev/null
trimappendix='trim-auto'
for a in "data/alignments_raw/$inseq"_aligned*.fasta
do appendix=$(echo $a | cut -d '/' -f 3- | sed "s/$inseq\_//" | sed "s/.fasta//")
if [ ! -f data/alignments_trimmed/"$inseq"_"$appendix"_"$trimappendix".fasta ]
then echo "trimming alignment $a"
sed -i 's/ /_/g' $a
trimal -in $a \
-out data/alignments_trimmed/"$inseq"_"$appendix"_"$trimappendix".fasta \
-automated1 \
-htmlout data/alignments_trimmed/"$inseq"_"$appendix"_"$trimappendix".html &
fi
done
mkdir data/alignments_trimmed 2> /dev/null
trimappendix='trim-gt4-seq90-res8'
for a in "data/alignments_raw/$inseq"_aligned*.fasta
do appendix=$(echo $a | cut -d '/' -f 3- | sed "s/$inseq\_//" | sed "s/.fasta//")
if [ ! -f data/alignments_trimmed/"$inseq"_"$appendix"_"$trimappendix".fasta ]
then echo "trimming alignment $a"
sed -i 's/ /_/g' $a
trimal -in $a \
-out data/alignments_trimmed/"$inseq"_"$appendix"_"$trimappendix".fasta \
-gt .4 \
-seqoverlap 80 \
-resoverlap 0.7 \
-htmlout data/alignments_trimmed/"$inseq"_"$appendix"_"$trimappendix".html
fi
done
ls data/alignments_trimmed
conda deactivate
Let's look at the stats
auto:
Selected Sequences: 808 /Selected Residues: 145
Deleted Sequences: 0 /Deleted Residues: 2346
t6-seq80-res6
Selected Sequences: 808 /Selected Residues: 303
Deleted Sequences: 0 /Deleted Residues: 2188
First verdict, the auto is defenitely too strict for my taste. I do like the selection of columns in the manual parameters but think that the selection of sequences can be more strict to filter out banding like this:
gt6-seq20-res6
Selected Sequences: 808 /Selected Residues: 303
Deleted Sequences: 0 /Deleted Residues: 2188
So that was pointless, let's try 90 instead:
gt6-seq90-res6
Selected Sequences: 800 /Selected Residues: 303
Deleted Sequences: 8 /Deleted Residues: 2188
Marginally better, lets keep trying
trim-gt6-seq95-res7
Selected Sequences: 584 /Selected Residues: 303
Deleted Sequences: 224 /Deleted Residues: 2188
In the end I choose this one: trim-gt4-seq95-res7
It has 584 sequences, discarding 224, and keeping 306 columns of information. horizontal banding looks like above.
We'll make fast trees (not so acurate, no bootstraps, but fast) (optional) and we'll make "propper trees" using the amazing iqtree
rm -rf ./data/LAR_orthogroup_fasttrees
ls analyses/1kP_LAR_orthogroup_random700_guidev3_fasttrees
for a in data/alignments_trimmed/"$inseq"_aligned*.fasta
do echo "making a fasttree of file $a"
appendix=$(echo $a | cut -d '/' -f 3- | sed "s/$inseq\_//" | sed "s/.fasta//")
echo $appendix
if [ ! -d analyses/"$inseq"_fasttrees ]
then mkdir analyses/"$inseq"_fasttrees
fi
if [ ! -d analyses/"$inseq"_fasttrees/"$appendix" ]
then mkdir analyses/"$inseq"_fasttrees/"$appendix"
fasttree -log analyses/"$inseq"_fasttrees/"$appendix"/"$inseq"_"$appendix"_fasttree.log \
$a \
> analyses/"$inseq"_fasttrees/"$appendix"/"$inseq"_"$appendix"_fasttree.tree \
2> analyses/"$inseq"_fasttrees/"$appendix"/"$inseq"_"$appendix"_fasttree.stderr &
fi
done
tail analyses/"$inseq"_fasttrees/"$appendix"/"$inseq"_"$appendix"_fasttree.log
ls analyses/"$inseq"_fasttrees/"$appendix"/
Now let's loop over all made aligments, making a tree. Choose your parameters wisely, this can take a long time
Model finder, this is the best feature of iqtree.
Evolution can happen in a lot of ways, and iqtree takes this into account.
Use -m TEST
to use modelfinder, or -m MFP
for extended modelfinding (for your publication quality tree).
Modelfinder, especially the extended version, can take a long time to calculate.
So if you have done it once for a specific alignment, don't do it twice. ;)
Now have a look at these examples and the manual:
ls data/alignments_trimmed/"$inseq"_aligned*gt*.fasta
grep 'Best-fit model' ./analyses/1kP_LAR_orthogroup_random700_guidev3_trees/aligned-mafft-linsi_trim-gt6-seq80/1kP_LAR_orthogroup_random700_guidev3_aligned-mafft-linsi_trim-gt6-seq80_iqtree-bb2000-shalrt2000
iqpendix='iqtree-bb2000-alrt2000'
#for a in data/alignments_trimmed/"$inseq"_aligned*trim*.fasta
for a in data/alignments_trimmed/"$inseq"_aligned*trim-gt4-seq95-res7.fasta
do echo "making a tree of file $a"
appendix=$(echo $a | cut -d '/' -f 3- | sed "s/$inseq\_//" | sed "s/.fasta//")
echo $appendix
if [ ! -d analyses/"$inseq"_trees ]
then mkdir analyses/"$inseq"_trees
fi
if [ ! -d analyses/"$inseq"_trees/"$appendix" ]
then mkdir analyses/"$inseq"_trees/"$appendix"
fi
if [ ! -f analyses/"$inseq"_trees/"$appendix"/"$inseq"_"$appendix"_"$iqpendix".tree ]
then nice iqtree -s $a \
-bb 2000 \
-alrt 2000 \
-nt AUTO \
-ntmax 12 \
-pre analyses/"$inseq"_trees/"$appendix"/"$inseq"_"$appendix"_"$iqpendix" \
2> analyses/"$inseq"_trees/"$appendix"/"$inseq"_"$appendix"_"$iqpendix"-b1000 \
> analyses/"$inseq"_trees/"$appendix"/"$inseq"_"$appendix"_"$iqpendix" \
-m MFP && \
cat analyses/"$inseq"_trees/"$appendix"/"$inseq"_"$appendix"_"$iqpendix".log | mail -s LAR_selectionv1_guidev5 laura.w.dijkhuizen@gmail.com &
fi
done
iqpendix='iqtree-b200'
#for a in data/alignments_trimmed/"$inseq"_aligned*trim*.fasta
for a in data/alignments_trimmed/"$inseq"_aligned*trim-gt4-seq95-res7.fasta
do echo "making a tree of file $a"
appendix=$(echo $a | cut -d '/' -f 3- | sed "s/$inseq\_//" | sed "s/.fasta//")
echo $appendix
if [ ! -d analyses/"$inseq"_trees ]
then mkdir analyses/"$inseq"_trees
fi
if [ ! -d analyses/"$inseq"_trees/"$appendix" ]
then mkdir analyses/"$inseq"_trees/"$appendix"
fi
if [ ! -f analyses/"$inseq"_trees/"$appendix"/"$inseq"_"$appendix"_"$iqpendix".tree ]
then nice iqtree -s $a \
-b 200 \
-nt AUTO \
-ntmax 12 \
-pre analyses/"$inseq"_trees/"$appendix"/"$inseq"_"$appendix"_"$iqpendix" \
2> analyses/"$inseq"_trees/"$appendix"/"$inseq"_"$appendix"_"$iqpendix"-b1000 \
> analyses/"$inseq"_trees/"$appendix"/"$inseq"_"$appendix"_"$iqpendix" \
-m LG+R7 && \
cat analyses/"$inseq"_trees/"$appendix"/"$inseq"_"$appendix"_"$iqpendix".log | mail -s LAR_selectionv1_guidev5 laura.w.dijkhuizen@gmail.com &
fi
done
ls analyses/"$inseq"_trees/aligned*/
tail -n 20 analyses/"$inseq"_trees/"$appendix"/"$inseq"_"$appendix"_"$iqpendix".log
grep 'START' analyses/"$inseq"_trees/"$appendix"/"$inseq"_"$appendix"_"$iqpendix".log
find the trees in iTOL: https://itol.embl.de/shared/lauradijkhuizen
We have identified roughly three clades of fern LAR and LAR likes, here I select sequences from fern species that have representative sequences in each of these three clades.
UGNK-2002432-Marattia attenuata
MEKP-2009607-Dipteris conjugata
VIBO-2075523-Osmunda javanica
VIBO-2074404-Osmunda javanica
MEKP-2006126-Dipteris conjugata
PNZO-2008392-Culcita macrocarpa
PNZO-2022483-Culcita macrocarpa
MEKP-2104936-Dipteris conjugata
CQPW-2095554-Anemia tomentosa
PNZO-2020054-Culcita macrocarpa
CVEG-2005709-Azolla cf. caroliniana
Azolla filiculoides LAR
Azfi s0197.g057377 characterized LAR
YJJY-2005734-Woodsia scopulina
ZXJO-2003107-Hemionitis arifolia
ZXJO-2003105-Hemionitis arifolia
PIVW-2004875-Ceratopteris thalictroides
PIVW-2004244-Ceratopteris thalictroides
PIVW-2004243-Ceratopteris thalictroides
ZXJO-2004441-Hemionitis arifolia
ZXJO-2004443-Hemionitis arifolia
ZXJO-2004442-Hemionitis arifolia
CAPN-2038587-Equisetum diffusum
JVSZ-2129506-Equisetum hymale
JVSZ-2001809-Equisetum hymale
JVSZ-2001808-Equisetum hymale
JVSZ-2001810-Equisetum hymale
JVSZ-2001811-Equisetum hymale
UGNK-2017109-Marattia attenuata
UGNK-2021182-Marattia attenuata
MEKP-2006924-Dipteris conjugata
EWXK-2119988-Thyrsopteris elegans
PNZO-2007161-Culcita macrocarpa
PNZO-2007160-Culcita macrocarpa
PNZO-2013494-Culcita macrocarpa
EWXK-2121103-Thyrsopteris elegans
VIBO-2007835-Osmunda javanica
MEKP-2104845-Dipteris conjugata
EWXK-2121424-Thyrsopteris elegans
PNZO-2150800-Culcita macrocarpa
VIBO-2006262-Osmunda javanica
VIBO-2006264-Osmunda javanica
CQPW-2096458-Anemia tomentosa
MEKP-2001942-Dipteris conjugata
CQPW-2014224-Anemia tomentosa
PNZO-2149017-Culcita macrocarpa
EWXK-2024179-Thyrsopteris elegans
EWXK-2022152-Thyrsopteris elegans
PNZO-2026447-Culcita macrocarpa
CVEG-2016335-Azolla cf. caroliniana
CVEG-2016336-Azolla cf. caroliniana
Afi v2 s1241G000110.6 PCBER
Afi v2 s1241G000080.2 IFR
PIVW-2013241-Ceratopteris thalictroides
PIVW-2091514-Ceratopteris thalictroides
ZXJO-2007558-Hemionitis arifolia
YJJY-2003866-Woodsia scopulina
YJJY-2003867-Woodsia scopulina
MEKP-2105811-Dipteris conjugata
VIBO-2008406-Osmunda javanica
MEKP-2014651-Dipteris conjugata
MEKP-2104929-Dipteris conjugata
CQPW-2024177-Anemia tomentosa
PIVW-2009700-Ceratopteris thalictroides
ZXJO-2008167-Hemionitis arifolia
ZXJO-2008168-Hemionitis arifolia
EWXK-2012291-Thyrsopteris elegans
EWXK-2012289-Thyrsopteris elegans
YJJY-2008533-Woodsia scopulina
UGNK-2006528-Marattia attenuata
ALVQ-2018495-Tmesipteris parva
QVMR-2016059-Psilotum nudum
ALVQ-2000101-Tmesipteris parva
ALVQ-2000100-Tmesipteris parva
ALVQ-2007082-Tmesipteris parva
QVMR-2017404-Psilotum nudum
UGNK-2021731-Marattia attenuata
PNZO-2016614-Culcita macrocarpa
PNZO-2016613-Culcita macrocarpa
QVMR-2013662-Psilotum nudum
ALVQ-2004508-Tmesipteris parva
ALVQ-2004507-Tmesipteris parva
ALVQ-2004506-Tmesipteris parva
EWXK-2000842-Thyrsopteris elegans
EWXK-2000843-Thyrsopteris elegans
EWXK-2000844-Thyrsopteris elegans
UGNK-2020446-Marattia attenuata
MEKP-2101772-Dipteris conjugata
MEKP-2006789-Dipteris conjugata
MEKP-2006790-Dipteris conjugata
mkdir analyses/WLARs
ls analyses/WLARs
wc -l ./analyses/WLARs/fern*LAR*.txt
#sed -i 's/ /_/g' ./analyses/WLARs/fern*LAR*.txt
/opt/rnaseq/bin/extract_sequence_by_id.pl --ID=./analyses/WLARs/fernLARs.txt \
--fileformat=txt \
--sequences=./data/1kP_LAR_selectionv1_guide_v5.fasta \
--outfile=./analyses/WLARs/fernLARs.fasta
/opt/rnaseq/bin/extract_sequence_by_id.pl --ID=./analyses/WLARs/fernWLAR1.txt \
--fileformat=txt \
--sequences=./data/1kP_LAR_selectionv1_guide_v5.fasta \
--outfile=./analyses/WLARs/fernWLAR1.fasta
/opt/rnaseq/bin/extract_sequence_by_id.pl --ID=./analyses/WLARs/fernWLAR2.txt \
--fileformat=txt \
--sequences=./data/1kP_LAR_selectionv1_guide_v5.fasta \
--outfile=./analyses/WLARs/fernWLAR2.fasta
grep '>' -c ./analyses/WLARs/fern*LAR*.fasta
diff <(sort analyses/WLARs/fernLARs.txt) <(grep '>' analyses/WLARs/fernLARs.fasta | tr -d '>' | sort)
diff <(sort analyses/WLARs/fernWLAR1.txt) <(grep '>' analyses/WLARs/fernWLAR1.fasta | tr -d '>' | sort)
diff <(sort analyses/WLARs/fernWLAR2.txt) <(grep '>' analyses/WLARs/fernWLAR2.fasta | tr -d '>' | sort)
cat analyses/WLARs/fernLARs.fasta | sed 's/^>/>LAR_/g'> analyses/WLARs/fernLARclades.fasta
cat analyses/WLARs/fernWLAR1.fasta | sed 's/^>/>WLAR1_/g'>> analyses/WLARs/fernLARclades.fasta
cat analyses/WLARs/fernWLAR2.fasta | sed 's/^>/>WLAR2_/g'>> analyses/WLARs/fernLARclades.fasta
head ./analyses/WLARs/fernLARclades.fasta
then add these
WLAR1_Afi_v2_s1241G000080.2 MAGGEGESKRVLVLGATGYIGKFIALAGPSLGHPTFALIRPSTIASKPDIVQSLQSAGITILQGSLDDHESLVAAFKQVDVVISAVGGAQLKDQLKVLEA IKEAGTIKRFIPSEFGNDVDRTHSLEPAQSLFKGKIEVRRSIEDAGIPYTYVVSNGFAGYFLSNLLQEGHTSPPRDKVTIYGSGDVKAIAVHEEDIGTYT IKAAFDPRALNKTLHIRPPANIITLNELVDKWEKKIGKTLEKITVTEEEFVKKIESMSLNPFFFSLDLFSLSFVCFLLVKHSTDVMSARCRYSISREHFS IYFAWHCLQRGANKFRAWTQ WLAR1_Afi_v2_s1241G000110.6 MIHPSIHSIMHLDCDDSFIPSLHVPQSQQTQSSSMNDSADRTWESRERETDRQRGMAGGGEGESKRILILGATGYIGKFIALAGPSLGHPTFALIRPSTI ASKPDLVQSLRSAGISILQGSLDDHESLVAAFKQVDVVISAVGEAQLKDQLKILDAIKEVGTIKRFIPSEFGSDVNHSQGLGPAQSLFKAKVEIRRSIED AGIPYMYVVANGFAGYFLSSLLQEGHTSPPRDKVTIYGSGDVKVIAVYEEDVGTYTIKAAFDPRTLNKTLHIRPPANIVTFNELVDKWEKKIGKSLEKIT VTEEEFVKKIEGTPFPGNLFLSLLHGIVFKGDQTNFELGPNDVEATSLYPDVKYTSVDDYLDRFV LAR_Azolla filiculoides LAR MGVKSRVLIIGATGYIGKHVARASVAEGHPTSILIRPSTLTTKAELVTSFKDLGITLVEGS LDDHAGLVAAIKEVDVVISTVGGPAIPEQEKIIAAIKEAGNVLRFLPSEFGNDVDHAK ALEPVNTMYGKKVTIRRKIEEAGIPYTYISSNAFAGYSLSNLVQFGKPSPPRDKVTIYG SGDAKAIFLKEEDIGLFTIKTIDDPRTLNKIVYLRPPGNILSVNEVVSLWESKIGAKLER EYVSEEDMIVLIKTSPIPKNIVLATVHNIFVRGDQYNFEIGEKGVEASTLYPDVKYTTAS EYLDKFV LAR_Azfi_s0197.g057377 MGVKSRVLIIGATGYIGKHVARASVAEGHPTSILIRPSTLTTKAELVTSFKDLGITLVEGSLDDHAGLVA AIKEVDVSSRPSVGLPSPSKRRSLLLLKKQGMFLPSEFGNDVDHAKALEPVNTMYGKKVTIRRKIEEAGI PYTYISSNAFAGYSLSNLVQFGKPSPPRDKVTIYGSGDAKAIFLKEEDIGLFTIKTIDDPRTLNKIVYLR PPGNILSVNEVVSLWESKIGAKLEREYVSEEDMIVLIKTSPIPKNIVLATVHNIFVRGDQYNFEIGEKGV EASTLYPDVKYTTASEYLDKFV
conda activate phylogenetics
linsi --thread 6 analyses/WLARs/fernLARclades.fasta > analyses/WLARs/fernLARclades_aligned-mafft-linsi.fasta
conda deactivate
mkdir ./analyses/WLARs/fernLARclades_tree
nice iqtree -s ./analyses/WLARs/fernLARclades_aligned-mafft-linsi.fasta \
-b 1000 \
-nt AUTO \
-ntmax 6 \
-pre ./analyses/WLARs/fernLARclades_tree/fernLARclades_aligned-mafft-linsi_iqtree-b1000 \
2> ./analyses/WLARs/fernLARclades_tree/fernLARclades_aligned-mafft-linsi_iqtree-b1000.stderr \
> ./analyses/WLARs/fernLARclades_tree/fernLARclades_aligned-mafft-linsi_iqtree-b1000.stdout \
-m 'JTTDCMut+I+G4' && \
cat ./analyses/WLARs/fernLARclades_tree/fernLARclades_aligned-mafft-linsi_iqtree-b1000.log | mail -s WLAR_tree laura.w.dijkhuizen@gmail.com &
tail ./analyses/WLARs/fernLARclades_tree/fernLARclades_aligned-mafft-linsi_iqtree-b1000.stdout -n 40