Musa acuminata genes were sorted into orthogroups using OrthoMCL (Li et al, 2003). Musa acuminata genes were translated using Transpipe (Barker et al., 2010). Putative paralog genes (as identified by syntenic blogs) from Musa were used to group otherwise separate gene families. Orthogroups were aligned at the amino acid level and nucleotide sequences were aligned on to the amino acids using pal2nal (Suyama et al., 2006). These ortholog trees were then estimated using RAxML (Stamatakis, 2006) (RAxML parameters: -# 500 -o EUDICOT -s $dir/$file.phylip -w $dir -n $file.raxml.out -f a -x 54781 -p 12541 -m GTRGAMMA). Resulting trees were queried using an in-house script that searches for MRCA of putative paralogs. Available upon request from M.R. McKain. File: renaming_index.txt Content: File that gives the true name of the Musa genes (based on genome annotation) used in orthogroup families in the first column and new names in the second column. New names were generated based on Musa and a random number. These new names are used in all subsequent files. Directory Name: Musa_WGD/Alignments Multiple files of the name "XX_pairs.cleaned.fasta" where the number represents an orthogroup name (randomly given). These are the pal2nal alignments. Directory Name: Musa_WGD/Trees Multiple files of the name "RAxML_bipartitions.XX_pairs.cleaned.fasta.raxml.out" where the number represents an orthogroup name (randomly given). Orthogroup numbers match between alignments and trees. File: Duplication_pairs.txt.condensed_clusters.txt Content: File contains the names of the original orthogroups that were collapsed based on shared putative paralogs. Once collapsed, the new orthogroup took the name of the larger of the previous two orthogroups. The new group name is in the first column and ends with ":". All subsequent columns represent the clusters (based on cluster number) that were collapsed into the current cluster. File: Duplication_pairs.txt.discovered_pairs.txt Content: File contains the putative paralogs from Musa in columns 1 and 2. Columns 3 and 4 are the cluster groups the paralogs were found in. Ignore column 5 for the Musa data. Column 6 contains an "*" if the paralogs are in separate orthogroups. Any questions or comments may be directed to: Michael R. McKain, Department of Plant Biology, University of Georgia. E-mail: mrmckain@gmail.com