TRUE indicates package was installed and loaded correctly.
## plyr ggplot2 tidyverse plotrix stringr pander vegan
## TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## devtools
## TRUE
This reads the file called functions_themes.R and runs it. This will enable you to use the functions and defines graphic themes.
source(here("functions_themes.R"))
Reading in your data. Do not change “Pathotype.Data”, just the name of the file, we will use the Practice data provided as an example. The function here() will find the .csv file relative to the R project, so there is no need to set working directory or provide full file path.
The input should be in .csv format with any NA values encoded as blanks
If NA values are encoded differently, replace the option na = “” to what your values are encoded as
Pathotype.Data <- read_csv(here("Practice data set.csv"), na = "")
The value in “Distribution_of_Susceptibilities(60)” (in this case, 60), sets the cutoff for susceptible reactions. For example, currently all genes with 60% or more of the plants rated susceptible will return a “1” in previous scripts (see line 30).
The output will return a list with the first element equal to the graphic, and the second with the table. You can parse the list by putting a $ and showing if you want the Data or a graphic
Suceptibilities <- Distribution_of_Susceptibilities(60)
pander::pander(Suceptibilities$Data)
Rps | N | percent_isolates_pathogenic |
---|---|---|
Rps 1a | 21 | 100 |
Rps 1b | 15 | 71.43 |
Rps 1c | 20 | 95.24 |
Rps 1d | 16 | 76.19 |
Rps 1k | 18 | 85.71 |
Rps 2 | 14 | 66.67 |
Rps 3a | 5 | 23.81 |
Rps 3b | 20 | 95.24 |
Rps 3c | 4 | 19.05 |
Rps 4 | 5 | 23.81 |
Rps 5 | 13 | 61.9 |
Rps 6 | 11 | 52.38 |
Rps 7 | 21 | 100 |
susceptible | 21 | 100 |
Suceptibilities$Graphic
Again, you can change your susceptible cutoff value here for your dataset
complexities <- Distribution_of_Complexities(60)
Output the frequency data
pander::pander(complexities$FrequencyData)
Frequency_of_Complexities | complexities |
---|---|
0 | 0 |
0 | 1 |
0 | 2 |
0 | 3 |
0 | 4 |
4.762 | 5 |
9.524 | 6 |
9.524 | 7 |
33.33 | 8 |
0 | 9 |
23.81 | 10 |
14.29 | 11 |
0 | 12 |
4.762 | 13 |
Output the distribution data
pander::pander(complexities$DistributionData)
Distribution_of_Complexities | complexities |
---|---|
0 | 0 |
0 | 1 |
0 | 2 |
0 | 3 |
0 | 4 |
1 | 5 |
2 | 6 |
2 | 7 |
7 | 8 |
0 | 9 |
5 | 10 |
3 | 11 |
0 | 12 |
1 | 13 |
output the mean of the distribution
complexities$Mean
## [1] 8.714286
output the standard deviation of the output
complexities$StandardDev
## [1] 2.003568
output the standard error of the output
complexities$StandardErr
## [1] 0.4372144
Output the frequency plot
complexities$FrequencyPlot
Output the distribution plot
complexities$DistributionPlot
path.freq <- Pathotype.frequency.dist(60)
count | Pathotype |
---|---|
1 | 1a, 1b, 1d, 1k, 2, 3a, 3b, 5, 6, 7 |
1 | 1a, 1b, 1c, 1k, 2, 3b, 3c, 4, 6, 7 |
1 | 1a, 1b, 1c, 1d, 1k, 2, 3b, 4, 6, 7 |
2 | 1a, 1c, 1d, 1k, 2, 3b, 5, 7 |
1 | 1a, 1c, 1d, 1k, 2, 3b, 6, 7 |
2 | 1a, 1b, 1c, 1d, 1k, 2, 3b, 7 |
1 | 1a, 1c, 1d, 3b, 5, 7 |
1 | 1a, 1c, 3b, 5, 7 |
1 | 1a, 1c, 3b, 5, 6, 7 |
1 | 1a, 1b, 1c, 1d, 1k, 2, 6, 7 |
1 | 1a, 1b, 1c, 1d, 1k, 3b, 7 |
1 | 1a, 1b, 1c, 1k, 3b, 5, 6, 7 |
1 | 1a, 1b, 1c, 1d, 1k, 2, 3b, 4, 5, 6, 7 |
1 | 1a, 1b, 1c, 1k, 3b, 5, 7 |
1 | 1a, 1b, 1c, 1d, 1k, 2, 3b, 4, 5, 7 |
1 | 1a, 1b, 1c, 1d, 1k, 3a, 3b, 5, 6, 7 |
1 | 1a, 1b, 1c, 1d, 1k, 2, 3a, 3b, 3c, 6, 7 |
1 | 1a, 1b, 1c, 1d, 1k, 2, 3a, 3b, 3c, 5, 7 |
1 | 1a, 1b, 1c, 1d, 1k, 2, 3a, 3b, 3c, 4, 5, 6, 7 |
Diversity indices used to investigate pathotype divversity within and between states are shown below. In Version 1 of this document, only code for analyzing a single state at a time is shown. In the future, scripts could be produced so that multiple states could be analyzed at once, independently of each other. Therefore, if analyzing multiple states pathotype data, each state must be analyzed from its own .csv document.
Determines the number of isolates within the data
Number_of_isolates <- length(levels(Pathotype.Data$Isolate))
Number_of_isolates
## [1] 21
Determining the number of unique pathotypes for this analysis
Number_of_pathotypes <- specnumber(path.freq$count)
Number_of_pathotypes
## [1] 19
Simple diversity will show the proportion of unique pathotypes to total isolates. As the values gets closer to 1, there is greater diversity in pathoypes within the population.
Simple <- Number_of_pathotypes/ Number_of_isolates
Simple
## [1] 0.9047619
An alternate version of Simple diversity index. This index is less sensitive to sample size than the simple index.
Gleason <- (Number_of_pathotypes - 1)/log(Number_of_isolates)
Gleason
## [1] 5.912257
Shannon diversity index is typically between 1.5 and 3.5. As richness and evenness of the population increase, so does the Shannon index value
Shannon <- diversity(path.freq[-1], index="shannon")
Shannon
## [1] 2.912494
Simpsom diversity index values range from 0 to 1. 1 represents high diversity and 0 represents no diversity.
Simpson <- diversity(path.freq[-1], index="simpson")
Simpson
## [1] 0.9433107
Evenness ranges from 0 to 1. As the Eveness value approaches 1, there is a more evene distribution of each pathoypes frequency within the population.
Evenness <- Shannon/ log(Number_of_pathotypes)
Evenness
## [1] 0.9891509