This function uses techniques from the Bioconductor Workflow in order to preprocess phyloseq data for downstream analysis.
preprocess_phyloseq(phyloseq_object, process_list = NULL, ...)
| phyloseq_object | A phyloseq object |
|---|---|
| process_list | This parameter is used to control the way the phyloseq object is processed. It can be one of three values:
|
| ... | The dot parameters can be any combination of the default processing keywords. See the "Keywords for Processing" section below for more details |
Returns a phyloseq object that has undergone the specified processing strategies.
DETAILS
DEFAULT: 1e-5. Filters any taxa that do not meat the mean threshold.
DEFAULT: list("Phylum"= list("min_a"=5, "r_s_p"=0.5), "Class"=list("min_a"=3, "r_s_p"=0.3)). Filters OTUs that do not appear more than a certian amount of times in a certain percentage of samples at the specified agglomerated rank.
DEFAULT: list("min_a"=5, "r_s_p"=0.5). Filters OTUs that do not appear more than a certian amount of times in a certain percentage of samples.
DEFAULT: NULL. Agglomerates the data at the specified rank.
DEFAULT: list(amb_ranks = c("Phylum", "Class", "Order", "Family", "Genus"), amb_items = c(NA, "", "uncharacterized", "uncultured", "Unassigned", "Ambiguous", "Ambiguous_taxa")). Removes OTUs that are labeled with the specified ambiguous items. This is done for each specified rank.
DEFAULT: 0.55. Standardizes abundances to the median sequencing depth
DEFAULT: function(x)x / sum(x). Transforms the abundance values to relative abundance values.
DEFAULT: NULL.
prune_taxa, taxa_sums, filter_taxa, tax_glom, nsamples,
filterfun_sample, genefilter_sample, get_taxa_unique, transform_sample_counts,
merge_samples, subset_taxa
yaml.load
Bioconductor Workflow - https://f1000research.com/articles/5-1492/v2
Phyloseq Website - https://joey711.github.io/phyloseq/index.html
Other Filters: remove_ambiguous_taxa
# NOT RUN {
> phy_obj <- get_phyloseq_object(...)
> phy_obj
phyloseq-class experiment-level object
otu_table() OTU Table: [ 1955 taxa and 48 samples ]
sample_data() Sample Data: [ 48 samples by 8 sample variables ]
tax_table() Taxonomy Table: [ 1955 taxa by 7 taxonomic ranks ]
phy_tree() Phylogenetic Tree: [ 1955 tips and 1954 internal nodes ]
> pp_phy_obj <- preprocess_phyloseq(phy_obj, master_thresh = 1e-5,
taxon_filter = list("Phylum"= list("min_a"=5, "r_s_p"=0.5),
"Class"=list("min_a"=3, "r_s_p"=0.3)),
prevalence_filter = list("min_a"=5, "r_s_p"=0.5), glom_rank = NULL,
ambiguous=list(amb_ranks = c("Phylum", "Class", "Order", "Family", "Genus"),
amb_items = c(NA, "", "uncharacterized", "uncultured", "Unassigned", "Ambiguous", "Ambiguous_taxa")),
coeff_of_variation = 0.55, trans_function = function(x){x / sum(x)}, merge_samp = NULL)
> pp_phy_obj
# }