##Command line for QIIME2## To import fastq files recommended commandline is Casava 1.8 paired-end demultiplexed fastq In Casava 1.8 demultiplexed (paired-end) format, there are two fastq.gz files for each sample in the study, each containing the forward or reverse reads for that sample. The file name includes the sample identifier. The forward and reverse read file names for a single sample will look like Solveig-1_S2_L001_R1_001.fastq.gz and Solveig-1_S2_L001_R2_001.fastq.gz, respectively. Fastq-files available from DRYAD https://datadryad.org/handle/10255/dryad.150698 Unzip: COPD_BERGEN_Fastq175_part1 & COPD_BERGEN_Fastq175_part2, and move all subfolders to a new folder. QIIME2 tutorial for importing data: https://docs.qiime2.org/2019.1/tutorials/importing/ Metadatafile giving sampleID, sample type and RUN to import and batch samples for DADA2: COPD_BERGEN_stvsex_Metadata.txt (Sample type: Induced only=108 samples. Variable to batch in DADA2: run) qiime dada2 denoise-paired --i-demultiplexed-seqs COPDQ2_IlluminafilesR7.qza --p-trunc-len-f 300 --p-trunc-len-r 225 --p-trim-left-f 17 --p-trim-left-r 21 --p-chimera-method pooled --output-dir COPD_DenoisepairedR7_pooled --p-n-threads 7 --verbose qiime dada2 denoise-paired --i-demultiplexed-seqs COPDQ2_IlluminafilesR8.qza --p-trunc-len-f 288 --p-trunc-len-r 222 --p-trim-left-f 17 --p-trim-left-r 21 --p-chimera-method pooled --output-dir COPD_DenoisepairedR8_pooled --p-n-threads 7 --verbose qiime dada2 denoise-paired --i-demultiplexed-seqs COPDQ2_IlluminafilesR9.qza --p-trunc-len-f 288 --p-trunc-len-r 222 --p-trim-left-f 17 --p-trim-left-r 21 --p-chimera-method pooled --output-dir COPD_DenoisepairedR9_pooled --p-n-threads 7 --verbose qiime dada2 denoise-paired --i-demultiplexed-seqs COPDQ2_IlluminafilesR2.qza --p-trunc-len-f 288 --p-trunc-len-r 222 --p-trim-left-f 17 --p-trim-left-r 21 --p-chimera-method pooled --output-dir COPD_DenoisepairedR2_pooled --p-n-threads 7 --verbose qiime feature-table merge --i-tables COPD_DenoisepairedR2_pooled/table.qza --i-tables COPD_DenoisepairedR7_pooled/table.qza --i-tables COPD_DenoisepairedR8_pooled/table.qza --i-tables COPD_DenoisepairedR9_pooled/table.qza --o-merged-table COPDpooled_ASVtable108.qza Merging representative sequences: qiime feature-table merge-seqs - outputs: COPDpooled_RepSeq108.qza Chimera removal step 2: qiime vsearch uchime-denovo --i-table COPDpooled_ASVtable108.qza --i-sequences COPDpooled_RepSeq108.qza --output-dir COPDpooled_vsearch108 qiime feature-table filter-features - outputs: COPDpooled_NochimeraASV108.qza qiime feature-table filter-seqs - outputs: COPDpooled_NochimeraRepSeq108.qza Filtering 4 samples due to low quality and sequence counts. Remove pairs of samples: Solveig61+62 and Solveig 151+152 (Does not reduce number of sample pairs in final selection). Outputs: COPDpooled_ASV104Nochimera.qza For taxonomic classification: SILVA release 128 https://www.arb-silva.de/documentation/release-128/ Files: SILVA_128_QIIME_release/rep_set/rep_set_16S_only/99/99_otus_16S.fasta SILVA_128_QIIME_release/taxonomy/16S_only/99/consensus_taxonomy_7_levels.txt qiime tools import \ --type 'FeatureData[Sequence]' \ --input-path 99_otus_16S.fasta\ --output-path Silva_99_otus_16S.qza qiime tools import \ --type 'FeatureData[Taxonomy]' \ --source-format HeaderlessTSVTaxonomyFormat \ --input-path consensus_taxonomy_7_levels.txt \ --output-path Silva_ref-taxonomy.qza qiime feature-classifier extract-reads \ --i-sequences Silva_99_otus_16S.qza \ --p-f-primer CCTACGGGNGGCWGCAG \ --p-r-primer GACTACHVGGGTATCTAATCC \ --o-reads Silva_99_ref-seqs.qza qiime feature-classifier fit-classifier-naive-bayes --i-reference-reads Silva_99_ref-seqs.qza --i-reference-taxonomy Silva_ref-taxonomy.qza --o-classifier Silva_99_classifier.qza qiime feature-classifier classify-sklearn - outputs: TaxonomyPooled.qza Decontam run in R due to lack of negative controls: Exported from qiime2 for import in R: qiime tools export COPDpooled_ASV104Nochimera.qza --output-dir Exported_R output=feature-table.biom qiime tools export TaxonomyPooled.qza --output-dir Exported_R output=taxonomy.tsv make copy: Pooledbiom-taxonomy.tsv - modify as explained: https://forum.qiime2.org/t/is-there-any-way-to-summarize-taxa-plot-by-category/446/2?u=jairideout biom add-metadata -i feature-table.biom -o COPDpooled_table_wtaxonomy.biom --observation-metadata-fp Pooledbiom-taxonomy.tsv --sc-separated taxonomy biom add-metadata -i COPDpooled_table_wtaxonomy.biom -o COPDpooled_ASVwTaxandMet.biom --sample-metadata-fp COPD_BERGEN_stvsex_Metadata.txt Import to R and run Decontam threshold=0.2 on pico green measurements (Metadatafile: decontambatch_1q_2p)=2. qiime feature-table filter-features outputs: PooledASV104_nochimnocont.qza Filter ASVs lacking taxonomic assignment (Unassigned+D_0__Bacteria): Pooledbiom-taxonomy.tsv to obtain ASV ids, store in Pooled_unidentifiedASVid.txt: qiime feature-table filter-features --i-table PooledASV104_nochimnocont.qza --m-metadata-file Pooled_unidentifiedASVid.txt --p-exclude-ids --o-filtered-table PooledASV104_clean.qza Phylogenetic tree by the following commands: qiime alignment mafft --i-sequences {input:COPDpooled_RepSeq108.qza} qiime alignment mask qiime phylogeny fasttree qiime phylogeny midpoint-root Final output: COPDpooled_RootedTree.qza Select samples for Sputum microbiota at stable state and during exacerbations in a cohort of COPD patients Filter small and rare ASVs: qiime feature-table filter-features --i-table PooledASV104_clean.qza --p-min-frequency 10 --p-min-samples 5 --o-filtered-table PooledASV104_filter5_10.qza qiime feature-table filter-samples --i-table PooledASV104_filter5_10.qza --m-metadata-file COPD_BERGEN_stvsex_Metadata.txt --p-where "PickPair='1'" --o-filtered-table COPD_BERGEN_stvsex_ASV.qza And for the case-study of 13 sputum samples from one individual: qiime feature-table filter-samples --i-table PooledASV104_filter5_10.qza --m-metadata-file COPD_BERGEN_stvsex_Metadata.txt --p-where "PickCase='1'" --o-filtered-table COPD_BERGEN_Case_ASV.qza The datasets are now ready for analyses.