"deepTools2: a next generation web server for deep-sequencing data analysis" has comments on PubPeer
×

Highlight Samples

Regex mode off

    Rename Samples

    Click here for bulk input.

    Paste two columns of a tab-delimited table here (eg. from Excel).

    First column should be the old name, second column the new name.

    Regex mode off

      Show / Hide Samples

      Regex mode off

        Export Plots

        px
        px
        X

        Download the raw data used to create the plots in this report below:

        Note that additional data was saved in multiqc_danRer10_data when this report was generated.


        Choose Plots

        If you use plots from MultiQC in a publication or presentation, please cite:

        MultiQC: Summarize analysis results for multiple tools and samples in a single report
        Philip Ewels, Måns Magnusson, Sverker Lundin and Max Käller
        Bioinformatics (2016)
        doi: 10.1093/bioinformatics/btw354
        PMID: 27312411

        Save Settings

        You can save the toolbox settings for this report to the browser.


        Load Settings

        Choose a saved report profile from the dropdown box below:

        Tool Citations

        Please remember to cite the tools that you use in your analysis.

        To help with this, you can download publication details of the tools mentioned in this report:

        About MultiQC

        This report was generated using MultiQC, version 1.14

        You can see a YouTube video describing how to use MultiQC reports here: https://youtu.be/qPbIlO_KWN0

        For more information about MultiQC, including other videos and extensive documentation, please visit http://multiqc.info

        You can report bugs, suggest improvements and find the source code for MultiQC on GitHub: https://github.com/ewels/MultiQC

        MultiQC is published in Bioinformatics:

        MultiQC: Summarize analysis results for multiple tools and samples in a single report
        Philip Ewels, Måns Magnusson, Sverker Lundin and Max Käller
        Bioinformatics (2016)
        doi: 10.1093/bioinformatics/btw354
        PMID: 27312411

        These samples were run by seq2science v1.1.0, a tool for easy preprocessing of NGS data.

        Take a look at our docs for info about how to use this report to the fullest.

        Workflow
        alignment
        Date
        September 18, 2023
        Project
        alignment
        Contact E-mail
        yourmail@here.com

        Report generated on 2023-09-18, 10:22 CEST based on data in:

        Change sample names:

        Welcome! Not sure where to start?   Watch a tutorial video   (6:06)

        General Statistics

        Showing 21/21 rows and 14/29 columns.
        Sample Name% DuplicationM Reads After FilteringGC content% PF% AdapterInsert Size% Dups% MappedM Total seqs% Proper PairsM Total seqsGenome coverageM Genome readsM MT genome reads
        GSM3934881
        38.5%
        59.2
        44.1%
        100.0%
        0.2%
        219 bp
        44.5%
        98.8%
        59.2
        80.0%
        55.4
        2.4 X
        55.4
        3.5
        GSM3934891
        45.0%
        65.3
        46.5%
        100.0%
        0.1%
        279 bp
        53.3%
        99.2%
        65.3
        94.5%
        61.5
        2.7 X
        62.9
        2.8
        GSM3934893
        49.0%
        65.5
        46.6%
        100.0%
        0.1%
        304 bp
        58.3%
        99.3%
        65.5
        94.2%
        59.1
        2.4 X
        55.4
        10.6
        GSM3934905
        27.6%
        18.5
        43.7%
        43.7%
        0.4%
        114 bp
        12.1%
        97.0%
        18.5
        98.0%
        15.5
        0.7 X
        17.9
        0.0
        GSM3934914
        7.9%
        32.7
        42.8%
        92.4%
        6.0%
        103 bp
        7.4%
        96.1%
        32.7
        98.2%
        25.1
        1.4 X
        31.5
        0.0
        GSM3934916
        13.5%
        32.2
        41.8%
        84.5%
        2.8%
        109 bp
        11.2%
        91.2%
        32.2
        96.6%
        24.0
        1.1 X
        29.3
        0.0
        GSM3934928
        3.4%
        51.7
        46.1%
        100.0%
        0.3%
        107 bp
        4.4%
        98.4%
        51.7
        98.9%
        45.2
        1.8 X
        50.8
        0.0
        GSM3934938
        3.1%
        29.1
        42.8%
        94.6%
        4.9%
        105 bp
        3.8%
        98.3%
        29.1
        98.0%
        24.4
        1.3 X
        28.7
        0.0
        GSM3934940
        13.5%
        53.1
        43.6%
        88.7%
        3.3%
        107 bp
        12.4%
        92.0%
        53.1
        97.5%
        39.3
        1.8 X
        48.8
        0.0
        GSM4661979
        3.1%
        30.2
        45.7%
        94.4%
        34.7%
        213 bp
        10.7%
        99.2%
        30.2
        98.4%
        27.4
        2.8 X
        29.9
        0.5
        GSM4661990
        5.2%
        45.7
        41.1%
        95.1%
        39.1%
        200 bp
        15.7%
        99.4%
        45.7
        98.3%
        39.6
        3.6 X
        40.3
        5.8
        GSM4661992
        17.8%
        81.0
        41.7%
        99.3%
        25.8%
        179 bp
        27.5%
        99.7%
        81.0
        92.8%
        70.1
        8.1 X
        80.8
        0.7
        GSM4662006
        10.3%
        75.7
        40.0%
        98.2%
        26.5%
        183 bp
        13.8%
        99.8%
        75.7
        97.4%
        62.3
        8.0 X
        77.4
        0.0
        GSM4662016
        11.6%
        76.0
        41.1%
        98.8%
        41.7%
        155 bp
        16.1%
        99.8%
        76.0
        97.5%
        60.2
        7.8 X
        77.5
        0.0
        GSM4662018
        9.6%
        78.6
        39.3%
        98.6%
        27.1%
        185 bp
        12.7%
        99.8%
        78.6
        98.5%
        64.0
        8.3 X
        80.2
        0.0
        GSM4662028
        8.8%
        74.3
        42.9%
        98.6%
        30.3%
        179 bp
        12.4%
        99.8%
        74.3
        98.2%
        52.5
        7.7 X
        75.9
        0.0
        GSM4662038
        9.1%
        68.6
        43.8%
        98.3%
        26.0%
        190 bp
        13.7%
        99.8%
        68.6
        98.2%
        47.4
        7.2 X
        70.0
        0.0
        GSM4662040
        8.9%
        75.5
        43.1%
        98.7%
        28.0%
        185 bp
        12.9%
        99.8%
        75.5
        98.4%
        54.3
        7.9 X
        77.1
        0.0
        GSM4662071
        10.9%
        796.7
        21.5%
        99.3%
        9.3%
        187 bp
        14.9%
        88.2%
        796.7
        95.9%
        347.9
        73.4 X
        716.2
        0.7
        GSM4662076
        11.6%
        763.6
        22.3%
        99.2%
        8.5%
        192 bp
        16.8%
        88.7%
        763.6
        95.9%
        335.6
        73.8 X
        690.6
        1.7
        GSM4662077
        13.8%
        862.6
        22.2%
        99.2%
        7.4%
        201 bp
        20.3%
        87.3%
        862.6
        95.6%
        373.3
        82.1 X
        767.1
        2.9

        Workflow explanation

        Preprocessing of reads was done automatically by seq2science v1.1.0 using the alignment workflow. Genome assembly danRer10 was downloaded with genomepy 0.16.1. Paired-end reads were trimmed with fastp v0.23.2 with default options. Reads were aligned with bwa-mem2 v2.2.1 with options '-M'. Afterwards, duplicate reads were marked with Picard MarkDuplicates v3.0.0. General alignment statistics were collected by samtools stats v1.16. Deeptools v3.5.1 was used for the fingerprint, profile, correlation and dendrogram/heatmap plots, where the heatmap was made with options '--distanceBetweenBins 9000 --binSize 1000'. The UCSC genome browser was used to visualize and inspect alignment. Quality control metrics were aggregated by MultiQC v1.14.

        Assembly stats

        Genome assembly danRer10 contains of 1061 contigs, with a GC-content of 36.64%, and 0.15% consists of the letter N. The N50-L50 stats are 53345113-12 and the N75-L75 stats are 47771147-18. The genome annotation contains 32762 genes.

        fastp

        fastp An ultra-fast all-in-one FASTQ preprocessor (QC, adapters, trimming, filtering, splitting...).DOI: 10.1093/bioinformatics/bty560.

        Filtered Reads

        Filtering statistics of sampled reads.

        Created with Highcharts 5.0.6# ReadsChart context menuExport PlotFastp: Filtered ReadsPassed FilterLow QualityToo Many NToo ShortToo LongGSM3934881GSM3934891GSM3934893GSM3934905GSM3934914GSM3934916GSM3934928GSM3934938GSM3934940GSM4661979GSM4661990GSM4661992GSM4662006GSM4662016GSM4662018GSM4662028GSM4662038GSM4662040GSM4662071GSM4662076GSM4662077050M100M150M200M250M300M350M400M450M500M550M600M650M700M750M800M850M900M950MCreated with MultiQC

        Insert Sizes

        Insert size estimation of sampled reads.

        Created with Highcharts 5.0.6Insert sizeRead percentChart context menuExport PlotFastp: Insert Size Distribution0204060801001201401601802002202402600%2%4%6%8%10%12%Created with MultiQC

        Sequence Quality

        Average sequencing quality over each base of all reads.

        Created with Highcharts 5.0.6Read PositionR1 Before filtering: Sequence QualityChart context menuExport PlotFastp: Sequence Quality0102030405060708090100110120130140150051015202530354045Created with MultiQC

        GC Content

        Average GC content over each base of all reads.

        Created with Highcharts 5.0.6Read PositionR1 Before filtering: Base Content PercentChart context menuExport PlotFastp: Read GC Content01020304050607080901001101201301401500%20%40%60%80%100%Created with MultiQC

        N content

        Average N content over each base of all reads.

        Created with Highcharts 5.0.6Read PositionR1 Before filtering: Base Content PercentChart context menuExport PlotFastp: Read N Content01020304050607080901001101201301401500%10%20%30%40%50%Created with MultiQC

        Picard

        Picard is a set of Java command line tools for manipulating high-throughput sequencing data.

        Insert Size

        Plot shows the number of reads at a given insert size. Reads with different orientations are summed.

        Created with Highcharts 5.0.6Insert Size (bp)CountChart context menuExport PlotPicard: Insert Size02004006008001000120014001600180020002200025000050000075000010000001250000150000017500002000000Created with MultiQC

        Mark Duplicates

        Number of reads, categorised by duplication state. Pair counts are doubled - see help text for details.

        The table in the Picard metrics file contains some columns referring read pairs and some referring to single reads.

        To make the numbers in this plot sum correctly, values referring to pairs are doubled according to the scheme below:

        • READS_IN_DUPLICATE_PAIRS = 2 * READ_PAIR_DUPLICATES
        • READS_IN_UNIQUE_PAIRS = 2 * (READ_PAIRS_EXAMINED - READ_PAIR_DUPLICATES)
        • READS_IN_UNIQUE_UNPAIRED = UNPAIRED_READS_EXAMINED - UNPAIRED_READ_DUPLICATES
        • READS_IN_DUPLICATE_PAIRS_OPTICAL = 2 * READ_PAIR_OPTICAL_DUPLICATES
        • READS_IN_DUPLICATE_PAIRS_NONOPTICAL = READS_IN_DUPLICATE_PAIRS - READS_IN_DUPLICATE_PAIRS_OPTICAL
        • READS_IN_DUPLICATE_UNPAIRED = UNPAIRED_READ_DUPLICATES
        • READS_UNMAPPED = UNMAPPED_READS
        Created with Highcharts 5.0.6# ReadsChart context menuExport PlotPicard: Deduplication StatsUnique PairsUnique UnpairedDuplicate Pairs NonopticalDuplicate UnpairedUnmappedGSM3934881GSM3934891GSM3934893GSM3934905GSM3934914GSM3934916GSM3934928GSM3934938GSM3934940GSM4661979GSM4661990GSM4661992GSM4662006GSM4662016GSM4662018GSM4662028GSM4662038GSM4662040GSM4662071GSM4662076GSM466207705101520253035404550556065707580859095100Created with MultiQC

        SamTools pre-sieve

        Samtools is a suite of programs for interacting with high-throughput sequencing data.DOI: 10.1093/bioinformatics/btp352.

        The pre-sieve statistics are quality metrics measured before applying (optional) minimum mapping quality, blacklist removal, mitochondrial read removal, read length filtering, and tn5 shift.

        Percent Mapped

        Alignment metrics from samtools stats; mapped vs. unmapped reads.

        For a set of samples that have come from the same multiplexed library, similar numbers of reads for each sample are expected. Large differences in numbers might indicate issues during the library preparation process. Whilst large differences in read numbers may be controlled for in downstream processings (e.g. read count normalisation), you may wish to consider whether the read depths achieved have fallen below recommended levels depending on the applications.

        Low alignment rates could indicate contamination of samples (e.g. adapter sequences), low sequencing quality or other artefacts. These can be further investigated in the sequence level QC (e.g. from FastQC).

        Created with Highcharts 5.0.6# ReadsChart context menuExport PlotSamtools stats: Alignment ScoresMappedUnmappedGSM3934881GSM3934891GSM3934893GSM3934905GSM3934914GSM3934916GSM3934928GSM3934938GSM3934940GSM4661979GSM4661990GSM4661992GSM4662006GSM4662016GSM4662018GSM4662028GSM4662038GSM4662040GSM4662071GSM4662076GSM4662077050M100M150M200M250M300M350M400M450M500M550M600M650M700M750M800M850M900M950MCreated with MultiQC

        Alignment metrics

        This module parses the output from samtools stats. All numbers in millions.

        Hover over a data point for more information
        Created with Highcharts 5.0.60100200300400500600700800Total sequences
        Created with Highcharts 5.0.60100200300400500600700800Mapped & paired
        Created with Highcharts 5.0.60100200300400500600700800Properly paired
        Created with Highcharts 5.0.60100200300400500600700800Duplicated
        Created with Highcharts 5.0.60100200300400500600700800QC Failed
        Created with Highcharts 5.0.60100200300400500600700800Reads MQ0
        Created with Highcharts 5.0.6010k20k30k40k50kMapped bases (CIGAR)
        Created with Highcharts 5.0.6010k20k30k40k50kBases Trimmed
        Created with Highcharts 5.0.6010k20k30k40k50kDuplicated bases
        Created with Highcharts 5.0.60100200300400500600700800Diff chromosomes
        Created with Highcharts 5.0.60100200300400500600700800Other orientation
        Created with Highcharts 5.0.60100200300400500600700800Inward pairs
        Created with Highcharts 5.0.60100200300400500600700800Outward pairs

        SamTools post-sieve

        Samtools is a suite of programs for interacting with high-throughput sequencing data.DOI: 10.1093/bioinformatics/btp352.

        The post-sieve statistics are quality metrics measured after applying (optional) minimum mapping quality, blacklist removal, mitochondrial read removal, and tn5 shift.

        Percent Mapped

        Alignment metrics from samtools stats; mapped vs. unmapped reads.

        For a set of samples that have come from the same multiplexed library, similar numbers of reads for each sample are expected. Large differences in numbers might indicate issues during the library preparation process. Whilst large differences in read numbers may be controlled for in downstream processings (e.g. read count normalisation), you may wish to consider whether the read depths achieved have fallen below recommended levels depending on the applications.

        Low alignment rates could indicate contamination of samples (e.g. adapter sequences), low sequencing quality or other artefacts. These can be further investigated in the sequence level QC (e.g. from FastQC).

        Created with Highcharts 5.0.6# ReadsChart context menuExport PlotSamtools stats: Alignment ScoresMappedGSM3934881GSM3934891GSM3934893GSM3934905GSM3934914GSM3934916GSM3934928GSM3934938GSM3934940GSM4661979GSM4661990GSM4661992GSM4662006GSM4662016GSM4662018GSM4662028GSM4662038GSM4662040GSM4662071GSM4662076GSM4662077020M40M60M80M100M120M140M160M180M200M220M240M260M280M300M320M340M360M380M400MCreated with MultiQC

        Alignment metrics

        This module parses the output from samtools stats. All numbers in millions.

        Hover over a data point for more information
        Created with Highcharts 5.0.6050100150200250300350Total sequences
        Created with Highcharts 5.0.6050100150200250300350Mapped & paired
        Created with Highcharts 5.0.6050100150200250300350Properly paired
        Created with Highcharts 5.0.6050100150200250300350Duplicated
        Created with Highcharts 5.0.6050100150200250300350QC Failed
        Created with Highcharts 5.0.6050100150200250300350Reads MQ0
        Created with Highcharts 5.0.605k10k15k20k25k30k35kMapped bases (CIGAR)
        Created with Highcharts 5.0.605k10k15k20k25k30k35kBases Trimmed
        Created with Highcharts 5.0.605k10k15k20k25k30k35kDuplicated bases
        Created with Highcharts 5.0.6050100150200250300350Diff chromosomes
        Created with Highcharts 5.0.6050100150200250300350Other orientation
        Created with Highcharts 5.0.6050100150200250300350Inward pairs
        Created with Highcharts 5.0.6050100150200250300350Outward pairs

        deepTools

        deepTools is a suite of tools to process and analyze deep sequencing data.DOI: 10.1093/nar/gkw257

        .

        PCA plot

        PCA plot with the top two principal components calculated based on genome-wide distribution of sequence reads

        Created with Highcharts 5.0.6PC1PC2Chart context menuExport Plotdeeptools: PCA Plot0.0250.050.0750.10.1250.150.1750.20.2250.250.2750.30.3250.350.375-0.2-0.100.10.20.30.4Created with MultiQC

        Fingerprint plot

        Signal fingerprint according to plotFingerprint

        Created with Highcharts 5.0.6rankFraction w.r.t. bin with highest coverageChart context menuExport PlotdeepTools: Fingerprint plot00.050.10.150.20.250.30.350.40.450.50.550.60.650.70.750.80.850.90.95100.20.40.60.81Created with MultiQC

        deepTools - Spearman correlation heatmap of reads in bins across the genome

        Spearman correlation plot generated by deeptools. Spearman correlation is a non-parametric (distribution-free) method, and assesses the monotonicity of the relationship.


        deepTools - Pearson correlation heatmap of reads in bins across the genome

        Pearson correlation plot generated by deeptools. Pearson correlation is a parametric (lots of assumptions, e.g. normality and homoscedasticity) method, and assesses the linearity of the relationship.


        Samples & Config

        The samples file used for this run:

        sample assembly assay tissue descriptive_name colors
        GSM3934905 danRer10 H3K27ac Brain H3K27ac_Brain (0.625,_0.7529411764705882,_1.0)
        GSM3934916 danRer10 H3K27ac Muscle H3K27ac_Muscle (0.625,_0.7529411764705882,_1.0)
        GSM3934914 danRer10 H3K27ac Liver H3K27ac_Liver (0.625,_0.7529411764705882,_1.0)
        GSM3934928 danRer10 H3K4me3 Brain H3K4me3_Brain (0.9742547425474255,_0.9647058823529412,_1.0)
        GSM3934940 danRer10 H3K4me3 Muscle H3K4me3_Muscle (0.9742547425474255,_0.9647058823529412,_1.0)
        GSM3934938 danRer10 H3K4me3 Liver H3K4me3_Liver (0.9742547425474255,_0.9647058823529412,_1.0)
        GSM4661979 danRer10 ATACseq Brain ATACseq_Brain (0.8333333333333334,_0.7490196078431373,_1.0)
        GSM4661992 danRer10 ATACseq Muscle ATACseq_Muscle (0.8333333333333334,_0.7490196078431373,_1.0)
        GSM4661990 danRer10 ATACseq Liver ATACseq_Liver (0.8333333333333334,_0.7490196078431373,_1.0)
        GSM4662028 danRer10 H3K9me3 Brain H3K9me3_Brain (0.09424083769633508,_0.7609561752988049,_0.984313725490196)
        GSM4662040 danRer10 H3K9me3 Muscle H3K9me3_Muscle (0.09424083769633508,_0.7609561752988049,_0.984313725490196)
        GSM4662038 danRer10 H3K9me3 Liver H3K9me3_Liver (0.09424083769633508,_0.7609561752988049,_0.984313725490196)
        GSM4662006 danRer10 H3K9me2 Brain H3K9me2_Brain (0.638888888888889,_0.04511278195488727,_0.5215686274509804)
        GSM4662018 danRer10 H3K9me2 Muscle H3K9me2_Muscle (0.638888888888889,_0.04511278195488727,_0.5215686274509804)
        GSM4662016 danRer10 H3K9me2 Liver H3K9me2_Liver (0.638888888888889,_0.04511278195488727,_0.5215686274509804)
        GSM4662071 danRer10 WGBS Brain WGBS_Brain (0.6510416666666666,_0.2711864406779661,_0.9254901960784314)
        GSM4662077 danRer10 WGBS Muscle WGBS_Muscle (0.6510416666666666,_0.2711864406779661,_0.9254901960784314)
        GSM4662076 danRer10 WGBS Liver WGBS_Liver (0.6510416666666666,_0.2711864406779661,_0.9254901960784314)
        GSM3934881 danRer10 RNA Brain RNA_Brain (0.0,_0.0,_0.0)
        GSM3934893 danRer10 RNA Muscle RNA_Muscle (0.0,_0.0,_0.0)
        GSM3934891 danRer10 RNA Liver RNA_Liver (0.0,_0.0,_0.0)

        The config file used for this run:
        # tab-separated file of the samples
        samples: zebrafish_alignment_samples.tsv
        
        # pipeline file locations
        result_dir: ./results  # where to store results
        genome_dir: ./genomes  # where to look for or download the genomes
        # fastq_dir: ./results/fastq  # where to look for or download the fastqs
        
        
        # contact info for multiqc report and trackhub
        email: yourmail@here.com
        
        # produce a UCSC trackhub?
        create_trackhub: true
        
        # how to handle replicates
        technical_replicates: merge    # change to "keep" to not combine them
        
        # which trimmer to use
        trimmer: fastp
        
        # which aligner to use
        aligner: bwa-mem2
        
        # how to sort bam
        bam_sorter:
          samtools:
            coordinate
        
        # filtering after alignment
        remove_blacklist: true
        only_primary_align: true
        min_mapping_quality: 30
        remove_dups: false