Loading report..

Highlight Samples

Regex mode off

    Rename Samples

    Click here for bulk input.

    Paste two columns of a tab-delimited table here (eg. from Excel).

    First column should be the old name, second column the new name.

    Regex mode off

      Show / Hide Samples

      Regex mode off

        Export Plots

        px
        px
        X

        Download the raw data used to create the plots in this report below:

        Note that additional data was saved in multiqc_HmiaM1_data when this report was generated.


        Choose Plots

        If you use plots from MultiQC in a publication or presentation, please cite:

        MultiQC: Summarize analysis results for multiple tools and samples in a single report
        Philip Ewels, Måns Magnusson, Sverker Lundin and Max Käller
        Bioinformatics (2016)
        doi: 10.1093/bioinformatics/btw354
        PMID: 27312411

        Save Settings

        You can save the toolbox settings for this report to the browser.


        Load Settings

        Choose a saved report profile from the dropdown box below:

        Tool Citations

        Please remember to cite the tools that you use in your analysis.

        To help with this, you can download publication details of the tools mentioned in this report:

        About MultiQC

        This report was generated using MultiQC, version 1.14 (08138c8)

        You can see a YouTube video describing how to use MultiQC reports here: https://youtu.be/qPbIlO_KWN0

        For more information about MultiQC, including other videos and extensive documentation, please visit http://multiqc.info

        You can report bugs, suggest improvements and find the source code for MultiQC on GitHub: https://github.com/ewels/MultiQC

        MultiQC is published in Bioinformatics:

        MultiQC: Summarize analysis results for multiple tools and samples in a single report
        Philip Ewels, Måns Magnusson, Sverker Lundin and Max Käller
        Bioinformatics (2016)
        doi: 10.1093/bioinformatics/btw354
        PMID: 27312411

        These samples were run by seq2science v1.0.0, a tool for easy preprocessing of NGS data.

        Take a look at our docs for info about how to use this report to the fullest.

        Workflow
        atac-seq
        Date
        June 15, 2023
        Project
        atac
        Contact E-mail
        yourmail@here.com

        Report generated on 2023-06-15, 16:17 CEST based on data in:

        Change sample names:


        General Statistics

        Showing 24/24 rows and 17/34 columns.
        Sample Name% DuplicationM Reads After FilteringGC content% PF% AdapterInsert Size% Dups% MappedM Total seqs% Proper PairsM Total seqs% AssignedGenome coverageM Genome readsM MT genome readsNumber of PeaksTreatment Redundancy
        SRX5260852
        8.7%
        42.9
        32.6%
        97.8%
        27.4%
        147 bp
        16.5%
        96.1%
        42.9
        98.4%
        12.3
        34.7%
        3.0 X
        41.4
        0.0
        56295
        0.20
        SRX5260853
        7.8%
        31.7
        32.5%
        97.5%
        22.5%
        168 bp
        15.8%
        94.0%
        31.7
        97.7%
        7.6
        41.2%
        2.2 X
        29.9
        0.0
        46134
        0.23
        SRX5260854
        6.6%
        30.2
        34.6%
        97.4%
        22.7%
        172 bp
        14.6%
        84.7%
        30.2
        97.3%
        6.2
        48.7%
        1.9 X
        25.7
        0.0
        50360
        0.26
        SRX5260855
        5.4%
        16.8
        35.6%
        97.8%
        32.2%
        167 bp
        12.4%
        77.9%
        16.8
        97.8%
        3.5
        37.3%
        1.0 X
        13.1
        0.0
        32903
        0.21
        SRX5260868
        24.0%
        85.1
        35.7%
        97.6%
        18.1%
        160 bp
        37.4%
        81.4%
        85.1
        98.0%
        13.3
        35.9%
        5.2 X
        69.4
        0.0
        61200
        0.28
        SRX5260869
        28.7%
        57.7
        33.5%
        97.6%
        17.6%
        159 bp
        40.2%
        93.5%
        57.7
        97.8%
        9.9
        39.0%
        4.1 X
        54.1
        0.0
        62501
        0.23
        SRX5260873
        7.2%
        13.7
        32.8%
        97.9%
        23.2%
        165 bp
        14.2%
        91.0%
        13.7
        97.8%
        3.4
        39.3%
        0.9 X
        12.5
        0.0
        33098
        0.21
        SRX5260875
        8.9%
        68.7
        33.6%
        97.9%
        30.5%
        111 bp
        16.2%
        91.8%
        68.7
        99.0%
        21.9
        30.2%
        4.6 X
        63.2
        0.0
        58936
        0.19
        SRX5260876
        15.8%
        71.6
        31.4%
        97.3%
        19.7%
        165 bp
        25.9%
        97.1%
        71.6
        98.1%
        16.5
        53.7%
        5.2 X
        69.6
        0.0
        74841
        0.33
        SRX5260877
        13.9%
        27.8
        32.6%
        97.8%
        29.8%
        113 bp
        23.2%
        92.8%
        27.8
        99.1%
        8.4
        39.7%
        1.9 X
        25.9
        0.0
        53189
        0.20
        SRX5260878
        16.9%
        71.5
        35.9%
        96.9%
        35.6%
        98 bp
        27.5%
        79.3%
        71.5
        99.3%
        18.2
        27.8%
        4.1 X
        56.8
        0.0
        49281
        0.20
        SRX5260900
        19.1%
        38.8
        32.2%
        96.3%
        22.2%
        159 bp
        25.4%
        97.6%
        38.8
        98.3%
        9.2
        46.7%
        2.9 X
        37.9
        0.0
        48680
        0.26
        SRX5260901
        20.8%
        63.6
        33.6%
        96.8%
        24.3%
        137 bp
        27.4%
        98.3%
        63.6
        98.8%
        17.1
        46.2%
        4.7 X
        62.6
        0.0
        69652
        0.28
        SRX5260902
        11.4%
        42.2
        32.1%
        97.5%
        23.9%
        160 bp
        21.4%
        92.8%
        42.2
        98.0%
        10.2
        40.3%
        2.9 X
        39.3
        0.0
        55012
        0.23
        SRX5260904
        10.5%
        99.7
        33.1%
        97.9%
        46.3%
        72 bp
        14.2%
        98.4%
        99.7
        99.3%
        40.0
        14.1%
        6.8 X
        98.2
        0.0
        58494
        0.15
        SRX5260905
        10.6%
        36.2
        32.8%
        97.7%
        25.1%
        162 bp
        20.2%
        91.2%
        36.2
        97.9%
        8.4
        35.3%
        2.5 X
        33.1
        0.0
        52996
        0.21
        SRX5260908
        7.9%
        43.0
        33.8%
        97.7%
        22.4%
        168 bp
        16.1%
        91.4%
        43.0
        97.9%
        9.9
        41.5%
        3.0 X
        39.4
        0.0
        55911
        0.23
        SRX5260909
        6.6%
        40.2
        32.9%
        98.2%
        26.7%
        132 bp
        13.5%
        96.0%
        40.2
        98.6%
        12.7
        23.7%
        2.8 X
        38.7
        0.0
        45322
        0.14
        SRX5260915
        7.9%
        63.0
        34.1%
        97.7%
        24.1%
        160 bp
        16.5%
        93.8%
        63.0
        98.2%
        15.9
        38.1%
        4.4 X
        59.2
        0.0
        60939
        0.22
        SRX5260918
        12.6%
        61.8
        32.8%
        98.0%
        20.2%
        171 bp
        22.6%
        94.4%
        61.8
        97.7%
        12.8
        51.3%
        4.4 X
        58.5
        0.0
        66889
        0.30
        SRX5260919
        23.1%
        26.0
        30.5%
        97.0%
        20.0%
        158 bp
        36.7%
        89.1%
        26.0
        98.0%
        4.8
        45.2%
        1.8 X
        23.2
        0.0
        41756
        0.27
        SRX5260920
        7.4%
        39.8
        35.1%
        97.4%
        29.8%
        169 bp
        16.6%
        79.8%
        39.8
        97.5%
        7.7
        42.9%
        2.4 X
        31.9
        0.0
        49157
        0.24
        SRX5260921
        7.4%
        55.3
        32.7%
        98.0%
        39.9%
        86 bp
        14.1%
        98.1%
        55.3
        99.0%
        20.5
        17.8%
        3.8 X
        54.3
        0.0
        47873
        0.13
        SRX5260922
        22.0%
        22.9
        30.2%
        97.3%
        19.3%
        150 bp
        36.9%
        88.5%
        22.9
        98.2%
        4.6
        49.1%
        1.5 X
        20.3
        0.0
        42231
        0.29

        Workflow explanation

        Preprocessing of reads was done automatically by seq2science v1.0.0 using the atac-seq workflow. Genome assembly danRer10 was downloaded with genomepy 0.15.0. Paired-end reads were trimmed with fastp v0.23.2 with default options. Reads were aligned with bwa-mem2 v2.2.1 with options '-M'. The UCSC genome browser was used to visualize and inspect alignment. Afterwards, duplicate reads were marked with Picard MarkDuplicates v3.0.0. General alignment statistics were collected by samtools stats v1.16. Before peak calling, paired-end info from reads was removed with seq2science so that both mates in a pair get used. Peaks were called with macs2 v2.2.7 with options '--shift -100 --extsize 200 --nomodel --buffer-size 10000' in BAM mode. The effective genome size was estimated by by khmer v3.0 by taking the number of unique k-mers in the assembly of the same length as the average read length for each sample. Deeptools v3.5.1 was used for the fingerprint, profile, correlation and dendrogram/heatmap plots, where the heatmap was made with options '--distanceBetweenBins 9000 --binSize 1000'. Narrowpeak files of biological replicates belonging to the same condition were merged with fisher's method in macs2. The fraction reads in peak score (frips) was calculated by featurecounts v1.6.4. A consensus set of summits was made with gimmemotifs.combine_peaks v0.18.0. A peak feature distribution plot and peak localization plot relative to TSS were made with chipseeker. All summits were extended with 100 bp to get a consensus peakset. Finally, a count table from the consensus peakset was made with gimmemotifs.coverage_table. Differential accessibility analysis was performed using DESeq2 v1.34. To adjust for multiple testing the (default) Benjamini-Hochberg procedure was performed with an FDR cutoff of 0.1 (default is 0.1). Counts were log transformed using the (default) shrinkage estimator apeglm v1.16. Differential motif analysis on the consensus peakset was performed with gimme maelstrom v0.18.0. Quality control metrics were aggregated by MultiQC v1.14.

        Assembly stats

        Genome assembly HmiaM1 contains of 18347 contigs, with a GC-content of 31.81%, and 11.85% consists of the letter N. The N50-L50 stats are 1044515-275 and the N75-L75 stats are 501601-598. The genome annotation contains 50 genes.

        fastp

        fastp An ultra-fast all-in-one FASTQ preprocessor (QC, adapters, trimming, filtering, splitting...).DOI: 10.1093/bioinformatics/bty560.

        Filtered Reads

        Filtering statistics of sampled reads.

        loading..

        Insert Sizes

        Insert size estimation of sampled reads.

        loading..

        Sequence Quality

        Average sequencing quality over each base of all reads.

        loading..

        GC Content

        Average GC content over each base of all reads.

        loading..

        N content

        Average N content over each base of all reads.

        loading..

        Picard

        Picard is a set of Java command line tools for manipulating high-throughput sequencing data.

        Insert Size

        Plot shows the number of reads at a given insert size. Reads with different orientations are summed.

        loading..

        Mark Duplicates

        Number of reads, categorised by duplication state. Pair counts are doubled - see help text for details.

        The table in the Picard metrics file contains some columns referring read pairs and some referring to single reads.

        To make the numbers in this plot sum correctly, values referring to pairs are doubled according to the scheme below:

        • READS_IN_DUPLICATE_PAIRS = 2 * READ_PAIR_DUPLICATES
        • READS_IN_UNIQUE_PAIRS = 2 * (READ_PAIRS_EXAMINED - READ_PAIR_DUPLICATES)
        • READS_IN_UNIQUE_UNPAIRED = UNPAIRED_READS_EXAMINED - UNPAIRED_READ_DUPLICATES
        • READS_IN_DUPLICATE_PAIRS_OPTICAL = 2 * READ_PAIR_OPTICAL_DUPLICATES
        • READS_IN_DUPLICATE_PAIRS_NONOPTICAL = READS_IN_DUPLICATE_PAIRS - READS_IN_DUPLICATE_PAIRS_OPTICAL
        • READS_IN_DUPLICATE_UNPAIRED = UNPAIRED_READ_DUPLICATES
        • READS_UNMAPPED = UNMAPPED_READS
        loading..

        SamTools pre-sieve

        Samtools is a suite of programs for interacting with high-throughput sequencing data.DOI: 10.1093/bioinformatics/btp352.

        The pre-sieve statistics are quality metrics measured before applying (optional) minimum mapping quality, blacklist removal, mitochondrial read removal, read length filtering, and tn5 shift.

        Percent Mapped

        Alignment metrics from samtools stats; mapped vs. unmapped reads.

        For a set of samples that have come from the same multiplexed library, similar numbers of reads for each sample are expected. Large differences in numbers might indicate issues during the library preparation process. Whilst large differences in read numbers may be controlled for in downstream processings (e.g. read count normalisation), you may wish to consider whether the read depths achieved have fallen below recommended levels depending on the applications.

        Low alignment rates could indicate contamination of samples (e.g. adapter sequences), low sequencing quality or other artefacts. These can be further investigated in the sequence level QC (e.g. from FastQC).

        loading..

        Alignment metrics

        This module parses the output from samtools stats. All numbers in millions.

        loading..

        SamTools post-sieve

        Samtools is a suite of programs for interacting with high-throughput sequencing data.DOI: 10.1093/bioinformatics/btp352.

        The post-sieve statistics are quality metrics measured after applying (optional) minimum mapping quality, blacklist removal, mitochondrial read removal, and tn5 shift.

        Percent Mapped

        Alignment metrics from samtools stats; mapped vs. unmapped reads.

        For a set of samples that have come from the same multiplexed library, similar numbers of reads for each sample are expected. Large differences in numbers might indicate issues during the library preparation process. Whilst large differences in read numbers may be controlled for in downstream processings (e.g. read count normalisation), you may wish to consider whether the read depths achieved have fallen below recommended levels depending on the applications.

        Low alignment rates could indicate contamination of samples (e.g. adapter sequences), low sequencing quality or other artefacts. These can be further investigated in the sequence level QC (e.g. from FastQC).

        loading..

        Alignment metrics

        This module parses the output from samtools stats. All numbers in millions.

        loading..

        deepTools

        deepTools is a suite of tools to process and analyze deep sequencing data.DOI: 10.1093/nar/gkw257.

        PCA plot

        PCA plot with the top two principal components calculated based on genome-wide distribution of sequence reads

        loading..

        Fingerprint plot

        Signal fingerprint according to plotFingerprint

        loading..

        Read Distribution Profile after Annotation

        Accumulated view of the distribution of sequence reads related to the closest annotated gene. All annotated genes have been normalized to the same size.

        • Green: -3.0Kb upstream of gene to TSS
        • Yellow: TSS to TES
        • Pink: TES to 3.0Kb downstream of gene
        loading..

        macs2_frips

        Subread featureCounts is a highly efficient general-purpose read summarization program that counts mapped reads for genomic features such as genes, exons, promoter, gene bodies, genomic bins and chromosomal locations.DOI: 10.1093/bioinformatics/btt656.

        loading..

        deepTools - Spearman correlation heatmap of reads in bins across the genome

        Spearman correlation plot generated by deeptools. Spearman correlation is a non-parametric (distribution-free) method, and assesses the monotonicity of the relationship.


        deepTools - Pearson correlation heatmap of reads in bins across the genome

        Pearson correlation plot generated by deeptools. Pearson correlation is a parametric (lots of assumptions, e.g. normality and homoscedasticity) method, and assesses the linearity of the relationship.


        Peak distributions (macs2)

        The distribution of read pileup around 20000 random peaks for each sample. This visualization is a quick and dirty way to check if your peaks look like what you would expect, and what the underlying distribution of different types of peaks is.