Published April 16, 2025 | Version 1.3
Software Open

Illamina - a Python3 pipeline for Linux/Ubuntu to perform bacterial genome assembly from Illumina paired-end reads

Authors/Creators

  • 1. ROR icon University of Pretoria

Description


     ================================================================================
    ■ Illumina Assembler ILLAMINA v1.3 ■
    ================================================================================
    Created: April 16, 2025
       Author: Oleg Reva (oleg.reva@up.ac.za)
       Centre for Bioinformatics and Computational Biology,
       BGM, University of Pretoria, South Africa

    📌 Purpose:
    Pipeline assembly of Illumina paired-end reads using both de novo and
    reference-based approaches.

    ⚙️ Dependencies:
    ┌──────────────────────────────────────────┐
    │ Tool                 │ Minimum Version                                                          
    ├──────────────────────────────────────────┤
    │ Python            │ 3.12.3              
    │ BioPython       │ 1.83                 
    │ SPAdes            │ 3.15.0               
    │ Bowtie2           │ 2.4.1                
    │ Bcftools           │ 1.19                 
    │ RagTag            │ 2.1.0                
    │ Trimmomatic   │ 0.36                 
    └──────────────────────────────────────────┘

    💻 Tested Environments:
    - CentOS Linux 7.3.1611
    - Ubuntu 20.04 LTS

    🚀 Usage:
        python3 illamina.py <arguments>

    🔧 Arguments:
        --project_directory    Path to project directory [REQUIRED]
        --input_directory      Input directory name (default: 'input')
        --output_directory     Output directory name (default: 'output')
        --tmp_directory        Temporary files directory (default: 'tmp')
        --reference_directory  Reference sequences directory (default: 'refseq')
        --reference_file       Reference sequence file name (must be in refseq dir)
        --rf_1_ending          R1 file ending (default: '_R1_.fastq.gz')
        --rf_2_ending          R2 file ending (default: '_R2_.fastq.gz')

    📂 File Requirements:
        - Paired-end reads in input/*.fastq.gz
        - Reference genome in refseq/SCPM_chr.fasta

    🏆 Output:
        - Assembled scaffolds and consensus sequences in output

    💡 Help Options:
        -h, --help            Show this help message
        -v, --version         Show version information

    ================================================================================
    Program code decoration by DeepSeek (https://chat.deepseek.com/)
    ================================================================================

The folder 'input' contains a subfolder "example" with two example *.fastq.gz files and the folder 'refseq' has an example.fasta   file to test the program. Run:
python3 ~/Illamina/illamina.py --project_directory example --reference_file example.fasta

Files

Files (303.7 MB)

Name Size Download all
md5:a99c49b61fa7393455fbf3d6774e4a54
303.7 MB Download