cblaster: a Python package for identifying clustered sequence homologs in NCBI BLAST databases

Cameron L.M. Gilchrist

doi:10.5281/zenodo.3660769

Published February 10, 2020 | Version 1.0.10

Software Open

cblaster: a Python package for identifying clustered sequence homologs in NCBI BLAST databases

Cameron L.M. Gilchrist¹

1. The University of Western Australia

Identifying clusters of co-located, homologous genes is a commonplace procedure in comparative genomics, for example when looking for gene clusters encoding production of secondary metabolites. cblaster is a Python package that facilitates the identification of such gene clusters across publically available Basic Local Alignment Search Tool (BLAST) databases hosted by the National Center for Biotechnology Information (NCBI). Given either a FASTA file of query sequences, or a collection of valid NCBI sequence identifiers, cblaster is capable of both local (using the DIAMOND protein aligner) and remote (using the NCBI's public BLAST API) BLAST searches as well as retrieval and parsing of results. It leverages the NCBI's Identical Protein Groups (IPG) resource to retrieve the genomic context of each BLAST hit, grouping those that are co-located on genomic scaffolds within user-defined thresholds for intergenic distance and number of conserved sequences. cblaster then provides a human-readable report of its results. cblaster provides a simple command line interface with sensible default options, as well as offering several public methods and classes directly usable in Python code. It is installable from PyPI via pip (https://pypi.org/project/cblaster), and source code is freely available on GitHub (https://www.github.com/gamcil/cblaster) under the MIT license.

Files

Files (25.0 kB)

Name	Size	Download all
cblaster-1.0.10.tar.gz md5:8bd052e315ff7829698dce6608b09111	25.0 kB	Download

	All versions	This version
Views	1,287	1,287
Downloads	67	67
Data volume	1.7 MB	1.7 MB

cblaster: a Python package for identifying clustered sequence homologs in NCBI BLAST databases

Authors/Creators

Description

Files

Files (25.0 kB)