Published March 14, 2026 | Version v1.0.0
Software Open

Seq2Bio: A Colab‑based tool for protein‑centric biological context retrieval

Authors/Creators

Description

Seq2Bio is a comprehensive Google Colab notebook that transforms a protein sequence into a rich biological report. It integrates multiple public databases (NCBI BLAST, NCBI Taxonomy, iNaturalist, GBIF, PubMed) and tools (Clustal Omega, AlphaFold DB, ColabFold) to provide:

  • BLAST homology search (≥70% identity)
  • Taxonomic lineage
  • Organism images and common names
  • Interactive geographic map with hover tooltips (species, common name, image, location)
  • Relevant PubMed publications (toxin/venom focused, but customisable)
  • Multiple sequence alignment and phylogenetic tree (Newick format)
  • Protein structure retrieval from AlphaFold DB or prediction via ColabFold, with confidence plots (pLDDT, PAE) and Ramachandran plots
  • Downloadable ZIP archive of all results
  • Email notification upon completion

Notes

If you use Seq2Bio in your research, please cite it as below.

Files

Vidhusv/seq2bio-v1.0.0.zip

Files (1.0 MB)

Name Size Download all
md5:35796c0b8c74794ea03336aa9b265f47
1.0 MB Preview Download

Additional details

Related works

Is supplement to
Software: https://github.com/Vidhusv/seq2bio/tree/v1.0.0 (URL)

Software