Published November 26, 2024 | Version 16-747c6
Software Open

soedinglab/MMseqs2: MMseqs2 Release 16-747c6

  • 1. Seoul National University
  • 2. LJK-GINP
  • 3. ELKMO
  • 4. Max Planck Institute
  • 5. @common-workflow-language
  • 6. Max-Planck institute for biophysical chemistry
  • 7. @NeelyxLabs
  • 8. Sunagawa Lab @ ETH Zürich
  • 9. Southern University of Science and Technology
  • 10. University of Wisconsin - Madison
  • 11. GIST

Description

MMseqs2 Release 16 introduces support for GPU-accelerated searches [1]. Additionally, we fixed numerous bugs and relicensed MMseqs2 under the MIT license.

[1] Kallenborn F, Chacon A, Hundt C, Sirelkhatim H, Didi K, Dallago C, Mirdita M, Schmidt B, Steinegger M: GPU-accelerated homology search with MMseqs2. bioRxiv (2024).

Breaking Changes

  • Custom substitution matrices (--seed-sub-mat, --sub-mat) are not supported in this release. Only the built-in matrices will work. We will restore support in the next release. (93b2d94c)

New Features and Enhancements

  • Added GPU support to MMseqs2, allowing for faster computations of sensitive alignments on CUDA-compatible hardware on the Turing generation or newer (a66ad0c2, 81171a53, 1806c0c8)
  • Added full-length six-frame translated search with --translation-mode 1 (#885)
  • Implement qframe and tframe output fields in convertalis (#615, #803, 417f22f2)
  • Allows resuming of interrupted downloads in databases and createtaxdb (0b27c9d7)
  • MMseqs2 taxonomy now always keeps at least the longest open reading frame within each input sequence after fragment elimination (#832, 5b4c8163)
  • Added option to not compress outputs in tsv2exprofiledb (a1468874)
  • filterdb has learned a new sort mode (--sort-entries 4 --weights file) to sort by priority (54f8983e)
  • Updated tantan (3e53eee8)

Bug Fixes

  • prefilter could use excessive memory and crash for highly redundant databases (950342d9)
  • prefilter was not properly evaluating the last potential hit, increases sensitivity of k-mer prefilter slightly (06f74297)
  • result2msa works correctly with clustered clustered databases (78ae2c5b)
  • Fixed ppos output field calculation in convertalis (fb38b7d4, 816c5c91)
  • Fixed wrong coverage being passed to realignment (6267ffba)
  • Fixed --taxon-list being broken in multi-threaded prefilter and ungappedprefilter (804bb2af)
  • Fixed segmentation fault in prefilter (#872, a64d60a4, ef2ebe9c)
  • Fixed inconsistent ordering issue in createclusearchdb (b59ad53c)
  • Corrected backtrace in SAM output for nucleotide-protein alignments and show reverse complement sequence correctly (#845, 5f23f1fd)

Developer Notes

  • Disabled nedmalloc due a OpenMP crash in Cygwin (c498f510)
  • Breaking changes in how (sub)project command initialization works (1c086858, af2cc52d)
  • Removed gzstream (111d893a)
  • Breaking fix for parameter singleton in subprojects (5c6e32c6)
  • Export MMSEQS_ARCH in CMakeCache for subprojects to use (48f13f92)

Files

soedinglab/MMseqs2-16-747c6.zip

Files (14.4 MB)

Name Size Download all
md5:b0216726dd4c514dd3d202cc92c2f3b2
14.4 MB Preview Download

Additional details

Related works