Published April 15, 2026 | Version v2
Journal article Open

Benchmarking AlphaGenome on NVIDIA GPUs: latency, memory, and feasibility across sequence lengths

  • 1. University of Washington
  • 2. Stanford University

Description

AlphaGenome pushes genomic modeling to sequence lengths up to 1 Mb, but practical adoption still comes down to a simple question: what fits on the GPU you actually have, and how long will it take? This post benchmarks the official JAX implementation and the community PyTorch port across seven NVIDIA GPUs: H200 (141 GB), H100 (80 GB), A100 (80 GB), L40S (48 GB), L40 (48 GB), A40 (48 GB), and RTX 6000 (24 GB). We report inference, heads-only finetuning, and full-weights finetuning on real genomic workloads. Key takeaways: * Inference up to 1 Mb is feasible on every tested GPU from 48 GB upward; 1 Mb typically requires about 35-41 GB of peak memory * Heads-only finetuning fits on every tested GPU up to 1 Mb, making it the accessible adaptation path for most labs * Full-weights finetuning is memory-bound: 1 Mb is comfortable on H200, borderline on H100, reaches 524 kb on 80 GB cards otherwise, and tops out at 262 kb on 48 GB cards * Above about 131 kb, memory scaling is close to linear for all three workloads

Files

2026-005-v2-borzoi_vs_alphagenome.png

Files (1.5 MB)

Name Size Download all
md5:06f774691787c11c37c398c228d9b5d5
250.6 kB Preview Download
md5:5323f6a070aa0e4b43a148a53a699330
1.2 MB Preview Download
md5:10b6e2f3ee738c619d0891375876495a
26.4 kB Preview Download

Additional details