Published May 5, 2026 | Version v1.7.2

ChASE-library/ChASE: Interior eigenvalue support for pseudo-hermitian problems and built-in multi-GPU HouseholderQR

  • 1. Jülich Supercomputing Centre
  • 2. @tetrascience
  • 3. Forschungszentrum Juelich
  • 4. NTT Data

Description

ChASE Library Release Notes

Overview

This release represents significant advancements in the ChASE (Chebyshev Accelerated Subspace Extraction) library, with major enhancements to pseudo-Hermitian matrix support to compute interior eigenpairs with eigenvalues around 0, distributed-memory GPU capabilities, QR factorization algorithms, and user interfaces.

🎯 Major Features

1. Pseudo-Hermitian Matrix Support

2. Distributed-Memory GPU Support (pChASE-GPU)

  • GPU-resident Lanczos implementation with fused kernels
  • Optional warm-up procedures in constructor for NCCL initialization

3. Advanced Householder QR Factorization

  • CUDA-aware-MPI and NCCL Householder QR: Two-level padding/cleaning strategy with full-path diagnostics
    • Level-1: Panel pre-clean (coarse)
    • Level-2: Column split-and-pad (fine)
    • Full-height GEMM updates with NCCL allreduce
  • Distributed CPU Householder QR:
    • cpu_distributed_houseQR_panel_factor_block_cyclic_1d
    • cpu_distributed_blocked_houseQR_formQ_block_cyclic_1d

4. Enhanced Fortran/C Interfaces

  • Added second initializer for Fortran interfaces (internal buffer management)
  • Users no longer need to provide external buffers for eigenvectors/eigenvalues
  • Configuration setter functions added to Fortran/C interfaces
  • Support for both block-block and block-cyclic distributions simultaneously

📦 Build & Configuration

CMake Improvements

  • Minimal drivers are split into chase_driver_cpu (links chase_cpu, and cublaspp/cusolverpp when CUDA is enabled so shared headers link cleanly) and chase_driver_gpu (links chase_gpu when CUDA is enabled); the old single target chase_driver is removed.
  • Made ScaLAPACK optional
  • Added pkg-config support
  • Improved CMake configuration with chase_config.h.in
  • Default CUDA architectures: 80;90 (Ampere and Hopper)

Configuration Options

  • Environment variable support for Householder parameters
  • CHASE_PH_LANCZOS_DIAG: Optional damping factor info print (default: OFF)

🔍 Logging & Diagnostics

  • Centralized ChASE logger introduced
  • Replaced direct stdout with configurable logging
  • Log levels: Trace, Debug, Info, Warning, Error

🧪 Testing

  • Added Fortran interface tests
  • Full runs of all builds in unit tests

🔗 API Changes

New Unified Configuration Setters

All precision/matrix-type independent:

chase_set_tol(tol)
chase_set_deg(deg)
chase_set_max_deg(max_deg)
chase_set_max_iter(max_iter)
chase_set_lanczos_iter(lanczos_iter)
chase_set_num_lanczos(num_lanczos)
chase_set_approx(flag)
chase_set_opt(flag)
chase_set_cholqr(flag)
chase_enable_sym_check(flag)
chase_set_cluster_aware_degrees(flag)

📌 Migration Notes

For Users Upgrading

  1. ScaLAPACK is now optional: Distributed Householder QR is the default
  2. Logger API: Direct printf/cout calls replaced with GetLogger().Log()
  3. Configuration: Fortran: new unified setters work across all precision/matrix types
  4. Fortran buffers: Optional external buffers - library can manage internally

👥 Contributors

This release includes contributions from Edoardo Di Napoli, Clément Richefort and Xinzhe Wu, with major efforts in:

  • Pseudo-Hermitian extensions
  • Distributed GPU implementations
  • NCCL-based linear algebra
  • Interface improvements

Files

ChASE-library/ChASE-v1.7.2.zip

Files (6.3 MB)

Name Size Download all
md5:c8ce57de01f23909d2d032d2b967bf72
6.3 MB Preview Download

Additional details

Related works