Software Open Access

lattice/quda: QUDA v1.1.0

maddyscientist; Mathias Wagner; Dean Howarth; Evan Weinberg; Alexei Strelchenko; Jiqun Tu; Buck Babich; Alejandro Vaquero; Balint Joo; Simone Bacchio; Nuno Cardoso; Michael Cheng; Justin Foley; windy510; Frank Winter; Bartosz Kostrzewa; Carleton DeTar; chris-schroeder; Eloy Romero; jcosborn; Robert Maynard; walkloud; Evan Berkowitz; Filippo Spiga; Matthew R Johnson; sunwayihep; Xiao-Yong; Mario Schröck; tsuki


JSON-LD (schema.org) Export

{
  "description": "<p>Version 1.1.0 - October 2021</p>\n<ul>\n<li><p>Add support for NVSHMEM communication for the Dslash operators, for significantly improved strong scaling.  See <a href=\"https://github.com/lattice/quda/wiki/Multi-GPU-with-NVSHMEM\">https://github.com/lattice/quda/wiki/Multi-GPU-with-NVSHMEM</a> for more  details.</p>\n</li>\n<li><p>Addition of the MSPCG preconditioned CG solver for M\u00f6bius fermions. See <a href=\"https://github.com/lattice/quda/wiki/The-Multi-Splitting-Preconditioned-Conjugate-Gradient-(MSPCG),-an-application-of-the-additive-Schwarz-Method\">https://github.com/lattice/quda/wiki/The-Multi-Splitting-Preconditioned-Conjugate-Gradient-(MSPCG),-an-application-of-the-additive-Schwarz-Method</a> for more details.</p>\n</li>\n<li><p>Addition of the Exact One Flavor Algorithm (EOFA) for M\u00f6bius fermions.  See <a href=\"https://github.com/lattice/quda/wiki/The-Exact-One-Flavor-Algorithm-(EOFA\">https://github.com/lattice/quda/wiki/The-Exact-One-Flavor-Algorithm-(EOFA</a>) for more details.</p>\n</li>\n<li><p>Addition of a fully GPU native Implicitly Restarted Arnoldi eigensolver (as opposed to partially relying on ARPACK).  See <a href=\"https://github.com/lattice/quda/wiki/QUDA%27s-eigensolvers#implicitly-restarted-arnoldi-eigensolver\">https://github.com/lattice/quda/wiki/QUDA%27s-eigensolvers#implicitly-restarted-arnoldi-eigensolver</a> for more details.</p>\n</li>\n<li><p>Significantly reduced latency for reduction kernels through the use of heterogeneous atomics.  Requires CUDA 11.0+.</p>\n</li>\n<li><p>Addition of support for a split-grid multi-RHS solver.  See <a href=\"https://github.com/lattice/quda/wiki/Split-Grid\">https://github.com/lattice/quda/wiki/Split-Grid</a> for more details.</p>\n</li>\n<li><p>Continued work on enhancing and refining the staggered multigrid algorithm.  The MILC interface can now drive the staggered multigrid solver.</p>\n</li>\n<li><p>Multigrid setup can now use tensor cores on Volta, Turing and Ampere GPUs to accelerate the calculation.  Enable with the\n<code>QudaMultigridParam::use_mma</code> parameter.</p>\n</li>\n<li><p>Improved support of managed memory through the addition of a prefetch API.  This can dramatically improve the performance of the multigrid setup when oversubscribing the memory.</p>\n</li>\n<li><p>Improved the performance of using MILC RHMC with QUDA</p>\n</li>\n<li><p>Add support for a new internal data order FLOAT8.  This is the default data order for nSpin=4 half and quarter precision fields,\nthough the prior FLOAT4 order can be enabled with the cmake option QUDA_FLOAT8=OFF.</p>\n</li>\n<li><p>Remove of the singularity from the reconstruct-8 and reconstruct-9 compressed gauge field ordering.  This enables support for free fields with these orderings.</p>\n</li>\n<li><p>The clover parameter convention has been codified: one can either\n1.) pass in QudaInvertParam::kappa and QudaInvertParam::csw separately, and QUDA will infer the necessary clover coefficient, or\n2.) pass an explicit value of QudaInvertParam::clover_coeff (e.g. CHROMA's use case) and that will override the above inference.</p>\n</li>\n<li><p>QUDA now includes fast-compilation options (QUDA_FAST_COMPILE_DSLASH and QUDA_FAST_COMPILE_REUDCE) which enable much faster build times for development at the expense of reduced performance.</p>\n</li>\n<li><p>Add support for compiling QUDA using clang for both the host and device compiler.</p>\n</li>\n<li><p>While the bulk of the work associated with making QUDA portable to different architectures will form the soul of QUDA 2.0, some of the initial refactoring associated with this has been applied.</p>\n</li>\n<li><p>Significant cleanup of the tests directory to reduce boiler plate.</p>\n</li>\n<li><p>General improvements to the cmake build system using modern cmake features.  We now require cmake 3.15.</p>\n</li>\n<li><p>Extended the ctest list to include some optional benchmarks.</p>\n</li>\n<li><p>Fix a long-standing issue with multi-node Kepler GPU and Intel dual socket systems.</p>\n</li>\n<li><p>Improved ASAN integration: SANITIZE builds now work out of the box with no need to set the ASAN_OPTIONS environment variable.</p>\n</li>\n<li><p>Add support for the extended QIO branch (now required for MILC).</p>\n</li>\n<li><p>Bump QMP version to 2.5.3.</p>\n</li>\n<li><p>Updated to Eigen 3.3.9.</p>\n</li>\n<li><p>Multiple bug fixes and clean up to the library.  Many of these are listed here: <a href=\"https://github.com/lattice/quda/milestone/24?closed=1\">https://github.com/lattice/quda/milestone/24?closed=1</a></p>\n</li>\n</ul>", 
  "license": "", 
  "creator": [
    {
      "@type": "Person", 
      "name": "maddyscientist"
    }, 
    {
      "@type": "Person", 
      "name": "Mathias Wagner"
    }, 
    {
      "affiliation": "LLNL", 
      "@type": "Person", 
      "name": "Dean Howarth"
    }, 
    {
      "@type": "Person", 
      "name": "Evan Weinberg"
    }, 
    {
      "affiliation": "FNAL", 
      "@type": "Person", 
      "name": "Alexei Strelchenko"
    }, 
    {
      "affiliation": "NVIDIA", 
      "@type": "Person", 
      "name": "Jiqun Tu"
    }, 
    {
      "affiliation": "NVIDIA", 
      "@type": "Person", 
      "name": "Buck Babich"
    }, 
    {
      "affiliation": "University Of Utah", 
      "@type": "Person", 
      "name": "Alejandro Vaquero"
    }, 
    {
      "affiliation": "Oak RIdge Leadership Computing Facility, Oak RIdge National Laboratory", 
      "@type": "Person", 
      "name": "Balint Joo"
    }, 
    {
      "affiliation": "The Cyprus Institute", 
      "@type": "Person", 
      "name": "Simone Bacchio"
    }, 
    {
      "affiliation": "CeFEMA, Departamento de F\u00edsica, Instituto Superior T\u00e9cnico, Universidade de Lisboa", 
      "@type": "Person", 
      "name": "Nuno Cardoso"
    }, 
    {
      "@type": "Person", 
      "name": "Michael Cheng"
    }, 
    {
      "@type": "Person", 
      "name": "Justin Foley"
    }, 
    {
      "@type": "Person", 
      "name": "windy510"
    }, 
    {
      "affiliation": "Jefferson Lab", 
      "@type": "Person", 
      "name": "Frank Winter"
    }, 
    {
      "affiliation": "Digital Science Center (DiCe) & High Performance Computing / Analytics Lab (HPC/A), Bonn University", 
      "@type": "Person", 
      "name": "Bartosz Kostrzewa"
    }, 
    {
      "affiliation": "University of Utah", 
      "@type": "Person", 
      "name": "Carleton DeTar"
    }, 
    {
      "@type": "Person", 
      "name": "chris-schroeder"
    }, 
    {
      "affiliation": "Jefferson Lab", 
      "@type": "Person", 
      "name": "Eloy Romero"
    }, 
    {
      "@type": "Person", 
      "name": "jcosborn"
    }, 
    {
      "affiliation": "NVIDIA", 
      "@type": "Person", 
      "name": "Robert Maynard"
    }, 
    {
      "@type": "Person", 
      "name": "walkloud"
    }, 
    {
      "affiliation": "University of Maryland", 
      "@type": "Person", 
      "name": "Evan Berkowitz"
    }, 
    {
      "affiliation": "NVIDIA", 
      "@type": "Person", 
      "name": "Filippo Spiga"
    }, 
    {
      "@type": "Person", 
      "name": "Matthew R Johnson"
    }, 
    {
      "affiliation": "IHEP", 
      "@type": "Person", 
      "name": "sunwayihep"
    }, 
    {
      "@type": "Person", 
      "name": "Xiao-Yong"
    }, 
    {
      "affiliation": "INFN Roma Tre", 
      "@type": "Person", 
      "name": "Mario Schr\u00f6ck"
    }, 
    {
      "affiliation": "TokyoTech", 
      "@type": "Person", 
      "name": "tsuki"
    }
  ], 
  "url": "https://zenodo.org/record/5610079", 
  "codeRepository": "https://github.com/lattice/quda/tree/v1.1.0", 
  "datePublished": "2021-10-28", 
  "version": "v1.1.0", 
  "@context": "https://schema.org/", 
  "identifier": "https://doi.org/10.5281/zenodo.5610079", 
  "@id": "https://doi.org/10.5281/zenodo.5610079", 
  "@type": "SoftwareSourceCode", 
  "name": "lattice/quda: QUDA v1.1.0"
}
122
4
views
downloads
All versions This version
Views 12280
Downloads 42
Data volume 8.3 MB4.4 MB
Unique views 10868
Unique downloads 42

Share

Cite as