Releases · ChASE-library/ChASE

Introduced a new distributed GPU-build of ChASE entirely based on the NVIDIA NCCL library, which avoids the explicit data movement between host and device memory, and leads to much faster collective communications among the involved GPUs. This new release achieves between a 1.5x and 3x with respect to the traditional distributed multi-GPUs build. Now ChASE can be compiled and executed with the following distinct parallel configurations:

Distributed CPU only
Distributed multi-GPUs (traditionally based on host-device communication standards)
Distributed multi-GPUs (using NVIDIA NCCL library)

Assets 2

05 Apr 13:08

nidode

v1.3.1

2f0babf

ChASE v1.3.1: minor release

Updated the estimation bound for the condition number of the matrix of filtered vectors V. This estimate bounds from above the actual condition number of the matrix V allowing for the dynamical selection of the Communication-Avoiding QR-decomposition (CAQR) variant within the ChASE library at run time.

Assets 2

10 Mar 14:40

nidode

v1.3.0

16deae1

ChASE v1.3.0. Major release.

This release features a number of changes in the parallel implementation and the algorithm.

The QR factorization, which was previously done redundantly on each MPI process, is not parallelized on a 1D sub-grid of the 2D MPI cartesian grid.
As a consequence of the additional parallelization, the number and structure of the workspace buffers has changed greatly diminishing the memory footprint of the entire library
The use of the postApplication function has been substituted with the result that some of the communication is now hidden behind computation during the execution of the Rayleigh-Ritz kernel and the Residual kernel
The parallel HouseholderQR algorithm has been substituted with the CholeskyQR algorithm (and its more stable variants). A mechanism to avoid failure of this algorithm has been introduced based on numerical analysis results.
A new parallel random generator has been added to reduce the time spent initializing the computation, especially for large scale problems.

Assets 2

23 Jan 14:27

nidode

v1.2.1

a307474

ChASE is now integrated into the ELSI library.

In this release:

The C and Fortran interfaces have been improved
Dependencies on Nvtx tool has been removed
the ELSI interface has been included

Assets 2

13 Jun 18:16

brunowu

v1.2.0

a307474

ChASE v1.2.0 Release

We release the version 1.2.0 of ChASE, with new features as follows:

include fortran interface explicitly in the ChASE code
add a new chase-mpi-properties interface for block distribution, in which the distribution is provided by user, rather than use the built-in one.
fully compatible with Quantum Espresso

Assets 2

17 Mar 16:08

nidode

v1.1.1

9c3bc92

Standardized LICENSE

Fixes

BSD3.0 license standardization

Assets 2

15 Mar 16:07

nidode

v1.1.0

89649df

Algorithmic improvements in the Chebyshev filter

Improvements

Integrated axpy in the call to HEMM when executing the 3-terms recurrence relation in the Chebyshev filter.
Moved the shift of the A matrix in the 3-terms recurrent relation for the GPU build within the accelerator.

Assets 2

15 Mar 16:04

nidode

v1.0.0

add9217

Release the first stable version of ChASE: v1.0.0

First stable version (v1.0.0) of ChASE, which supports:

shared-memory build;
MPI+Threads build;
MPI+multi-GPUs build.

This version supports also both the block-distribution and block-cyclic distribution of matrix to be solved across 2D MPI grid.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes

Improvements

Releases: ChASE-library/ChASE

Added CI pipeline for automatic building and testing

Bug fix: GPU-timing syncronization

ChASE v1.4.0. Major release.

ChASE v1.3.1: minor release

ChASE v1.3.0. Major release.

ChASE is now integrated into the ELSI library.

ChASE v1.2.0 Release

Standardized LICENSE

Fixes

Algorithmic improvements in the Chebyshev filter

Improvements

Release the first stable version of ChASE: v1.0.0