ChASE v1.3.0. Major release.
This release features a number of changes in the parallel implementation and the algorithm.
- The QR factorization, which was previously done redundantly on each MPI process, is not parallelized on a 1D sub-grid of the 2D MPI cartesian grid.
- As a consequence of the additional parallelization, the number and structure of the workspace buffers has changed greatly diminishing the memory footprint of the entire library
- The use of the postApplication function has been substituted with the result that some of the communication is now hidden behind computation during the execution of the Rayleigh-Ritz kernel and the Residual kernel
- The parallel HouseholderQR algorithm has been substituted with the CholeskyQR algorithm (and its more stable variants). A mechanism to avoid failure of this algorithm has been introduced based on numerical analysis results.
- A new parallel random generator has been added to reduce the time spent initializing the computation, especially for large scale problems.