[00586] Challenges for Attaining High-performance in Numerical Software
Session Time & Room : 3D (Aug.23, 15:30-17:10) @E603
Type : Proposal of Minisymposium
Abstract : The architectures of the existing top performing systems are undeniable complex, building upon multi-core units and proprietary interconnects, with very high levels of parallelism. These features pose many challenges to numerical library and application developers. In addition, accuracy of numerical computations, which can be an issue for conventional (e.g., BLAS) or complex algorithms (e.g., eigensolvers), should be concerned. In this minisymposium will discuss recent work on Automatic tuning (AT) by using expandable AI, novel approaches for accuracy verification, and iterative eigensolvers that do not enforcing orthogonality on the iterates thus reducing communication.
[03413] Adaptation of XAI to Numerical Libraries: A Case Study for Automatic Performance Tuning
Format : Talk at Waseda University
Author(s) :
Takahiro Katagiri (Nagoya University)
Abstract : AI is one of crucial technologies. On the other hand, we have been adapting to auto-tuning (AT) for numerical software. By utilizing AI technology, it is expected to establish AT function for performance tuning on numerical libraries. However, it is difficult to verify correctness for obtained AI model. Adaptation of explainable AI (XAI) is one of solutions. In this presentation, several scenarios for adapted XAI to AT function will be demonstrated.
[03472] Parallel Eigensolvers Based on Minimization Strategies
Format : Talk at Waseda University
Author(s) :
Doru Thom Popovici (Lawrence Berkeley National Lab)
Osni Marques (Lawrence Berkeley National Laboratory)
Mauro Del Ben (Lawrence Berkeley National Laboratory)
Andrew Canning (Lawrence Berkeley National Laboratory)
Abstract : This presentation will show recent developments in unconstrained minimization strategies for the solution of eigenvalue problems in electronic structure calculations. These schemes employ a preconditioned conjugate gradient approach that avoids an explicit reorthogonalization of the trial eigenvectors, in contrast to typical iterative eigensolvers, therefore reducing communications and becoming an attractive approach for the solution of very large problems on massively parallel computers. The presentation will also discuss the need to rearrange calculations (sometimes counteractively) to achieve performance, in particular on GPUs.
[03700] Mixed-precision iterative refinement for real-symmetric eigenvalue decomposition with clustered eigenvalues
Format : Talk at Waseda University
Author(s) :
Yuki Uchino (Shibaura Institute of Technology)
Katsuhisa Ozaki (Shibaura Institute of Technology)
Toshiyuki Imamura (RIKEN)
Abstract : Uchino et al. presented two mixed-precision iterative refinement algorithms (herein called Algorithm 1 and 2) for the real-symmetric eigendecomposition based on the algorithm proposed by Ogita and Aishima.
Algorithm 2 offers the same convergence and advantages in terms of computational speed compared to Algorithm 1, as demonstrated through numerical experiments on the supercomputer Fugaku housed at RIKEN R-CCS.
We will also show that Algorithm 2 is much faster than the eigensolver provided in ScaLAPACK.
[04170] Mixed Precision Iterative Refinement with H-matrices
Format : Talk at Waseda University
Author(s) :
Thomas Spendlhofer (Tokyo Institute of Technology)
Rio Yokota (Tokyo Institute of Technology)
Abstract : It has been shown that the solution to a dense linear system can be accelerated by using mixed precision iterative refinement relying on approximate LU-factorization.
We investigate the usage of both mixed precision and low-rank approximations for obtaining an approximate factorization. When employing the hierarchical matrix format, we are able to attain results accurate to a double precision solver at a lower complexity of $\order{n^2}$ for certain matrices.