Abstract : Extreme scale computing efforts have resulted in numerous advances for multicore and accelerator based scalable systems. In addition, large-scale applications must increasingly deal with data management and analysis as a first-class concern. Therefore, new applications often have to manage distributed and parallel computing, and have to manage workflows of different tasks, such as computing, data analytics, machine learning, visualization, etc. In this MS, we present some of the latest work in scalable algorithms, programming paradigms, and libraries for next generation computing platforms. Furthermore, we discuss efforts to better incorporate data science concerns as an important component of our scientific workflows.
Organizer(s) : Keng Nakajima, Michael Heroux, Serge Petiton
[05569] Innovative Supercomputing in the Exascale Era by Integration of Simulation/Data/Learning
Format : Talk at Waseda University
Author(s) :
Kengo Nakajima (The University of Tokyo/RIKEN)
Abstract : We propose an innovative method of computational science for sustainable promotion of scientific discovery by supercomputers in the Exascale Era by integrating (Simulation/Data /Learning (S+D+L)), and develop a software platform “h3-Open-BDEC” for integration of (S+D+L) and evaluate the effects of the integration of on heterogenous supercomputer systems. The h3-Open-BDEC is designed for extracting the maximum performance of the supercomputers with minimum energy consumption. Related activities are described in the talk with future perspectives.
[03388] Task-based hybrid parallel matrix factorization for distributed memory environment
Format : Talk at Waseda University
Author(s) :
Tomohiro Suzuki (University of YamanashiUniversity)
Abstract : The task parallel approach provided by OpenMP has achieved great success in shared memory environments. In distributed memory environments, an interoperability issue exists between MPI and OpenMP. Several techniques have been proposed to address the issue. However, these proposed methods require a high level of thread support in MPI, so they can only work in limited environments.
In this presentation, basic experimental results performed on the Wisteria-O system will be presented.
[03651] Accelerating lattice Boltzmann method with GPU and C++ standard parallelization
Format : Talk at Waseda University
Author(s) :
Ziheng Yuan (The University of Tokyo)
Takashi Shimokawabe (The University of Tokyo)
Abstract : In recent years, with the increasing use of GPUs as accelerators in the HPC platform, using C++ standard parallel language as a GPU programming language has also attracted attention. Compared with traditional parallel programming languages, C++ standard parallel language for programming has advantages such as readability and maintainability of the code. This talk intends to introduce the application of C++ standard parallel language in fluid simulation using the lattice kinetic scheme (LKS), an extended lattice Boltzmann method (LBM), and discusses its performance.
[05041] GPU-accelerated viscoelastic crustal deformation analysis with data-driven method
Format : Talk at Waseda University
Author(s) :
Sota Murakami (The University of Tokyoyo)
Kohei Fujita (The University of Tokyo)
Tsuyoshi Ichimura (The University of Tokyo)
Takane Hori (Japan Agency for Marine-Earth Science and Technology)
Muneo Hori (Japan Agency for Marine-Earth Science and Technology)
Maddegedara Lalith (The University of Tokyo)
Naonori Ueda (RIKEN)
Abstract : We developed a viscoelastic analysis solver with data-driven method on GPUs for fast computation of highly detailed 3D crustal structure models. Here, the initial solution is obtained with high accuracy using a data-driven predictor based on previous time-step results, which reduces the number of multi-grid solver iterations and thus reduces the computation cost. The algorithm is designed to be suitable for GPUs. The developed GPU-based solver attained 8.6-fold speedup from the state-of-art GPU-based multi-grid solver.
[03035] System-Wide Coupling Communication for Heterogeneous Computing Systems
Format : Talk at Waseda University
Author(s) :
Shinji Sumimoto (The University of Tokyo)
Takashi Arakawa (CliMTech Inc.)
Yoshio Sakaguchi (Fujitsu Ltd.)
Hiroya Matsuba (Hitachi Ltd.)
Satoshi Ohshima (Kyushu University)
Hisashi Yashiro (National Institute for Environmental Studies)
Toshihiro Hanawa (The University of Tokyo)
Kengo Nakajima (The University of Tokyo/RIKEN)
Abstract : This talk presents a system-wide coupling communication library to couple multiple MPI programs for heterogeneous coupling computing called h3-Open-SYS/WaitIO (WaitIO for short). WaitIO provides an inter-program communication environment among MPI programs and supports different MPI libraries with various interconnects and processor types. We have developed the WaitIO communication library to realize the environments. We present how WaitIO works and performs in such heterogeneous computing environments.
[03261] h3-Open-UTIL/MP: a coupling library for heterogeneous computing
Format : Talk at Waseda University
Author(s) :
Takashi Arakawa (The University of Tokyo)
Shinji Sumimoto (The University of Tokyo)
Hisashi Yashiro (National Institute for Environmental Studies)
Kengo Nakajima (The University of Tokyo/RIKEN)
Abstract : Heterogeneous computing is one of the main topics for recent high-performance computing. The reason is that role of HPC has expanded beyond not only simple simulation but also to large-scale data analysis and machine learning. Based on these backgrounds, we are developing a heterogeneous coupling library h3-Open-UTIL/MP. In our presentation, we will describe the structure and function of h3-Open-UTIL/MP and discuss the results of performance measurements and application examples.
[03652] Modernizing the weather prediction model ICON for extreme-scale computing, a librarization effort
Format : Talk at Waseda University
Author(s) :
Yen-Chen Chen (Karlsruhe Institute of Technology)
Terry Cojean (Karlsruhe Institute of Technology)
Jonas Jucker (CSCS Swiss National Supercomputing Centre)
Sergey Kosukhin (Max-Planck-Institute for Meteorology)
Luis Kornblueh (Max-Planck-Institute for Meteorology)
Will Sawyer (CSCS Swiss National Supercomputing Centre)
Jörg Behrens (German Climate Computing Centre)
Claudia Frauen (German Climate Computing Centre)
Abstract : The weather and climate prediction model ICON was operated since 1999 and has become the forecasting model of more than 30 national weather services. However, the legacy Fortran code restricts its portability to GPU clusters and hinders its parallel performance on modern exascale clusters. The ICON consolidated (ICON-C) project and several related projects aim to make ICON more modular, portable, and suitable for modern extreme-scale parallel computing. This talk focuses on a librarization effort of ICON-C.
[03776] Performance Modeling Challenges in Extreme Scale Computing
Format : Online Talk on Zoom
Author(s) :
Ayesha Afzal (Erlangen National High Performance Computing Center (NHR@FAU))
Abstract : In extreme scale computing, analytic performance modeling using first-principles is pre-eminent for optimization. However, it is challenging since the implicit presumption of strict synchronization among all processes is not necessarily accurate. Therefore, for programs with rare synchronization points, simply summing the runtimes predicted by computation and communication performance models is often erroneous.
In my talk, I will highlight the most intriguing insights about intricate hardware-software interactions emerging from this model failure. Interestingly, the hardware bottlenecks permits for non-intuitive spontaneous asynchronicity that helps to get the most of the systems' capabilities and mitigates the communication overhead.
[05260] Exascale challenges and opportunities for fundamental research
Format : Online Talk on Zoom
Author(s) :
Christophe Calvin (CEA)
France Boillod-Cerneux (CEA)
Valérie Brenner (CEA)
Abstract : With the exascale come new challenges: the processing of massive data coupled with digital simulation becomes intrinsic to science. In addition, the constraints brought by the architectures of calculation for the exascale impose to also rethink the scientific applications. We are therefore faced with two major challenges. The 1st one: how the new exascale calculators, inscribed in a digital continuum, will be able to provide solutions for the processing of complex workflows combining data processing and simulation. The 2nd, how to design or redesign the applications in order to be able to exploit the architectures of the exascale supercomputers. We will illustrate these 2 challenges through different use cases at the CEA’s Fundamental Research Division.
[03605] An algorithm reducing by 2 the number of operations for the PageRank method, and its generalisation for stochastic matrix-vector products
Format : Online Talk on Zoom
Author(s) :
serge georges petiton (University of Lille, CNRS)
Maxence Vandromme (RATP Smart Systems)
Abstract : We propose an efficient PageRank algorithm that reduces the complexity by a factor two. We implement the method using row-major and column-major sparse matrix formats. The experiments are done on two different Intel processors from recent generations. The column-major storage format version of our method shows good scaling and outperforms the standard PageRank in a majority of cases. We also propose generalisations of this algorithm to the multiplication of stochastic matrices by a vector product.
[03705] Accelerating Cardiac Electrophysiology Simulations using novel AI Hardware
Format : Talk at Waseda University
Author(s) :
Johannes Langguth (Simula Research Laboratory)
Luk Bjarne Burchard (Simula Research Laboratory)
Xing Cai (Simula Research Laboratory)
Abstract : Recent advances in personalized arrhythmia risk prediction show that computational models can provide not only safer but also more accurate results than invasive procedures. However, biophysically accurate simulations require solving linear systems over fine meshes and time resolutions, which require significant computational resources. However, by leveraging sophisticated parallelization patterns as well as non-traditional hardware architectures, it is possible to meet the computational demands of these simulations.
A major recent development in computer hardware was the rise of dedicated accelerator hardware for machine learning applications such as the Graphcore IPUs and Cerebras WSE. These processors have evolved from the experimental state into market-ready products, and they have the potential to constitute the next major architectural shift after GPUs saw widespread adoption a decade ago.
In this talk, we present ongoing work on the parallelization of finite volume computations over an unstructured mesh using these new accelerators. We compare them to traditional CPUs and GPUs and point out challenges and opportunities of this new hardware for extreme scale computing.
[05304] A Medical Data Analytics Framework Transforming Big Data to Better Healthcare
Format : Talk at Waseda University
Author(s) :
Weichung Wang (National Taiwan University)
Abstract : Incorporating data science is crucial for the next-generation medical workflows that rely on high-performance computing to analyze large-scale medical data for digital and precision medicine. The "Medical Data Analytics Framework" combines project design, multimodality data, intelligent analytics, medical workflows, regulation, ethics, deployment, and operations to achieve an end-to-end R&D life-cycle in medical AI that positively impacts clinical workflows. This interdisciplinary framework reduces physicians' workload and assists in diagnosing with advanced algorithms and software.