All times are CST
Tuesday, Nov. 16

10:45 a.m. (Join Virtually via Webex)
Sayan Ghosh, Pacific Northwest National Laboratory
“Characterizing Performance of Graph Neighborhood Communication Patterns”
Abstract
Distributed-memory graph algorithms are fundamental enablers in scientific computing and analytics workflows. Most graph algorithms rely on the graph neighborhood communication pattern, i.e., repeated asynchronous communication between a vertex and its neighbors in the graph. The pattern is adversarial for communication software and hardware due to high message injection rates and input-dependent, many-to-one traffic with variable destinations and volumes. Therefore, carefully designed graph neighborhood communication benchmarks will assist system designers and engineers to optimize exascale systems for irregular applications. We build benchmarks and conduct performance analysis of graph neighborhood communication on modern large-scale network interconnects from four supercomputers: ALCF Theta, NERSC Cori, OLCF Summit and R-CCS Fugaku. We characterize communication from the perspectives of latency and throughput and analyze the effects of per-vertex work in our synthetic workloads.

11:30 a.m. (Join Virtually via Webex)
Marcus Noack, Lawrence Berkeley National Laboratory
“Optimal Autonomous Data Acquisition for Large-Scale Experimental Facilities”
Abstract
The execution and analysis of ever more complex experiments are increasingly challenged by the vast dimensionality of the parameter spaces that underlie investigations in the biological, chemical, physical, and materials sciences. While an increase in data-acquisition rates should allow broader querying of the parameter space, the complexity of experiments and the subtle dependence of the model function on input parameters remains daunting due to the sheer number of variables. To meet these challenges, new strategies for autonomous data acquisition are rapidly coming to fruition and are being deployed across a spectrum of scientific experiments. One promising direction is the use of Gaussian process regression (GPR) – a quick, non-parametric, robust approximation and uncertainty quantification method that can directly be applied to autonomous data acquisition. In this talk, I will present our work on GPR-driven autonomous experimentation at large-scale experimental facilities around the globe.

1:45 p.m. (Join Virtually via Webex)
Yuhua Duan, National Energy Technology Laboratory
“Modeling of Functional Materials for Energy Applications with NETL High-Performance Supercomputer Joule 2.0”
Abstract
The Joule 2.0 Supercomputer (https://hpc.netl.doe.gov) at the Office of Fossil Energy and Carbon Management’s National Energy Technology Laboratory (NETL) is a 5.7 PFLOPS computer that enables the simulation of phenomena that are difficult or impossible to measure. It is intended to help energy researchers discover new materials, optimize designs, and better predict operational characteristics. Functional materials are found in all classes of materials and are generally characterized as those materials which possess particularly native properties and functions of their own (such as energy storage, magnetism, piezoelectricity, sensing, optics, etc.). Exploring the properties of these materials are the key for different applications. To screen and design functional materials for specific applications, the computational simulation can play an important role due to the low-cost and faster to find the candidates. Instead of experimental test on vast of materials, based on the properties of different applications, the high-throughput computational methods can screen large number of materials from material database and only those predicted most-promising candidates will be further validated by experimental measurements. In addition, computational modeling can design and synthesize new materials which do not exist in the database. First-principles density functional theory (DFT) has been widely used for simulating the atomic-scale and nano-scale phenomena of materials engineering, materials optimization, and materials discovery. Executing DFT software on high-performance computer let us explore larger systems and obtain results faster. The simulations give insights to understand and improve performances of materials. In this presentation, we demonstrate using multi-scale modeling executed by high-performance computing to simulate functional materials for several energy-related applications including (1) solid oxide fuel cells, (2) high temperature optical sensing materials, (3) CO2 capture, and (4) tritium producing burnable absorber rods (TPBARs) in nuclear reactors.

2:30 p.m. (Join Virtually via Webex)
Al Geist, Oak Ridge National Laboratory
“Frontier: The First U.S. Exascale System”
Abstract
Frontier has now been delivered and installed at Oak Ridge National Laboratory. This HPE/AMD system with a peak of over 1.5 EF double precision FLOPS is the first exascale system in the U.S. This talk will describe the Frontier system and initial experiences. It will also describe the challenges caused by the chip shortages and the heroic efforts by HPE and AMD to meet the Frontier delivery schedule.

3:15 p.m. (Join Virtually via Webex)
Arvind Ramanathan, Argonne National Laboratory
“AI-Driven Adaptive Multiresolution Molecular Simulations on Heterogeneous Computing Platforms”
Abstract
Emerging hardware tailored for artificial intelligence (AI) and machine learning (ML) methods provide novel means to couple them with traditional high-performance computing (HPC) workflows involving molecular dynamics (MD) simulations. We propose Stream-AI-MD, a novel instance of applying deep learning methods to drive adaptive MD simulation campaigns in a streaming manner. We leverage the ability to run ensemble MD simulations on GPU clusters, while the data from atomistic MD simulations are streamed continuously to AI/ML approaches to guide the conformational search in a biophysically meaningful manner on a wafer-scale AI accelerator. We demonstrate the efficacy of Stream-AI-MD simulations for two scientific use-cases: (1) folding a small prototypical protein, namely BBA FSD-EY and (2) understanding protein-protein interaction (PPI) within the SARS-CoV-2 proteome between two proteins, nsp16 and nsp10. We show that Stream-AI-MD simulations can improve time-to-solution by ~50X for BBA protein folding. In addition, we also demonstrate the use of Stream-AI-MD in running multi resolution simulations for understanding the SARS-CoV-2 replication transcription complex.