2019 Featured Speakers
Tuesday, Nov. 19
Stephane Ethier, Princeton Plasma Physics Laboratory
“High-Fidelity Whole-Device Model of Magnetically Confined Fusion Plasma”
The goal of this project is to develop a high-fidelity whole-device model (WDM) of magnetically confined fusion plasmas, which is urgently needed to understand and predict the performance of ITER and future next-step facilities, validated on present tokamak experiments. Guided by the understanding obtained from several fusion experiments as well as theory and simulation activities in the U.S. and abroad, ITER is expected to attain tenfold energy gain and will realize burning plasmas that are well beyond the operational regimes accessible in present and past fusion experiments. The science of fusion plasmas is inherently multi-scale in space and time, spanning several orders of magnitude in a geometrically complex configuration, and is an ideal testbed for extreme-scale computing. Our 10-year problem target on exascale computers is the high-fidelity simulation of whole-device burning plasmas applicable to a high-performance advanced tokamak regime (i.e., an ITER steady-state plasma with tenfold energy gain), integrating the effects of turbulence- and collision-induced transport, large-scale magnetohydrodynamic instabilities, energetic particles, plasma material interactions, as well as heating and current drive.
Deborah Bard, Lawrence Berkeley National Laboratory
“Cross-Facility Science: The Superfacility Model at Lawrence Berkeley National Laboratory”
As data sets from DOE user facilities grow in both size and complexity, there is an urgent need for new capabilities to transfer, analyze, store and curate the data to facilitate scientific discovery. DOE supercomputing facilities have begun to expand services and provide new capabilities in support of experiment workflows via powerful computing, storage, and networking systems. In this talk, I will introduce the Superfacility concept—a framework for integrating experimental and observational instruments with computational and data facilities at NERSC. I will discuss the science requirements that are driving this work, and how this translates into technical innovations in data management, scheduling, networking, and automation. In particular, I will focus on the new ways experimental scientists are accessing HPC facilities, and the implications for future system design.
Ang Li, Pacific Nothwest National Laboratory
“Online Anomalous Running Detection via Recurrent Neural Network for GPU-Accelerated HPC Machines”
We propose a workload classification framework that discriminates illicit computation from authorized workloads on GPU-accelerated HPC systems. As such heterogeneous systems become more powerful, they are potentially exploited by attackers to run malicious and for-profit programs that typically require extremely high computing capability to be successful. Our classification framework leverages the distinctive signatures between illicit and authorized workloads, and explores machine learning methods to learn the workloads and classify them. The framework uses lightweight, non-intrusive workload profiling to collect model input data, and explores multiple machine learning methods, particularly recurrent neural network (RNN) that is suitable for online anomalous workload detection. Evaluation results on three generations of GPU machines demonstrate that the workload classification framework can figure out the illicit unauthorized workloads with a high accuracy of over 95%. The collected dataset, detection framework, and neural network models will be released on github.
Prasanna Balaprakash, Argonne National Laboratory
“Scientific Domain-Informed Machine Learning”
Extracting knowledge from scientific data—produced from observation, experiment, and simulation—presents a significant hurdle for scientific discovery. As the U.S. Department of Energy (DOE) has moved toward data-driven scientific discovery, machine learning (ML) has become a critical technology in the modeling of complex phenomena in concert with current computational, experimental, and observational approaches. In the past few years, increased availability of massive data sets and growing computational power have led to breakthroughs in many scientific domains. However, development of ML systems for many scientific domains poses several challenges such as data paucity, domain-knowledge integration, and adaptability. In this talk, we will present Argonne’s work on scientific domain-informed ML approaches that seek to overcome these challenges. We will illustrate these methods using case studies on a range of DOE scientific applications. We will conclude with some exciting avenues for future research.
Dirk VanEssendelft, National Energy Technology Laboratory
“TensorFlow For Scientific and Engineering HPC Computations: Examples in Computational Fluid Dynamics”
The National Energy Technology Laboratory (NETL) has been exploring the use of TensorFlow (TF) for general scientific and engineering computations within HPC environments which might include machine learning (ML). TF has some unique capabilities in the HPC environment that could serve to reduce effort and development time. Specifically, memory management, communication, data operations, code optimization, and parallelization are handled on a wide variety of hardware in a largely automated fashion. These inherent qualities allow a practitioner to focus largely on algorithm development without necessity for deep computational science knowledge (although deep diving into TF code development can improve performance and application efficiency). NETL will provide two example cases as examples of TF capabilities for science and engineering applications in the context of computational fluid dynamics. First, NETL recently developed a novel stiff chemistry solver implemented in TF and achieved ~300× speed up over LSODA serial and ~35× speedup over LSODA parallel. Second, NETL developed a TF-based single-phase fluid solver and achieved ~3.1× improvement over 40 ranks of MPI on CPU (much higher accelerations are possible with further parallelization and better scaling is achieved when more transport equations are solved). NETL will detail early benchmarks on small to medium-scale problems and discuss how next-generation software can be significantly improved. NETL is also presenting lessons learned in short tutorial form at NVIDIA’s Expo theater as a complimentary talk (check NVIDIA’s schedule for date and time).
David Womble, Oak Ridge National Laboratory
“Opportunities at the Intersection of Artificial Intelligence and Science”
Recent impacts of artificial intelligence (AI) have been enabled by huge increases in data collection and high-performance computing. This presentation will highlight recent successes in the application and potentially disruptive opportunities of AI within the DOE mission space.
Keren Bergman, Fermi National Accelerator Laboratory
“Optically Connected Memory for High Performance Computing”
As the computational speed required by the cloud and high-performance computing continues to scale up, the required memory bandwidth is not keeping pace. Conventional electronic interconnects are limited by the inherent power consumption challenges of communicating high data rates over distances beyond the chip scale. Today, applications such as machine learning and deep neural networks require large memory banks to store weights and learning data. This talk will cover the opportunity offered by optically connected memory with silicon photonic links, which have the benefit of low energy per bit, small footprint, and compatibility with the current CMOS processes and ASICs.
Wednesday, Nov. 20
Brian Spears, Lawrence Livermore National Laboratory
“Cognitive Simulation: Integrating Large-Scale Simulations and Experiments Using Deep Learning”
Lawrence Livermore National Laboratory (LLNL) builds world-class predictive capabilities across a wide variety of national security missions. We continually challenge our theory-driven simulations with precision experimental data. Both simulation and experiment have become very data-rich with a complex of observables including scalars, vector-valued data, and various images. Traditional approaches can omit much of this information, making the resulting models less accurate than they otherwise could be. Today, LLNL teams are tackling this problem by developing Cognitive Simulation tools—deep learning technologies that improve predictive capabilities by effectively coupling simulation and experimental data. These CogSim techniques amplify our effective computation power, improve predictive performance, and offer new AI-driven approaches to design. To build CogSim models, we first train deep neural network models on simulation data to capture the theory implemented in advanced simulation codes. Later, we improve, or elevate, the trained models by incorporating experimental data. The training and elevation process both improves our predictive accuracy and provides a quantitative measure of uncertainty in such predictions. We will present an overview of work in this arena with specific examples from testbed research in inertial confinement fusion at the National Ignition Facility. This includes advanced deep learning architectures and methods necessary to handle rich, multimodal data and strong nonlinearities as well as techniques for reconciling these models with real experimental data. We also cover our work on enormous training sets —billions of both scalar and image observables —and models trained on them using the more than 17,000 GPUs on the Sierra supercomputer. We also describe our ongoing efforts to co-design next-generation platforms that are optimized for both precision simulation and machine learning demanded by CogSim and future applications.
Balint Joo and Graham Heyes, Thomas Jefferson National Accelerator Facility
“HPC at Jefferson Lab for Theory and Experiment”
We will discuss two of the primary computational workloads related to high performance computing at Jefferson Lab: Lattice QCD (LQCD) calculations and experimental data analysis workflows. Lattice QCD calculations are carried out in tandem with allocations at leadership facilities, with Jefferson Lab operating national shared cluster resources to provide mid-range capacity computing to the U.S. LQCD community. Jefferson Lab staff are actively engaged in software developments as part of the SciDAC-4 program and the Exascale Computing Project to exploit the most recently available compute architectures, which enable the use of both the large-scale DOE facilities as well as locally hosted cluster resources. We will detail some recent results in exploiting accelerator technologies and in the area of performance portability. In terms of data analysis, advances in all aspects of computing are beginning to make possible a new model for the analysis workflows of nuclear physics experiments, where data filtering is minimized and data is streamed in parallel through various stages of online and near-line processing, as opposed to slower models of the last 30 years, which consisted of reading the data from detectors, subjecting them to heavy filtering, and then storing the results for post-processing at a later date using thousands of individual jobs on a batch system. The new approach results in richer multi-dimensional datasets that can be made accessible for processing using grid, cloud, or leadership-class computing facilities. This is a much more responsive workflow, which leaves decisions affecting science quality as late as possible. We will provide an update on the progress of work at Jefferson Lab aimed at investigating several aspects of this new computing model. It is expected that, on the five- to ten-year timescale, streaming data readout and processing will become the norm.
Brian Albright and Brian Settlemyer, Los Alamos National Laboratory
“Co-design at Extreme Scale: Finding New Efficiencies in Simulation, I/O, and Analysis”
Los Alamos National Laboratory’s (LANL’s) Vector Particle in Cell code, VPIC, has for several years been a key driver of scientific discovery in plasma physics. The per-node performance and scalability of VPIC has enabled massive simulations (up to several trillions of computational particles and hundreds of billions of computational cells) using multiple generations of supercomputers across the DOE complex. However, scientific discovery is driven not just by computational power, but also by the ability to find new insights within massive datasets. For calculations of extreme size, this can pose a profound challenge. In this talk, we describe how the ability to efficiently output and analyze data using DeltaFS is critical to the plasma physics workflow and how co-designed I/O capabilities in particular have accelerated data analysis and discovery. By combining efficient simulation and efficient data analysis within VPIC, LANL has expanded the frontiers of plasma physics and made key discoveries in a range of scientific areas, including magnetohydrodynamics, space physics, laser-plasma interaction, and the properties of high energy density matter.
Meifeng Lin, Brookhaven National Laboratory
“High Performance Computing for Large-Scale Experimental Facilities”
This presentation will describe recent work in bringing high-performance computing solutions to large-scale experimental facilities, such as the National Synchrotron Light Source II (NSLS-II) at Brookhaven National Laboratory and the ATLAS experiment at CERN’s Large Hadron Collider particle accelerator. With the unprecedented amount of data continually produced at these large-scale user facilities, the need for incorporating HPC technologies and tools into experimental workflows continues to rise. Compute accelerators, such as graphics processing units (GPUs), can offer a tremendous boost to computational workloads for experiments conducted at these facilities. However, amending software to use accelerators more efficiently can be challenging. In collaboration with NSLS-II and ATLAS, Brookhaven’s Computational Science Initiative has successfully adapted some key software to use GPUs. This presentation will examine the challenges associated with porting C++- and Python-based software to GPUs and how these enhancements will impact experimental workflow approaches employed at scientific user facilities and the ways resulting data are processed.
Andrew Younge, Sandia National Laboratories
“Supercontainers for HPC”
As the code complexity of HPC applications expands, development teams continually rely on detailed software operation workflows to enable automation of building and testing their application. These development workflows can become increasingly complex and, as a result, are difficult to maintain when the target platforms’ environments are increasing in architectural diversity and continually changing. Recently, the advent of containers in industry have demonstrated the feasibility of such workflows, and the latest support for containers in HPC environments makes them now attainable for application teams. Fundamentally, containers have the potential to provide a mechanism for simplifying workflows for development and deployment, which could improve overall build and testing efficiency for many teams. This talk introduces the Exascale Computing Project (ECP) Supercomputing Containers Project, named Supercontainers, which represents a consolidated effort across the DOE and NNSA to use a multi-level approach to accelerate adoption of container technologies for exascale. A major tenant of the project is to ensure that container runtimes are well poised to take advantage of future HPC systems, including efforts to ensure container images can be scalable, interoperable, and well integrated into exascale supercomputers across the DOE. The project focuses on foundational system software research needed for ensuring containers can be deployed at scale and provides enhanced user and developer support to ensure containerized exascale applications and software are both efficient and performant. Furthermore, these activities are conducted in the context of interoperability, effectively generating portable solutions that work for HPC applications across DOE facilities, ranging from laptops to exascale platforms.
Jana Thayer and Chin Fang, SLAC National Accelerator Laboratory
“Big Data at the Linac Coherent Light Source”
The increase in volume and complexity of the data generated by the upcoming LCLS-II upgrade presents a considerable challenge for data acquisition, data processing, and data management. These systems face formidable challenges due to the extremely high data throughput, hundreds of GB/s to multi-TB/s, generated by the detectors at the experimental facilities and to the intensive computational demand for data processing and scientific interpretation. The LCLS Data System is a fast, powerful, and flexible architecture that includes a feature extraction layer designed to reduce the data volumes by at least one order of magnitude while preserving the science content of the data. Innovative architectures are required to implement this reduction with a configurable approach that can adapt to the multiple science areas served by LCLS. In order to increase the likelihood of experiment success and improve the quality of recorded data, a real-time analysis framework provides visualization and graphically configurable analysis of a selectable subset of the data on the timescale of seconds. A fast feedback layer offers dedicated processing resources to the running experiment to provide experimenters feedback about the quality of acquired data within minutes. We will present an overview of the LCLS Data System architecture with an emphasis on the Data Reduction Pipeline and online monitoring framework.
2018 Featured Speakers
Tuesday, Nov. 13
Pete Beckman, Argonne National Laboratory
“The Tortoise and the Hare: Is There Still Time for HPC to Catch Up to the Cloud in the Performance Race?”
Speed and scale define supercomputing. By many metrics, our supercomputers are the fastest, most capable systems on the planet. We have succeeded in deploying extreme-scale systems with high reliability, extended uptime, and large user communities. Computational science at extreme scale is leading to scientific breakthroughs. Over the past twenty years, however, the community has become overconfident in our designs for HPC system software and intelligent networking, while the cloud computing community has been steadily adding new software features and intelligent networking. From containers and virtual machines to software-defined networking and FPGAs in the fabric, the hyperscalers have been steadily moving forward building advanced systems. Has the cloud computing community already won the race? Can HPC regain leadership in the design and architecture of flexible system software and leverage containers, advanced operating systems, reconfigurable fabrics, and software-defined networking? Come learn about Argo, an operating system project for the Exascale Computing Project, how “Fluid HPC” could make large-scale system more flexible, and how the HPC community might leverage these new technologies.
“Fermilab’s Quantum Computing Program”
Fermilab’s Panagiotis Spentzouris will discuss the goals and strategy of the Fermilab Quantum Science Program, which includes simulation of quantum field theories, development of algorithms for high-energy physics computational problems, teleportation experiments and applying qubit technologies to quantum sensors in high-energy physics experiments.
Sriram Krishnamoorthy, Pacific Northwest National Laboratory
“Intense National Focus on QIS”
PNNL scientist Sriram Krishnamoorthy invites you to learn how the scientific grand challenge of quantum chemistry will benefit from quantum computers. PNNL, with its depth of experience in computational chemistry, is currently exploring and designing the quantum chemistry problems that can benefit most from quantum computers. In addition, PNNL’s computer scientists and computational chemists are working closely with industry partners to jointly design the first quantum computing-based quantum chemistry calculations that surpass the limits of classical supercomputers. In this talk, Krishnamoorthy will describe these efforts and collaborations as well as other ongoing quantum computing-related activities at PNNL.
“Introducing NERSC-9, Berkeley Lab’s Next-Generation Pre-Exascale Supercomputer”
The NERSC-9 pre-exascale system, to be deployed in 2020, will support the broad Office of Science user community. The system is designed to support the needs of both simulations and modeling, as well as data analysis from DOE’s experimental facilities. This talk will announce and describe the NERSC-9 system for the SC18 community, including architecture features and plans for transitioning NERSC’s 7,000-member user community.
Inder Monga, Lawrence Berkeley National Laboratory
“ESnet6: Design of the Next-Generation Science Network”
Because of the dramatically increasing size of datasets and the need to make scientific data broadly accessible, ESnet is designing ESnet6, its next-generation network. The network will offer higher bandwidth, more growth capability, advanced features tailored for modern science and the necessary resilience to support DOE’s core research mission. The talk will discuss the conceptual ESnet6 architecture that will comprise of a programmable, scalable and resilient hollow core coupled with a flexible, dynamic and programmable services edge.ESnet6 will feature services that monitor and measure the network to make sure it is operating at peak performance. These services will also facilitate advanced cybersecurity capabilities providing the control and management needed to protect the network.
“The Ristra Project: Preparing for Multi-Physics Simulation at Exascale”
Two key challenges on the path to efficient multi-physics simulation on exascale-class computing platforms are (a) abstracting exascale hardware from multi-physics code development, and (b) solving integral problems at multiple physical scales. Ristra, a four-year old Los Alamos project under the Advanced Technology Development and Mitigation (ATDM) sub-program of the DOE ASC program, is developing a toolkit for multi-physics code development based around a computer science interface (FleCSI) that limits the impact of disruptive computer technology on physics developers. FleCSI enables the adoption of novel programming models and data management methods to address the challenges and diversity of new technology. Simultaneously, Ristra is exploring the use of multi-scale numerical methods that offer improved physics fidelity and computing efficiency. The Ristra software architecture and progress to date will be presented, together with early results of simulations in solid mechanics and multi-scale radiation hydrodynamics.
“Machine Learning and Predictive Simulation: HPC and the U.S. Cancer Moonshot on Sierra”
The marriage of experimental science with simulation has been a fruitful one–the fusion of HPC-based simulation and experimentation moves science forward faster than either discipline alone, rapidly testing hypotheses and identifying promising directions for future research. The emergence of machine learning at scale promises to bring a new type of thinking into the mix, incorporating data analytics techniques alongside traditional HPC to accompany experiment. I will discuss the convergence of machine learning, predictive simulation and experiment in the context of one element of the U.S. Cancer Moonshot– a multi-scale investigation of Ras biology in realistic membranes.
“BigPanDA project. Workflow and Workload Management System for High Energy and Nuclear Physics, and for Extreme Scale Scientific Applications”
The PanDA software is used for workload management on distributed grid resources by the ATLAS experiment at the LHC. An effort was launched to extend PanDA, called BigPanDA, to access HPC resources, funded by the US Department of Energy (DOE-ASCR). Through this successful effort, ATLAS today uses over 25 million hours monthly on the Titan supercomputer at Oak Ridge National Laboratory. Many challenges were met and overcome in using HPCs for ATLAS simulations. ATLAS uses two different operational modes at Titan. The traditional mode uses allocations – which require software innovations to fit the low latency requirements of experimental science. New techniques were implemented to shape large jobs using allocations on a leadership class machine. In the second mode, high priority work is constantly sent to Titan to backfill high priority leadership class jobs. This has resulted in impressive gains in overall utilization of Titan, while benefiting the physics objective s of ATLAS. For both modes, BigPanDA has integrated traditional grid computing with HPC architecture.
Wednesday, Nov. 14
Kerstin Kleese van Dam, Brookhaven National Laboratory
“Real Time Performance Analysis of Applications and Workflows”
As part of the ECP CODAR project Brookhaven National Laboratory in collaboration with the Oregon Universities TAU team have developed unique capabilities to analyze, reduce and visualize single application and complete workflow performance data in-situ. The resulting tool enables the researchers to examine and explore their workflow performance as it is being executed.
Arthur “Buddy” Bland, Oak Ridge National Laboratory
“An Overview of ORNL’s Summit Supercomputer”
In June 2018, U.S. Department of Energy’s Oak Ridge National Laboratory unveiled Summit as the world’s most powerful and smartest scientific supercomputer. Summit has a peak performance of 200 petaflops, and for certain scientific applications, will also be capable of more than three billion billion mixed precision calculations per second, or 3.3 exaops. Summit will provide unprecedented computing power for research in energy, advanced materials and artificial intelligence (AI), among other domains, enabling scientific discoveries that were previously impractical or impossible.
Mike Sprague, National Renewable Energy Laboratory
“ExaWind: Towards Predictive Wind Farm Simulations on Exascale Platforms”
This talk will describe the ExaWind Exascale Computing Project, which is in pursuit of predictive wind turbine and wind plant simulations. Predictive, physics-based high-fidelity computational models, validated with targeted experiments, provide the most efficacious path to understanding wind plant physics and reducing wind plant losses. Predictive simulations will require blade-resolved moving meshes, high-resolution grids to resolve the flow structures, hybrid-RANS/LES turbulence modeling, fluid-structure interaction, and coupling to meso-scale flows. The modeling and algorithmic pathways of ExaWind include unstructured-grid finite volume spatial discretization and pressure-projection methods for incompressible flow. The ExaWind code is Nalu-Wind, which is built on Trilinos/STK and employs the Kokkos abstraction layer for performance portability. Results will be shown for turbine simulations with the Hypre and Trilinos linear-system solver stacks with particular focus on strong scaling performance on NERSC Cori and NREL Peregrine and the underlying algebraic multigrid (AMG) preconditioners. We also describe new Hypre results on SummitDev at OLCF, and recent MW-scale single-turbine simulations under turbulent inflow.
Yee Ting Li, SLAC National Accelerator Laboratory
“Hyperscale (Petabyte, Exabyte and Beyond) Data Distribution for Delivery of LCLS-II Free Electron Laser Data to Supercomputers”
The next generation Linear Coherent Light Source (LCLS-II) at SLAC is planned to achieve first light in 2020. The potential data rates are 1000X greater than the existing LCLS. By 2025, experimenters will need to stream data from the detectors at SLAC to DOE supercomputers at rates substantially exceeding terabits/sec. Since 2014, we have been working to create an effective solution for hyperscale data distribution. Using 5 rack-unit co-located clusters and 80Gbits/sec capacity links over a 5000mile path, we recently transferred a petabyte of encrypted data in a world-leading 29 hours. Our next steps are to transport data from SLAC to NERSC over an ESnet 100Gbps capacity link, compare software solutions and evaluate Intel Optane SSDs.
Doug Kothe, Oak Ridge National Laboratory
“Exascale Computing Project Update”
An update on the U.S. Department of Energy’s Exascale Computing Project – a multi-lab, 7-year collaborative effort focused on accelerating the delivery of a capable exascale computing ecosystem by 2021. The goal of the ECP is to enable breakthrough solutions that can address our most critical challenges in scientific discovery, energy assurance, economic competitiveness, and national security. The project is a joint effort of two U.S. Department of Energy (DOE) organizations: the Office of Science and the National Nuclear Security Administration (NNSA).
Jim Laros, Sandia National Laboratories
“Vanguard-Astra: NNSA Advanced Architecture Prototype Platform”
Jim Brandt, Sandia National Laboratories
“Platform Independent Run Time HPC Monitoring, Analysis, and Feedback at Any-Scale”
Large-scale HPC simulation applications may execute across thousands to millions of processor threads. Contention for network and/or file system resources and mismatches in processor, memory, and network resources can have significant impact on application performance. Such effects can stem from a variety of sources from manufacturing variation to resource allocation, to power and cooling variation and more. This talk presents a suite of scalable tools, developed by Sandia, to gain insight into per-instance causes of application performance degradation. We present background, architectural details and actual use case examples of monitoring sources, data, and run time analyses of that data. We also present how the output can directly inform application users and operations staff about application and system performance characteristics as well as be used to provide feedback to applications and system software components. The tools are not only useful for the insights they provide but are also fun to use and can provide hours of enjoyment for users, operations staff, and researchers trying to identify ways to architect more efficient systems/applications.
Graham Heyes, Thomas Jefferson National Accelerator Facility
“Streaming Data for Nuclear Physics Experiments”
The computing workflow model for most nuclear physics experiments has remained relatively unchanged for over thirty years. Data is read from detectors, heavily filtered to reduce data rate and stored. At a later date the data is retrieved and processed using thousands of individual jobs on a batch system. The final, compute intensive, processing, was performed locally since network bandwidth limited offsite data access. The whole process is slow, with weeks or months between steps, and forces the scientist to make choices in advance of data taking that affect data quality. Advances in all aspects of computing are beginning to make possible a model, new to nuclear physics, where filtering is relaxed and data is streamed in parallel through various stages of online and near line processing. This results in rich multi-dimensional datasets that can be made accessible for processing using grid, cloud, or leadership class computing facilities. This is a much more responsive workflow with minimum filtering of the raw data, which leaves decisions effecting science quality as late as possible. At Jefferson Lab several aspects of this computing model are being investigated. It is expected that, on the five to ten year timescale, streaming data readout and processing will become the norm.