Center for Computing Research
CCR Researcher Discusses Ceph Storage on Next Platform TV
CCR system software researcher Matthew Curry appeared on the June 22nd episode of “Next Platform TV” to discuss the increased use of the Ceph storage system in high-performance computing (HPC). Matthew’s interview with Nicole Hemsoth of the Next Platform starts at the 18:40 mark of the video. In the interview, Matthew describes the Stria system, which is an unclassified version of Astra, which was the first petascale HPC system based on the Arm processor. Matthew also describes the use of the Ceph storage system and some of the important aspects that are being tested and evaluated on Stria. More details and the entire episode are here.
Contact: Curry, Matthew Leon
Key Numerical Computing Algorithm Implemented on Neuromorphic Hardware
Researchers in Sandia’s Center for Computing Research (CCR) have demonstrated using Intel’s Loihi and IBM’s TrueNorth that neuromorphic hardware can efficiently implement Monte Carlo solutions for partial differential equations. CCR researchers had previously hypothesized that neuromorphic chips were capable of implementing critical Monte Carlo algorithm kernels efficiently at large scales, and this study was the first to demonstrate that this approach could be used to approximate solutions to arrive at a steady-state PDE solution. This study formalized the mathematical description of PDEs into an algorithmic form suitable for spiking neural hardware and highlighted results from implementing this spiking Monte Carlo algorithm on Sandia’s 8-chip Loihi test board and the IBM TrueNorth chip at Lawrence Livermore National Laboratory. These results confirmed that the computational costs scale highly efficiently with model size; suggesting that spiking architectures such as Loihi and TrueNorth may be highly desirable for particle-based PDE solutions. This work was funded by Sandia’s Laboratory Directed Research and Development (LDRD) program and the DOE Advanced Simulation and Computing (ASC) program. The paper has been accepted to the 2020 International Conference on Neuromorphic Systems (ICONS) and is available at https://arxiv.org/abs/2005.10904
Contact: Aimone, James Bradley
Sandia Researchers Collaborate with Red Hat on Container Technology
Sandia researchers in the Center for Computing Research collaborated with engineers from Red Hat, the world’s leading provider of open source solutions for enterprise computing, to enable more robust production container capabilities for high-performance computing (HPC) systems. CCR researchers demonstrated the use of Podman, which allows ordinary users to build and run containers without needing the elevated security privileges of an administrator, on the Stria machine at Sandia. Stria is an unclassified version of Astra, which was the first petascale HPC system based on an Arm processor. While Arm processors have shown to be very capable for HPC workloads, they are not as prevalent in laptops and workstations as other processors. To address this limitation, Podman provides the ability to build containers directly on machines like Stria and Astra without requiring root-level access. This capability is a critical advancement in container functionality for the HPC application development environment. The CCR team is continuing to work with Red Hat on improving Podman for traditional HPC applications as well as machine learning and deep learning workloads. More details on this collaboration can be found here:
Contact: Younge, Andrew J
Sandia-led Earth System Modeling Project Featured in ECP Podcast
CCR researcher Mark Taylor was interviewed in a recent episode of the “Let’s Talk Exascale” podcast from the Department of Energy’s Exascale Computing Project (ECP). Taylor leads the Energy Exascale Earth System Model – Multiscale Modeling Framework (E3SM-MMF) subproject, which is working to improve the ability to simulate the water cycle and processes around precipitation. The podcast and a transcript of the interview can be found here.
Contact: Taylor, Mark A.
Sandia Covid-19 Medical Resource Modeling
As part of the Department of Energy response to the novel coronavirus pandemic of 2020, Sandia personnel developed a model to predict medical resources needed, including medical practitioners (e.g. ICU nurses, physicians, respiratory therapists), fixed resources (regular or ICU beds and ventilators), and consumable resources (masks, gowns, gloves, etc.)
Researchers in Center 1400 developed a framework for performing uncertainty analysis on the resource model. The uncertainty analysis involved sampling 26 input parameters using the Dakota software. The sampling was performed conditional on the patient arrival streams, which were derived from epidemiology models and had a significant effect on the projected resource needs.
Using two of Sandia’s High Performing Computing clusters, the generated patient streams were run through the resource model for each of 3,145 counties in the United States, where each county-level run involved 100 samples per scenario. Three different social distancing scenarios were investigated. This resulted in approximately 900,000 individual runs of the medical resource model, requiring over 500 processor hours on the HPCs. The results included mean estimates per resource per county, as well as uncertainty in those estimates (e.g., variance, 5th and 95th quantile, and exceedance probabilities). Example results are shown in Figures 1-2. As updated patient stream projections become available from the latest epidemiology models, the analysis can be re-run quickly to provide resource projections in rapidly changing environments.
For more information on Sandia research related to COVID-19, please visit the COVID-19 Research website.
Contact: Swiler, Laura Painton
Sandia to receive Fujitsu supercomputer processor
This spring, CCR researchers anticipate Sandia becoming one of the first DOE laboratories to receive the newest A64FX Fujitsu processor, a Japanese Arm-based processor optimized for high-performance computing.The 48-core A64FX processor was designed for Japan’s soon-to-be-deployed Fugaku supercomputer, which incorporates high-bandwidth memory. It also is the first to fully utilize wide vector lanes that were designed around Arm’s Scalable Vector Extensions. These wide vector lanes make possible a type of data-level parallelism where a single instruction operates on multiple data elements arranged in parallel. Penguin Computer Inc. will deliver the new system — the first Fujitsu PRIMEHPC FX700 with A64FX processors. Sandia will evaluate Fujitsu’s new processor and compiler using DOE mini- and proxy-applications and will share the results with Fujitsu and Penguin. More details are available here.
Contact: Laros, James H.
Sandia-led Supercontainers Project Featured in ECP Podcast
As the US Department of Energy’s (DOE) Exascale Computing Project (ECP) has evolved since its inception in 2016, what’s known as containers technology and how it fits into the wider scheme of exascale computing and high-performance computing (HPC) has been an area of ongoing interest in its own right within the HPC community.
Container technology has revolutionized software development and deployment for many industries and enterprises because it provides greater software flexibility, reliability, ease of deployment, and portability for users. But several challenges must be addressed to get containers ready for exascale computing.
The Supercontainers project, one of ECP’s newest efforts, aims to deliver containers and virtualization technologies for productivity, portability, and performance on the first exascale computing machines, which are planned for 2021.
ECP’s Let’s Talk Exascale podcast features as a guest Supercontainers project team member Andrew Younge of Sandia National Laboratories. The interview was recorded this past November in Denver at SC19: The International Conference for High Performance Computing, Networking, Storage, and Analysis.
Contact: Younge, Andrew J
Steve Plimpton Awarded the 2020 SIAM Activity Group on Supercomputing Career Prize
Steve Plimpton has been awarded the 2020 Society for Industrial and Applied Mathematics (SIAM) 2020 Activity Group on Supercomputing Career Prize. This prestigious award is given every two years to an outstanding researcher who has made broad and distinguished contributions to the field of algorithm development for parallel scientific computing. According to SIAM, the Career Prize recognizes Steve’s “seminal algorithmic and software contributions to parallel molecular dynamics, to parallel crash and impact simulations, and for leadership in modular open-source parallel software.”
Steve is the originator of several successful software projects, most notably the open-source LAMMPS code for molecular dynamics. Since its release in 2004, LAMMPS has been downloaded hundreds of thousands of times and has grown to become a leading particle-based materials modeling code worldwide. Steve’s leadership in parallel scientific computing has led to many opportunities for the Center for Computing Research to collaborate on high-performance computing projects both within and outside Sandia National Laboratories.
Contact: Littlewood, David John