Center for Computing Research
Astra Supercomputer is Fastest Arm-Based Machine on Top500 List
Sandia’s Astra is the world’s fastest Arm-based supercomputer according to the just released TOP500 list, the supercomputer industry’s standard. With a speed of 1.529 petaflops, Astra placed 203rd on a ranking of top computers announced at SC18, the International Conference for High Performance Computing, Networking, Storage, and Analysis, in Dallas. A petaflop is a unit of computing speed equal to one thousand million million (1015) floating-point operations per second. Astra achieved this speed on the High-Performance Linpack benchmark. Astra is one of the first supercomputers to use processors based on Arm technology. The machine’s success means the supercomputing industry may have found a new potential supplier of supercomputer processors, since Arm designs are available for licensing. More details are in this article.
Contact: Laros, James H.
Power API and LAMMPS Win R&D100 Awards
Two CCR technologies have won 2018 R&D100 Awards. Each year, R&D Magazine names the 100 most technologically significant products and advancements, recognizing the winners and their organizations. Winners are selected from submissions from universities, corporations, and government labs throughout the world. This year’s winners include the Power APIand LAMMPS. The Power API was also recognized with a Special Recognition Award for corporate social responsibility. Sandia garnered a total of five R&D100 Awards. The Power API is portable programming interface for developing applications and tools that can be used to control and monitor the power use of high-performance computing systems in order to improve energy efficiency. LAMMPS is a molecular dynamics modeling and simulation application designed to run on large-scale high performance computing systems. Winners were announced at a recent ceremony at the R&D 100 Conference.
Contact: Brightwell, Ronald B. (Ron)
Power API and LAMMPS Named R&D100 Award Finalists
Two CCR technologies have been named as finalists for the 2018 R&D100 Awards. Each year, R&D Magazine names the 100 most technologically significant products and advancements, recognizing the winners and their organizations. Winners are selected from submissions from universities, corporations, and government labs throughout the world. This year’s finalists include the Power APIand LAMMPS. The Power API is portable programming interface for developing applications and tools that can be used to control and monitor the power use of high-performance computing systems in order to improve energy efficiency. LAMMPS is a molecular dynamics modeling and simulation application designed to run on large-scale high performance computing systems. The final award winners will be announced at a ceremony at the R&D 100 Conferencein mid-November.
Contact: Brightwell, Ronald B. (Ron)
Sandia Joins the Linaro HPC Special Interest Group
Sandia National Laboratories has joined Linaro’s High Performance Compute (HPC) Special Interest Group as an advanced end user of mission-critical HPC systems. Linaro Ltd, is the open source collaborative engineering organization developing software for the Arm ecosystem. Sandia recently announced Astra, one of the first supercomputers to use processors based on the Arm architecture in a large-scale high-performance computing platform. This system requires a complete vertically integrated software stack for Arm: from the operating system through compilers and math libraries. Sandia and Linaro will work together with the other members of the HPC SIG to jointly address hardware and software challenges, expand the HPC ecosystem by developing and proving new technologies and increase technology and vendor choices for future platforms. More info is available here.
Contact: Younge, Andrew J
Astra - An Arm-Based Large-Scale Advanced Architecture Prototype Platform
Astra, one of the first supercomputers to use processors based on the Arm architecture in a large-scale high-performance computing platform, is being deployed at Sandia National Laboratories. Astra is the first of a potential series of advanced architecture prototype platforms, which will be deployed as part of the Vanguard program that will evaluate the feasibility of emerging high-performance computing architectures as production platforms. The machine is based on the recently announced Cavium Inc. ThunderX2 64-bit Arm-v8 microprocessor. The platform consists of 2,592 compute nodes, of which each is 28-core, dual-socket, and will be at a theoretical peak of more than 2.3 petaflops, equivalent to 2.3 quadrillion floating-point operations (FLOPS), or calculations, per second. While being the fastest is not one of the goals of Astra or the Vanguard program in general, a single Astra node is roughly one hundred times faster than a modern Arm-based cellphone. More details are available here.
Contact: Laros, James H.
CCR Researcher Kurt Ferreira Co-Authors Best Paper at APDCM Workshop
CCR Researcher Kurt Ferreira and his co-authors have been awarded Best Paper at the upcoming Workshop on Advances in Parallel and Distributed Computational Models (APDCM) at the International Parallel and Distributed Processing Symposium. Their paper entitled "Optimal Cooperative Checkpointing for Shared High-Performance Computing Platforms" proposes a cooperative checkpoint scheduling policy that combines optimal checkpointing periods with I/O scheduling in an effort to ensure minimal overheads in the presence of bursty, competing I/O. This work provides crucial analysis and direct guidance on maximizing throughput on current and future extreme-scale platforms. This year marks the 20th APDCM Workshop, which intends “to provide a timely forum for the exchange and dissemination of new ideas, techniques and research in the field of the parallel and distributed computational models.”
Contact: Ferreira, Kurt Brian
NVIDIA has invited SNL to present results of a GPU performant shock hydrodynamics code at their Super Computing (SC17) booth.
NVIDIA has invited SNL to present results of a GPU performant coupled hydrodynamics, low Magnetic Reynolds number (low Rm) code at their Super Computing 17 (SC17) booth. Researchers at Sandia are developing a new shock hydrodynamics capability, based on adaptive Lagrangian techniques targeted at next generation architectures. The code simulates shock hydrodynamics on GPU architectures using the Kokkos library to provide portability across architectures. Mesh and field data management, as well as adaptive Lagrangian operations are being developed to run exclusively on the GPU. New algorithms using tetrahedral elements and a predictor-corrector time integrator have been implemented. Low Rm physics is solved using NVIDIA’s AmgX GPU-aware, algebraic multigrid solver. Using an exemplar problem provided by our NW partners we have demonstrated good scaling and performance on next generation architectures. Notably, the exemplar problem demonstrates the advantages of a device-centric design philosophy, where the hydrodynamics physics solve, including adaptivity and remapping, are hosted on the coprocessor with exceptional performance on the GPU relative to traditional multi-core architectures. Additionally, solve times for the low Rm physics with the AmgX software demonstrate sub-second solve times for million degree of freedom problems. Next steps include full-scale testing on Trinity (on both the Haswell and KNL partitions) as well as Sierra as it becomes available, the addition of robust treatment for material/material interactions and the inclusion of more comprehensive MHD physics.
Contact: Hansen, Glen
The Next Platform Highlights CCR Work on Memory-Centric Programming
A recent article from The Next Platform, an online publication that offers in-depth coverage of high-end computing, recently featured an article entitled “New Memory Challenges Legacy Approaches to HPC Code.” The article discusses a paper co-authored by CCR researcher Ron Brightwell that was published last November as part of the Workshop on Memory Centric Programming for HPC at the SC’17 conference. In the article, Brightwell and one of his co- authors, Yonghong Yan from the University of South Carolina, discuss the programming challenges created by recent advances in memory technology and the deepening memory hierarchy. The article examines the notion of memory-centric programming and how programming systems need to evolve to provide better abstractions to help insulate application developers from the complexities associated with current and future advances in memory technology for high-performance computing systems.
Contact: Brightwell, Ronald B. (Ron)
DOE award to develop new quantum algorithms for simulation, optimization, and machine learning
The Department of Energy's Office of Science recently awarded $4.5M over three years to a multi-institutional and multi-disciplinary team led by Dr. Ojas Parekh (1464) to explore the abilities of quantum computers in three interrelated areas: quantum simulation, optimization, and machine learning, each highly relevant to the DOE mission. The QOALAS (Quantum Optimization and Learning and Simulation) project brings together some the world’s top experts in quantum algorithms, quantum simulation, theoretical physics, applied mathematics, and discrete optimization from Sandia National Laboratories, Los Alamos National Laboratory, California Institute of Technology, and University of Maryland. The QOALAS team will leverage and unearth connections between simulation, optimization, and machine learning to fuel new applications of quantum information processing to science and technology as well as further investigate the potential of quantum computers to solve certain problems dramatically faster or with better fidelity than possible with classical computers.
Contact: Parekh, Ojas D.
Released VTK-m user's guide, version 1.1
Researchers at Sandia National Laboratories, in collaboration with Kitware Inc., Oak Ridge National Laboratory, Los Alamos National Laboratory, and the university of Oregon, are proud to release VTK-m version 1.1. The VTK-m library provides highly parallel code to execute visualization on many-core processors like GPUs, multi-core CPUs, and other hardware we are likely to see at for Exascale HPC. This release of VTK-m includes critical core features including filter structures and key reduction. Also provided by this release are several new filters including external faces, gradients, clipping, and point merging. Also provided with this release is a comprehensive VTK-m User’s Guide providing detailed instruction and reference for using and editing VTK-m.
Contact: Moreland, Kenneth D. (Ken)