Center for Computing Research (CCR)

Center for Computing Research

Scott Larson Nicoll Levy

Scott Larson Nicoll Levy
Scalable System Software
Email: sllevy@sandia.gov
Phone: 505/844-7292

Mailing address:
Sandia National Laboratories
P.O. Box 5800, MS 1319
Albuquerque, NM
87185-1320

I am a Senior Member of Technical Staff in the Scalable System Software department of the Center for Computing Research (CCR).  I research system software for next-generation extreme-scale systems.  Specifically, I study the impact of system failures, and other sources of performance interference, on the execution of scientific simulations.  I am also investigating application performance in power-constrained environments.  I earned my Ph.D. from the University of New Mexico, where I worked with Prof. Patrick Bridges in the Scalable Systems Lab.  At Sandia, I work with Kurt Ferreira, Patrick Widener and the 9lives research group on improving the resilience and fault tolerance of large-scale parallel systems.

Education/Background

Ph.D., Computer Science, University of New Mexico

B.S., Electrical Engineering, Cornell University

Selected Publications & Presentations

2016
  • Levy, Scott N., Kurt Brian Ferreira, "An Examination of the Impact of Failure Distribution on Coordinated Checkpoint/Restart," Workshop Paper, Workshop on Fault-Tolerance for HPC at Extreme Scale (FTXS), May 2016.
  • Levy, Scott N., Kurt Brian Ferreira, Patrick Widener, Patrick G Bridges, Oscar H. Mondragon, "How I Learned to Stop Worrying and Love In Situ Analytics: Leveraging Latent Synchronization in MPI Collective Algorithms," Conference Paper, MPI Users and Developers Conference (EuroMPI 2016) , September 2016.
  • Levy, Scott N., Kurt Brian Ferreira, Patrick G Bridges, "Improving Application Resilience to Memory Errors with Lightweight Compression," Conference Paper, International Conference on High Performance Computing, Networking, Storage and Analysis (SC16) , November 2016.
  • Mondragon, Oscar H., Patrick G Bridges, Kurt Brian Ferreira, Scott N. Levy, Patrick Widener, "Understanding Performance Interference in Next-Generation HPC Systems," Conference Paper, International Conference on High Performance Computing, Networking, Storage and Analysis (SC16) , November 2016.
  • Widener, Patrick, Kurt Brian Ferreira, Scott N. Levy, "Horseshoes and Hand Grenades: The Case for Approximate Coordination in Local Checkpointing Protocols," Workshop Paper, Workshop on Resiliency in High Performance Computing (Resilience) in Clusters, Clouds, and Grids, August 2016.
  • Widener, Patrick, Scott N. Levy, Kurt Brian Ferreira, Torsten Hoefler, "On noise and the performance benefit of nonblocking collectives," Journal Article, International Journal of High Performance Computing Applications, Vol. 30, No. 1, pp. 121–133, Accepted/Published February 2016.
2015
2014
2013
2012