Welcome to the International Symposium on High-Performance Parallel and Distributed Computing (ACM HPDC 2019)!, the premier conference at the intersection of high performance and distributed computing, now in our 27th year. HPDC is fortunate to be co-located and sponsorsed by ACM FCRC this year which is providing exceptional keynotes, the Turing Award Lecture, and a chance to mingle with our colleagues in other disciplines of Computer Science. ACM HPDC has a focus on high-performance parallel and distributed computing topics over the years including platforms spanning clouds, clusters, grids, big data, massively multicore, and extreme-scale computing systems. One of the unique features of HPDC is that it welcomes a blend of ideas ranging from applied research in the form of experience papers on operational deployments and applications, and more fundamental research in parallel and distributed techniques and systems. The conference has always appreciated the heroic work taken to deploy real systems and applications and the insights gained by live measurement and experimentation. The HPDC 2019 program is no exception with topics ranging from hybrid systems, scalable graph processing, GPU applications, and cloud systems, to name a few. In addition to a strong technical program, the HPDC 2019 achievement award was given to Professor Geoffrey Fox, Indiana University for his foundational contributions to parallel computing, high-performance software, the interface between applications and systems, contributions to education, and outreach to underrepresented communities. This year the conference will include five exciting workshops on cutting edge topics including perenial favorites ScienceCloud and ROSS. The program will also include a PhD Forum event for our future budding stars in topics relating to HPDC.
Conferences like HPDC require a large committment by the community. First, I would like to acknowledge and thank our sponsors ACM SIGARCH and the University of Arizona, and supporters, IBM, DOE, and the NSF. Next, I would like to thank the various organizing committees that ensure that HPDC remains a top-tier conference at the intersection of high performance and distributed computing. My deepest thanks go to Evgenia Smirni and Ali Butt for their leadership as co-chairs of the Program Committee helping to bring a strong and exciting technical program which is a cornerstone of HPDC. I would also like to acknowledge the HPDC Awards Committee led by Douglas Thain in their selection of Geoffrey Fox for the 2019 HPDC Achievement Award.
Additional thanks go to Alex Iosup for chairing the Workshop committee which has had a long tradition at HPDC bringing cutting-edge topics that complement the main program. Thanks also to Jay F Lofstead II for leading the effort as Poster Chair and the mentoring session as part of the Ph.D. Forum. I would also like to acknowledge Antonino Tumeo for his work as our Publications Chair, an often thankless job. Conferences cannot succeed without the tireless work of publicity and I would like to thank Ioan Raicu, Torsten Hoefler, Shuaiwen Leon Song, Kenjiro Taura, for spreading the word. Lastly, I express gratitude to Zachary Leidall, our Web chair, for keeping the HPDC 2019 Web site informative and up to date. Finally, let me thank the entire HPDC community including the steering committee, organizing committee participants, volunteers, session chairs, attendees, and co-authors, for making HPDC 2019 a great success.
Jon Weissman, University of Minnesota—HPDC 2019 General Chair, Steering Committee Chair
Proceeding Downloads
SpotWeb: Running Latency-sensitive Distributed Web Services on Transient Cloud Servers
Many cloud providers offer servers with transient availability at a reduced cost. These servers can be unilaterally revoked by the provider, usually after a warning period to the user. Until recently, it has been thought that these servers are not ...
LABIOS: A Distributed Label-Based I/O System
In the era of data-intensive computing, large-scale applications, in both scientific and the BigData communities, demonstrate unique I/O requirements leading to a proliferation of different storage devices and software stacks, many of which have ...
Parsl: Pervasive Parallel Programming in Python
- Yadu Babuji,
- Anna Woodard,
- Zhuozhao Li,
- Daniel S. Katz,
- Ben Clifford,
- Rohan Kumar,
- Lukasz Lacinski,
- Ryan Chard,
- Justin M. Wozniak,
- Ian Foster,
- Michael Wilde,
- Kyle Chard
High-level programming languages such as Python are increasingly used to provide intuitive interfaces to libraries written in lower-level languages and for assembling applications from various components. This migration towards orchestration rather than ...
Kleio: A Hybrid Memory Page Scheduler with Machine Intelligence
The increasing demand of big data analytics for more main memory capacity in datacenters and exascale computing environments is driving the integration of heterogeneous memory technologies. The new technologies exhibit vastly greater differences in ...
MANA for MPI: MPI-Agnostic Network-Agnostic Transparent Checkpointing
Transparently checkpointing MPI for fault tolerance and load balancing is a long-standing problem in HPC. The problem has been complicated by the need to provide checkpoint-restart services for all combinations of an MPI implementation over all network ...
Multi-Level Analysis of Compiler-Induced Variability and Performance Tradeoffs
- Michael Bentley,
- Ian Briggs,
- Ganesh Gopalakrishnan,
- Dong H. Ahn,
- Ignacio Laguna,
- Gregory L. Lee,
- Holger E. Jones
Successful HPC software applications are long-lived. When ported across machines and their compilers, these applications often produce different numerical results, many of which are unacceptable. Such variability is also a concern while optimizing the ...
Making Root Cause Analysis Feasible for Large Code Bases: A Solution Approach for a Climate Model
- Daniel J. Milroy,
- Allison H. Baker,
- Dorit M. Hammerling,
- Youngsung Kim,
- Elizabeth R. Jessup,
- Thomas Hauser
Large-scale simulation codes that model complicated science and engineering applications typically have huge and complex code bases. For such simulation codes, where bit-for-bit comparisons are too restrictive, finding the source of statistically ...
HEXO: Offloading HPC Compute-Intensive Workloads on Low-Cost, Low-Power Embedded Systems
- Pierre Olivier,
- A. K. M. Fazla Mehrab,
- Stefan Lankes,
- Mohamed Lamine Karaoui,
- Robert Lyerly,
- Binoy Ravindran
OS-capable embedded systems exhibiting a very low power consumption are available at an extremely low price point. It makes them highly compelling in a datacenter context. In this paper we show that sharing long-running, compute-intensive datacenter HPC ...
Scheduling Beyond CPUs for HPC
High performance computing (HPC) is undergoing significant changes. The emerging HPC applications comprise both compute- and data-intensive applications. To meet the intense I/O demand from emerging data-intensive applications, burst buffers are ...
Paths to Fast Barrier Synchronization on the Node
- Conor Hetland,
- Georgios Tziantzioulis,
- Brian Suchy,
- Michael Leonard,
- Jin Han,
- John Albers,
- Nikos Hardavellas,
- Peter Dinda
Synchronization primitives like barriers heavily impact the performance of parallel programs. As core counts increase and granularity decreases, the value of enabling fast barriers increases. Through the evaluation of the performance of a variety of ...
XBFS: eXploring Runtime Optimizations for Breadth-First Search on GPUs
Attracted by the enormous potentials of Graphics Processing Units (GPUs), an array of efforts has surged to deploy Breadth-First Search (BFS) on GPUs, which, however, often exploits the static mechanisms to address the challenges that are dynamic in ...
Combining Data Duplication and Graph Reordering to Accelerate Parallel Graph Processing
Performance of single-machine, shared memory graph processing is affected by expensive atomic updates and poor cache locality. Data duplication, a popular approach to eliminate atomic updates by creating thread-local copies of shared data, incurs ...
Perspectives on High-Performance Computing in a Big Data World
High-Performance Computing (HPC) and Cyberinfrastructure have played a leadership role in computational science even since the start of the NSF computing centers program. Thirty years ago parallel computing was a centerpiece of computer science ...
Preemptive Multi-Queue Fair Queuing
Fair queuing (FQ) algorithms have been widely adopted in computer systems to share resources among multiple users. Modern operating systems and hypervisors use variants of FQ algorithms to implement the critical OS resource management -- the thread ...
DeepSZ: A Novel Framework to Compress Deep Neural Networks by Using Error-Bounded Lossy Compression
Today's deep neural networks (DNNs) are becoming deeper and wider because of increasing demand on the analysis quality and more and more complex applications to resolve. The wide and deep DNNs, however, require large amounts of resources (such as memory,...
PERQ: Fair and Efficient Power Management of Power-Constrained Large-Scale Computing Systems
Large-scale computing systems are becoming increasingly more power-constrained, but these systems employ hardware over- provisioning to achieve higher system throughput because applications often do not consume the peak power capacity of nodes. ...
Suffix Array Construction on Multi-GPU Systems
Suffix arrays are prevalent data structures being fundamental to a wide range of applications including bioinformatics, data compression, and information retrieval. Therefore, various algorithms for (parallel) suffix array construction both on CPUs and ...
CuLDA: Solving Large-scale LDA Problems on GPUs
Latent Dirichlet Allocation(LDA) is a popular topic model. Given the fact that the input corpus of LDA algorithms consists of millions to billions of tokens, the LDA training process is very time-consuming, which prevents the adoption of LDA in many ...
Better Late Than Never: An n-Variant Framework of Verification for Java Source Code on CPU x GPU Hybrid Platform
A method of detecting malicious intrusions and runtime faults in software is proposed, which replicates untrusted computations onto two diverse but often co-located instruction architectures: CPU and GPU. Divergence between the replicated computations ...
UMR-EC: A Unified and Multi-Rail Erasure Coding Library for High-Performance Distributed Storage Systems
Distributed storage systems typically need data to be stored redundantly to guarantee data durability and reliability. While the conventional approach towards this objective is to store multiple replicas, today's unprecedented data growth rates ...
GAugur: Quantifying Performance Interference of Colocated Games for Improving Resource Utilization in Cloud Gaming
- Yusen Li,
- Chuxu Shan,
- Ruobing Chen,
- Xueyan Tang,
- Wentong Cai,
- Shanjiang Tang,
- Xiaoguang Liu,
- Gang Wang,
- Xiaoli Gong,
- Ying Zhang
Cloud gaming has been very popular recently, but providing satisfactory gaming experiences to players at a modest cost is still challenging. Colocating several games onto one server could improve server utilization. To enable efficient colocations while ...
Adaptive Resource Views for Containers
As OS-level virtualization advances, containers have become a viable alternative to virtual machines in deploying applications in the cloud. Unlike virtual machines, which allow guest OSes to run atop virtual hardware, containers have direct access to ...
Semantic-aware Workflow Construction and Analysis for Distributed Data Analytics Systems
Logging is a universal approach to recording important events in system workflows of distributed systems. Current log analysis tools ignore the semantic knowledge that is key to workflow construction and analysis. In addition, they focus on ...
Cited By
- Cascajo A, Singh D and Carretero J (2022). LIMITLESS — LIght-weight MonItoring Tool for LargE Scale Systems, Microprocessors & Microsystems, 93:C, Online publication date: 1-Sep-2022.
-
Zhang S, Fu H, Wu L, Li Y, Wang H, Zeng Y, Duan X, Wan W, Wang L, Zhuang Y, Meng H, Xu K, Xu P, Gan L, Liu Z, Wu S, Chen Y, Yu H, Shi S, Wang L, Xu S, Xue W, Liu W, Guo Q, Zhang J, Zhu G, Tu Y, Edwards J, Baker A, Yong J, Yuan M, Yu Y, Zhang Q, Liu Z, Li M, Jia D, Yang G, Wei Z, Pan J, Chang P, Danabasoglu G, Yeager S, Rosenbloom N and Guo Y (2020). Optimizing high-resolution Community Earth System Model on a heterogeneous many-core supercomputing platform, Geoscientific Model Development, 10.5194/gmd-13-4809-2020, 13:10, (4809-4829)
Index Terms
- Proceedings of the 28th International Symposium on High-Performance Parallel and Distributed Computing