Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3208040acmconferencesBook PagePublication PageshpdcConference Proceedingsconference-collections
HPDC '18: Proceedings of the 27th International Symposium on High-Performance Parallel and Distributed Computing
ACM2018 Proceeding
Publisher:
  • Association for Computing Machinery
  • New York
  • NY
  • United States
Conference:
HPDC '18: The 27th International Symposium on High-Performance Parallel and Distributed Computing Tempe Arizona June 11 - 15, 2018
ISBN:
978-1-4503-5785-2
Published:
11 June 2018
Sponsors:

Bibliometrics
Skip Abstract Section
Abstract

It is a great pleasure to welcome you to the 27th ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC 2018), and with it to Tempe, Arizona, United States of America. This year's HPDC continues its nearly three-decade tradition as the premier annual conference for presenting the latest research on the design, implementation, evaluation, and the use of parallel and distributed systems for high-end computing.

Skip Table Of Content Section
abstract
Reproducibility in computational and data-enabled science

As computation becomes central to scientific research and discovery new questions arise regarding the implementation, dissemination, and evaluation of methods that underlie scientific claims. I present a framework for conceptualizing the affordances ...

SESSION: Operating systems
research-article
PicoDriver: fast-path device drivers for multi-kernel operating systems

Lightweight kernel (LWK) operating systems (OS) in high-end supercomputing have a proven track record of excellent scalability. However, the lack of full Linux compatibility and limited availability of device drivers in LWKs have prohibited their wide-...

research-article
Hard real-time scheduling for parallel run-time systems

High performance parallel computing demands careful synchronization, timing, performance isolation and control, as well as the avoidance of OS and other types of noise. The employment of soft real-time systems toward these ends has already shown ...

SESSION: Fault tolerance - I
research-article
ABFR: convenient management of latent error resilience using application knowledge

Exascale systems face high error-rates due to increasing scale (109 cores), software complexity and rising memory error rates. Increasingly, errors escape immediate hardware-level detection, silently corrupting application states. Such latent errors can ...

research-article
Public Access
Desh: deep learning for system health prediction of lead times to failure in HPC

Today's large-scale supercomputers encounter faults on a daily basis. Exascale systems are likely to experience even higher fault rates due to increased component count and density. Triggering resilience-mitigating techniques remains a challenge due to ...

research-article
Public Access
Improving performance of iterative methods by lossy checkponting

Iterative methods are commonly used approaches to solve large, sparse linear systems, which are fundamental operations for many modern scientific simulations. When the large-scale iterative methods are running with a large number of ranks in parallel, ...

SESSION: Massively multicore systems
research-article
Public Access
Efficient sparse-matrix multi-vector product on GPUs

Sparse Matrix-Vector (SpMV) and Sparse Matrix-Multivector (SpMM) products are key kernels for computational science and data science. While GPUs offer significantly higher peak performance and memory bandwidth than multicore CPUs, achieving high ...

research-article
Public Access
CommAnalyzer: automated estimation of communication cost and scalability on HPC clusters from sequential code

To deliver scalable performance to large-scale scientific and data analytic applications, HPC cluster architectures adopt the distributed-memory model. The performance and scalability of parallel applications on such systems are limited by the ...

research-article
A high-performance connected components implementation for GPUs

Computing connected components is an important graph algorithm that is used, for example, in medicine, image processing, and biochemistry. This paper presents a fast connected-components implementation for GPUs called ECL-CC. It builds upon the best ...

abstract
Cambrian explosion of computing and big data in the post-moore era

The so-called "Moore's Law", by which the performance of the processors will increase exponentially by factor of 4 every 3 years or so, is slated to be ending in 10--15 year timeframe due to the lithography of VLSIs reaching its limits around that time, ...

SESSION: Runtime systems
research-article
Public Access
PShifter: feedback-based dynamic power shifting within HPC jobs for performance

The US Department of Energy (DOE) has set a power target of 20-30MW on the first exascale machines. To achieve one exaFLOPS under this power constraint, it is necessary to manage power intelligently while maximizing performance. Most production-level ...

research-article
ADAPT: an event-based adaptive collective communication framework

The increase in scale and heterogeneity of high-performance computing (HPC) systems predispose the performance of Message Passing Interface (MPI) collective communications to be susceptible to noise, and to adapt to a complex mix of hardware ...

research-article
Best Paper
Best Paper
Process-in-process: techniques for practical address-space sharing

The two most common parallel execution models for many-core CPUs today are multiprocess (e.g., MPI) and multithread (e.g., OpenMP). The multiprocess model allows each process to own a private address space, although processes can explicitly allocate ...

SESSION: Fault tolerance - II
research-article
Open Access
Thread-local concurrency: a technique to handle data race detection at programming model abstraction

With greater adoption of various high-level parallel programming models to harness on-node parallelism, accurate data race detection has become more crucial than ever. However, existing tools have great difficulty spotting data races through these high-...

research-article
LADR: low-cost application-level detector for reducing silent output corruptions

Applications running on future high performance computing (HPC) systems are more likely to experience transient faults due to technology scaling trends with respect to higher circuit density, smaller transistor size and near-threshold voltage (NTV) ...

research-article
Public Access
Profiling distributed systems in lightweight virtualized environments with logs and resource metrics

Understanding and troubleshooting distributed systems in the cloud is considered a very difficult problem because the execution of a single user request is distributed to multiple machines. Further, the multi-tenancy nature of cloud environments further ...

SESSION: Performance modeling and analysis
research-article
Tuyere: enabling scalable memory workloads for system exploration

Memory technologies are under active development. Meanwhile, workloads on contemporary computing systems are increasing rapidly in size and diversity. Such dynamics in hardware and software further widen the gap between memory system design and ...

research-article
Public Access
Performance analysis and optimization of in-situ integration of simulation with data analysis: zipping applications up

This paper targets an important class of applications that requires combining HPC simulations with data analysis for online or real-time scientific discovery. We use the state-of-the-art parallel-IO and data-staging libraries to build simulation-time ...

research-article
ForkTail: a black-box fork-join tail latency prediction model for user-facing datacenter workloads

The workflows of the predominant user-facing datacenter services, including web searching and social networking, are underlaid by various Fork-Join structures. Due to the lack of understanding the performance of Fork-Join structures in general, today's ...

abstract
Public Access
The biology of software

Biological design principles can potentially change the way we study engineer, maintain, and develop large dynamic software systems. For example, computer programmers like to think of software as the product of intelligent design, carefully crafted to ...

SESSION: Storage and I/O
research-article
Public Access
Hermes: a heterogeneous-aware multi-tiered distributed I/O buffering system

Modern High-Performance Computing (HPC) systems are adding extra layers to the memory and storage hierarchy named deep memory and storage hierarchy (DMSH), to increase I/O performance. New hardware technologies, such as NVMe and SSD, have been ...

research-article
NVStream: accelerating HPC workflows with NVRAM-based transport for streaming objects

Nonvolatile memory technologies (NVRAM) with larger capacity relative to DRAM and faster persistence relative to block-based storage technologies are expected to play a crucial role in accelerating I/O performance for HPC scientific workflows. Typically,...

research-article
Public Access
Parallelizing garbage collection with I/O to improve flash resource utilization

Garbage Collection (GC) has been a critical optimization target for improving the performance of flash-based Solid State Drives (SSDs); the long-lasting GC process occupies the flash resources, thereby blocking normal I/O requests and increasing ...

SESSION: Big data
research-article
Transparent speculation in geo-replicated transactional data stores

This work presents Speculative Transaction Replication (STR), a protocol that exploits transparent speculation techniques to enhance performance of geo-distributed, partially replicated transactional data stores. In addition, we define a new consistency ...

research-article
Cross-geography scientific data transferring trends and behavior

Wide area data transfers play an important role in many science applications but rely on expensive infrastructure that often delivers disappointing performance in practice. In response, we present a systematic examination of a large set of data transfer ...

Contributors
  • University of Minnesota Twin Cities
  • Lawrence Berkeley National Laboratory
Index terms have been assigned to the content through auto-classification.

Recommendations

Acceptance Rates

HPDC '18 Paper Acceptance Rate 22 of 111 submissions, 20%;
Overall Acceptance Rate 166 of 966 submissions, 17%
YearSubmittedAcceptedRate
HPDC '191062221%
HPDC '181112220%
HPDC '171001919%
HPDC '161292016%
HPDC '151161916%
HPDC '141302116%
HPDC '131312015%
HPDC '121432316%
Overall96616617%