Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- posterFebruary 2013
Scalable statistics counters
PPoPP '13: Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programmingPages 307–308https://doi.org/10.1145/2442516.2442558Naive statistics counters that are commonly used to monitor system events and performance become a scalability bottleneck as systems become larger and more NUMA; furthermore some are so inaccurate that they are not useful. We present a number of ...
Also Published in:
ACM SIGPLAN Notices: Volume 48 Issue 8 - posterFebruary 2013
Runtime elision of transactional barriers for captured memory
PPoPP '13: Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programmingPages 303–304https://doi.org/10.1145/2442516.2442556In this paper, we propose a new technique that can identify transaction-local memory (i.e. captured memory), in managed environments, while having a low runtime overhead. We implemented our proposal in a well known STM framework (Deuce) and we tested it ...
Also Published in:
ACM SIGPLAN Notices: Volume 48 Issue 8 - posterFebruary 2013
Reducing contention through priority updates
PPoPP '13: Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programmingPages 299–300https://doi.org/10.1145/2442516.2442554Also Published in:
ACM SIGPLAN Notices: Volume 48 Issue 8 - posterFebruary 2013
RaceFree: an efficient multi-threading model for determinism
PPoPP '13: Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programmingPages 297–298https://doi.org/10.1145/2442516.2442553Current deterministic systems generally incur large overhead due to the difficulty of detecting and eliminating data races. This paper presents RaceFree, a novel multi-threading runtime that adopts a relaxed deterministic model to provide a data-race-...
Also Published in:
ACM SIGPLAN Notices: Volume 48 Issue 8 - posterFebruary 2013
Programming with hardware lock elision
PPoPP '13: Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programmingPages 295–296https://doi.org/10.1145/2442516.2442552We present a simple yet effective technique for improving performance of lock-based code using the hardware lock elision (HLE) feature in Intel's upcoming Haswell processor.
We also describe how to extend Haswell's HLE mechanism to achieve a similar ...
Also Published in:
ACM SIGPLAN Notices: Volume 48 Issue 8 -
- posterFebruary 2013
Parallel programming with big operators
PPoPP '13: Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programmingPages 293–294https://doi.org/10.1145/2442516.2442551In the sciences, it is common to use the so-called "big operator" notation to express the iteration of a binary operator (the reducer) over a collection of values. Such a notation typically assumes that the reducer is associative and abstracts the ...
Also Published in:
ACM SIGPLAN Notices: Volume 48 Issue 8 - posterFebruary 2013
Expressing graph algorithms using generalized active messages
PPoPP '13: Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programmingPages 289–290https://doi.org/10.1145/2442516.2442549Recently, graph computation has emerged as an important class of high-performance computing application whose characteristics differ markedly from those of traditional, compute-bound, kernels. Libraries such as BLAS, LAPACK, and others have been ...
Also Published in:
ACM SIGPLAN Notices: Volume 48 Issue 8 - research-articleFebruary 2013
ZOOMM: a parallel web browser engine for multicore mobile devices
- Calin Cascaval,
- Seth Fowler,
- Pablo Montesinos-Ortego,
- Wayne Piekarski,
- Mehrdad Reshadi,
- Behnam Robatmili,
- Michael Weber,
- Vrajesh Bhavsar
PPoPP '13: Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programmingPages 271–280https://doi.org/10.1145/2442516.2442543We explore the challenges in expressing and managing concurrency in browsers on mobile devices. Browsers are complex applications that implement multiple standards, need to support legacy behavior, and are highly dynamic and interactive. We present ...
Also Published in:
ACM SIGPLAN Notices: Volume 48 Issue 8 - research-articleFebruary 2013
Using hardware transactional memory to correct and simplify and readers-writer lock algorithm
PPoPP '13: Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programmingPages 261–270https://doi.org/10.1145/2442516.2442542Designing correct synchronization algorithms is notoriously difficult, as evidenced by a bug we have identified that has apparently gone unnoticed in a well-known synchronization algorithm for nearly two decades. We use hardware transactional memory (...
Also Published in:
ACM SIGPLAN Notices: Volume 48 Issue 8 - research-articleFebruary 2013
TigerQuoll: parallel event-based JavaScript
PPoPP '13: Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programmingPages 251–260https://doi.org/10.1145/2442516.2442541JavaScript, the most popular language on the Web, is rapidly moving to the server-side, becoming even more pervasive. Still, JavaScript lacks support for shared memory parallelism, making it challenging for developers to exploit multicores present in ...
Also Published in:
ACM SIGPLAN Notices: Volume 48 Issue 8 - research-articleFebruary 2013
The tasks with effects model for safe concurrency
PPoPP '13: Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programmingPages 239–250https://doi.org/10.1145/2442516.2442540Today's widely-used concurrent programming models either provide weak safety guarantees, making it easy to write code with subtle errors, or are limited in the class of programs that they can express. We propose a new concurrent programming model based ...
Also Published in:
ACM SIGPLAN Notices: Volume 48 Issue 8 - research-articleFebruary 2013
StreamScan: fast scan algorithms for GPUs without global barrier synchronization
PPoPP '13: Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programmingPages 229–238https://doi.org/10.1145/2442516.2442539Scan (also known as prefix sum) is a very useful primitive for various important parallel algorithms, such as sort, BFS, SpMV, compaction and so on. Current state of the art of GPU based scan implementation consists of three consecutive Reduce-Scan-Scan ...
Also Published in:
ACM SIGPLAN Notices: Volume 48 Issue 8 - research-articleFebruary 2013
Scalable deterministic replay in a parallel full-system emulator
PPoPP '13: Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programmingPages 207–218https://doi.org/10.1145/2442516.2442537Full-system emulation has been an extremely useful tool in developing and debugging systems software like operating systems and hypervisors. However, current full-system emulators lack the support for deterministic replay, which limits the ...
Also Published in:
ACM SIGPLAN Notices: Volume 48 Issue 8 - research-articleFebruary 2013
Parallel suffix array and least common prefix for the GPU
PPoPP '13: Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programmingPages 197–206https://doi.org/10.1145/2442516.2442536Suffix Array (SA) is a data structure formed by sorting the suffixes of a string into lexicographic order. SAs have been used in a variety of applications, most notably in pattern matching and Burrows-Wheeler Transform (BWT) based lossless data ...
Also Published in:
ACM SIGPLAN Notices: Volume 48 Issue 8 - research-articleFebruary 2013
Parallel schedule synthesis for attribute grammars
PPoPP '13: Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programmingPages 187–196https://doi.org/10.1145/2442516.2442535We examine how to synthesize a parallel schedule of structured traversals over trees. In our system, programs are declaratively specified as attribute grammars. Our synthesizer automatically, correctly, and quickly schedules the attribute grammar as a ...
Also Published in:
ACM SIGPLAN Notices: Volume 48 Issue 8 - research-articleFebruary 2013
Ownership passing: efficient distributed memory programming on multi-core systems
PPoPP '13: Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programmingPages 177–186https://doi.org/10.1145/2442516.2442534The number of cores in multi- and many-core high-performance processors is steadily increasing. MPI, the de-facto standard for programming high-performance computing systems offers a distributed memory programming model. MPI's semantics force a copy ...
Also Published in:
ACM SIGPLAN Notices: Volume 48 Issue 8 - research-articleFebruary 2013
Online-ABFT: an online algorithm based fault tolerance scheme for soft error detection in iterative methods
PPoPP '13: Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programmingPages 167–176https://doi.org/10.1145/2442516.2442533Soft errors are one-time events that corrupt the state of a computing system but not its overall functionality. Large supercomputers are especially susceptible to soft errors because of their large number of components. Soft errors can generally be ...
Also Published in:
ACM SIGPLAN Notices: Volume 48 Issue 8 - research-articleFebruary 2013
NUMA-aware reader-writer locks
PPoPP '13: Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programmingPages 157–166https://doi.org/10.1145/2442516.2442532Non-Uniform Memory Access (NUMA) architectures are gaining importance in mainstream computing systems due to the rapid growth of multi-core multi-chip machines. Extracting the best possible performance from these new machines will require us to revisit ...
Also Published in:
ACM SIGPLAN Notices: Volume 48 Issue 8 - research-articleFebruary 2013
Morph algorithms on GPUs
PPoPP '13: Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programmingPages 147–156https://doi.org/10.1145/2442516.2442531There is growing interest in using GPUs to accelerate graph algorithms such as breadth-first search, computing page-ranks, and finding shortest paths. However, these algorithms do not modify the graph structure, so their implementation is relatively ...
Also Published in:
ACM SIGPLAN Notices: Volume 48 Issue 8 - research-articleFebruary 2013
Ligra: a lightweight graph processing framework for shared memory
PPoPP '13: Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programmingPages 135–146https://doi.org/10.1145/2442516.2442530There has been significant recent interest in parallel frameworks for processing graphs due to their applicability in studying social networks, the Web graph, networks in biology, and unstructured meshes in scientific simulation. Due to the desire to ...
Also Published in:
ACM SIGPLAN Notices: Volume 48 Issue 8