Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

iThreads: A Threading Library for Parallel Incremental Computation

Published: 14 March 2015 Publication History

Abstract

Incremental computation strives for efficient successive runs of applications by re-executing only those parts of the computation that are affected by a given input change instead of recomputing everything from scratch. To realize these benefits automatically, we describe iThreads, a threading library for parallel incremental computation. iThreads supports unmodified shared-memory multithreaded programs: it can be used as a replacement for pthreads by a simple exchange of dynamically linked libraries, without even recompiling the application code. To enable such an interface, we designed algorithms and an implementation to operate at the compiled binary code level by leveraging MMU-assisted memory access tracking and process-based thread isolation. Our evaluation on a multicore platform using applications from the PARSEC and Phoenix benchmarks and two case-studies shows significant performance gains.

References

[1]
PaX Team. PaX Address Space Layout Randomization (ASLR). (http://pax.grsecurity.net/docs/aslr.txt).
[2]
Monte-Carlo Method. (http://cdac.in/index.aspx?id=evhpcpthreadbenchmarkskernels).
[3]
Pigz: A parallel implementation of gzip for modern multi-processor, multi-core machines. (http://zlib.net/pigz/).
[4]
Pthreads Memory Model. (http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/v1chap04.html).
[5]
M. Abadi, B. W. Lampson, and J.-J. Levy. Analysis and caching of dependencies. In proceedings of the First ACM SIGPLAN International Conference on Functional Programming (ICFP), 1996.
[6]
U. A. Acar. Self-Adjusting Computation. PhD thesis, Department of Computer Science, Carnegie Mellon University, May 2005.
[7]
U. A. Acar, G. E. Blelloch, M. Blume, R. Harper, and K. Tangwongsan. An experimental analysis of self-adjusting computation. In proceedings of the 2006 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2009.
[8]
U. A. Acar, A. Cotter, B. Hudson, and D. Turkoglu. Dynamic well-spaced point sets. In proceedings of the Twenty-sixth Annual Symposium on Computational Geometry (SoCG), 2010.
[9]
S. V. Adve and H.-J. Boehm. Memory models: A case for rethinking parallel languages and hardware. Communication of ACM (CACM), 2010.
[10]
P. K. Agarwal, L. J. Guibas, H. Edelsbrunner, J. Erickson, M. Isard, S. Har-Peled, J. Hershberger, C. Jensen, L. Kavraki, P. Koehl, M. Lin, D. Manocha, D. Metaxas, B. Mirtich, D. Mount, S. Muthukrishnan, D. Pai, E. Sacks, J. Snoeyink, S. Suri, and O. Wolefson. Algorithmic issues in modeling motion. ACM Computing Survey, 2002.
[11]
P. S. Almeida, C. Baquero, and V. Fonte. Interval tree clocks. In proceedings of the 12th International Conference on Principles of Distributed Systems (OPODIS), 2008.
[12]
G. Altekar and I. Stoica. ODR: Output-deterministic Replay for Multicore Debugging. In proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles (SOSP), 2009.
[13]
A. Aviram, S.-C. Weng, S. Hu, and B. Ford. Efficient System-enforced Deterministic Parallelism. In proceedings of the9th USENIX Conference on Operating Systems Design and Implementation (OSDI), 2010.
[14]
T. Bergan, O. Anderson, J. Devietti, L. Ceze, and D. Grossman. CoreDet: A Compiler and Runtime System for Deterministic Multithreaded Execution. In proceedings of the fifteenth edition of Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2010.
[15]
T. Bergan, N. Hunt, L. Ceze, and S. D. Gribble. Deterministic Process Groups in dOS. In proceedings of the 9th USENIX conference on Operating Systems Design and Implementation (OSDI), 2010.
[16]
E. D. Berger, B. G. Zorn, and K. S. McKinley. Composing High-Performance Memory Allocators. In proceedings of the ACM SIGPLAN 2001 conference on Programming Language Design and Implementation (PLDI), 2001.
[17]
E. D. Berger, T. Yang, T. Liu, and G. Novark. Grace: Safe Multithreaded Programming for C/C++. In proceedings of the 24th ACM SIGPLAN conference on Object Oriented Programming Systems Languages and Applications (OOPSLA), 2009.
[18]
P. Bhatotia, A. Wieder, I. E. Akkus, R. Rodrigues, and U. A. Acar. Large-scale incremental data processing with change propagation. In USENIX Workshop on Hot Topics in Cloud Computing (HotCloud), 2011.
[19]
P. Bhatotia, A. Wieder, R. Rodrigues, U. A. Acar, and R. Pasquini. Incoop: MapReduce for Incremental Computations. In proceedings of the 2nd ACM Symposium on Cloud Computing (SoCC), 2011.
[20]
P. Bhatotia, M. Dischinger, R. Rodrigues, and U. A. Acar. Slider: Incremental Sliding-Window Computations for Large-Scale Data Analysis. In Technical Report: MPI-SWS-2012-004, 2012.
[21]
P. Bhatotia, R. Rodrigues, and A. Verma. Shredder: GPU- Accelerated Incremental Storage and Computation. In proceedings of the 10th USENIX conference on File and Storage Technologies (FAST), 2012.
[22]
P. Bhatotia, U. A. Acar, F. Junqueira, and R. Rodrigues. Slider: Incremental Sliding Window Analytics. In proceedings of the 15th Annual ACM/IFIP/USENIX Middleware conference (Middleware), 2014.
[23]
P. Bhatotia, A. Wieder, R. Rodrigues, and U. A. Acar. Incremental MapReduce Computations. In book chapter of advances in data processing techniques in the era of Big Data, CRC Press, 2014.
[24]
C. Bienia, S. Kumar, J. P. Singh, and K. Li. The PARSEC Benchmark Suite: Characterization and Architectural Implications. In proceedings of the 17th international conference on Parallel Architectures and Compilation Techniques (PACT), 2008.
[25]
G. S. Brodal and R. Jacob. Dynamic planar convex hull. In proceedings of the 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002.
[26]
Y. Bu, B. Howe, M. Balazinska, and M. D. Ernst. HaLoop: efficient iterative data processing on large clusters. In proceedings of VLDB Endowment (VLDB), 2010.
[27]
S. Burckhardt, D. Leijen, C. Sadowski, J. Yi, and T. Ball. Two for the Price of One: A Model for Parallel and Incremental Computation. In proceedings of the 2011 ACM international conference on Object Oriented Programming Systems Languages and Applications (OOPSLA), 2011.
[28]
J. B. Carter, J. K. Bennett, and W. Zwaenepoel. Implementation and Performance of Munin. In proceedings of the Thirteenth ACM Symposium on Operating Systems Principles (SOSP), 1991.
[29]
Y. Chen, J. Dunfield, and U. A. Acar. Type-Directed Automatic Incrementalization. In proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation (PLDI), Jun 2012.
[30]
Y.-J. Chiang and R. Tamassia. Dynamic algorithms in computational geometry. proceedings of the IEEE, 1992.
[31]
T. Condie, N. Conway, P. Alvaro, J. M. Hellerstein, K. Elmeleegy, and R. Sears. MapReduce online. In proceedings of the 7th USENIX Conference on Networked Systems Design and Implementation (NSDI), 2010.
[32]
H. Cui, J. Wu, C.-C. Tsai, and J. Yang. Stable deterministic multithreading through schedule memoization. In proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation (OSDI), 2010.
[33]
H. Cui, J. Wu, J. Gallagher, H. Guo, and J. Yang. Efficient deterministic multithreading through schedule relaxation. In proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles (SOSP), 2011.
[34]
H. Cui, J. Simsa, Y.-H. Lin, H. Li, B. Blum, X. Xu, J. Yang, G. A. Gibson, and R. E. Bryant. Parrot: a practical runtime for deterministic, stable, and reliable threads. In proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles (SOSP), 2013.
[35]
J. Dean and S. Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. In proceedings of the 6th conference on Symposium on Operating Systems Design and Implementation (OSDI), 2004.
[36]
A. Demers, T. Reps, and T. Teitelbaum. Incremental evaluation for attribute grammars with application to syntax-directed editors. In proceedings of the 8th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), 1981.
[37]
C. Demetrescu, I. Finocchi, and G. Italiano. Handbook on Data Structures and Applications, chapter 36: Dynamic Graphs. Dinesh Mehta and Sartaj Sahni (eds.), CRC Press Series, in Computer and Information Science, 2005.
[38]
C. Demetrescu, I. Finocchi, and G. Italiano. Handbook on Data Structures and Applications, chapter 35: Dynamic Trees. Dinesh Mehta and Sartaj Sahni (eds.), CRC Press Series, in Computer and Information Science, 2005.
[39]
J. Devietti, B. Lucia, L. Ceze, and M. Oskin. DMP: Deterministic Shared Memory Multiprocessing. In proceedings of the 14th international conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2009.
[40]
J. Devietti, J. Nelson, T. Bergan, L. Ceze, and D. Grossman. RCDC: A Relaxed Consistency Deterministic Computer. proceedings of the 16th international conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2011.
[41]
G. W. Dunlap, S. T. King, S. Cinar, M. A. Basrai, and P. M. Chen. ReVirt: Enabling Intrusion Analysis Through Virtual- machine Logging and Replay. proceedings of the 5th symposium on Operating Systems Design and Implementation (OSDI), 2002.
[42]
D. Eppstein, Z. Galil, and G. F. Italiano. Dynamic graph algorithms. In M. J. Atallah, editor, Algorithms and Theory of Computation Handbook, chapter 8. CRC Press, 1999.
[43]
C. Flanagan and S. N. Freund. FastTrack: Efficient and Precise Dynamic Race Detection. In proceedings of the 2009 ACM SIGPLAN conference on Programming Language Design and Implementation (PLDI), 2009.
[44]
K. Gharachorloo, D. Lenoski, J. Laudon, P. Gibbons, A. Gupta, and J. Hennessy. Memory Consistency and Event Ordering in Scalable Shared-memory Multiprocessors. In proceedings of the 17th Annual International Symposium on Computer Architecture (ISCA), 1990.
[45]
L. Guibas. Modeling motion. In Handbook of Discrete and Computational Geometry. 2004.
[46]
L. J. Guibas. Kinetic data structures: a state of the art report. In proceedings of the third Workshop on the Algorithmic Foundations of Robotics (WAFR), 1998.
[47]
P. K. Gunda, L. Ravindranath, C. A. Thekkath, Y. Yu, and L. Zhuang. Nectar: automatic management of data and computation in datacenters. In proceedings of the 9th USENIX conference on Operating systems design and implementation (OSDI), 2010.
[48]
M. Hammer, U. A. Acar, M. Rajagopalan, and A. Ghuloum. A proposal for parallel self-adjusting computation. In proceedings of the 2007 workshop on Declarative Aspects of Multicore Programming (DAMP), 2007.
[49]
M. A. Hammer, U. A. Acar, and Y. Chen. CEAL: a C-Based Language for Self-Adjusting Computation. In proceedings of the 2009 ACM SIGPLAN conference on Programming Language Design and Implementation (PLDI), 2009.
[50]
M. A. Hammer, K. Y. Phang, M. Hicks, and J. S. Foster. Adapton: Composable, demand-driven incremental computation. In proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2014.
[51]
A. Heydon, R. Levin, and Y. Yu. Caching Function Calls Using Precise Dependencies. In proceedings of the ACM SIGPLAN 2000 conference on Programming Language Design and Implementation (PLDI), 2000.
[52]
N. Honarmand and J. Torrellas. RelaxReplay: Record and Replay for Relaxed-consistency Multiprocessors. In proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2014.
[53]
R. Hoover. Incremental Graph Evaluation. PhD thesis, Department of Computer Science, Cornell University, May 1987.
[54]
D. R. Hower, P. Dudnik, M. D. Hill, and D. A. Wood. Calvin: Deterministic or not? free will to choose. In proceedings of the 2011 IEEE 17th International Symposium on High Performance Computer Architecture (HPCA), 2011.
[55]
M. Isard, M. Budiu, Y. Yu, A. Birrell, and D. Fetterly. Dryad: distributed data-parallel programs from sequential building blocks. In proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems (EuroSys), 2007.
[56]
P. Keleher, A. L. Cox, S. Dwarkadas, and W. Zwaenepoel. Treadmarks: Distributed shared memory on standard workstations and operating systems. In proceedings of the USENIX Winter 1994 Technical Conference on USENIX Winter 1994 Technical Conference (USENIX), 1994.
[57]
O. Laadan, N. Viennot, and J. Nieh. Transparent, lightweight application execution replay on commodity multiprocessor operating systems. In proceedings of the ACM SIGMETRICS international conference on Measurement and modeling of computer systems (SIGMETRICS), 2010.
[58]
L. Lamport. How to make a correct multiprocess program execute correctly on a multiprocessor. IEEE Transactions on Computers, 1997.
[59]
D. Lee, B. Wester, K. Veeraraghavan, S. Narayanasamy, P. M. Chen, and J. Flinn. Respec: Efficient Online Multiprocessor Replay via Speculation and External Determinism. In proceedings of the 15th edition of ASPLOS on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2010.
[60]
R. Ley-Wild, M. Fluet, and U. A. Acar. Compiling Self-Adjusting Programs with Continuations. In proceedings of the 13th ACM SIGPLAN International Conference on Functional programming (ICFP), 2008.
[61]
R. Ley-Wild, U. A. Acar, and M. Fluet. A cost semantics for self-adjusting computation. In proceedings of the 26th Annual ACM Symposium on Principles of Programming Languages (POPL), 2009.
[62]
T. Liu and E. D. Berger. SHERIFF: Precise Detection and Automatic Mitigation of False Sharing. In proceedings of the 2011 ACM International Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA), 2011.
[63]
T. Liu, C. Curtsinger, and E. D. Berger. Dthreads: Efficient Deterministic Multithreading. In proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles (SOSP), 2011.
[64]
D. Logothetis, C. Olston, B. Reed, K. C. Webb, and K. Yocum. Stateful Bulk Processing for Incremental Analytics. In proceedings of the 1st ACM Symposium on Cloud Computing (SoCC), 2010.
[65]
F. Mattern. Virtual Time and Global States of Distributed Systems. In Parallel and Distributed Algorithms, 1989.
[66]
A. Muthitacharoen, B. Chen, and D. Mazieres. A Low-bandwidth Network File System. In proceedings of the 18th ACM Symposium on Operating Systems Principles (SOSP), 2001.
[67]
M. Olszewski, J. Ansel, and S. Amarasinghe. Kendo: Efficient Deterministic Multithreading in Software. In proceedings of the 14th international conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2009.
[68]
M. H. Overmars and J. van Leeuwen. Maintenance of configurations in the plane. Journal of Computer and System Sciences, 23:166--204, 1981.
[69]
S. Park, Y. Zhou, W. Xiong, Z. Yin, R. Kaushik, K. H. Lee, and S. Lu. PRES: Probabilistic Replay with Execution Sketching on Multiprocessors. In proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles (SOSP), 2009.
[70]
L. Popa, M. Budiu, Y. Yu, and M. Isard. DryadInc: Reusing Work in Large-scale Computations. In USENIX Workshop on Hot Topics in Cloud Computing (HotCloud), 2009.
[71]
E. Pozniansky and A. Schuster. Efficient On-the-Fly Data Race Detection in Multithreaded C++ Programs. In proceedings of the ninth ACM SIGPLAN symposium on Principles and Practice of Parallel Programming (PPoPP), 2003.
[72]
W. Pugh. Incremental Computation via Function Caching. PhD thesis, Department of Computer Science, Cornell University, Aug. 1988.
[73]
G. Ramalingam and T. Reps. A Categorized Bibliography on Incremental Computation. In proceedings of the 20th ACM SIGPLAN-SIGACT symposium on Principles of Programming Languages (POPL), 1993.
[74]
C. Ranger, R. Raghuraman, A. Penmetsa, G. Bradski, and C. Kozyrakis. Evaluating MapReduce for Multi-core and Multiprocessor Systems. In proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture (HPCA), 2007.
[75]
M. Ronsse and K. De Bosschere. RecPlay: A Fully Integrated Practical Record/Replay System. ACM Transactions on Computer Systems (TOCS), 1999.
[76]
H.-W. Tseng and D. M. Tullsen. Data-Triggered Threads: Eliminating Redundant Computation. In proceedings of 17th International Symposium on High Performance Computer Architecture (HPCA), 2011.
[77]
H.-W. Tseng and D. M. Tullsen. Software Data-triggered Threads. In proceedings of the ACM International Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA), 2012.
[78]
H.-W. Tseng and D. M. Tullsen. CDTT: Compiler-generated data-triggered threads. In proceedings of 17th International Symposium on High Performance Computer Architecture (HPCA), 2014.
[79]
K. Veeraraghavan, P. M. Chen, J. Flinn, and S. Narayanasamy. Detecting and Surviving Data Races Using Complementary Schedules. In proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles (SOSP), 2011.
[80]
K. Veeraraghavan, D. Lee, B. Wester, J. Ouyang, P. M. Chen, J. Flinn, and S. Narayanasamy. DoublePlay: Parallelizing Sequential Logging and Replay. In proceedings of the sixteenth international conference on Architectural support for programming languages and operating system (ASPLOS), 2011.
[81]
N. Viennot, S. Nair, and J. Nieh. Transparent Mutable Replay for Multicore Debugging and Patch Validation. In proceedings of the 18th International Conference on Architectural Support for Programming Languages and Operating Systems(ASPLOS), 2013.
[82]
W. Xiong, S. Park, J. Zhang, Y. Zhou, and Z. Ma. Ad hoc Synchronization Considered Harmful. In proceedings of the 9th USENIX conference on Operating Systems Design and Implementation (OSDI), 2010.

Cited By

View all
  • (2017)Function call interception techniquesSoftware: Practice and Experience10.1002/spe.250148:3(385-401)Online publication date: 16-May-2017
  • (2024)From Batch to Stream: Automatic Generation of Online AlgorithmsProceedings of the ACM on Programming Languages10.1145/36564188:PLDI(1014-1039)Online publication date: 20-Jun-2024
  • (2021)Kard: lightweight data race detection with per-thread memory protectionProceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3445814.3446727(647-660)Online publication date: 19-Apr-2021

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices
ACM SIGPLAN Notices  Volume 50, Issue 4
ASPLOS '15
April 2015
676 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/2775054
  • Editor:
  • Andy Gill
Issue’s Table of Contents
  • cover image ACM Conferences
    ASPLOS '15: Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems
    March 2015
    720 pages
    ISBN:9781450328357
    DOI:10.1145/2694344
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 March 2015
Published in SIGPLAN Volume 50, Issue 4

Check for updates

Author Tags

  1. concurrent dynamic dependence graph (CDDG)
  2. incremental computation
  3. memoization
  4. release consistency (RC) memory model
  5. self-adjusting computation
  6. shared-memory multithreading

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)12
  • Downloads (Last 6 weeks)2
Reflects downloads up to 24 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2017)Function call interception techniquesSoftware: Practice and Experience10.1002/spe.250148:3(385-401)Online publication date: 16-May-2017
  • (2024)From Batch to Stream: Automatic Generation of Online AlgorithmsProceedings of the ACM on Programming Languages10.1145/36564188:PLDI(1014-1039)Online publication date: 20-Jun-2024
  • (2021)Kard: lightweight data race detection with per-thread memory protectionProceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3445814.3446727(647-660)Online publication date: 19-Apr-2021
  • (2021)Efficient Parallel Self-Adjusting ComputationProceedings of the 33rd ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3409964.3461799(59-70)Online publication date: 6-Jul-2021
  • (2019)Incremental Sliding Window AnalyticsEncyclopedia of Big Data Technologies10.1007/978-3-319-77525-8_156(1007-1015)Online publication date: 20-Feb-2019
  • (2019)Approximate Computing for Stream AnalyticsEncyclopedia of Big Data Technologies10.1007/978-3-319-77525-8_153(90-97)Online publication date: 20-Feb-2019
  • (2019)Privacy-Preserving Data AnalyticsEncyclopedia of Big Data Technologies10.1007/978-3-319-77525-8_152(1292-1300)Online publication date: 20-Feb-2019
  • (2019)Incremental Approximate ComputingEncyclopedia of Big Data Technologies10.1007/978-3-319-77525-8_151(1000-1007)Online publication date: 20-Feb-2019
  • (2018)D4: fast concurrency debugging with parallel differential analysisACM SIGPLAN Notices10.1145/3296979.319239053:4(359-373)Online publication date: 11-Jun-2018
  • (2018)Thread-safe reactive programmingProceedings of the ACM on Programming Languages10.1145/32764772:OOPSLA(1-30)Online publication date: 24-Oct-2018

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media