Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

ProteusTM: Abstraction Meets Performance in Transactional Memory

Published: 25 March 2016 Publication History
  • Get Citation Alerts
  • Abstract

    The Transactional Memory (TM) paradigm promises to greatly simplify the development of concurrent applications. This led, over the years, to the creation of a plethora of TM implementations delivering wide ranges of performance across workloads. Yet, no universal implementation fits each and every workload. In fact, the best TM in a given workload can reveal to be disastrous for another one. This forces developers to face the complex task of tuning TM implementations, which significantly hampers their wide adoption. In this paper, we address the challenge of automatically identifying the best TM implementation for a given workload. Our proposed system, ProteusTM, hides behind the TM interface a large library of implementations. Underneath, it leverages a novel multi-dimensional online optimization scheme, combining two popular learning techniques: Collaborative Filtering and Bayesian Optimization.
    We integrated ProteusTM in GCC and demonstrate its ability to switch between TMs and adapt several configuration parameters (e.g., number of threads). We extensively evaluated ProteusTM, obtaining average performance <3% from optimal, and gains up to 100x over static alternatives.

    References

    [1]
    Allon Adir, Dave Goodman, Daniel Hershcovich, Oz Hershkovitz, Bryan Hickerson, Karen Holtz, Wisam Kadry, Anatoly Koyfman, John Ludden, Charles Meissner, Amir Nahir, Randall R. Pratt, Mike Schiffli, Brett St. Onge, Brian Thompto, Elena Tsanko, and Avi Ziv. Verification of Transactional Memory in POWER8. In Proceedings of the Annual Design Automation Conference, DAC, pages 1--6, 2014.
    [2]
    Michèle Basseville and Igor V. Nikiforov. Detection of Abrupt Changes: Theory and Application. Prentice-Hall, Inc., Upper Saddle River, NJ, USA, 1993.
    [3]
    James Bergstra, R. Bardenet, Yoshua Bengio, and Balázs Kégl. Algorithms for Hyper-Parameter Optimization. In Proceedings of the Annual Conference on Neural Information Processing Systems, NIPS, Granada, Spain, 2011.
    [4]
    James Bergstra and Yoshua Bengio. Random search for hyper-parameter optimization. J. Mach. Learn. Res., 13(1):281--305, February 2012.
    [5]
    Christopher M. Bishop. Pattern Recognition and Machine Learning. Springer-Verlag New York, Inc., 2006.
    [6]
    Christopher M. Bishop. Pattern Recognition and Machine Learning. 2007.
    [7]
    Leo Breiman. Bagging predictors. Mach. Learn., 24(2):123--140, August 1996.
    [8]
    Eric Brochu, Vlad M Cora, and Nando de Freitas. A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning. eprint arXiv:1012.2599, arXiv.org, December 2010.
    [9]
    Chi Cao Minh, JaeWoong Chung, Christos Kozyrakis, and Kunle Olukotun. STAMP: Stanford Transactional Applications for Multi-Processing. In Proceedings of The IEEE International Symposium on Workload Characterization, IISWC, 2008.
    [10]
    Michael J. Carey, David J. DeWitt, and Jeffrey F. Naughton. The oo7 benchmark. SIGMOD Rec., 22(2):12--21, June 1993.
    [11]
    Calin Cascaval, Colin Blundell, Maged Michael, Harold W Cain, Peng Wu, Stefanie Chiras, and Siddhartha Chatterjee. Software transactional memory: why is it only a research toy? Communications of the ACM, 51(11):40--46, 2008.
    [12]
    Márcio Castro, LuísFabrícioWanderley Góes, LuizGustavo Fernandes, and Jean-François Méhaut. Dynamic Thread Mapping Based on Machine Learning for Transactional Memory Applications. In Proceedings of the European Conference on Parallel Processing, Euro-Par, pages 465--476. 2012.
    [13]
    Carlo Curino, Evan P.C. Jones, Samuel Madden, and Hari Balakrishnan. Workload-aware database monitoring and consolidation. In Proceedings of the ACM International Conference on Management of Data, SIGMOD, pages 313--324, 2011.
    [14]
    Luke Dalessandro, François Carouge, Sean White, Yossi Lev, Mark Moir, Michael L. Scott, and Michael F. Spear. Hybrid NOrec: A Case Study in the Effectiveness of Best Effort Hardware Transactional Memory. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS, pages 39--52, 2011.
    [15]
    Luke Dalessandro, Michael F. Spear, and Michael L. Scott. NOrec: Streamlining STM by Abolishing Ownership Records. In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP, pages 67--78, 2010.
    [16]
    Abhinandan S. Das, Mayur Datar, Ashutosh Garg, and Shyam Rajaram. Google news personalization: Scalable online collaborative filtering. In Proceedings of the International Conference on World Wide Web, WWW, pages 271--280, 2007.
    [17]
    Howard David, Eugene Gorbatov, Ulf R. Hanebutte, Rahul Khanna, and Christian Le. RAPL: Memory Power Estimation and Capping. In Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design, ISLPED, pages 189--194, 2010.
    [18]
    Tudor David, Rachid Guerraoui, and Vasileios Trigonakis. Everything You Always Wanted to Know About Synchronization but Were Afraid to Ask. In Proceedings of the ACM Symposium on Operating Systems Principles, SOSP, pages 33--48, 2013.
    [19]
    James Davidson, Benjamin Liebald, Junning Liu, Palash Nandy, Taylor Van Vleet, Ullas Gargi, Sujoy Gupta, Yu He, Mike Lambert, Blake Livingston, and Dasarathi Sampath. The YouTube Video Recommendation System. In Proceedings of the ACM Conference on Recommender Systems, RecSys, pages 293--296, 2010.
    [20]
    Christina Delimitrou and Christos Kozyrakis. Paragon: QoS-aware Scheduling for Heterogeneous Datacenters. In Proceedings of the Eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS, pages 77--88, 2013.
    [21]
    Christina Delimitrou and Christos Kozyrakis. Quasar: resource-efficient and QoS-aware cluster management. In Proceedings of Architectural Support for Programming Languages and Operating Systems, ASPLOS, pages 127--144, 2014.
    [22]
    Dave Dice, Ori Shalev, and Nir Shavit. Transactional Locking II. In Proceedings of the International Conference on Distributed Computing, DISC, pages 194--208, 2006.
    [23]
    David Dice, Yossi Lev, Mark Moir, and Daniel Nussbaum. Early experience with a commercial hardware transactional memory implementation. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS, pages 157--168, 2009.
    [24]
    Diego Didona, Pascal Felber, Derin Harmanci, Paolo Romano, and Joerg Schenker. Identifying the Optimal Level of Parallelism in Transactional Memory Applications. Computing Journal, pages 1--21, December 2013.
    [25]
    Nuno Diegues and Paolo Romano. Self-Tuning Intel Transactional Synchronization Extensions. In Proceedings of the USENIX International Conference on Autonomic Computing, pages 209--219, Philadelphia, PA, 2014.
    [26]
    Nuno Diegues, Paolo Romano, and Luıs Rodrigues. Virtues and Limitations of Commodity Hardware Transactional Memory. In Proceedings of the International Conference on Parallel Architectures and Compilation, PACT, pages 3--14, 2014.
    [27]
    Aleksandar Dragojević, Rachid Guerraoui, and Michal Kapalka. Stretching Transactional Memory. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI, pages 155--165, 2009.
    [28]
    Songyun Duan, Vamsidhar Thummala, and Shivnath Babu. Tuning Database Configuration Parameters with iTuned. PVLDB, 2(1):1246--1257, 2009.
    [29]
    Pascal Felber, Christof Fetzer, and Torvald Riegel. Dynamic Performance Tuning of Word-based Software Transactional Memory. In Proc. of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP, pages 237--246, 2008.
    [30]
    Rachid Guerraoui, Maurice Herlihy, and Bastian Pochon. Polymorphic Contention Management. In Proceedings of the International Conference on Distributed Computing, DISC, pages 303--323, 2005.
    [31]
    Rachid Guerraoui, Maurice Herlihy, and Bastian Pochon. Toward a Theory of Transactional Contention Managers. In Proceedings of the ACM Symposium on Principles of Distributed Computing, PODC, pages 258--264, 2005.
    [32]
    Rachid Guerraoui, Michal Kapalka, and Jan Vitek. STMBench7: A Benchmark for Software Transactional Memory. In Proceedings of the ACM SIGOPS European Conference on Computer Systems, EuroSys, pages 315--324, 2007.
    [33]
    Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, and Ian H. Witten. The weka data mining software: An update. SIGKDD Explor. Newsl., 11(1):10--18, November 2009.
    [34]
    Tim Harris, James Larus, and Ravi Rajwar. Transactional Memory, 2nd Edition. Morgan and Claypool Publishers, 2nd edition, 2010.
    [35]
    Maurice Herlihy and J. Eliot B. Moss. Transactional memory: Architectural support for lock-free data structures. In Proceedings of the Annual International Symposium on Computer Architecture, ISCA, pages 289--300, 1993.
    [36]
    M. Horowitz, T. Indermaur, and R. Gonzalez. Low-power digital design. In Proceedings of the IEEE Symposium on Low Power Electronics, pages 8--11, Oct 1994.
    [37]
    Frank Hutter, Holger H. Hoos, and Kevin Leyton-Brown. Sequential Model-based Optimization for General Algorithm Configuration. In Proceedings of the International Conference on Learning and Intelligent Optimization, LION, pages 507--523, 2011.
    [38]
    Intel Corporation. Intel Transactional Memory Compiler and Runtime Application Binary Interface. https://gcc.gnu.org/wiki/TransactionalMemory?action=AttachFile&do=get&target=Intel-TM-ABI-1_1_20060506.pdf, 2009.
    [39]
    Christian Jacobi, Timothy Slegel, and Dan Greiner. Transactional Memory Architecture and Implementation for IBM System Z. In Proceedings of the Annual nternational Symposium on Microarchitecture, MICRO, pages 25--36, 2012.
    [40]
    Donald R. Jones, Matthias Schonlau, and William J. Welch. Efficient Global Optimization of Expensive Black-Box Functions. Journal of Global Optimization, 13(4):455--492, December 1998.
    [41]
    T. Karnagel, R. Dementiev, R. Rajwar, K. Lai, T. Legler, B. Schlegel, and W. Lehner. Improving in-memory database index performance with Intel Transactional Synchronization Extensions. In Proceedings of the IEEE International Symposium on High Performance Computer Architecture, pages 476--487, 2014.
    [42]
    Andi Kleen. Scaling existing lock-based applications with lock elision. Commun. ACM, 57(3):52--56, March 2014.
    [43]
    Per-Ake Larson, Spyros Blanas, Cristian Diaconu, Craig Freedman, Jignesh M. Patel, and Mike Zwilling. High-performance Concurrency Control Mechanisms for Main-memory Databases. Proceedings of the VLDB Endownment, 5(4):298--309, December 2011.
    [44]
    Yossi Lev, Mark Moir, and Dan Nussbaum. Phtm: Phased transactional memory. In Workshop on Transactional Computing (Transact), 2007.
    [45]
    Greg Linden, Brent Smith, and Jeremy York. Amazon.Com Recommendations: Item-to-Item Collaborative Filtering. IEEE Internet Computing, 7(1):76--80, January 2003.
    [46]
    Daniel Lupei, Bogdan Simion, Don Pinto, Matthew Misler, Mihai Burcea, William Krick, and Cristiana Amza. Transactional Memory Support for Scalable and Transparent Parallelization of Multiplayer Games. In Proceedings of the ACM SIGOPS European Conference on Computer Systems, EuroSys, pages 41--54, 2010.
    [47]
    Alexander Matveev and Nir Shavit. Reduced Hardware Transactions: A New Approach to Hybrid Transactional Memory. In Proceedings of the Annual ACM Symposium on Parallelism in Algorithms and Architectures, SPAA, pages 11--22, 2013.
    [48]
    Adam Morrison and Yehuda Afek. Fast Concurrent Queues for x86 Processors. In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP, pages 103--112, 2013.
    [49]
    Yang Ni, Adam Welc, Ali-Reza Adl-Tabatabai, Moshe Bach, Sion Berkowits, James Cownie, Robert Geva, Sergey Kozhukow, Ravi Narayanaswamy, Jeffrey Olivier, Serguei Preis, Bratin Saha, Ady Tal, and Xinmin Tian. Design and Implementation of Transactional Constructs for C/C
    [50]
    . In Proceedings of the ACM SIGPLAN Conference on Object-oriented Programming Systems Languages and Applications, OOPSLA, pages 195--212, 2008.
    [51]
    Takayuki Osogami and Sei Kato. Optimizing System Configurations Quickly by Guessing at the Performance. In Proceedings of the ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS, pages 145--156, 2007.
    [52]
    Sean Owen, Robin Anil, Ted Dunning, and Ellen Friedman. Mahout in Action. Manning Publications Co., Greenwich, CT, USA, 2011.
    [53]
    Victor Pankratius and Ali-Reza Adl-Tabatabai. Software Engineering with Transactional Memory Versus Locks in Practice. Theor. Comp. Sys., 55(3):555--590, October 2014.
    [54]
    Eric Pettijohn, Yanfei Guo, Palden Lama, and Xiaobo Zhou. User-Centric Heterogeneity-Aware MapReduce Job Provisioning in the Public Cloud. In Proceedings of the International Conference on Autonomic Computing, ICAC, pages 137--143, 2014.
    [55]
    Anand Rajaraman and Jeffrey David Ullman. Mining of Massive Datasets. Cambridge University Press, 2011.
    [56]
    Carl Edward Rasmussen and Christopher K. I. Williams. Gaussian Processes for Machine Learning. The MIT Press, 2005.
    [57]
    Carl Ritson and Frederick Barnes. An Evaluation of Intel's Restricted Transactional Memory for CPAs. In Proceedings of Communicating Process Architectures, CPA, pages 271--292, 2013.
    [58]
    Christopher J. Rossbach, Owen S. Hofmann, and Emmett Witchel. Is Transactional Programming Actually Easier? SIGPLAN Not., 45(5):47--56, January 2010.
    [59]
    Wenjia Ruan, Trilok Vyas, Yujie Liu, and Michael Spear. Transactionalizing Legacy Code: An Experience Report Using GCC and Memcached. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS, pages 399--412, New York, NY, USA, 2014. ACM.
    [60]
    Diego Rughetti, Pierangelo Di Sanzo, Bruno Ciciani, and Francesco Quaglia. Machine learning-based self-adjusting concurrency in software transactional memory systems. In Proceedings of the 2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems, MASCOTS '12, pages 278--285, Washington, DC, USA, 2012. IEEE Computer Society.
    [61]
    Pierangelo Di Sanzo, Francesco Del Re, Diego Rughetti, Bruno Ciciani, and Francesco Quaglia. Regulating Concurrency in Software Transactional Memory: An Effective Model-based Approach. In Proceedings of the IEEE International Conference on Self-Adaptive and Self-Organizing Systems, SASO, pages 31--40, 2013.
    [62]
    Xiaoyuan Su and Taghi M. Khoshgoftaar. A survey of collaborative filtering techniques. Adv. in Artif. Intell., 2009:4:2--4:2, January 2009.
    [63]
    Chris Thornton, Frank Hutter, Holger H. Hoos, and Kevin Leyton-Brown. Auto-WEKA: Combined Selection and Hyperparameter Optimization of Classification Algorithms. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD, pages 847--855, 2013.
    [64]
    TPC Council. TPC-C Benchmark. http://www.tpc.org/tpcc, 2011.
    [65]
    Stephen Tu, Wenting Zheng, Eddie Kohler, Barbara Liskov, and Samuel Madden. Speedy Transactions in Multicore In-memory Databases. In Proceedings of the ACM Symposium on Operating Systems Principles, SOSP, pages 18--32, 2013.
    [66]
    Qingping Wang, Sameer Kulkarni, John Cavazos, and Michael Spear. A Transactional Memory with Automatic Performance Tuning. ACM Trans. Archit. Code Optim., 8(4):54:1--54:23, January 2012.
    [67]
    Bowei Xi, Zhen Liu, Mukund Raghavachari, Cathy H. Xia, and Li Zhang. A Smart Hill-climbing Algorithm for Application Server Configuration. In Proceedings of the International Conference on World Wide Web, WWW, pages 287--296, 2004.
    [68]
    Richard M. Yoo, Christopher J. Hughes, Konrad Lai, and Ravi Rajwar. Performance evaluation of Intel Transactional Synchronization Extensions for High-performance Computing. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pages 1--19. ACM, 2013.
    [69]
    Wei Zheng, Ricardo Bianchini, G. John Janakiraman, Jose Renato Santos, and Yoshio Turner. JustRunIt: Experiment-based Management of Virtualized Data Centers. In Proceedings of the Conference on USENIX Annual Technical Conference, ATC, pages 18--18, Berkeley, CA, USA, 2009. USENIX Association.

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 51, Issue 4
    ASPLOS '16
    April 2016
    774 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/2954679
    • Editor:
    • Andy Gill
    Issue’s Table of Contents
    • cover image ACM Conferences
      ASPLOS '16: Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems
      March 2016
      824 pages
      ISBN:9781450340915
      DOI:10.1145/2872362
      • General Chair:
      • Tom Conte,
      • Program Chair:
      • Yuanyuan Zhou
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 25 March 2016
    Published in SIGPLAN Volume 51, Issue 4

    Check for updates

    Author Tags

    1. adaptive system
    2. performance tuning
    3. recommender systems
    4. transactional memory

    Qualifiers

    • Research-article

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)10
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 10 Aug 2024

    Other Metrics

    Citations

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media