Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Efficient parameterized algorithms for data packing

Published: 02 January 2019 Publication History
  • Get Citation Alerts
  • Abstract

    There is a huge gap between the speeds of modern caches and main memories, and therefore cache misses account for a considerable loss of efficiency in programs. The predominant technique to address this issue has been Data Packing: data elements that are frequently accessed within time proximity are packed into the same cache block, thereby minimizing accesses to the main memory. We consider the algorithmic problem of Data Packing on a two-level memory system. Given a reference sequence R of accesses to data elements, the task is to partition the elements into cache blocks such that the number of cache misses on R is minimized. The problem is notoriously difficult: it is NP-hard even when the cache has size 1, and is hard to approximate for any cache size larger than 4. Therefore, all existing techniques for Data Packing are based on heuristics and lack theoretical guarantees. In this work, we present the first positive theoretical results for Data Packing, along with new and stronger negative results. We consider the problem under the lens of the underlying access hypergraphs, which are hypergraphs of affinities between the data elements, where the order of an access hypergraph corresponds to the size of the affinity group. We study the problem parameterized by the treewidth of access hypergraphs, which is a standard notion in graph theory to measure the closeness of a graph to a tree. Our main results are as follows: we show that there is a number q* depending on the cache parameters such that (a) if the access hypergraph of order q* has constant treewidth, then there is a linear-time algorithm for Data Packing; (b) the Data Packing problem remains NP-hard even if the access hypergraph of order q*−1 has constant treewidth. Thus, we establish a fine-grained dichotomy depending on a single parameter, namely, the highest order among access hypegraphs that have constant treewidth; and establish the optimal value q* of this parameter. Finally, we present an experimental evaluation of a prototype implementation of our algorithm. Our results demonstrate that, in practice, access hypergraphs of many commonly-used algorithms have small treewidth. We compare our approach with several state-of-the-art heuristic-based algorithms and show that our algorithm leads to significantly fewer cache-misses.

    Supplementary Material

    WEBM File (a53-goharshady.webm)

    References

    [1]
    Amir Abboud, Virginia Vassilevska Williams, and Joshua Wang. 2016. Approximation and fixed parameter subquadratic algorithms for radius and diameter in sparse graphs. In SODA 2016. 377–391.
    [2]
    Hans L Bodlaender. 1988. Dynamic programming on graphs with bounded treewidth. In ICALP 1988. 105–118.
    [3]
    Hans L Bodlaender. 1996. A linear-time algorithm for finding tree-decompositions of small treewidth. SIAM Journal on computing 25, 6 (1996), 1305–1317.
    [4]
    Hans L Bodlaender. 1997. Treewidth: Algorithmic techniques and results. In MFCS 1997. 19–36.
    [5]
    Hans L Bodlaender. 1998. A partial k-arboretum of graphs with bounded treewidth. Theoretical computer science 209, 1-2 (1998).
    [6]
    Allan Borodin, Sandy Irani, Prabhakar Raghavan, and Baruch Schieber. 1995. Competitive paging with locality of reference. Journal of Computer and System Sciences 50, 2 (1995), 244–258.
    [7]
    Brad Calder, Chandra Krintz, Simmi John, and Todd Austin. 1998. Cache-conscious data placement. In ASPLOS 1998. 139–149.
    [8]
    Krishnendu Chatterjee, Amir Kafshdar Goharshady, Nastaran Okati, and Andreas Pavlogiannis. 2019. Efficient Parameterized Algorithms for Data Packing. In IST Austria Publications Repository. 1–35. https://repository.ist.ac.at/1056/
    [9]
    Krishnendu Chatterjee, Amir Kafshdar Goharshady, and Andreas Pavlogiannis. 2017. JTDec: A Tool for Tree Decompositions in Soot. In ATVA 2017. 59–66.
    [10]
    Krishnendu Chatterjee, Rasmus Ibsen-Jensen, Amir Kafshdar Goharshady, and Andreas Pavlogiannis. 2018. Algorithms for Algebraic Path Properties in Concurrent Systems of Constant Treewidth Components. ACM Trans. Program. Lang. Syst. 40, 3 (2018), 9:1–9:43.
    [11]
    Krishnendu Chatterjee, Rasmus Ibsen-Jensen, and Andreas Pavlogiannis. 2015a. Faster Algorithms for Quantitative Verification in Constant Treewidth Graphs. In CAV 2015. 140–157.
    [12]
    Krishnendu Chatterjee, Rasmus Ibsen-Jensen, Andreas Pavlogiannis, and Prateesh Goyal. 2015b. Faster algorithms for algebraic path properties in recursive state machines with constant treewidth. In POPL 2015. 97–109.
    [13]
    Marek Cygan, Fedor V Fomin, Łukasz Kowalik, Daniel Lokshtanov, Dániel Marx, Marcin Pilipczuk, Michał Pilipczuk, and Saket Saurabh. 2015. Parameterized algorithms.
    [14]
    Chen Ding and Ken Kennedy. 1999. Improving cache performance in dynamic applications through data and computation reorganization at run time. In PLDI 1999. 229–241.
    [15]
    Wei Ding and Mahmut Kandemir. 2014. CApRI: CAche-conscious data Reordering for Irregular codes. ACM SIGMETRICS Performance Evaluation Review 42, 1 (2014), 477–489.
    [16]
    Fedor V Fomin, Daniel Lokshtanov, Michał Pilipczuk, Saket Saurabh, and Marcin Wrochna. 2017. Fully polynomial-time parameterized computations for graphs and matrices of low treewidth. In SODA 2017. 1419–1432.
    [17]
    Jens Gustedt, Ole A Mæhle, and Jan Arne Telle. 2002. The treewidth of Java programs. In ALENEX 2002. 86–97.
    [18]
    Hwansoo Han and Chau-Wen Tseng. 2006. Exploiting locality for irregular scientific codes. IEEE Transactions on Parallel and Distributed Systems 17, 7 (2006), 606–618.
    [19]
    Rahman Lavaee. 2016. The hardness of data packing. In POPL 2016. 232–242.
    [20]
    Konstantinos Panagiotou and Alexander Souza. 2006. On adequate performance measures for paging. In STOC 2006. 487–496.
    [21]
    Erez Petrank and Dror Rawitz. 2002. The hardness of cache conscious data placement. In POPL 2002. 101–112.
    [22]
    Neil Robertson and Paul D Seymour. 1984. Graph minors. III. Planar tree-width. Journal of Combinatorial Theory, Series B 36, 1 (1984), 49–64.
    [23]
    Neil Robertson and Paul D. Seymour. 1986. Graph minors. II. Algorithmic aspects of tree-width. Journal of algorithms 7, 3 (1986), 309–322.
    [24]
    Daniel D Sleator and Robert E Tarjan. 1985. Amortized efficiency of list update and paging rules. Commun. ACM 28, 2 (1985), 202–208.
    [25]
    Khalid Omar Thabit. 1982. Cache management by the compiler. Ph.D. Dissertation. Rice University.
    [26]
    Mikkel Thorup. 1998. All structured programs have small tree width and good register allocation. Information and Computation 142, 2 (1998), 159–181.
    [27]
    Thomas van Dijk, Jan-Pieter van den Heuvel, and Wouter Slob. 2006. Computing treewidth with LibTW. Technical Report. University of Utrecht.
    [28]
    William A Wulf and Sally A McKee. 1995. Hitting the memory wall: implications of the obvious. ACM SIGARCH computer architecture news 23, 1 (1995), 20–24.
    [29]
    Chengliang Zhang, Chen Ding, Mitsunori Ogihara, Yutao Zhong, and Youfeng Wu. 2006. A hierarchical model of data locality. In POPL 2006. 16–29.
    [30]
    Yutao Zhong, Maksim Orlovich, Xipeng Shen, and Chen Ding. 2004. Array regrouping and structure splitting using whole-program reference affinity. In PLDI 2004. 255–266.

    Cited By

    View all
    • (2024)Reordering Functions in Mobiles Apps for Reduced Size and Faster Start-UpACM Transactions on Embedded Computing Systems10.1145/366063523:4(1-54)Online publication date: 20-Apr-2024
    • (2023)Optimizing Function Layout for Mobile ApplicationsProceedings of the 24th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems10.1145/3589610.3596277(52-63)Online publication date: 13-Jun-2023
    • (2023)Efficient Interprocedural Data-Flow Analysis Using Treedepth and TreewidthVerification, Model Checking, and Abstract Interpretation10.1007/978-3-031-24950-1_9(177-202)Online publication date: 17-Jan-2023
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the ACM on Programming Languages
    Proceedings of the ACM on Programming Languages  Volume 3, Issue POPL
    January 2019
    2275 pages
    EISSN:2475-1421
    DOI:10.1145/3302515
    Issue’s Table of Contents
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 02 January 2019
    Published in PACMPL Volume 3, Issue POPL

    Permissions

    Request permissions for this article.

    Check for updates

    Badges

    Author Tags

    1. cache management
    2. compilers
    3. data locality
    4. data packing

    Qualifiers

    • Research-article

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)152
    • Downloads (Last 6 weeks)16

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Reordering Functions in Mobiles Apps for Reduced Size and Faster Start-UpACM Transactions on Embedded Computing Systems10.1145/366063523:4(1-54)Online publication date: 20-Apr-2024
    • (2023)Optimizing Function Layout for Mobile ApplicationsProceedings of the 24th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems10.1145/3589610.3596277(52-63)Online publication date: 13-Jun-2023
    • (2023)Efficient Interprocedural Data-Flow Analysis Using Treedepth and TreewidthVerification, Model Checking, and Abstract Interpretation10.1007/978-3-031-24950-1_9(177-202)Online publication date: 17-Jan-2023
    • (2022)Profile inference revisitedProceedings of the ACM on Programming Languages10.1145/34987146:POPL(1-24)Online publication date: 12-Jan-2022
    • (2022)Optimal Mining: Maximizing Bitcoin Miners' Revenues from Transaction Fees2022 IEEE International Conference on Blockchain (Blockchain)10.1109/Blockchain55522.2022.00044(266-273)Online publication date: Aug-2022
    • (2020)An efficient algorithm for computing network reliability in small treewidthReliability Engineering & System Safety10.1016/j.ress.2019.106665193(106665)Online publication date: Jan-2020
    • (2020)Faster Algorithms for Quantitative Analysis of MCs and MDPs with Small TreewidthAutomated Technology for Verification and Analysis10.1007/978-3-030-59152-6_14(253-270)Online publication date: 12-Oct-2020
    • (2020)Optimal and Perfectly Parallel Algorithms for On-demand Data-Flow AnalysisProgramming Languages and Systems10.1007/978-3-030-44914-8_5(112-140)Online publication date: 18-Apr-2020
    • (2019)Codestitcher: inter-procedural basic block layout optimizationProceedings of the 28th International Conference on Compiler Construction10.1145/3302516.3307358(65-75)Online publication date: 16-Feb-2019
    • (2019)The treewidth of smart contractsProceedings of the 34th ACM/SIGAPP Symposium on Applied Computing10.1145/3297280.3297322(400-408)Online publication date: 8-Apr-2019

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media