Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3297858.3304074acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article
Open access

Safer Program Behavior Sharing Through Trace Wringing

Published: 04 April 2019 Publication History

Abstract

When working towards application-tuned systems, developers often find themselves caught between the need to share information (so that partners can make intelligent design choices) and the need to hide information (to protect proprietary methods or sensitive data). One place where this problem comes to a head is in the release of program traces, for example a memory address trace. A trace taken from a production server might expose details about who the users are or what they are doing, or it might even expose details of the actual computation itself (e.g. through a side channel). Engineers are often asked to make, by hand, "analogs" of their codes that would be free from such sensitive data or, may even try to describe behaviors at a high level with words. Both of these approaches lead to missed opportunities, confusion, and frustration. We propose a new problem for study, trace-wringing, that seeks to remove as much information from the trace as possible while still maintaining key characteristics of the original. We formalize this problem and show that, for a specific instance around memory traces, as little as a few thousand bits need to be shared. We demonstrate experimentally that the trace-wrung proxies behave similarly in the context of cache simulation but with bounded leakage, and examine the sensitivity of wrung traces to a class of attacks on AES encryption.

References

[1]
Erik Berg and Erik Hagersten. 2004. StatCache: a probabilistic approach to efficient and accurate data locality analysis. In Performance Analysis of Systems and Software, 2004 IEEE International Symposium on-ISPASS. IEEE, 20--27.
[2]
Vincent Bindschaedler, Reza Shokri, and Carl A Gunter. 2017. Plausible deniability for privacy-preserving data synthesis. Proceedings of the VLDB Endowment, Vol. 10, 5 (2017), 481--492.
[3]
Martin Burtscher. 2004. VPC3: A fast and effective trace-compression algorithm. In ACM SIGMETRICS Performance Evaluation Review, Vol. 32. ACM, 167--176.
[4]
Martin Burtscher. 2006. TCgen 2.0: a tool to automatically generate lossless trace compressors. ACM SIGARCH Computer Architecture News, Vol. 34, 3 (2006), 1--8.
[5]
Martin Burtscher, Ilya Ganusov, Sandra J Jackson, Jian Ke, Paruj Ratanaworabhan, and Nana B Sam. 2005. The VPC trace-compression algorithms. IEEE Trans. Comput., Vol. 54, 11 (2005), 1329--1344.
[6]
Martin Burtscher and Metha Jeeradit. 2003. Compressing extended program traces using value predictors. In Parallel Architectures and Compilation Techniques, 2003. PACT 2003. Proceedings. 12th International Conference on. IEEE, 159--169.
[7]
Miguel Castro, Manuel Costa, and Jean-Philippe Martin. 2008. Better bug reporting with better privacy. ACM SIGARCH Computer Architecture News, Vol. 36, 1 (2008), 319--328.
[8]
Dehao Chen, Neil Vachharajani, Robert Hundt, Shih-wei Liao, Vinodha Ramasamy, Paul Yuan, Wenguang Chen, and Weimin Zheng. 2010. Taming hardware event samples for FDO compilation. In Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization. ACM, 42--52.
[9]
I-Cheng K Chen, John T Coffey, and Trevor N Mudge. 1996. Analysis of branch prediction via data compression. ACM SIGPLAN Notices, Vol. 31, 9 (1996), 128--137.
[10]
Trishul M Chilimbi. 2001. Efficient representations and abstractions for quantifying and exploiting data reference locality. In ACM SIGPLAN Notices, Vol. 36. ACM, 191--202.
[11]
Andrew Collette. 2013. Python and HDF5: Unlocking Scientific Data ." O'Reilly Media, Inc.".
[12]
W. Cui, Y. Ding, D. Dangwal, A. Holmes, J. McMahan, A. Javadi-Abhari, G. Tzimpragos, F. Chong, and T. Sherwood. 2018. Charm: A Language for Closed-Form High-Level Architecture Modeling. In 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA). 152--165.
[13]
Ashutosh S Dhodapkar and James E Smith. 2003. Comparing program phase detection techniques. In Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, 217.
[14]
Richard O Duda and Peter E Hart. 1972. Use of the Hough transformation to detect lines and curves in pictures. Commun. ACM, Vol. 15, 1 (1972), 11--15.
[15]
Cynthia Dwork. 2008. Differential privacy: A survey of results. In International Conference on Theory and Applications of Models of Computation. Springer, 1--19.
[16]
Lieven Eeckhout, Koen De Bosschere, and Henk Neefs. 2000. Performance analysis through synthetic trace generation. In Performance Analysis of Systems and Software, 2000. ISPASS. 2000 IEEE International Symposium on. IEEE, 1--6.
[17]
EN Elnozahy. 1999. Address trace compression through loop detection and reduction. In ACM SIGMETRICS Performance Evaluation Review, Vol. 27. ACM, 214--215.
[18]
Domenico Ferrari. 1981. A generative model of working set dynamics. In ACM SIGMETRICS Performance Evaluation Review, Vol. 10. ACM, 52--57.
[19]
Domenico Ferrari. 1984. On the foundations of artificial workload design. Vol. 12. ACM.
[20]
C Galamhos, Jose Matas, and Josef Kittler. 1999. Progressive probabilistic Hough transform for line detection. In Computer Vision and Pattern Recognition, 1999. IEEE Computer Society Conference on., Vol. 1. IEEE, 554--560.
[21]
Xiaofeng Gao, Allan Snavely, and Larry Carter. 2006. Path grammar guided trace compression and trace approximation. In High Performance Distributed Computing, 2006 15th IEEE International Symposium on. IEEE, 57--68.
[22]
Andreas Haeberlen, Benjamin C Pierce, and Arjun Narayan. 2011. Differential Privacy Under Fire. In USENIX Security Symposium .
[23]
O Hammami. 1995. Taking into account access patterns irregularity when compressing address traces. In Southeastcon'95. Visualize the Future., Proceedings., IEEE. IEEE, 74--77.
[24]
John A Hartigan and Manchek A Wong. 1979. Algorithm AS 136: A k-means clustering algorithm. Journal of the Royal Statistical Society. Series C (Applied Statistics), Vol. 28, 1 (1979), 100--108.
[25]
Mark D Hill. 1998. DINERO IV trace-driven uniprocessor cache simulator. http://www.cs.wisc.edu/markhill (1998).
[26]
Eric E Johnson and Jiheng Ha. 1994. Lossless address trace compression for reducing file size and access time. In International Phoenix Conference on Computers and Communications, IEEE Press, Los Alamitos, CA, USA. 213--219.
[27]
Ajay Joshi, Lieven Eeckhout, and Lizy John. 2008. The return of synthetic benchmarks. In 2008 SPEC Benchmark Workshop. 1--11.
[28]
J Yi Joshua, Resit Sendag, Lieven Eeckhout, Ajay Joshi, David J Lilja, and Lizy K John. 2006. Evaluating benchmark subsetting approaches. In Workload Characterization, 2006 IEEE International Symposium on. IEEE, 93--104.
[29]
James R Larus. 1999. Whole program paths. In ACM SIGPLAN Notices, Vol. 34. ACM, 259--269.
[30]
Ninghui Li, Tiancheng Li, and Suresh Venkatasubramanian. 2007. t-closeness: Privacy beyond k-anonymity and l-diversity. In Data Engineering, 2007. ICDE 2007. IEEE 23rd International Conference on. IEEE, 106--115.
[31]
Yun Li, Jian Ren, and Jie Wu. 2012. Quantitative measurement and design of source-location privacy schemes for wireless sensor networks. IEEE Transactions on Parallel and Distributed Systems, Vol. 23, 7 (2012), 1302--1311.
[32]
Ashwin Machanavajjhala, Johannes Gehrke, Daniel Kifer, and Muthuramakrishnan Venkitasubramaniam. 2006. l-diversity: Privacy beyond k-anonymity. In Data Engineering, 2006. ICDE'06. Proceedings of the 22nd International Conference on. IEEE, 24--24.
[33]
Stephen McCamant and Michael D Ernst. 2008. Quantitative information flow as network flow capacity. In ACM SIGPLAN Notices, Vol. 43. ACM, 193--205.
[34]
Pierre Michaud. 2009. Online compression of cache-filtered address traces. In Performance Analysis of Systems and Software, 2009. ISPASS 2009. IEEE International Symposium on. IEEE, 185--194.
[35]
Arvind Narayanan and Vitaly Shmatikov. 2008. Robust de-anonymization of large sparse datasets. In Security and Privacy, 2008. SP 2008. IEEE Symposium on. IEEE, 111--125.
[36]
Craig G Nevill-Manning and Ian H Witten. 1997. Linear-time, incremental hierarchy inference for compression. In Data Compression Conference, 1997. DCC'97. Proceedings. IEEE, 3--11.
[37]
Catherine Mills Olschanowsky, Mustafa M Tikir, Laura Carrington, and Allan Snavely. 2009. PSnAP: Accurate Synthetic Address Streams through Memory Profiles. In LCPC. Springer, 353--367.
[38]
Mark Oskin, Frederic T Chong, and Matthew Farrens. 2000. HLS: Combining statistical and symbolic simulation to guide microprocessor designs. Vol. 28. ACM.
[39]
Dag Arne Osvik, Adi Shamir, and Eran Tromer. 2006. Cache attacks and countermeasures: the case of AES. In Cryptographers' Track at the RSA Conference. Springer, 1--20.
[40]
Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, et almbox. 2011. Scikit-learn: Machine learning in Python. Journal of machine learning research, Vol. 12, Oct (2011), 2825--2830.
[41]
Juan Rodriguez-Rosell. 1976. Empirical data reference behavior in data base systems. Computer, Vol. 9, 11 (1976), 9--13.
[42]
Andreas Sembrant, David Black-Schaffer, and Erik Hagersten. 2012. Phase guided profiling for fast cache modeling. In Proceedings of the Tenth International Symposium on Code Generation and Optimization. ACM, 175--185.
[43]
Xipeng Shen, Yutao Zhong, and Chen Ding. 2004. Locality phase prediction. ACM SIGPLAN Notices, Vol. 39, 11 (2004), 165--176.
[44]
Timothy Sherwood, Erez Perelman, Greg Hamerly, and Brad Calder. 2002. Automatically characterizing large scale program behavior. ACM SIGARCH Computer Architecture News, Vol. 30, 5 (2002), 45--57.
[45]
Timothy Sherwood, Erez Perelman, Greg Hamerly, Suleyman Sair, and Brad Calder. 2003. Discovering and exploiting program phases. IEEE micro, Vol. 23, 6 (2003), 84--93.
[46]
Latanya Sweeney. 2000. Simple demographics often identify people uniquely. Health (San Francisco), Vol. 671 (2000), 1--34.
[47]
Latanya Sweeney. 2002. k-anonymity: A model for protecting privacy. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, Vol. 10, 05 (2002), 557--570.
[48]
Dominique Thiebaut, Joel L. Wolf, and Harold S. Stone. 1992. Synthetic traces for trace-driven simulation of cache memories. IEEE Transactions on computers, Vol. 41, 4 (1992), 388--410.
[49]
Mustafa M Tikir, Michael Laurenzano, Laura Carrington, and Allan Snavely. 2006. PMaC Binary Instrumentation Library for PowerPC/AIX. In Workshop on Binary Instrumentation and Applications .
[50]
New York Times. 2006. A Face Is Exposed for AOL Searcher No. 4417749. https://www.nytimes.com/2006/08/09/technology/09aol.html
[51]
Luk Van Ertvelde and Lieven Eeckhout. 2008. Dispersing proprietary applications as benchmarks through code mutation. In ACM SIGARCH Computer Architecture News, Vol. 36. ACM, 201--210.
[52]
Jonathan Weinberg and Allan Snavely. 2008. Chameleon: A framework for observing, understanding, and imitating the memory behavior of applications. In PARA08: Workshop on State-of-the-Art in Scientific and Parallel Computing, Trondheim, Norway.
[53]
Jonathan Weinberg and Allan Edward Snavely. 2008. Accurate memory signatures and synthetic address traces for HPC applications. In Proceedings of the 22nd annual international conference on Supercomputing. ACM, 36--45.

Cited By

View all
  • (2024)Camouflage: Utility-Aware Obfuscation for Accurate Simulation of Sensitive Program TracesACM Transactions on Architecture and Code Optimization10.1145/365011021:2(1-23)Online publication date: 21-May-2024
  • (2023)Mystique: Enabling Accurate and Scalable Generation of Production AI BenchmarksProceedings of the 50th Annual International Symposium on Computer Architecture10.1145/3579371.3589072(1-13)Online publication date: 17-Jun-2023
  • (2022)Update with careJournal of Systems and Software10.1016/j.jss.2022.111381191:COnline publication date: 1-Sep-2022
  • Show More Cited By

Index Terms

  1. Safer Program Behavior Sharing Through Trace Wringing

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ASPLOS '19: Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems
    April 2019
    1126 pages
    ISBN:9781450362405
    DOI:10.1145/3297858
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    In-Cooperation

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 04 April 2019

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. privacy of traces
    2. synthetic trace generation
    3. trace compression

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    ASPLOS '19

    Acceptance Rates

    ASPLOS '19 Paper Acceptance Rate 74 of 351 submissions, 21%;
    Overall Acceptance Rate 535 of 2,713 submissions, 20%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)95
    • Downloads (Last 6 weeks)13
    Reflects downloads up to 09 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Camouflage: Utility-Aware Obfuscation for Accurate Simulation of Sensitive Program TracesACM Transactions on Architecture and Code Optimization10.1145/365011021:2(1-23)Online publication date: 21-May-2024
    • (2023)Mystique: Enabling Accurate and Scalable Generation of Production AI BenchmarksProceedings of the 50th Annual International Symposium on Computer Architecture10.1145/3579371.3589072(1-13)Online publication date: 17-Jun-2023
    • (2022)Update with careJournal of Systems and Software10.1016/j.jss.2022.111381191:COnline publication date: 1-Sep-2022
    • (2021)Context-Aware Privacy-Optimizing Address Tracing2021 International Symposium on Secure and Private Execution Environment Design (SEED)10.1109/SEED51797.2021.00027(150-162)Online publication date: Sep-2021

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media