Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2647508.2647509acmconferencesArticle/Chapter ViewAbstractPublication PagespppjConference Proceedingsconference-collections
research-article

Fast Java profiling with scheduling-aware stack fragment sampling and asynchronous analysis

Published: 23 September 2014 Publication History
  • Get Citation Alerts
  • Abstract

    Sampling is a popular approach to profiling because it typically has only a small impact on performance and does not modify the profiled application. Common sampling profilers collect data about an application by pausing the application threads, walking the stacks to create stack traces, and then adding the traces to their profile. Waiting threads are often sampled as well, even when they have not been active since their last sample. Sampling profilers for Java commonly rely on safepoints, which are locations in the Java code where a thread can pause to be sampled. However, restricting profiling to these locations affects the accuracy of the profile, and the safepoint mechanism itself imposes significant pause times on the application.
    We present stack fragment sampling, a new approach for Java applications that minimizes pause times and eliminates redundant samples altogether. It interrupts an application thread only to copy a fragment of its stack to a buffer and then immediately resumes its execution. Retrieving and decoding the stack fragments happens asynchronously and can run on a separate core or on another processor. Our approach integrates with the operating system to only take samples of threads while they are running in order to avoid redundant samples. We demonstrate that our approach has a very small impact on performance even at high sampling rates. Furthermore, we validate our approach by comparing our profiles to those from a profiler using safepoints as well as to a VM-internal profiler that does not use safepoints. The results show that the profiles agree.

    References

    [1]
    A. Adamoli and M. Hauswirth. Trevis: A context tree visualization & analysis framework and its use for classifying performance failure reports. In Proceedings of the 5th International Symposium on Software Visualization, SOFTVIS '10, pages 73--82, New York, NY, USA, 2010. ACM. ISBN 978-1-4503-0028-5. URL http://doi.acm.org/10.1145/1879211.1879224.
    [2]
    G. Ammons, T. Ball, and J. R. Larus. Exploiting hardware performance counters with flow and context sensitive profiling. In Proceedings of the ACM SIGPLAN 1997 Conference on Programming Language Design and Implementation, PLDI '97, pages 85--96, New York, NY, USA, 1997. ACM. ISBN 0-89791-907-6. URL http://doi.acm.org/10.1145/258915.258924.
    [3]
    M. Arnold and D. Grove. Collecting and exploiting high-accuracy call graph profiles in virtual machines. In Proceedings of the International Symposium on Code Generation and Optimization, CGO '05, pages 51--62, Washington, DC, USA, 2005. IEEE Computer Society. ISBN 0-7695-2298-X. URL http://dx.doi.org/10.1109/CGO.2005.9.
    [4]
    W. Binder. Portable and accurate sampling profiling for Java. Software: Practice and Experience, 36(6):615--650, May 2006. ISSN 0038-0644. URL http://dx.doi.org/10.1002/spe.v36:6.
    [5]
    S. M. Blackburn, R. Garner, C. Hoffman, A. M. Khan, K. S. McKinley, R. Bentzur, A. Diwan, D. Feinberg, D. Frampton, S. Z. Guyer, M. Hirzel, A. Hosking, M. Jump, H. Lee, J. E. B. Moss, A. Phansalkar, D. Stefanović, T. VanDrunen, D. von Dincklage, and B. Wiedermann. The DaCapo benchmarks: Java benchmarking development and analysis. In OOPSLA '06: Proceedings of the 21st annual ACM SIGPLAN conference on Object-Oriented Programing, Systems, Languages, and Applications, pages 169--190, New York, NY, USA, Oct. 2006. ACM Press.
    [6]
    M. D. Bond and K. S. McKinley. Probabilistic calling context. In Proceedings of the 22Nd Annual ACM SIGPLAN Conference on Object-oriented Programming Systems and Applications, OOPSLA '07, pages 97--112, New York, NY, USA, 2007. ACM. ISBN 978-1-59593-786-5. URL http://doi.acm.org/10.1145/1297027.1297035.
    [7]
    D. C. D'Elia, C. Demetrescu, and I. Finocchi. Mining hot calling contexts in small space. In ACM SIGPLAN Notices, volume 46, pages 516--527. ACM, 2011.
    [8]
    P. T. Feller. Value profiling for instructions and memory locations. Master's thesis, UC San Diego, 1998.
    [9]
    S. Han, Y. Dang, S. Ge, D. Zhang, and T. Xie. Performance debugging in the large via mining millions of stack traces. In Proceedings of the 2012 International Conference on Software Engineering, pages 145--155. IEEE Press, 2012.
    [10]
    P. Hofer and H. Mössenböck. Efficient and accurate stack trace sampling in the Java HotSpot virtual machine. In Proceedings of the 5th ACM/SPEC International Conference on Performance Engineering, ICPE '14, pages 277--280, New York, NY, USA, 2014. ACM. ISBN 978-1-4503-2733-6. URL http://doi.acm.org/10.1145/2568088.2576759.
    [11]
    H. Inoue and T. Nakatani. How a Java VM can get more from a hardware performance monitor. In Proceedings of the 24th ACM SIGPLAN Conference on Object Oriented Programming Systems Languages and Applications, OOPSLA '09, pages 137--154, New York, NY, USA, 2009. ACM. ISBN 978-1-60558-766-0. URL http://doi.acm.org/10.1145/1640089.1640100.
    [12]
    A. Josey, D. Cragun, N. Stoughton, M. Brown, C. Hughes, et al. The Open Group Base Specifications Issue 7 IEEE Std 1003.1. The IEEE and The Open Group, 20:96, 2004.
    [13]
    kernel.org. perf: Linux profiling with performance counters. https://perf.wiki.kernel.org/.
    [14]
    T. Kotzmann, C. Wimmer, H. Mössenböck, T. Rodriguez, K. Russell, and D. Cox. Design of the Java HotSpot#8482; client compiler for Java 6. ACM Transactions on Architecture and Code Optimization (TACO), 5 (1):7, 2008.
    [15]
    P. Moret, W. Binder, and A. Villazon. CCCP: complete calling context profiling in virtual execution environments. In Proceedings of the 2009 ACM SIGPLAN workshop on Partial evaluation and program manipulation, pages 151--160. ACM, 2009.
    [16]
    P. Moret, W. Binder, A. Villazón, and D. Ansaloni. Exploring large profiles with calling context ring charts. In Proceedings of the First Joint WOSP/SIPEW International Conference on Performance Engineering, WOSP/SIPEW '10, pages 63--68, New York, NY, USA, 2010. ACM. ISBN 978-1-60558-563-5. URL http://doi.acm.org/10.1145/1712605.1712617.
    [17]
    T. Mytkowicz, A. Diwan, M. Hauswirth, and P. F. Sweeney. Evaluating the accuracy of Java profilers. In Proceedings of the 2010 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '10, pages 187--197, New York, NY, USA, 2010. ACM. ISBN 978-1-4503-0019-3. URL http://doi.acm.org/10.1145/1806596.1806618.
    [18]
    Oracle. OpenJDK HotSpot group. http://openjdk.java.net/groups/hotspot/.
    [19]
    Oracle. JVM#8482;Tool Interface version 1.2.1. http://docs.oracle.com/javase/7/docs/platform/jvmti/jvmti.html.
    [20]
    R. Pozo and B. Miller. SciMark 2.0. http://math.nist.gov/scimark2/.
    [21]
    M. Serrano and X. Zhuang. Building approximate calling context from partial call traces. In Proceedings of the 7th Annual IEEE/ACM International Symposium on Code Generation and Optimization, CGO '09, pages 221--230, Washington, DC, USA, 2009. IEEE Computer Society. ISBN 978-0-7695-3576-0. URL http://dx.doi.org/10.1109/CGO.2009.12.
    [22]
    A. Sewe, M. Mezini, A. Sarimbekov, and W. Binder. Da Capo con Scala: design and analysis of a Scala benchmark suite for the Java virtual machine. In Proceedings of the 26th Conference on Object-Oriented Programming, Systems, Languages and Applications, OOPSLA '11, pages 657--676, New York, NY, USA, 2011. ACM. ISBN 978-1-4503-0940-0.
    [23]
    W. N. Sumner, Y. Zheng, D. Weeratunge, and X. Zhang. Precise calling context encoding. IEEE Trans. Softw. Eng., 38(5):1160--1177, Sept. 2012. ISSN 0098-5589. URL http://dx.doi.org/10.1109/TSE.2011.70.
    [24]
    V. M. Weaver. The unofficial Linux perf events web-page. http://web.eece.maine.edu/~vweaver/projects/perf_events/.
    [25]
    V. M. Weaver. Linux perf_event features and overhead. In The 2nd International Workshop on Performance Analysis of Workload Optimized Systems, FastPath, page 80, 2013.
    [26]
    J. Whaley. A portable sampling-based profiler for Java virtual machines. In Proceedings of the ACM 2000 Conference on Java Grande, JAVA '00, pages 78--87, New York, NY, USA, 2000. ACM. ISBN 1-58113-288-3. URL http://doi.acm.org/10.1145/337449.337483.
    [27]
    X. Zhuang, M. J. Serrano, H. W. Cain, and J.-D. Choi. Accurate, efficient, and adaptive calling context profiling. In Proceedings of the 2006 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '06, pages 263--271, New York, NY, USA, 2006. ACM. ISBN 1-59593-320-4. URL http://doi.acm.org/10.1145/1133981.1134012.

    Cited By

    View all
    • (2021)System execution path profiling using hardware performance counters2021 IEEE International Systems Conference (SysCon)10.1109/SysCon48628.2021.9447121(1-8)Online publication date: 15-Apr-2021
    • (2016)UPC: Large-Scale Memory Efficient Java Primitive CollectionsJournal of Software10.17706/jsw.11.3.251-27111:3(251-271)Online publication date: 2016

    Index Terms

    1. Fast Java profiling with scheduling-aware stack fragment sampling and asynchronous analysis

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        PPPJ '14: Proceedings of the 2014 International Conference on Principles and Practices of Programming on the Java platform: Virtual machines, Languages, and Tools
        September 2014
        214 pages
        ISBN:9781450329262
        DOI:10.1145/2647508
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        In-Cooperation

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 23 September 2014

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. Java
        2. calling context
        3. monitoring
        4. profiling
        5. sampling

        Qualifiers

        • Research-article

        Conference

        PPPJ '14
        Sponsor:

        Acceptance Rates

        Overall Acceptance Rate 29 of 58 submissions, 50%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)19
        • Downloads (Last 6 weeks)0

        Other Metrics

        Citations

        Cited By

        View all
        • (2021)System execution path profiling using hardware performance counters2021 IEEE International Systems Conference (SysCon)10.1109/SysCon48628.2021.9447121(1-8)Online publication date: 15-Apr-2021
        • (2016)UPC: Large-Scale Memory Efficient Java Primitive CollectionsJournal of Software10.17706/jsw.11.3.251-27111:3(251-271)Online publication date: 2016

        View Options

        Get Access

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media