Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Idiom recognition framework using topological embedding

Published: 16 September 2013 Publication History
  • Get Citation Alerts
  • Abstract

    Modern processors support hardware-assist instructions (such as TRT and TROT instructions on the IBM System z) to accelerate certain functions such as delimiter search and character conversion. Such special instructions are often used in high-performance libraries, but their exploitation in optimizing compilers has been limited. We devised a new idiom recognition technique based on a topological embedding algorithm to detect idiom patterns in the input programs more aggressively than in previous approaches using exact pattern matching. Our approach can detect a pattern even if the code segment does not exactly match the idiom. For example, we can detect a code segment that includes additional code within the idiom pattern. We also propose an instruction simplification for the idiom recognition. This optimization analyzes all of the usages of the output of the optimized code for a specific idiom. If we find that we do not need an actual value for the output but only a value in a subrange, then we can assign a value in that subrange as the output. The code generation can generate faster code with this optimization. We implemented our new idiom recognition approach based on the Java Just-In-Time (JIT) compiler that is part of the J9 Java Virtual Machine, and we supported several important idioms for the special hardware-assist instructions on the IBM System z and on some models of the IBM System p. To demonstrate the effectiveness of our technique, we performed two experiments. The first experiment was to see how many more patterns we can detect compared to the previous approach. The second experiment measured the performance improvements over the previous approaches. For the first experiment, we used the Java Compatibility Kit (JCK) API tests. For the second experiment we used the IBM XML parser, SPECjvm98, and SPCjbb2000. In summary, relative to a baseline implementation using exact pattern matching, our algorithm converted 76% more loops in JCK tests. On a z9, we also observed significant average performance improvement of the XML parser by 54%, of SPECjvm98 by 1.9%, and of SPECjbb2000 by 4.4%. Finally, we observed that the JIT compilation time increased by only 0.32% to 0.44%.

    References

    [1]
    Aho, A. V., Ganapathi, M., and Tjiang, S. W. K. 1989. Code generation using tree matching and dynamic programming. ACM Trans. Program. Lang. Syst. 11, 491--516.
    [2]
    Aho, A. V., Sethi, R., and Ullman, J. D. 1986. Compiler Principles, Techniques, and Tools. Addison-Wesley.
    [3]
    Blume, W. and Eigenmann, R. 1994. An overview of symbolic analysis techniques needed for the effective parallelization of the perfect benchmarks. In Proceedings of the International Conference on Parallel Processing. 233--238.
    [4]
    Clark, N., Blome, J., Chu, M., Mahlke, S., Biles, S., and Flautner, K. 2005. An architecture framework for transparent instruction set customization in embedded processors. In Proceedings of the 32nd Annual International Symposium on Computer Architecture (ISCA'05). IEEE Computer Society, Washington, DC, 272--283.
    [5]
    Cousot, P. and Cousot, R. 1977. Abstract interpretation: A united lattice model for static analysis of programs by construction or approximation of fixpoints. In Proceedings of the 4th ACM SIGACT/SIGPLAN Symposium on Principles of Programming Languages (POPL'77). ACM Press, New York, 238--252.
    [6]
    Fu, J. J. 1997. Directed graph pattern matching and topological embedding. J. Algor. 22, 2, 372--391.
    [7]
    Grcevski, N., Kielstra, A., Stoodley, K., Stoodley, M., and Sundaresan, V. 2004. Java just-intime compiler and virtual machine improvements for server and middleware applications. In Proceedings of the Virtual Machine Research and Tech Symposium. 151--162.
    [8]
    He, J., Snavely, A., Van Der Wijngaart, R., and Frumkin, M. 2011. Automatic recognition of performance idioms in scientific applications. In Proceedings of the IEEE International Parallel Distributed Processing Symposium (IPDPS'11). 118--127.
    [9]
    IBM Corp. 2013. IBM mainframe. http://www-03.ibm.com/systems/z/.
    [10]
    IBM Corp. 2013. IBM powerpc architecture. http://www-03.ibm.com/technology/ges/semiconductor/power/powerpc.html.
    [11]
    IBM Corp. 2004. z/Architecture principles of operation (sa22-7832-03). IBM, New York. http://www-01.ibm.com/support/docview.wss?uid=pub1sa22783203.
    [12]
    IBM Corp. 2005. Powerpc microprocessor family: Vector/simd multimedia extension technology programming environments manual. http://www-3.ibm.com/chips/techlib/techlib.nsf/techdocs/FBFA164F824370F987256D6A006F424D.
    [13]
    Inagaki, T., Onodera, T., Komatsu, H., and Nakatani, T. 2003. Stride prefetching by dynamically inspecting objects. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'03). ACM Press, New York, 269--277.
    [14]
    Kawahito, M., Komatsu, H., Moriyama, T., Inoue, H., and Nakatani, T. 2006. A new idiom recognition framework for exploiting hardware-assist instructions. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems. 382--393.
    [15]
    Kawahito, M., Komatsu, H., and Nakatani, T. 2004. Partial redundancy elimination for access expressions by speculative code motion. Softw. Pract. Exper. 34, 11, 1065--1090.
    [16]
    Knoop, J., Ruthing, O., and Steffen, B. 1994a. Optimal code motion: Theory and practice. ACM Trans. Program. Lang. Syst. 16, 4, 1117--1155.
    [17]
    Knoop, J., Ruthing, O., and Steffen, B. 1994b. Partial dead code elimination. In Proceedings of the International Conference on Programming Language Design and Implementation. ACM Press, New York, 147--158.
    [18]
    Larsen, S. and Amarasinghe, S. 2000. Exploiting superword level parallelism with multimedia instruction sets. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM Press, New York, 145--156.
    [19]
    Leuschel, M. 2004. A framework for the integration of partial evaluation and abstract interpretation of logic programs. ACM Trans. Program. Lang. Syst. 26, 3, 413--463.
    [20]
    Lindholm, T. and Yellin, F. 1996. The Java Virtual Machine Specification. Addison-Wesley.
    [21]
    Metzger, R. 1995. Automated recognition of parallel algorithms in scientific applications. In Proceedings of the Workshop on Plan Recognition at International Joint Conference on Artificial Intelligence (IJCAI'95).
    [22]
    Motorola Corp. 1999. Altivec technology programming interface manual. http://www.freescale.com/_les/32bit/doc/ref manual/ALTIVECPIM.pdf.
    [23]
    Muchnick, S. S. 1997. Advanced Compiler Design and Implementation. Morgan Kaufmann, San Fransisco, CA.
    [24]
    Olschanowsky, C., Snavely, A., Meswani, M., and Carrington, L. 2010. Pir: Pmac's idiom recognizer. In Proceedings of the 39th International Conference Parallel Processing Workshops (ICPPW'10). 189--196.
    [25]
    Pinter, S. S. and Pinter, R. Y. 1994. Program optimization and parallelization using idioms. ACM Trans. Program. Lang. Syst. 16, 3, 305--327.
    [26]
    Pottenger, B. and Eigenmann, R. 1995. Idiom recognition in the polaris parallelizing compiler. In Proceedings of the 9th International Conference on Supercomputing (ICS'95). ACM Press, New York, 444--448.
    [27]
    Sato, H. 2001. Array form representation of idiom recognition system for numerical programs. In Proceedings of the Conference on Array Processing Languages (APL'01). ACM Press, New York, 87--98.
    [28]
    Shin, J., Hall, M., and Chame, J. 2005. Superword-level parallelism in the presence of control flow. In Proceedings of the International Symposium on Code Generation and Optimization (CGO'05). IEEE Computer Society, Washington, DC, 165--175.
    [29]
    Siegel, T. J., Pfeffer, E., and Magee, J. A. 2004. The ibm eserver z990 microprocessor. IBM J. Res. Dev. 48, 3--4, 295--309.
    [30]
    Suganuma, T., Ogasawara, T., Kawachiya, K., Takeuchi, M., Ishizaki, K., Koseki, A., Inagaki, T., Yasue, T., Kawahito, M., Onodera, T., Komatsu, H., and Nakatani, T. 2004. Evolution of a java just-in-time compiler for ia-32 platforms. IBM J. Res. Devel. 48, 5/6, 767--795.
    [31]
    Suganuma, T., Ogasawara, T., Takeuchi, M., Yasue, T., Kawahito, M., Ishizaki, K., Komatsu, H., and Nakatani, T. 2000. Overview of the ibm java just-in-time compiler. IBM Syst. J. 39, 1, 175--193.
    [32]
    Suganuma, T., Yasue, T., Kawahito, M., Komatsu, H., and Nakatani, T. 2001. A Dynamic optimization framework for a java just-in-time compiler. In Proceedings of the International Conference on Object-Oriented Programming, Systems, Languages, and Applications. ACM Press, New York.
    [33]
    Sun Corp. 2013. The java compatibility kit (jck). https://jck.dev.java.net/.
    [34]
    Sun Corp. 2006. The api specification of the class java.lang.string. http://java.sun.com/javase/6/docs/api/java/lang/String.html.

    Cited By

    View all
    • (2024)Latent Idiom Recognition for a Minimalist Functional Array Language Using Equality Saturation2024 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)10.1109/CGO57630.2024.10444879(270-282)Online publication date: 2-Mar-2024
    • (2021)KernelFaRerACM Transactions on Architecture and Code Optimization10.1145/345901018:3(1-22)Online publication date: 28-Jun-2021

    Index Terms

    1. Idiom recognition framework using topological embedding

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Architecture and Code Optimization
      ACM Transactions on Architecture and Code Optimization  Volume 10, Issue 3
      September 2013
      310 pages
      ISSN:1544-3566
      EISSN:1544-3973
      DOI:10.1145/2509420
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 16 September 2013
      Accepted: 01 March 2013
      Revised: 01 March 2013
      Received: 01 September 2012
      Published in TACO Volume 10, Issue 3

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Idiom recognition
      2. JIT
      3. Java
      4. VMX
      5. hardware-assist instructions
      6. topological embedding

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)100
      • Downloads (Last 6 weeks)17

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Latent Idiom Recognition for a Minimalist Functional Array Language Using Equality Saturation2024 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)10.1109/CGO57630.2024.10444879(270-282)Online publication date: 2-Mar-2024
      • (2021)KernelFaRerACM Transactions on Architecture and Code Optimization10.1145/345901018:3(1-22)Online publication date: 28-Jun-2021

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Get Access

      Login options

      Full Access

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media