Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Loop-Level Parallelism in Numeric and Symbolic Programs

Published: 01 July 1993 Publication History
  • Get Citation Alerts
  • Abstract

    A new technique for estimating and understanding the speed improvement that can resultfrom executing a program on a parallel computer is described. The technique requires noadditional programming and minimal effort by a program's author. The analysis begins by tracing a sequential program. A parallelism analyzer uses information from the trace to simulate parallel execution of the program. In addition to predicting parallel performance, the parallelism analyzer measures many aspects of a program's dynamic behavior. Measurements of six substantial programs are presented. These results indicate that the three symbolic programs differ substantially from the numeric programs and, as aconsequence, cannot be automatically parallelized with the same compilation techniques.

    References

    [1]
    {1} A. V. Aho, R. Sethi, and J. D. Ullman, Compilers: Principles, Techniques, and Tools. Reading, MA: Addison-Wesley, 1985.
    [2]
    {2} T. Austin, personal communication, Apr. 1992.
    [3]
    {3} T. M. Austin and G. S. Sohi, "Dynamic dependence analysis of ordinary programs," in Proc. 19th Annu. Int. Symp. Comput. Architecture, June 1992, pp. 342-351.
    [4]
    {4} T. Ball and J. R. Larus, "Optimally profiling and tracing programs," in Conf. Rec. Nineteenth Annu. ACM Symp. Principles of Programming Languages, Jan. 1992, pp. 59-70.
    [5]
    {5} R. C. Covington, S. Madala, V. Mehta, J. R. Jump, and J. B. Sinclair, "The Rice Parallel Processing Testbed," in Proc. 1988 ACM SIGMETRICS Conf. Measuring and Modeling of Comput. Syst., May 1988, pp. 4-11.
    [6]
    {6} R. G. Cytron, "Compile-time scheduling and optimization for asynchronous machines," Tech. Rep. UIUCDCS-R-84-1177, Ph.D. dissertation, Dep. Comput. Sci., Univ. Illinois at Urbana-Champaign, Oct. 1984.
    [7]
    {7} S. J. Eggers, D. R. Keppel, E. J. Koldinger, and H. M. Levy, "Techniques for efficient inline tracing on a shared-memory multiprocessor," in Proc. 1990 ACM SIGMETRICS Conf. Measuring and Modeling of Comput. Syst., May 1990, pp. 37-47.
    [8]
    {8} J. D. Gee, M. D. Hill, D. N. Pnevmatikatos, and A. J. Smith, "Cache performance of the SPEC benchmark suite," Tech. Rep. 1049, Comput. Sci. Dep., Univ. Wisconsin-Madison, Sept. 1991.
    [9]
    {9} W. L. Harrison III, "The interprocedural analysis and automatic parallelization of scheme programs," Lisp and Symbolic Computation, vol. 2, no. 3-4, pp. 179-396, 1989.
    [10]
    {10} S. Horwitz, P. Pfeiffer, and T. Reps, "Dependence analysis for pointer variables," in Proc. SIGPLAN '89 Conf. Programming Language Design and Implementation, June 1989, pp. 28-40.
    [11]
    {11} W. L. Harrison III and Z. Ammarguellat, "The design of automatic parallelizers for symbolic and numeric programs," in Parallel Lisp: Languages and Systems, T. Ito and R. H. Halstead, Jr., Ed. Berlin, Germany: Springer-Verlag, 1990, pp. 235-254.
    [12]
    {12} M. Kumar, "Measuring parallelism in computation-intensive scientific/engineering applications," IEEE Trans. Comput., vol. 37, no. 9, pp. 1088-1098, Sept. 1988.
    [13]
    {13} J. R. Larus, "Abstract execution: A technique for efficiently tracing programs," Software Practice & Experience, vol. 20, no. 12, pp. 1241-1258, Dec. 1990.
    [14]
    {14} J. R. Larus, "Compiling Lisp programs for parallel execution," Lisp and Symbolic Computation, vol. 4, no. 1, pp. 29-99, Jan. 1991.
    [15]
    {15} J. R. Larus, "Estimating the potential parallelism in programs" in Proc. Third Workshop Languages and Compilers for Parallel Computing, A. Nicolau, D. Gelernter, T. Gross, and D. Padua, Eds., M.I.T. Press, 1991, ch. 17, pp. 331-349.
    [16]
    {16} G. Lee, C. P. Kruskal, and D. J. Kuck, "An empirical study of automatic restructuring of nonnumerical programs for parallel processors," IEEE Trans. Comput., vol. C-34, no. 10, pp. 927-933, Oct. 1985.
    [17]
    {17} D. E. Maydan, J. L. Hennessy, and M. S. Lam, "Effectiveness of data dependence analysis," in NSF NCRD Workshop on Advanced Compilation Techniques for Novel Architectures, 1992.
    [18]
    {18} B. P. Miller, M. Clark, J. Hollingsworth, S. Kierstead, S.-S. Lim, and T. Torzewski, "IPS-2: The second generation of a parallel program measurement system," IEEE Trans. Parallel Distributed Syst., vol. 1, no. 2, pp. 206-217, April 1990.
    [19]
    {19} A. Nicolau and J. A. Fischer, "Measuring the parallelism available for very long instruction word architectures," IEEE Trans. Comput., vol. C-33, no. 11, pp. 968-976, Nov. 1984.
    [20]
    {20} V. Sarkar, "Determining average program execution times and their variance," in Proc. SIGPLAN '89 Conf. Programming Language Design and Implementation, June 1989, pp. 298-309.
    [21]
    {21} SPEC, SPEC Benchmark Suite Release 1.0, Winter 1990.
    [22]
    {22} C. B. Stunkel and W. K. Fuchs, "TRAPEDS: Producing traces for multicomputers via Execution driven simulation," in Proc. 1989 ACM SIGMETRICS Conf. Measuring and Modeling of Comput. Syst., May 1989, pp. 70-78.
    [23]
    {23} M. J. Wolfe, Optimizing Supercompilers for Supercomputers. Cambridge, MA: M.I.T. Press, 1988.
    [24]
    {24} H. Zima and B. Chapman, Supercompilers for Parallel and Vector Computers. ACM Press/Addison-Wesley, 1990.

    Cited By

    View all
    • (2024)PROMPT: A Fast and Extensible Memory Profiling FrameworkProceedings of the ACM on Programming Languages10.1145/36498278:OOPSLA1(449-473)Online publication date: 29-Apr-2024
    • (2023)Program State Element CharacterizationProceedings of the 21st ACM/IEEE International Symposium on Code Generation and Optimization10.1145/3579990.3580011(199-211)Online publication date: 17-Feb-2023
    • (2019)A hybrid sample generation approach in speculative multithreadingThe Journal of Supercomputing10.1007/s11227-017-2118-375:8(4193-4225)Online publication date: 1-Aug-2019
    • Show More Cited By

    Index Terms

    1. Loop-Level Parallelism in Numeric and Symbolic Programs
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image IEEE Transactions on Parallel and Distributed Systems
      IEEE Transactions on Parallel and Distributed Systems  Volume 4, Issue 7
      July 1993
      120 pages

      Publisher

      IEEE Press

      Publication History

      Published: 01 July 1993

      Author Tags

      1. Index Termsnumeric programs
      2. dynamic behavior
      3. loop level parallelism
      4. parallel computer
      5. parallel execution
      6. parallel programming
      7. parallelism analyzer
      8. parallelperformance
      9. performance evaluation
      10. programcompilers
      11. sequential program
      12. speed improvement
      13. symbolic programs

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 28 Jul 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)PROMPT: A Fast and Extensible Memory Profiling FrameworkProceedings of the ACM on Programming Languages10.1145/36498278:OOPSLA1(449-473)Online publication date: 29-Apr-2024
      • (2023)Program State Element CharacterizationProceedings of the 21st ACM/IEEE International Symposium on Code Generation and Optimization10.1145/3579990.3580011(199-211)Online publication date: 17-Feb-2023
      • (2019)A hybrid sample generation approach in speculative multithreadingThe Journal of Supercomputing10.1007/s11227-017-2118-375:8(4193-4225)Online publication date: 1-Aug-2019
      • (2016)Performance implications of transient loop-carried data dependences in automatically parallelized loopsProceedings of the 25th International Conference on Compiler Construction10.1145/2892208.2892214(23-33)Online publication date: 17-Mar-2016
      • (2016)Memory Partitioning in the LimitInternational Journal of Parallel Programming10.1007/s10766-015-0380-744:2(337-380)Online publication date: 1-Apr-2016
      • (2013)Beyond reuse distance analysisACM Transactions on Architecture and Code Optimization10.1145/2541228.255530910:4(1-29)Online publication date: 1-Dec-2013
      • (2013)CUBITProceedings of the 2013 Research in Adaptive and Convergent Systems10.1145/2513228.2513272(63-68)Online publication date: 1-Oct-2013
      • (2012)Dynamic trace-based analysis of vectorization potential of applicationsACM SIGPLAN Notices10.1145/2345156.225410847:6(371-382)Online publication date: 11-Jun-2012
      • (2012)Dynamic trace-based analysis of vectorization potential of applicationsProceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation10.1145/2254064.2254108(371-382)Online publication date: 11-Jun-2012
      • (2012)Profiling Data-Dependence to Assist ParallelizationProceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture10.1109/MICRO.2012.47(437-448)Online publication date: 1-Dec-2012
      • Show More Cited By

      View Options

      View options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media