Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article
Free access

On the combination of hardware and software concurrency extraction methods

Published: 01 June 1988 Publication History

Abstract

It has been shown that parallelism is a very promising alternative for enhancing computer performance. Parallelism, however, introduces much complexity to the programming effort. This has lead to the development of automatic concurrency extraction techniques. Prior work has demonstrated that static program restructuring via compiler based techniques provides a large degree of parallelism to the target machine. Purely hardware based extraction techniques (without software preprocessing) have also demonstrated significant (but lesser) degrees of parallelism. This paper considers the performance effects of the combination of both hardware and software techniques. The concurrency extracted from a given set of benchmarks by each technique separately, and together, is determined via simulations and or analysis. The "common parallelism" extracted by the two methods is thus also considered, using new metrics. The analytic techniques for predicting the performance of specific programs are also described.

References

[1]
Acosta, R. D., Kjelstrup, J., and Torng, H. C. An Instruction Issuing Approach to Enhancing Performance in Multiple Functional Unit Processors. IEEE Transactions on Computers C-35:815--828, September, 1986.
[2]
Banerjee, U. Speedup of Ordinary Programs. PhD thesis, University of Illinois at Urbana-Champaign, October, 1979. Available as DCS Report No. UIUCDCS-R-79-989.
[3]
Chamberlin, D. D. The Single-Assignment Approach to Parallel Processing. In Fall Joint Computer Conference, pages 263--269, AFIPS, 1971.
[4]
Cytron, R. G. Doacross: Beyond Vectorization for Multiprocessors (Extended Abstract). In Proceedings of the 1986 International Conference on Parallel Processing, pages 836--844. Pennsylvania State University and the IEEE Computer Society, August, 1986.
[5]
Hwu, W. and Pau, Y. HPSm, a High Performance Restricted Data Flow Architecture Having Minimal Functionality. In Proceedings of the 13th Annual Symposium on Computer Architecture, pages 297--306. ACM-IEEE, June, 1986.
[6]
Keller, R. M. Look-Ahead Processors. ACM Computing Surveys 7(4):177--195, December, 1975.
[7]
Kolen, J. F. Characterization of Concurrently Executed Programs. 1987. Undergraduate project report, Dept. of Electrical Engineering and Computer Sciences, University of California at San Diego, La Jolla, CA.
[8]
Kuck, D. J., Muraoka, Y, and Chen, S.-C. On the Number of Operations Simultaneously Executable in Fortran-Like Programs and Their Resulting Speedup, IEEE Transactions on Computers C-21(12):1293--1310, December, 1972.
[9]
Kuck, D. J. A Survey of Parallel Machine Organization and Programming. ACM Computing Surveys 9(1):29--59, March, 1977.
[10]
Kuck, D. J. The Structure of Computers and Computations. John Wiley & Sons, New York, NY, 1978.
[11]
Kuck. D. J., Kuhn, R. H., Leasure, B., and Wolfe, M. The Structure of an Advanced Voctorizer for Pipelined Processors. In Proceedings of the Fourth International Computer Software and Applications Conference. ACM, October, 1980.
[12]
Pau, Y., Hwu, W., and Shebanow, M. HPS, a New Microarchitecture: Rationale and Introduction. In Proceedings of MICRO-18, pages 101--108, ACM, December, 1985.
[13]
Polychronopoulos, C. D., Kuck, D. J., and Padua. D. A. Utilizing Multidimensional Loop Parallelism on Large-Scale Parallel Processor Systems. IEEE Transactions on Computers, publication date unknown. Accepted for publication as of September 1987.
[14]
Polychronopoulos, C. D., On Program Restructuring, Scheduling, and Communication for Parallel Processor Systems. PhD thesis. University of Illinois at Urbana-Champaign, August, 1986. Available as Center for Supercomputing Research and Development Tech. Report CSRD No. 595.
[15]
Polychronopoulos, C. D., and Banerjee, U. Processor Allocation for Horizontal and Vertical Parallelism and Related Speedup Bounds. IEEE Transactions on Computers. April, 1987. Special Issue on Parallel and Distributed Processing.
[16]
Polychronopoulos, C. D., and Kuck, D.J. Guided Self-Scheduling: A Practical Scheduling Scheme for Parallel Supercomputers. IEEE Transactions on Computers, December 1987. Special Issue on Supercomputing.
[17]
Thorton, J. E. Parallel Operation in the Control Data 6600. In Proceedings of the Full Joint Computer Conference, pages 33--40, AFIPS, 1964.
[18]
Tjaden, G. S. Representation and Detection of Concurrency Using Ordering Matrices. PhD thesis. The Johns Hopkins University, 1972.
[19]
Tjaden, G. S. and Flynn, M. J. Representation of Concurrency with Ordering Matrices. IEEE Transaction on Computers C-22(8):752--761, August, 1973.
[20]
Tomasulo, R. M. An Efficient Algorithm for Exporting Multiple Arithmetic Units. IBM Journal:25--33, January, 1967.
[21]
Uht, A. K. Hardware Extraction of Low-level Concurrency from Sequential Instruction Streams. PhD thesis, Carnegie-Mellon University, Pittusburgh, PA, December, 1985. Available from University Microfilms International, Ann Arbor, Michigan. U.S.A.
[22]
Uht, A. K. An Efficient Hardware Algorithm to Extract Concurrency From General-Purpose Code. In Proceedings of the Nineteenth Annual Hawaii International Conference on System Sciences. University of Hawaii, in cooperation with the ACM and the IEEE Computer Society, January, 1986.
[23]
Uht, A. K. and Wedig, R. G. Hardware Extraction of Low-level Concurrency from Serial Instruction Streams. In Proceedings of the International Conference on Parallel Processing, pages 729--736. IEEE Computer Society and the Association for Computing Machinery, August, 1986.
[24]
Uht, A. K. Incremental Performance Contributions of Hardware Concurrency Extraction Techniques. In Proceedings of the International Conference on Supercomputing, Athens, Greece. Computer Technology Institute, Greece, in cooperation with the Association for Computing Machinery, IFIP, et al, June, 1987. Springer-Verlag Lecture Note Series. In publications.
[25]
Wedig, R. G. Detection of Concurrency in Directly Executed Language Instruction Streams. PhD thesis, Stanford University, June, 1982.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGMICRO Newsletter
ACM SIGMICRO Newsletter  Volume 19, Issue 1-2
June 1988
66 pages
ISSN:1050-916X
DOI:10.1145/62197
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 June 1988
Published in SIGMICRO Volume 19, Issue 1-2

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 67
    Total Downloads
  • Downloads (Last 12 months)27
  • Downloads (Last 6 weeks)4
Reflects downloads up to 12 Nov 2024

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media