Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/62115.62134acmconferencesArticle/Chapter ViewAbstractPublication PagesppoppConference Proceedingsconference-collections
Article
Free access

Characterizing the synchronization behavior of parallel programs

Published: 01 January 1988 Publication History
  • Get Citation Alerts
  • Abstract

    Contention for synchronization locks and delays waiting for synchronization events can substantially increase the running time of a parallel program. This makes it important to characterize the synchronization behavior of programs and to provide analysis tools to aid both the hardware and software designer in evaluating design alternatives. This paper describes a tracing facility that is incorporated into a synchronization package. This facility provides a portable means to accurately and efficiently characterize parallel programs. The behavior of several applications has been monitored uncovering program characteristics that make it difficult to achieve linear speedup. Our monitoring facility allows a programmer to determine the performance implications of the synchronization structure he has used, and it allows the architect to evaluate various hardware support mechanisms.

    References

    [1]
    A. Agarwal, R. L. Sites, and M. Horowitz. ATUM: new technique for capturing address traces using microcode. In Proceedings of the 13th Annual Int'l Symposium on Computer Architecture, ACM-IEEE, Jun 1986. Published as Vol. t4, No. 2, of Computer Architecture News,
    [2]
    Anant Agarwal and Anoop Gupta. Memory-referenCe characteristics of multiprocessor applications under MACH. In Proceedings of the A CM SIGMETRICS Conf. on Measurements and Modeling of Computer Systems, ACM, May 1988. t.o appear.
    [3]
    B. Beck, B. Kasten, and S. Thakkar. VLSI assist for a multiprocessor. In Proc. of the Second Int'l Con}. on Architectural Support,for Programming Languages and Operatin9 Systems, ACM-IEEE, Oct 1987. Pubfished as Vol. 15, No. 5, of Computer Architecture News.
    [4]
    T.A. Cargill and B.N. Locanthi. Cheap hardware support for software debugging and profiling. In Proceedings of the Second Int'l Conj. on Architectural Support for Programming Languages and Operatin9 Systems, ACM- IEEE, Oct 1987.
    [5]
    S. Chen. Large-scale and high-speed multiprocessor system for scientific applic~.tions-Cray X-MP-2 series. In Proc. of NATO Advanced Research Workshop on High Speed Computing, pages 59-67, Jun 1983.
    [6]
    C. Y. Chu. Mils: mips instruction level simulator. Sept. 1985. unpublished report.
    [7]
    D. W. Clark and J. S. Emer. Performance of the VAX- 11/7"80 translation buffer: simulations and meaurement. A CM Transactions on Compvter Systems, 3(1):31-62, Feb 1985.
    [8]
    Douglas W. Cl~rk. Cache performance in the VAX- 11/780. A CM Transactions on Computer Systems. 1(1):24-37, Feb 1983.
    [9]
    Digital Equipment Corporation. VAX-11 architecture reference manual. 1982.
    [10]
    Introducing the enhanced CRAY X-MP series of computer systems. Cray Channels, 6(3):3-5, Jly 1984.
    [11]
    F. Darema-Rogers, G. F. Pfist.er, and K. So. Memory access patterns of parallel scientific programs. In Proceedings of the A CM SIGMETRICS Conf. on Measurements and Modeling o} Computer Systems, ACM, May 1987.
    [12]
    Kenneth W. Dritz and James M. Boyle. Beyond "Speedup": Performance Analysis of Parallel Programs. Technical Report ANL-87-7, Argonne National Laboratory, Feb 1987.
    [13]
    S.H. Fuller and S.P. Harbison. The C.mmp Multiprocessor. Technical Report, Carnegie-Mellon University, 1978.
    [14]
    A. Gottlieb. Avoiding serial bottlenecks in ultra par- Mlel MIMD computers. In Proceedings of COMPCON, p~ges 354-359, Spring 1984.
    [15]
    A. Gottlieb. WASHCLOTH- The Logical Successor to SOAPSUDS. Technical Report Ultracomputer Note 12, Courant Institute, NYU, Dec 1980.
    [16]
    A. Gottlieb and C. Kruskal. Coordinating parallel processors. Computer News, 16-24, Oct 1981.
    [17]
    Teemu Kerola and Herb Schwetman. Monit: a performance monitoring tool for parallel and pseudo-parallel programs. In Proceedings of the A CM SIGMETRICS Conf. on Measurements and Modeling o} Computer Systems, ACM, May 1987.
    [18]
    Zhiyuan Li and Walid Abu-sufah. A technique for reducing synchronization overhead in large scale muttiprocessors. In Proceedings of the 12th Annual lnt'l Symposium on Computer Architecture, pages 284-291, June 1985.
    [19]
    Lusk, Overbeek, and eta}. Portable Programs for Parallel Processors. Holt, Rinehart and Winston. Inc., 1987.
    [20]
    E. L. Lusk and R. A. Overbeek. Use of Monitors in FORTRAN: A Tutorial on the Barrier, Self-scheduling DO-Loop, and Ask:for Monitors. Technical Report ANL- 84-51 Rev. 1, Argonne National Laboratory, Jun 1985.
    [21]
    J.-K. Peir. An. Efficient Synchronization Method for Multiprocessor Systems. Technical Report Cedar Docu. No. 27, University of Illinois at Urb.-Chanp., Dec 1983.
    [22]
    G. F. Pfister. W.C. Bran tley, D.A. George, S.L. Harvey, W.j. Kleinfelder, K.P. McAufiffe, E.A. Melton, V.A. Norton, and J. Weiss. The IBM research parallel processor prototype (RP3): introduction and architecture. In Proc of the 1985 Parallel Processing Conf., Aug 1985.
    [23]
    G. F. Pfister and V. A. Norton. Hot spot contention and combining in multistage interconnection networks. 1EEE Transcations on Computers, c-34(10):943-948, Oct 1986.
    [24]
    Jonathon Rose. LocusRoute: a parallel globM router for standard cells. In 25th A CM/1EEE Design Automation Conference. ACM-IEEE, 1988. to appear.
    [25]
    Sequent Computer Systems, Inc. Balance technical summar)'. Nov 1986.
    [26]
    Alan Jay Smith. Cache evaluations and the impact, of workload choice. In Proceedings of the l~2th A nntlal Int'l Symposium on Computer Architecture, ACM-IEEE,fun 1985.
    [27]
    Alan :lay Smith. Cache memories. ACM Computing Surveys, 473-530, Sep 1982.
    [28]
    B. Smith. Architecture and applications of the HEP multprocessor computer system. Proc. of SPIE, 241- 248, 1981.
    [29]
    K. So, F. Darema-Rogers, D. George, V. A. Norton, and G. F. Pfister. A System For Parallel Simulations o/ the Execution of Parallel Programs. Technical Report RCl1674, IBM, 1986.
    [30]
    Larry Soule. ParaJlel logic simul~tion on general purpose machines. In lnt 'l Conf. on Computer Aided Design, IEEE, 1988. to appear.
    [31]
    Charles P. Thacker and Lawrence C. Stewart. Firefly: a Multiprocessor Workstation. In Proceedings of the Second lnt'l Conf. or~ Architectural Support .for Programming Languages and Operating Systems, ACM-IEEE, Oct 1987.
    [32]
    C.-Q. Zhu and P.-C. Yew. A synchronization scheme and its applications for large scale multiprocessor systems. In Proceedings of the Conf. Distributed Computing Systems, pages 486-491, May 1984.

    Cited By

    View all
    • (2021)Optimizing Barrier Synchronization on ARMv8 Many-Core Architectures2021 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/Cluster48925.2021.00044(542-552)Online publication date: Sep-2021
    • (2006)Parallelization alternatives and their performance for the convex hull problemApplied Mathematical Modelling10.1016/j.apm.2005.05.02230:7(563-577)Online publication date: Jul-2006
    • (1996)Waiting time analysis and performance visualization in CarnivalProceedings of the SIGMETRICS symposium on Parallel and distributed tools10.1145/238020.238023(1-10)Online publication date: 1-Jan-1996
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    PPEALS '88: Proceedings of the ACM/SIGPLAN conference on Parallel programming: experience with applications, languages and systems
    January 1988
    246 pages
    ISBN:0897912764
    DOI:10.1145/62115
    • cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 23, Issue 9
      Proceedings of the ACM/SIGPLAN PPEALS 1988
      Sept. 1988
      246 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/62116
      Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 January 1988

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Article

    Conference

    PPEALS88
    Sponsor:

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)29
    • Downloads (Last 6 weeks)9
    Reflects downloads up to 10 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2021)Optimizing Barrier Synchronization on ARMv8 Many-Core Architectures2021 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/Cluster48925.2021.00044(542-552)Online publication date: Sep-2021
    • (2006)Parallelization alternatives and their performance for the convex hull problemApplied Mathematical Modelling10.1016/j.apm.2005.05.02230:7(563-577)Online publication date: Jul-2006
    • (1996)Waiting time analysis and performance visualization in CarnivalProceedings of the SIGMETRICS symposium on Parallel and distributed tools10.1145/238020.238023(1-10)Online publication date: 1-Jan-1996
    • (1993)Performance debugging using parallel performance predicatesACM SIGPLAN Notices10.1145/174267.17127628:12(140-150)Online publication date: 1-Dec-1993
    • (1993)Performance debugging using parallel performance predicatesProceedings of the 1993 ACM/ONR workshop on Parallel and distributed debugging10.1145/174266.171276(140-150)Online publication date: 1-Dec-1993
    • (1993)MtoolIEEE Transactions on Parallel and Distributed Systems10.1109/71.2056514:1(28-40)Online publication date: 1-Jan-1993
    • (1991)Characterizing memory hot spots in a shared memory MIMD machineProceedings of the 1991 ACM/IEEE conference on Supercomputing10.1145/125826.126132(554-566)Online publication date: 1-Aug-1991
    • (1991)Scalable reader-writer synchronization for shared-memory multiprocessorsACM SIGPLAN Notices10.1145/109626.10963726:7(106-113)Online publication date: 1-Apr-1991
    • (1991)Scalable reader-writer synchronization for shared-memory multiprocessorsProceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming10.1145/109625.109637(106-113)Online publication date: 1-Apr-1991
    • (1990)Simulation vs. prototype execution: a case studyCOMPEURO'90: Proceedings of the 1990 IEEE International Conference on Computer Systems and Software Engineering@m_Systems Engineering Aspects of Complex Computerized Systems10.1109/CMPEUR.1990.113654(428-436)Online publication date: 1990
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media