Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/143369.143398acmconferencesArticle/Chapter ViewAbstractPublication PagesicsConference Proceedingsconference-collections
Article
Free access

Automatic software cache coherence through vectorization

Published: 01 August 1992 Publication History
  • Get Citation Alerts
  • Abstract

    Access latency in large-scale shared-memory multiprocessors is a concern since most (if not all) memory is one or more hops away through an interconnection network. Providing processors with one or more levels of cache is an accepted way to reduce the average access latency; however, in a multiprocessor, cached values must be kept coherent for the multiprocessor to support the abstraction of a shared global memory. There is no generally accepted hardware solution to provde cache coherence for large-scale shared-memory multiprocessors. Software coherence strategies offer scalability with current hardware. In this paper we examine a compiler-based software strategy for maintaining cache coherence that relies on dependence analysis and a vectorization algorithm to insert cache control directives. Experiments on the BBN TC2000 for a pair of numerical problems show that the run-time cost of coherence using our strategy is less than that for previously proposed compiler-based software methods and suggest that it should compare favorably with proposed hardware schemes.

    References

    [1]
    R. Allen and K. Kennedy. Automatic translation of FORTRAN programs to vector form. A CM Transactions on Programming Languages and Systems, 9(4):491-542, Oct. 1987.
    [2]
    R. Alverson, D. Callahan, D. Cummings, B. Koblenz, A. Porterfield, and B. Smith. The Tern computer system. In Proc. of the 1990 International Conference on Supercomputing/Computer Arch,tecture News, pages 1-6, Amsterdam, The Netherlands, June 1990. Proceedings published as ACM SIGARCH Computer Architecture News, 18 (3), Sept. 1990.
    [3]
    L. M. Censier and P. Feautrier. A new solution to coherence problems in multicache systems. IEEE Transavtion~ oi~ Computers, C-27(12);1112-1118, Dec. 1978.
    [4]
    tI. Cheong and A. Veidenbaum. Compiler-directed cache management for multiprocessors. Computer, 23(6):39-47, June 1990.
    [5]
    R. Cytron, S. Karlovsky, and K. McAuliffe. Automatic management of programmable caches. IlL Proc. of the 1988 International Conference on Parallel Processing, pages 229-238, ?, Aug. 1988.
    [6]
    E. Darnell, J. M. Mellor-Crummcy, and K. Kennedy. Automatic software cache coherence through vectorizat.ion. Technical Report CRPC-TR92197, Computer Science Department, Rice University, Jan. 1992.
    [7]
    J. J. Donagrra, I. S. Duff, D. C. Sorenson, and H. A. van der Vorst. Solving Linear Systems on Vector and Shared Memory Computers. Society for Industrial ~ncl Applied Mathematics, 1991.
    [8]
    D. Kuck. The Structure of Computers and Computations, Volume 1. Wiley, New York, NY, 1978.
    [9]
    D. Kuck, R. Kuhn, D. Padua, B. Leasure, and M. J. Wolfe. Dependence graphs and compiler optimizations. In Conference Record of the Eighth A CM Symposium on the Principles of Programming Languages, Williamsburg, VA, Jan. 1981.
    [10]
    L. Lamport. How to make a multiprocessor that correctly executes multiprocess programs. IEEE Transactions on Computers, C-28(9), Sept. 1979.
    [11]
    D. Lenoski, J. Laudon, K. Gharachorloo, A. Gupt~, and J. Hennessy. The directory-based cache coherence protocol for the dash multiprocessor. 17th International Symposium on Computer Architecture~Computer Architecture News, p~ges 148-159, May 1990. Special issue of Computer Architecture News, 18(2), June 1990.
    [12]
    S. Min and J. B~er. A timestamp-based c~che coherence scheme. In Proc. of the 1989 International Con- }emnce on Parallel Processing, volume 1, pages 23-32, Aug. 1989.
    [13]
    S. Min, :I. Baer, and H. Kim. An efficient caching support for critical sections in large-scale shared-memory multiprocessors. In Proc. of the 1990 International Conference on Supercomputing/Computer Architecture News, pages 4-47, June 1990. Special issue of Computer Architecture News, 18(3), Sept. 1990.
    [14]
    A. Osterhaug, editor. Guide to Parallel Programming on Sequent Computer Systems. Sequent Technical Publications, San Diego, CA, 1989.
    [15]
    Par,'dlel Computing Forum. PCF Fortran, Mar. 1990. Working Draft.
    [16]
    D. Schanin. The design and development of a very high speed system bus - the encore multimax nanobus. In Proceedings of the Fall Joint Computer Conference, pages 410-418, Nov. 1986.
    [17]
    J. Willis, A. Sanderson, ~nd C. Hill. Cache coherence in systems with parallel communication channels ~z many processors. In Supercornput:~ng '90, p~ges 554- 563, 1990.

    Cited By

    View all
    • (2023)WARDen: Specializing Cache Coherence for High-Level Parallel LanguagesProceedings of the 21st ACM/IEEE International Symposium on Code Generation and Optimization10.1145/3579990.3580013(122-135)Online publication date: 17-Feb-2023
    • (2016)Architecting and Programming a Hardware-Incoherent Multiprocessor Cache Hierarchy2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS.2016.76(555-565)Online publication date: May-2016
    • (2016)Compiler Support for Software Cache Coherence2016 IEEE 23rd International Conference on High Performance Computing (HiPC)10.1109/HiPC.2016.047(341-350)Online publication date: Dec-2016
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICS '92: Proceedings of the 6th international conference on Supercomputing
    August 1992
    495 pages
    ISBN:0897914856
    DOI:10.1145/143369
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 August 1992

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Article

    Conference

    ICS92
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 629 of 2,180 submissions, 29%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)31
    • Downloads (Last 6 weeks)7
    Reflects downloads up to 12 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)WARDen: Specializing Cache Coherence for High-Level Parallel LanguagesProceedings of the 21st ACM/IEEE International Symposium on Code Generation and Optimization10.1145/3579990.3580013(122-135)Online publication date: 17-Feb-2023
    • (2016)Architecting and Programming a Hardware-Incoherent Multiprocessor Cache Hierarchy2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS.2016.76(555-565)Online publication date: May-2016
    • (2016)Compiler Support for Software Cache Coherence2016 IEEE 23rd International Conference on High Performance Computing (HiPC)10.1109/HiPC.2016.047(341-350)Online publication date: Dec-2016
    • (2003)Towards general and exact distributed invalidationJournal of Parallel and Distributed Computing10.1016/j.jpdc.2003.07.00763:11(1123-1137)Online publication date: 1-Nov-2003
    • (2000)Exact Distributed InvalidationEuro-Par 2000 Parallel Processing10.1007/3-540-44520-X_51(395-404)Online publication date: 18-Aug-2000
    • (1996)Eliminating Stale Data References through Array Data-Flow AnalysisProceedings of the 10th International Parallel Processing Symposium10.5555/645606.661163(4-13)Online publication date: 15-Apr-1996
    • (1996)Eliminating stale data references through array data-flow analysisProceedings of International Conference on Parallel Processing10.1109/IPPS.1996.508032(4-13)Online publication date: 1996
    • (1995)Solutions and debugging for data consistency in multiprocessors with noncoherent cachesInternational Journal of Parallel Programming10.1007/BF0257778523:1(83-103)Online publication date: 1-Feb-1995
    • (1995)Combining flow and dependence analyses to expose redundant array accessesInternational Journal of Parallel Programming10.1007/BF0257777323:5(423-470)Online publication date: 1-Oct-1995
    • (1994)Exploiting cache affinity in software cache coherenceProceedings of the 8th international conference on Supercomputing10.1145/181181.181542(264-273)Online publication date: 16-Jul-1994
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media