Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1094811.1094852acmconferencesArticle/Chapter ViewAbstractPublication PagessplashConference Proceedingsconference-collections
Article

X10: an object-oriented approach to non-uniform cluster computing

Published: 12 October 2005 Publication History
  • Get Citation Alerts
  • Abstract

    It is now well established that the device scaling predicted by Moore's Law is no longer a viable option for increasing the clock frequency of future uniprocessor systems at the rate that had been sustained during the last two decades. As a result, future systems are rapidly moving from uniprocessor to multiprocessor configurations, so as to use parallelism instead of frequency scaling as the foundation for increased compute capacity. The dominant emerging multiprocessor structure for the future is a Non-Uniform Cluster Computing (NUCC) system with nodes that are built out of multi-core SMP chips with non-uniform memory hierarchies, and interconnected in horizontally scalable cluster configurations such as blade servers. Unlike previous generations of hardware evolution, this shift will have a major impact on existing software. Current OO language facilities for concurrent and distributed programming are inadequate for addressing the needs of NUCC systems because they do not support the notions of non-uniform data access within a node, or of tight coupling of distributed nodes.We have designed a modern object-oriented programming language, X10, for high performance, high productivity programming of NUCC systems. A member of the partitioned global address space family of languages, X10 highlights the explicit reification of locality in the form of places}; lightweight activities embodied in async, future, foreach, and ateach constructs; a construct for termination detection (finish); the use of lock-free synchronization (atomic blocks); and the manipulation of cluster-wide global data structures. We present an overview of the X10 programming model and language, experience with our reference implementation, and results from some initial productivity comparisons between the X10 and Java™ languages.

    References

    [1]
    Sudhir Ahuja, Nicholas Carriero, and David Gelernter. Linda and friends. IEEE Computer, 19(8):26--34, August 1986.
    [2]
    Eric Allan, David Chase, Victor Luchangco, Jan-Willem Maessen, Sukyoung Ryu, Guy L. Steele Jr., and Sam Tobin-Hochstadt. The Fortress language specification version 0.618. Technical report, Sun Microsystems, April 2005.
    [3]
    Yariv Aridor, Michael Factor, and Avi Teperman. cJVM: A single system image of a JVM on a cluster. In Proceedings of the International Conference on Parallel Processing (ICPP'99), pages 4--11, September 1999.
    [4]
    Henri E. Bal and M. Frans Kaashoek. Object distribution in Orca using compile-time and run-time techniques. In Proceedings of the Conference on Object-Oriented Programming Systems, Languages and Applications (OOPSLA'93), pages 162--177, November 1993.
    [5]
    Ray Barriuso and Allan Knies. SHMEM user's guide. Technical report, Cray Inc. Research, May 1994.
    [6]
    John K. Bennett, John B. Carter, and Willy Zwaenepoel. Munin: Distributed shared memory based on type specific memory coherence. In Proceedings of the Symposium on Principles of Programming Languages (POPL'95), pages 168--176, March 1990.
    [7]
    Hans Boehm. Threads cannot be implemented as a library. In Proceedings of the Conference on Programming Language Design and Implementation (PLDI'05), pages 261--268, June 2005.
    [8]
    Luca Cardelli. A language with distributed scope. In Proceedings of the Symposium on Principles of Programming Languages (POPL'95), pages 286--297, January 1995.
    [9]
    Calin Cascaval, Evelyn Duesterwald, Peter F. Sweeney, and Robert W. Wisniewski. Multiple page size modeling and optimization. In Proceedings of the Conference on Parallel Architectures and Compilation Techniques (PACT'05), September 2005.
    [10]
    IBM International Technical Support Organization Poughkeepsie Center. Overview of lapi. Technical report sg24-2080-00, chapter 10, IBM, December 1997. www.redbooks.ibm.com/redbooks/pdfs/sg242080.pdf.
    [11]
    Bradford L. Chamberlain, Sung-Eun Choi, Steven J. Deitz, and Lawrence Snyder. The high-level parallel language ZPL improves productivity and performance. In Proceedings of the IEEE International Workshop on Productivity and Performance in High-End Computing, 2004.
    [12]
    Elaine Cheong, Judy Liebman, Jie Liu, and Feng Zhao. TinyGALS: A Programming model for event-driven embedded systems. In Proceedings of 2003 ACM Symposium on Applied Computing, 2003.
    [13]
    Brian Chin, Shane Markstrum, and Todd Millstein. Semantic type qualifiers. In Proceedings of the Conference on Programming Language Design and Implementation (PLDI'05), pages 85--95, June 2005.
    [14]
    CILK-5.3 reference manual. Technical report, Supercomputing Technologies Group, June 2000.
    [15]
    F. Darema, D.A. George, V.A. Norton, and G.F. Pfister. A Single-Program-Multiple-Data Computational model for EPEX/FORTRAN. Parallel Computing, 7(1):11--24, 1988.
    [16]
    Kemal Ebcioc glu, Vijay Saraswat, and Vivek Sarkar. X10: Programming for hierarchical parallelism and nonuniform data access (extended abstract). In Language Runtimes '04 Workshop: Impact of Next Generation Processor Architectures On Virtual Machines (colocated with OOPSLA 2004), October 2004. www.aurorasoft.net/workshops/lar04/lar04home.htm.
    [17]
    Kemal Ebcioc glu, Vijay Saraswat, and Vivek Sarkar. X10: an experimental language for high productivity programming of scalable systems (extended abstract). In Workshop on Productivity and Performance in High-End Computing (P-PHEC), February 2005.
    [18]
    ECMA. Standard ecma-334: C} language specification. http://www.ecma-international.org/publications/files/ecma-st/Ecma-334.pdf, December 2002.
    [19]
    Tarek El-Ghazawi, William W. Carlson, and Jesse M. Draper. UPC Language Specification v1.1.1, October 2003.
    [20]
    High Performance Fortran Forum. High performance fortran language specification version 2.0. Technical report, Rice University Houston, TX, October 1996.
    [21]
    Ian Foster and Carl Kesselman. The Globus toolkit. The Grid: Blueprint of a New Computing Infrastructure, pages 259--278, 1998.
    [22]
    Basilio B. Fraquela, Jia Guo, Ganesh Bikshandi, Maria J. Garzaran, Gheorghe Almasi, Jose Moreira, and David Padua. The hierarchically tiled arrays programming approach. In Proceedings of the Workshop on Languages, Compilers, and Runtime Support for Scalable Systems (LCR'04), pages 1--12, 2004.
    [23]
    Al Geist, Adam Beguelin, Jack Dongarra, Weicheng Jiang, Robert Manchek, and Vaidy Sunderam. PVM -- Parallel Virtual Machine: A Users' Guide and Tutorial for Networked Parallel Computing. MIT Press, 1994.
    [24]
    James Gosling, Bill Joy, Guy Steele, and Gilad Bracha. The Java Language Specification. Addison Wesley, 2000.
    [25]
    Robert H. Halstead. MULTILISP: A language for concurrent symbolic computation. ACM Transactions on Programming Languages and Systems, 7(4):501--538, 1985.
    [26]
    Per Brinch Hansen. Structured multiprogramming. CACM, 15(7), July 1972.
    [27]
    Timothy Harris and Keir Fraser. Language support for lightweight transactions. In Proceedings of the Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA'03), pages 388--402, October 2003.
    [28]
    Matthias Hauswirth, Peter F. Sweeney, Amer Diwan, and Michael Hind. Vertical profiling: Understanding the behavior of object oriented applications. In Proceedings of the Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA'04), October 2004.
    [29]
    Maurice Herlihy. Wait-free synchronization. ACM Transactions on Programming Languages and Systems, 13(1):124--149, January 1991.
    [30]
    Paul Hilfinger, Dan Bonachea, David Gay, Susan Graham, Ben Liblit, Geoff Pike, and Katherine Yelick. Titanium Language Reference Manual. Technical Report CSD-01-1163, University of California at Berkeley, Berkeley, Ca, USA, 2001.
    [31]
    C.A.R. Hoare. Monitors: An operating system structuring concept. CACM, 17(10):549--557, October 1974.
    [32]
    HPC challenge benchmark. http://icl.cs.utk.edu/hpcc/.
    [33]
    HPL Workshop on High Productivity Programming Models and Languages, May 2004. http://hplws.jpl.nasa.gov/.
    [34]
    Cray Inc. The Chapel language specification version 0.4. Technical report, Cray Inc., February 2005.
    [35]
    The Java Grande Forum benchmark suite. http://www.epcc.ed.ac.uk/javagrande/javag.html.
    [36]
    The Java RMI Specification. http://java.sun.com/products/jdk/rmi/.
    [37]
    Arvind Krishnamurthy, David E. Culler, Andrea Dusseau, Seth C. Goldstein, Steven Lumetta, Thorsten von Eicken, and Katherine Yelick. Parallel programming in Split-C. In Proceedings of the 1993 ACM/IEEE Conference on Supercomputing, pages 262 -- 273, 1993.
    [38]
    L. Lamport. How to make a multiprocessor computer that correctly executes multiprocess programs. IEEE Transactions on Computers, 28(9), 1979.
    [39]
    Doug Lea. Concurrent Programming in Java, Second Edition. Addison-Wesley, Inc., Reading, Massachusetts, 1999.
    [40]
    Doug Lea. The Concurreny Utilities, 2001. JSR 166, http://www.jcp.org/en/jsr/detail?id=166.
    [41]
    Maged M. Michael and Michael L. Scott. Simple, fast, and practical non-blocking and blocking concurrent queue algorithms. In PODC '96: Proceedings of the fifteenth annual ACM symposium on Principles of distributed computing, pages 267--275. ACM Press, 1996.
    [42]
    Jose Moreira, Samuel Midkiff, and Manish Gupta. A comparison of three approaches to language, compiler, and library support for multidimensional arrays in Java computing. In Proceedings of the ACM Java Grande - ISCOPE 2001 Conference, June 2001.
    [43]
    Jose E. Moreira, Samuel P. Midkiff, Manish Gupta, Pedro V. Artigas, Marc Snir, and Richard D. Lawrence. Java programming for high-performance numerical computing. IBM Systems Journal, 39(1):21--, 2000.
    [44]
    Robert W. Numrich and John Reid. Co-Array Fortran for parallel programming. ACM SIGPLAN Fortran Forum Archive, 17:1--31, August 1998.
    [45]
    Nathaniel Nystrom, Michael R. Clarkson, and Andrew C. Myers. Polyglot: An extensible compiler framework for Java. In Proceedings of the Conference on Compiler Construction (CC'03), pages 1380--152, April 2003.
    [46]
    OpenMP specifications. http://www.openmp.org/specs.
    [47]
    Vijay Saraswat and Radha Jagadeesan. Concurrent clustered programming. In Proceedings of the International Conference on Concurrency Theory (CONCUR'05), August 2005.
    [48]
    Vijay Saraswat, Radha Jagadeesan, Armando Solar-Iezama, and Christoph von Praun. Determinate imperative programming: A clocked interpretetation of imperative syntax. Submitted for publication, available at http://www.saraswat.org/cf.html, September 2005.
    [49]
    V. Sarkar and G. R. Gao. Analyzable atomic sections: Integrating fine-grained synchronization and weak consistency models for scalable parallelism. Technical report, CAPSL Technical Memo 52, February 2004.
    [50]
    Vivek Sarkar, Clay Williams, and Kemal Ebcioc glu. Application development productivity challenges for high-end computing. In Workshop on Productivity and Performance in High-End Computing (P-PHEC), February 2004. http://www.research.ibm.com/arl/pphec/pphec2004-proceedings.pdf.
    [51]
    Anthony Skjellum, Ewing Lusk, and William Gropp. Using MPI: Portable Parallel Programming with the Message Passing Iinterface. MIT Press, 1999.
    [52]
    Lorna A. Smith and J. Mark Bull. A multithreaded java grande benchmark suite. In Proceedings of the Third Workshop on Java for High Performance Computing, June 2001.
    [53]
    Lorna A. Smith, J. Mark Bull, and Jan Obdrzalek. A parallel Java Grande benchmark suite. In Proceedings of Supercomputing 2001, Denver, Colorado, November 2001.
    [54]
    Standard Performance Evaluation Corporation (SPEC). SPECjbb2000 (java business benchmark). http://www.spec.org/jbb2000.
    [55]
    Thorsten von Eicken, David E. Culler, Seth C. Goldstein, and Klaus E. Schauser. Active messages: a mechanism for integrated communication and computation. In Proceedings of the Annual International Symposium on Computer Architecture (ISCA'92), pages 256--266, May 1992.
    [56]
    Robert W. Wisniewski, Peter F. Sweeney, Kartik Sudeep, Matthias Hauswirth, Evelyn Duesterwald, Calin Cascaval, and Reza Azimi. Performance and Environment Monitoring for Whole-System Characterization and Optimization. In Conference on Power/Performance interaction with Architecture,Circuits, and Compilers, 2004.

    Cited By

    View all
    • (2024)Performance of Text-Independent Automatic Speaker Recognition on a Multicore SystemTsinghua Science and Technology10.26599/TST.2023.901001829:2(447-456)Online publication date: Apr-2024
    • (2024)When Is Parallelism Fearless and Zero-Cost with Rust?Proceedings of the 36th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3626183.3659966(27-40)Online publication date: 17-Jun-2024
    • (2024)Teaching Parallel Algorithms Using the Binary-Forking Model2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW63119.2024.00080(346-351)Online publication date: 27-May-2024
    • Show More Cited By

    Recommendations

    Reviews

    Henk Sips

    The authors present X10, an object-oriented language designed and implemented to address the requirements of non-uniform cluster computing platforms (NUCC): tile-based architectures having multicore symmetric multiprocessing (SMP) tiles (nodes) with non-uniform memory hierarchies. X10 increases NUCC programming productivity by facilitating the design, implementation, and deployment of safe, analyzable, scalable, and flexible parallel high-performance computing (HPC) applications. X10 is based on a novel combination of design decisions: a new programming model, reflected in a Java-based programming language, using a partitioned global address space (PGAS) model with explicit locality; specific concurrency constructs for task parallelism; and dedicated array constructs for data parallelism. The prototype implementation is entirely based on Java: X10 is translated to Java code and executed on a single unmodified Java Virtual Machine (JVM) (current version) or on a multi-JVM environment (under development). The code productivity analysis is done by comparing the parallelizations of eight benchmarks from the Java Grande Benchmark Suite in both X10 and Java. For the code size metrics considered, X10 gives better results. The benchmark performance is not analyzed, probably because the prototype X10 implementation (the full chain is due in 2010) is now mainly targeted to language design evaluation and not to performance. The main contribution of the paper is the presentation of X10 as a distinctive option among the HPC languages, due to its attempts at a coherent design (not building on legacy code); object-oriented approach; and novel constructs for locality, synchronization, data and task parallelism. Despite the numerous technical details, which may hinder readability, the authors have succeeded in proving X10 as a productive solution for programming NUCC platforms. Of course, they still have to prove that productivity and performance will go together, because that will determine in the end whether new paradigms and languages will be accepted by the HPC community. Online Computing Reviews Service

    Access critical reviews of Computing literature here

    Become a reviewer for Computing Reviews.

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    OOPSLA '05: Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
    October 2005
    562 pages
    ISBN:1595930310
    DOI:10.1145/1094811
    • cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 40, Issue 10
      Proceedings of the 20th annual ACM SIGPLAN conference on Object oriented programming systems languages and applications
      October 2005
      531 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/1103845
      Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 12 October 2005

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Java
    2. X10
    3. atomic blocks
    4. clocks
    5. data distribution
    6. multithreading
    7. non-uniform cluster computing (NUCC)
    8. partitioned global address space (PGAS)
    9. places
    10. productivity
    11. scalability

    Qualifiers

    • Article

    Conference

    OOPSLA05
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 268 of 1,244 submissions, 22%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)53
    • Downloads (Last 6 weeks)6
    Reflects downloads up to 11 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Performance of Text-Independent Automatic Speaker Recognition on a Multicore SystemTsinghua Science and Technology10.26599/TST.2023.901001829:2(447-456)Online publication date: Apr-2024
    • (2024)When Is Parallelism Fearless and Zero-Cost with Rust?Proceedings of the 36th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3626183.3659966(27-40)Online publication date: 17-Jun-2024
    • (2024)Teaching Parallel Algorithms Using the Binary-Forking Model2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW63119.2024.00080(346-351)Online publication date: 27-May-2024
    • (2024)Read/write fence-free work-stealing with multiplicityJournal of Parallel and Distributed Computing10.1016/j.jpdc.2023.104816186:COnline publication date: 1-Apr-2024
    • (2024)Bridging Between Active Objects: Multitier Programming for Distributed, Concurrent SystemsActive Object Languages: Current Research Trends10.1007/978-3-031-51060-1_4(92-122)Online publication date: 29-Jan-2024
    • (2024)M‐DFCPP: A runtime library for multi‐machine dataflow computingConcurrency and Computation: Practice and Experience10.1002/cpe.8248Online publication date: 7-Aug-2024
    • (2023)Extensible Metatheory Mechanization via Family PolymorphismProceedings of the ACM on Programming Languages10.1145/35912867:PLDI(1608-1632)Online publication date: 6-Jun-2023
    • (2023)Efficient Parallel Functional Programming with EffectsProceedings of the ACM on Programming Languages10.1145/35912847:PLDI(1558-1583)Online publication date: 6-Jun-2023
    • (2023)Flux: Liquid Types for RustProceedings of the ACM on Programming Languages10.1145/35912837:PLDI(1533-1557)Online publication date: 6-Jun-2023
    • (2023)Responsive Parallelism with SynchronizationProceedings of the ACM on Programming Languages10.1145/35912497:PLDI(712-735)Online publication date: 6-Jun-2023
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media