Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2141702.2141703acmconferencesArticle/Chapter ViewAbstractPublication PagesppoppConference Proceedingsconference-collections
research-article

Kokkos Array performance-portable manycore programming model

Published: 26 February 2012 Publication History
  • Get Citation Alerts
  • Abstract

    Large, complex scientific and engineering application code have a significant investment in computational kernels which implement their mathematical models. Porting these computational kernels to multicore-CPU and manycore-accelerator (e.g., NVIDIA® GPU) devices is a major challenge given the diverse programming models, application programming interfaces (APIs), and performance requirements. The Kokkos Array programming model provides library-based approach for implementing computational kernels that are performance-portable to multicore-CPU and manycore-accelerator devices. This programming model is based upon three fundamental concepts: (1) manycore compute devices each with its own memory space, (2) data parallel computational kernels, and (3) multidimensional arrays. Performance-portability is achieved by decoupling computational kernels from device-specific data access performance requirements (e.g., NVIDIA coalesced memory access) through an intuitive multidimensional array API. The Kokkos Array API uses C++ template meta-programming to, at compile time, transparently insert device-optimal data access maps into computational kernels. With this programming model computational kernels can be written once and, without modification, performance-portably compiled to multicore-CPU and manycore-accelerator devices.

    References

    [1]
    D. Abrahams and A. Gurtovoy. C++ Template Metaprogramming: Concepts, Tools, and Techniques from Boost and Beyond. Addison-Wesley, first edition, 2005.
    [2]
    Draft Technical Report on C++ Library Extensions. http://www.openstd.org/jtc1/sc22/wg21/docs/papers/2005/n1836.pdf, June 2005.
    [3]
    H. C. Edwards, D. Sunderland, C. Amsler, and S. Mish. Multicore/gpgpu portable computational kernels via multidimensional arrays. In Cluster Computing, 2011 IEEE Conference on Cluster Computing, pages 363--370. IEEE Computer Society, Sept. 2011.
    [4]
    W.-M. W. Hwu, editor. GPU Computing Gems Jade Edition. Elsevier, 225 Wynn Street, Waltham, MA 02451, USA, first edition, 2012.
    [5]
    Information Technology Industry Council. Programming Languages --- C++, International Standard ISO/IEC 14882. American National Standards Institute, 11 West 42nd Street, New York, New York 10036, first edition, 1998.
    [6]
    IEEE Std 1003.1, 2004 Edition, <pthread.h>, 2004.
    [7]
    J. Reinders. Intel Threading Building Blocks. O'Reilly, July 2007.
    [8]
    NVIDIA CUDA home page. http://www.nvidia.com/object/cuda_home.html, Feb. 2011.
    [9]
    Hardware Locality library home page. http://www.open-mpi.org/projects/hwloc/, Dec. 2011.
    [10]
    Thrust home page. http://code.google.com/p/thrust/, May 2011.
    [11]
    Trilinos website. http://trilinos.sandia.gov/, Aug. 2011.

    Cited By

    View all
    • (2024)IRIS Reimagined: Advancements in Intelligent Runtime System for Task-Based ProgrammingAsynchronous Many-Task Systems and Applications10.1007/978-3-031-61763-8_5(46-58)Online publication date: 14-Feb-2024
    • (2023)A MultiGPU Performance-Portable Solution for Array Programming Based on KokkosProceedings of the 9th ACM SIGPLAN International Workshop on Libraries, Languages and Compilers for Array Programming10.1145/3589246.3595369(1-12)Online publication date: 6-Jun-2023
    • (2023)Systematically Exploring High-Performance Representations of Vector Fields Through Compile-Time CompositionProceedings of the 2023 ACM/SPEC International Conference on Performance Engineering10.1145/3578244.3583723(55-66)Online publication date: 15-Apr-2023
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    PMAM '12: Proceedings of the 2012 International Workshop on Programming Models and Applications for Multicores and Manycores
    February 2012
    180 pages
    ISBN:9781450312110
    DOI:10.1145/2141702
    • Conference Chairs:
    • Minyi Guo,
    • Zhiyi Huang
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 26 February 2012

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. GPU
    2. manycore
    3. mini-application
    4. multicore
    5. multidimensional array

    Qualifiers

    • Research-article

    Conference

    PPoPP '12
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 53 of 97 submissions, 55%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)18
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 28 Jul 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)IRIS Reimagined: Advancements in Intelligent Runtime System for Task-Based ProgrammingAsynchronous Many-Task Systems and Applications10.1007/978-3-031-61763-8_5(46-58)Online publication date: 14-Feb-2024
    • (2023)A MultiGPU Performance-Portable Solution for Array Programming Based on KokkosProceedings of the 9th ACM SIGPLAN International Workshop on Libraries, Languages and Compilers for Array Programming10.1145/3589246.3595369(1-12)Online publication date: 6-Jun-2023
    • (2023)Systematically Exploring High-Performance Representations of Vector Fields Through Compile-Time CompositionProceedings of the 2023 ACM/SPEC International Conference on Performance Engineering10.1145/3578244.3583723(55-66)Online publication date: 15-Apr-2023
    • (2023)Language Agnostic Approach for Unification of Implementation Variants for Different Computing DevicesParallel Processing and Applied Mathematics10.1007/978-3-031-30442-2_21(279-290)Online publication date: 28-Apr-2023
    • (2022)EXA2PRO: A Framework for High Development Productivity on Heterogeneous Computing SystemsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2021.310425733:4(792-804)Online publication date: 1-Apr-2022
    • (2022)A Block-Based Triangle Counting Algorithm on Heterogeneous EnvironmentsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2021.309324033:2(444-458)Online publication date: 1-Feb-2022
    • (2021)Exascale models of stellar explosions: Quintessential multi-physics simulationThe International Journal of High Performance Computing Applications10.1177/10943420211027937(109434202110279)Online publication date: 20-Jul-2021
    • (2021)The digital revolution of Earth-system scienceNature Computational Science10.1038/s43588-021-00023-01:2(104-113)Online publication date: 22-Feb-2021
    • (2021)Towards performance portability in the Spark astrophysical magnetohydrodynamics solver in the Flash-X simulation frameworkParallel Computing10.1016/j.parco.2021.102830108(102830)Online publication date: Dec-2021
    • (2020)Preliminary Experience with OpenMP Memory Management ImplementationOpenMP: Portable Multi-Level Parallelism on Modern Systems10.1007/978-3-030-58144-2_20(313-327)Online publication date: 1-Sep-2020
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media