Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2016741.2016764acmotherconferencesArticle/Chapter ViewAbstractPublication PagestgConference Proceedingsconference-collections
research-article

Early experiences with the intel many integrated cores accelerated computing technology

Published: 18 July 2011 Publication History

Abstract

We report on early programming experiences with the Intel® Many Integrated Core (Intel® MIC) Co-processor. This new and x86 based technology is Intel's answer to GPU-based accelerators by NVIDIA, AMD and others. Accelerators have generally sparked interest in the HPC community because they have the potential to significantly increase the compute power of the next generation of supercomputers. The merits of accelerators for general HPC purposes are still very much under debate. Undoubtedly accelerators add more complexity to an already very complex cluster, and the programmability of accelerators will be the key to enticing the diverse HPC user community to this new technology, even if the performance promise may be large.
The study presented here is part of a much broader activity at the Texas Advanced Computing Center (TACC) that focuses on a wide range of accelerators (GPUs, FPGAs, Intel MIC coprocessor, etc.). The Intel MIC architecture is x86 based and supports languages and parallel programming paradigms commonly found on x86 CPUs, including OpenMP which has been widely accepted in the HPC community for thread-parallel programming. The scope of this initial study is limited to the investigation of the Intel MIC programming environment and particularly to the offload-OpenMP model.
Our initial experience with the Intel MIC platform has been very positive. The required code modifications to handle the data transfer and the offloading of parallel sections onto the Intel MIC co-processor are small and conveniently implemented as directives/pragmas to OpenMP constructs. (We use "accelerators" as a generic reference to Intel MIC Co-processors, GPUs, FPGAs, etc.)

References

[1]
Ren, Suda, "Power Efficient Large Matrices Multiplication by Load Scheduling on Multi-core and GPU Platform with CUDA", Int'l Conference on Computational Science and Engineering, Vancouver, August, 2009.
[2]
John A. Turner, "ORNL Center for Accelerated Application Readiness (CAAR): Preparing today's applications for tomorrrow's machines", 1st Hybrid Multicore Consortium Workshop, San Francisco, CA, January 2010.
[3]
http://www.intel.com/pressroom/archive/releases/2010/20100531comp.htm.
[4]
http://www.lanl.gov/roadrunner/
[5]
http://mdgrape.gsc.riken.jp/
[6]
http://www.riken.jp/engn/index.html
[7]
http://www.altera.com/ http://www.xilinx.com/
[8]
http://www.epcc.ed.ac.uk/facilities/maxwell/
[9]
http://www.epcc.ed.ac.uk/projects/research/fhpca
[10]
http://nscc-tj.gov.cn/en/show.asp?id=191

Cited By

View all
  • (2015)High-Performance and Scalable Design of MPI-3 RMA on Xeon Phi ClustersEuro-Par 2015: Parallel Processing10.1007/978-3-662-48096-0_48(625-637)Online publication date: 25-Jul-2015
  • (2015)Performance Characterization and Optimization for Intel Xeon Phi CoprocessorAlgorithms and Architectures for Parallel Processing10.1007/978-3-319-27119-4_2(16-33)Online publication date: 16-Dec-2015
  • (2014)Leveraging OmpSs to Exploit Hardware AcceleratorsProceedings of the 2014 IEEE 26th International Symposium on Computer Architecture and High Performance Computing10.1109/SBAC-PAD.2014.26(112-119)Online publication date: 22-Oct-2014
  • Show More Cited By

Index Terms

  1. Early experiences with the intel many integrated cores accelerated computing technology

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    TG '11: Proceedings of the 2011 TeraGrid Conference: Extreme Digital Discovery
    July 2011
    256 pages
    ISBN:9781450308885
    DOI:10.1145/2016741
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    • University of Illinois: University of Illinois

    In-Cooperation

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 18 July 2011

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. accelerated computing
    2. programming models

    Qualifiers

    • Research-article

    Conference

    TG'11
    Sponsor:
    • University of Illinois
    TG'11: TeraGrid 2011
    July 18 - 21, 2011
    Utah, Salt Lake City

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)7
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 15 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2015)High-Performance and Scalable Design of MPI-3 RMA on Xeon Phi ClustersEuro-Par 2015: Parallel Processing10.1007/978-3-662-48096-0_48(625-637)Online publication date: 25-Jul-2015
    • (2015)Performance Characterization and Optimization for Intel Xeon Phi CoprocessorAlgorithms and Architectures for Parallel Processing10.1007/978-3-319-27119-4_2(16-33)Online publication date: 16-Dec-2015
    • (2014)Leveraging OmpSs to Exploit Hardware AcceleratorsProceedings of the 2014 IEEE 26th International Symposium on Computer Architecture and High Performance Computing10.1109/SBAC-PAD.2014.26(112-119)Online publication date: 22-Oct-2014
    • (2014)A Coprocessor Sharing-Aware Scheduler for Xeon Phi-Based Compute ClustersProceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium10.1109/IPDPS.2014.44(337-346)Online publication date: 19-May-2014
    • (2014)Utilizing Multiple Xeon Phi Coprocessors on One Compute NodeAlgorithms and Architectures for Parallel Processing10.1007/978-3-319-11194-0_6(68-81)Online publication date: 2014
    • (2014)Hybrid Programming Using OpenSHMEM and OpenACCProceedings of the First Workshop on OpenSHMEM and Related Technologies. Experiences, Implementations, and Tools - Volume 835610.1007/978-3-319-05215-1_6(74-89)Online publication date: 4-Mar-2014
    • (2013)Accelerating IDCT Algorithm on Xeon Phi CoprocessorAdvanced Materials Research10.4028/www.scientific.net/AMR.756-759.3114756-759(3114-3120)Online publication date: Sep-2013
    • (2013)MVAPICH-PRISMProceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis10.1145/2503210.2503288(1-11)Online publication date: 17-Nov-2013
    • (2013)COSMICProceedings of the 22nd international symposium on High-performance parallel and distributed computing10.1145/2493123.2462921(215-226)Online publication date: 17-Jun-2013
    • (2013)COSMICProceedings of the 22nd international symposium on High-performance parallel and distributed computing10.1145/2462902.2462921(215-226)Online publication date: 17-Jun-2013
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media