Thread Mapping and Parallel Optimization for MIC Heterogeneous Parallel Systems

Ju, Tao; Zhu, Zhengdong; Wang, Yinfeng; Li, Liang; Dong, Xiaoshe

doi:10.1007/978-3-319-11194-0_23

Tao Ju²⁵,
Zhengdong Zhu²⁵,
Yinfeng Wang²⁶,
Liang Li²⁵ &
…
Xiaoshe Dong²⁵

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8631))

Included in the following conference series:

International Conference on Algorithms and Architectures for Parallel Processing

2674 Accesses
1 Citations

Abstract

There is no dedicated thread mapping method for Many Integrated Core (MIC) heterogeneous system in the traditional multithread programming model. The unreasonable thread mapping will lead the promising computing power of MIC coprocessor not to be fully exploited. In order to fully exploit the computing potential of MIC coprocessor, this paper discussed effective multi threads mapping strategies through comparing the computing performance and analyzing the performance differences between various mapping methods. Meanwhile, for the further exploiting the high computing power of MIC heterogeneous system, the specific program porting and performance optimization strategies were explored by using the k-means application program. Experimental results show that the proposed mapping and parallel optimization strategies are effective, which can be guide the programmer to port and optimize applications effectively to MIC heterogeneous parallel system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

HostoSink: A Collaborative Scheduling in Heterogeneous Environment

The Impact of Parallel Programming Interfaces on Energy

A Vectorized K-Means Algorithm for Intel Many Integrated Core Architecture

References

Top 500 supercomputer sites (June 2013), http://www.top500.org/
Brodtkorb, A.R., Dyken, C., Hagen, T.R., Hjelmervik, J.M., Storaasli, O.O.: State-of-the-art in heterogeneous computing. Scientific Programming 18(1), 1–33 (2010)
Google Scholar
Gelado, I., Stone, J.E., Cabezas, J., et al.: An asymmetric distributed shared memory model for heterogeneous parallel systems. In: Architectural Support for Programming Languages and Operating Systems (ASPLOS), pp. 347–358 (March 2010)
Google Scholar
Han, T.D., Abdelrahman: hiCUDA: High-Level GPGPU Programming. IEEE Transactions on Parallel and Distributed Systems 22(1), 78–90 (2011)
Article Google Scholar
Brodtkorb, A.R., Hagen, T.R., et al.: Graphics processing unit (GPU) programming strategies and trends in GPU computing. Journal of Parallel and Distributed Computing 73(1), 4–13 (2013)
Article Google Scholar
Pusukuri, K.K., Gupta, R., Bhuyan, L.N.: ADAPT: A framework for coscheduling multithreaded programs. ACM Transactions on Architecture and Code Optimization 9(4), Article 45 (2013)
Google Scholar
Jablin, T.B., Prabhu, P., Jablin, J.A., Johnson, N.P., Beard, S.R., August, D.I.: Automatic CPU-GPU communication management and optimization. In: Proc. ACM Programming Language Design and Implementation (PLDI), pp. 142–151 (June 2011)
Google Scholar
Jeffers, J., Reinders, J.: Intel’s Xeon Phi Coprocessor High-Performance Programming. Elsevier Inc., USA (2013)
Google Scholar
Che, S., Boyer, M., Meng, J., Tarjan, D., Sheaer, J.W., Lee, S.H., Skadron, K.: Rodinia: A benchmark suite for heterogeneous computing. In: Proceedings of IISWC, pp. 44–54 (2009)
Google Scholar
Stratton, C., Rodrigues, I., et al.: Parboil: A Revised Benchmark Suite for Scientific and Commercial Throughput Computing. IMPACT Technical Report, University of Illinois at Urbana-Champaign Center for Reliable and High-Performance Computing (March 2, 2012)
Google Scholar
Yang, Y., Xiang, P., Mantor, M., Zhou, H.: CPU-Assisted GPGPU on Fused CPU-GPU Architectures. In: 18th International Symposium on High Performance Computer Architecture, pp. 1–12 (2012)
Google Scholar
Lee, J., Lakshminarayana, N.B., Kim, H., et al.: Many-thread aware prefetching mechanisms for gpgpu applications. In: Proceeding of the 43rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pp. 213–224 (2010)
Google Scholar
Liu, W., Lewis, B., Zhou, X., et al.: A balanced programming model for emerging heteroge-neous multicore systems. In: Proceedings of the 2nd USENIX Conference on Hot Topics in Parallelism (2010)
Google Scholar
Liu, X., Smelyanskiy, M., Chow, E., et al.: Efficient sparse matrix-vector multiplication on x86-based many-core processors. In: Proceedings of the 27th International ACM Conference on International Conference on Supercomputing, pp. 273–282 (2013)
Google Scholar
Potluri, S., Venkatesh, A., Bureddy, D., et al.: Efficient Intra-node Communication on In-tel-MIC Clusters. In: Proceeding of the 13th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp. 128–135 (2013)
Google Scholar
Si, M., Ishikawa, Y., Tatagi, M.: Direct MPI Library for Intel Xeon Phi Co-Processors. In: Proceeding of the 27th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), pp. 816–824 (2013)
Google Scholar
Saini, S., Jin, J., Jespersen, D., et al.: An early performance evaluation of many integrated core architecture based SGI rackable computing system. In: Proceedings of the ACM International Conference for High Performance Computing, Networking, Storage and Analysis (2013)
Google Scholar
Schmidl, D., Cramer, T., Wienke, S., Terboven, C., Müller, M.S.: Assessing the performance of OpenMP programs on the intel xeon phi. In: Wolf, F., Mohr, B., an Mey, D. (eds.) Euro-Par 2013. LNCS, vol. 8097, pp. 547–558. Springer, Heidelberg (2013)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

School of Electronics and Information Engineering, Xi’an Jiaotong University, 710049, Xi’an, China
Tao Ju, Zhengdong Zhu, Liang Li & Xiaoshe Dong
Shenzhen Institute of Information Technology, 518172, Shenzhen, China
Yinfeng Wang

Authors

Tao Ju
View author publications
You can also search for this author in PubMed Google Scholar
Zhengdong Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Yinfeng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Liang Li
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoshe Dong
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Illinois Institute of Technology, 60616-3793, Chicago, IL, USA
Xian-he Sun
School of Computer Science and Technology, Dalian Maritime University, 1 Linghai Road, 116026, Dalian, China
Wenyu Qu
SEECS, University of Ottawa, 8, King Edward Ave, K1N 6N5, Ottawa, ON, Canada
Ivan Stojmenovic
Deakin University, 221 Burwood Highway, 3125, Burwood, VIC, Australia
Wanlei Zhou
Dalian Maritime University, NO.1 Linhai Road Dailian, 116026, China
Zhiyang Li
BeiHang University, XueYuan Road No.37, HaiDian District, Beijing, China
Hua Guo
University of Bradford, BD7 1DP, Bradford, West Yorkshire, United Kingdom
Geyong Min
Dalian Maritime University, NO.1 Linhai Road Dailian, China, 116026
Tingting Yang
Computer Network Information Center, Chinese Academy of Sciences, 100190, Beijing, China
Yulei Wu
Shandong University, 27 Shanda Nanlu, 250100, Jinan City, Shandong Province, China
Lei Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ju, T., Zhu, Z., Wang, Y., Li, L., Dong, X. (2014). Thread Mapping and Parallel Optimization for MIC Heterogeneous Parallel Systems. In: Sun, Xh., et al. Algorithms and Architectures for Parallel Processing. ICA3PP 2014. Lecture Notes in Computer Science, vol 8631. Springer, Cham. https://doi.org/10.1007/978-3-319-11194-0_23

Download citation

DOI: https://doi.org/10.1007/978-3-319-11194-0_23
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11193-3
Online ISBN: 978-3-319-11194-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Thread Mapping and Parallel Optimization for MIC Heterogeneous Parallel Systems

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

HostoSink: A Collaborative Scheduling in Heterogeneous Environment

The Impact of Parallel Programming Interfaces on Energy

A Vectorized K-Means Algorithm for Intel Many Integrated Core Architecture

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Thread Mapping and Parallel Optimization for MIC Heterogeneous Parallel Systems

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

HostoSink: A Collaborative Scheduling in Heterogeneous Environment

The Impact of Parallel Programming Interfaces on Energy

A Vectorized K-Means Algorithm for Intel Many Integrated Core Architecture

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation