Abstract
Our work investigates how to map loops efficiently onto Coarse-Grained Reconfigurable Architecture (CGRA). This paper examines the properties of CGRA and builds MapReduce inspired models for the loop parallelization problem. The proposed model has a more detailed performance metric and a more flexible unrolling scheme that can unroll different loop levels with different factors. A Geometric Programming based approach is proposed to resolve the optimization problem of loop parallelization problem. The proposed approach can find the optimal unrolling factor for each level loop, resulting in better parallelization of loops. Experimental results show that the proposed approach achieved up to 44% performance gain compared to the state-of-the-art loop mapping scheme.
Similar content being viewed by others
References
Ganesan M K A, Singh S, May F, et al. H.264 decoder at HD resolution on a coarse grain dynamically reconfigurable architecture. In: Proceedings of International Conference on Field Programmable Logic and Applications (FPL), Amsterdam, 2007. 467–471
Liu D J, Yin S Y, Liu L B, et al. Polyhedral model based mapping optimization of loop nests for CGRAs. In: Proceedings of the 50th ACM/EDAC/IEEE Design Automation Conference(DAC), Austin, 2013. 1–8
Dean J, Ghemawat S. MapReduce: simplified data processing on large clusters. In: Proceedings of the 6th Conference on Symposium on Operating Systems Design & Implementation (OSDI), San Francisco, 2004. 107–113
Zhu M, Liu L B, Yin S Y, et al. A reconfigurable multi-processor SoC for media applications. In: Proceedings of IEEE International Symposium on Circuits and Systems(ISCAS), Paris, 2010. 2011–2014
Benoit P, Torres L, Sassatelli G, et al. Automatic task scheduling/loop unrolling using dedicated RTR controllers in coarse grain reconfigurable architectures. In: Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS), Denver, 2005. 148a
Dragomir O S, Stefanov T, Bertels K. Loop unrolling and shifting for reconfigurable architectures. In: Proceedigns of International Conference on Field Programmable Logic and Applications (FPL), Heidelberg, 2008. 167–172
Dragomir O S, Moscu-Panainte E, Bertels K, et al. Optimal unroll factor for reconfigurable architectures. In: Proceedings of the 4th International Workshop on Reconfigurable Computing: Architectures, Tools and Applications (ARC), London, 2008. 4–14
Kaul M, Vemuri R, Govindarajan S, et al. An automated temporal partitioning and loop fission approach for FPGA based reconfigurable synthesis of DSP applications. In: Proceedings of the 36th Annual ACM/IEEE Design Automation Conference (DAC), New Orleans, 1999. 616–622
Park H, Fan K, Mahlke S A, et al. Edge-centric modulo scheduling for coarse-grained reconfigurable architectures. In: Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques (PACT), Toronto, 2008. 166–176
He B S, Fang W B, Luo Q, et al. Mars: A MapReduce framework on graphics processors. In: Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques (PACT), Toronto, 2008. 260–269
Yeung J H, Tsang C, Tsoi K, et al. Map-reduce as a programming model for custom computing machines. In: Proceedings of the 16th International Symposium on Field-Programmable Custom Computing Machines (FCCM), Palo Alto, 2008. 149–159
Shan Y, Wang B, Yan J, et al. FPMR: MapReduce framework on FPGA. In: Proceedings of the 18th Annual ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA), Monterey, 2010. 93–102
Liu Q, Todman T, Luk W, et al. Automatic optimisation of MapReduce designs by geometric programming. In: Proceedings of the International Conference on Field-Programmable Technology (FPT), Sydney, 2009. 215–222
Boyd S P, Vandenberghe L. Convex Optimization. Cambridge: Cambridge University Press, 2004
Potra F A, Wright S J. Interior-point methods. J Comput Appl Math, 2000, 124: 281–302
Yin C Y, Yin S Y, Liu L B, et al. Compiler framework for reconfigurable computing architecture. IEICE Trans Electron, 2009, 92: 1284–1290
Zuo Y H. Compiler for coarse-grained reconfigurable array processor. Dissertition for the Master Degree. Changsha: National University of Defense Technology, 2008
Singh H, Lee M H, Lu G, et al. MorphoSys: an integrated reconfigurable system for data-parallel computation-intensive applications. IEEE Trans Comput, 2000, 49: 465–481
Mei B, Berekovic M, Mignolet J, et al. Fine- and Coarse-grain Reconfigurable Computing. Berlin: Springer, 2008
PACT company. XPP technologies—white paper of video decoding on XPP-III, 2006
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Yin, S., Shao, S., Liu, L. et al. MapReduce inspired loop mapping for coarse-grained reconfigurable architecture. Sci. China Inf. Sci. 57, 1–14 (2014). https://doi.org/10.1007/s11432-014-5198-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11432-014-5198-1