Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2380445.2380523acmconferencesArticle/Chapter ViewAbstractPublication PagesesweekConference Proceedingsconference-collections
research-article

An exploration methodology for a customizable OpenCL stereo-matching application targeted to an industrial multi-cluster architecture

Published: 07 October 2012 Publication History

Abstract

Open Computing Language (OpenCL) is emerging as a standard for parallel programming of heterogeneous hardware accelerators. With respect to device specific languages, OpenCL enables application portability but does not guarantee performance portability, eventually requiring additional tuning of the implementation to a specific platform or to unpredictable dynamic workloads. In this paper, we present a methodology to analyze the customization space of an OpenCL application in order to improve performance portability and to support dynamic adaptation. We formulate our case study by implementing an OpenCL image stereo-matching application (which computes the relative depth of objects from a pair of stereo images) customized to the STMicroelectronics Platform 2012 many-core computing fabric. In particular, we use design space exploration techniques to generate a set of operating points that represent specific configurations of the parameters allowing different trade-offs between performance and accuracy of the algorithm itself. These points give detailed knowledge about the interaction between the application parameters, the underlying architecture and the performance of the system; they could also be used by a run-time manager software layer to meet dynamic Quality-of-Service (QoS) constraints.
To analyze the customization space, we use cycle-accurate simulations for the target architecture. Since the profiling phase of each configuration takes a long simulation time, we designed our methodology to reduce the overall number of simulations by exploiting some important features of the application parameters; our analysis also enables the identification of the parameters that could be explored on a high-level simulation model to reduce the simulation time. The resulting methodology is one order of magnitude more efficient than an exhaustive exploration and, given its randomized nature, it increases the probability to avoid sub-optimal trade-offs.

References

[1]
Altera. Implementing fpga design with the opencl standard, November 2011. http://www.altera.com/literature/wp/wp-01173-opencl.pdf.
[2]
G. Ascia, V. Catania, A. G. D. Nuovo, M. Palesi, and D. Patti. Efficient design space exploration for application specific systems-on-a-chip. Journal of Systems Architecture, 53(10):733--750, 2007.
[3]
S. Banerjee, G. Surendra, and S. K. Nandy. On the effectiveness of phase based regression models to trade power and performance using dynamic processor adaptation. Journal of Systems Architecture, 54(8):797--815, 2008.
[4]
L. Benini, E. Flamand, D. Fuin, and D. Melpignano. P2012: Building an ecosystem for a scalable, modular and high-efficiency embedded computing accelerator. In Design, Automation Test in Europe Conference Exhibition (DATE), 2012, pages 983--987, 2012.
[5]
P. A. Castillo, A. M. Mora, J. J. M. Guervós, J. L. J. Laredo, M. Moretó, F. J. Cazorla, M. Valero, and S. A. McKee. Architecture performance prediction using evolutionary artificial neural networks. In EvoWorkshops, pages 175--183, 2008.
[6]
H. Cook and K. Skadron. Predictive design space exploration using genetically programmed response surfaces. In DAC '08: Proceedings of the 45th annual Design Automation Conference, pages 960--965, New York, NY, USA, 2008. ACM.
[7]
M. Curtis-Maury, F. Blagojevic, C. D. Antonopoulos, and D. S. Nikolopoulos. Prediction-based power-performance adaptation of multithreaded scientific codes. IEEE Trans. Parallel Distrib. Syst., 19(10):1396--1410, 2008.
[8]
J. A. Czyzak P. Pareto simulated annealing - a metaheuristic technique for multiple-objective combinatorial optimisation. Journal of Multi-Criteria Decision Analysis, 7(7):34--47, April 1998.
[9]
M. Emmerich, K. Giannakoglou, and B. Naujoks. Single- and multiobjective evolutionary optimization assisted by gaussian random field metamodels. Evolutionary Computation, IEEE Transactions on, 10(4):421--439, Aug. 2006.
[10]
T. Givargis, F. Vahid, and J. Henkel. System-level exploration for pareto-optimal configurations in parameterized systems-on-a-chip. In Computer Aided Design, 2001. ICCAD 2001. IEEE/ACM International Conference on, pages 25 --30, 2001.
[11]
Hwang, C. L. and Masud, A. S. M. . Multiple Objective Decision Making -- Methods and Applications: A State-of-the-Art Survey, volume 164. Lecture Notes in Economics and Mathematical Systems, Springer-Verlag, Berlin--Heidelberg, 1979.
[12]
E. Ipek, S. A. McKee, R. Caruana, B. R. de Supinski, and M. Schulz. Efficiently exploring architectural design spaces via predictive modeling. Proceedings of the 12th international conference on Architectural support for programming languages and operating systems, 40(5):195--206, 2006.
[13]
P. J. Joseph, K. Vaswani, and M. J. Thazhuthaveetil. Construction and use of linear regression models for processor performance analysis. In Symposium on High Performance Computer Architecture, pages 99--108, Austin, Texas, USA, 2006. IEEE Computer Society.
[14]
P. J. Joseph, K. Vaswani, and M. J. Thazhuthaveetil. A predictive performance model for superscalar processors. In MICRO 39: Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture, pages 161--170, Washington, DC, USA, 2006. IEEE Computer Society.
[15]
K. Keutzer, A. Newton, J. Rabaey, and A. Sangiovanni-Vincentelli. System-level design: orthogonalization of concerns and platform-based design. Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on, 19(12):1523--1543, Dec 2000.
[16]
Khronos Group. The opencl specification, version 1.1, June 2011. http://www.khronos.org/registry/cl/specs/opencl-1.1.pdf.
[17]
J. Knowles. Parego: a hybrid algorithm with on-line landscape approximation for expensive multiobjective optimization problems. Evolutionary Computation, IEEE Transactions on, 10(1):50--66, Feb. 2006.
[18]
B. C. Lee and D. M. Brooks. Accurate and efficient regression modeling for microarchitectural performance and power prediction. Proceedings of the 12th international conference on Architectural support for programming languages and operating systems, 40(5):185--194, 2006.
[19]
B. Li, L. Peng, and B. Ramadass. Accurate and efficient processor performance prediction via regression tree based modeling. Journal of Systems Architecture, 55(10--12):457 -- 467, 2009.
[20]
G. Mariani, P. Avasare, G. Vanmeerbeeck, C. Ykman-Couvreur, G. Palermo, C. Silvano, and V. Zaccaria. An industrial design space exploration framework for supporting run-time resource management on multi-core systems. In Proc. Design, Automation & Test in Europe Conf. & Exhibition (DATE), pages 196--201, 2010.
[21]
G. Mariani, A. Brankovic, G. Palermo, J. Jovic, V. Zaccaria, and C. Silvano. A correlation-based design space exploration methodology for multi-processor systems-on-chip. In Proc. 47th ACM/IEEE Design Automation Conf. (DAC), pages 120--125, 2010.
[22]
J. Martinez and E. Ipek. Dynamic multicore resource management: A machine learning approach. Micro, IEEE, 29(5):8 --17, sept.-oct. 2009.
[23]
V. Nollet, D. Verkest, and H. Corporaal. A safari through the mpsoc run-time management jungle. Journal of Signal Processing Systems, 60:251--268, 2010. 10.1007/s11265-008-0305--4.
[24]
G. Palermo, C. Silvano, and V. Zaccaria. ReSPIR: A Response Surface-based Pareto Iterative Refinement for application-specific design space exploration. IEEE Transactions on Computer Aided Design of Integrated Circuits, 28(12):1816--1829, Dec. 2009.
[25]
T. J. Santner, W. B., and N. W. The Design and Analysis of Computer Experiments. Springer-Verlag, 2003.
[26]
D. Scharstein and R. Szeliski. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int. J. Comput. Vision, 47(1--3):7--42, Apr. 2002.
[27]
D. Scharstein and R. Szeliski. High-accuracy stereo depth maps using structured light. In Computer Vision and Pattern Recognition, 2003. Proceedings. 2003 IEEE Computer Society Conference on, volume 1, pages I--195 -- I--202 vol.1, june 2003.
[28]
H. Shojaei, A. Ghamarian, T. Basten, M. Geilen, S. Stuijk, and R. Hoes. A parameterized compositional multi-dimensional multiple-choice knapsack heuristic for cmp run-time management. In DAC '09: Proceedings of the 46th conference on Design automation, New York, NY, USA, 2009. ACM.
[29]
C. Ykman-Couvreur, V. Nollet, F. Catthoor, and H. Corporaal. Fast multidimension multichoice knapsack heuristic for mp-soc runtime management. ACM Trans. Embed. Comput. Syst., 10:35:1--35:16, May 2011.
[30]
K. Zhang, J. Lu, and G. Lafruit. Cross-based local stereo matching using orthogonal integral images. Circuits and Systems for Video Technology, IEEE Transactions on, 19(7):1073 --1079, july 2009.

Cited By

View all
  • (2024)Integrating Bayesian Optimization and Machine Learning for the Optimal Configuration of Cloud SystemsIEEE Transactions on Cloud Computing10.1109/TCC.2024.336107012:1(277-294)Online publication date: Jan-2024
  • (2024)d-MALIBOO: a Bayesian Optimization framework for dealing with Discrete Variables2024 32nd International Conference on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS)10.1109/MASCOTS64422.2024.10786339(1-8)Online publication date: 21-Oct-2024
  • (2022)MALIBOO: When Machine Learning meets Bayesian Optimization2022 IEEE 7th International Conference on Smart Cloud (SmartCloud)10.1109/SmartCloud55982.2022.00008(1-9)Online publication date: Oct-2022
  • Show More Cited By

Index Terms

  1. An exploration methodology for a customizable OpenCL stereo-matching application targeted to an industrial multi-cluster architecture

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      CODES+ISSS '12: Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
      October 2012
      596 pages
      ISBN:9781450314268
      DOI:10.1145/2380445
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 07 October 2012

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. design space exploration
      2. multi-objective optimization
      3. opencl
      4. runtime monitoring

      Qualifiers

      • Research-article

      Conference

      ESWEEK'12
      ESWEEK'12: Eighth Embedded System Week
      October 7 - 12, 2012
      Tampere, Finland

      Acceptance Rates

      CODES+ISSS '12 Paper Acceptance Rate 48 of 163 submissions, 29%;
      Overall Acceptance Rate 280 of 864 submissions, 32%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)6
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 27 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Integrating Bayesian Optimization and Machine Learning for the Optimal Configuration of Cloud SystemsIEEE Transactions on Cloud Computing10.1109/TCC.2024.336107012:1(277-294)Online publication date: Jan-2024
      • (2024)d-MALIBOO: a Bayesian Optimization framework for dealing with Discrete Variables2024 32nd International Conference on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS)10.1109/MASCOTS64422.2024.10786339(1-8)Online publication date: 21-Oct-2024
      • (2022)MALIBOO: When Machine Learning meets Bayesian Optimization2022 IEEE 7th International Conference on Smart Cloud (SmartCloud)10.1109/SmartCloud55982.2022.00008(1-9)Online publication date: Oct-2022
      • (2019)mARGOt: A Dynamic Autotuning Framework for Self-Aware Approximate ComputingIEEE Transactions on Computers10.1109/TC.2018.288359768:5(713-728)Online publication date: 1-May-2019
      • (2018)Real-Time High-Quality Stereo Matching System on a GPU2018 IEEE 29th International Conference on Application-specific Systems, Architectures and Processors (ASAP)10.1109/ASAP.2018.8445111(1-8)Online publication date: Jul-2018
      • (2015)Application autotuning to support runtime adaptivity in multicore architectures2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)10.1109/SAMOS.2015.7363673(173-180)Online publication date: Jul-2015
      • (2014)Combining application adaptivity and system-wide Resource Management on multi-core platforms2014 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS XIV)10.1109/SAMOS.2014.6893191(26-33)Online publication date: Jul-2014

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media