Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Construction and exploitation of VLIW ASIPs with heterogeneous vector-widths

Published: 01 November 2014 Publication History
  • Get Citation Alerts
  • Abstract

    Numerous applications in important domains, such as communication and multimedia, show a significant data-level parallelism (DLP). A large part of the DLP is usually exploited through application vectorization and implementation of vector operations in processors executing the applications. While the amount of DLP varies between applications of the same domain or even within a single application, processor architectures usually support a single vector width. This may not be optimal and may cause a substantial energy inefficiency. Therefore, an adequate more sophisticated exploitation of DLP is highly relevant. This paper proposes the use of heterogeneous vector widths and a method to explore the heterogeneous vector widths for VLIW ASIPs. In our context, heterogeneity corresponds to the usage of two or more different vector widths in a single ASIP. After a brief explanation of the target ASIP architecture model, the paper describes the vector-width exploration method and explains the associated design automation tools. Subsequently, experimental results are discussed.

    References

    [1]
    Software Programmable Media Processor, Movidius Myriad SoC, Project website, 2011. <http://movidius.com/>.
    [2]
    Programmable Image Signal Processor, Intel Mobile SoCs (Medfield, Clover Trail), Project website, 2012. <http://www.intel.com/>.
    [3]
    Y. Park, S. Seo, H. Park, H.K. Cho, S. Mahlke, Simd defragmenter: efficient ilp realization on data-parallel architectures, SIGARCH Comput. Archit. News, 40 (2012) 363-374.
    [4]
    Z. Hu, A. Buyuktosunoglu, V. Srinivasan, V. Zyuban, H. Jacobson, P. Bose, Microarchitectural techniques for power gating of execution units, in: ISLPED '04, ACM, New York, NY, USA, 2004, pp. 32-37.
    [5]
    M. Woh, S. Seo, S. Mahlke, T. Mudge, C. Chakrabarti, K. Flautner, Anysp: anytime anywhere anyway signal processing, in: ISCA '09, ACM, New York, NY, USA, 2009, pp. 128-139.
    [6]
    L. Jó¿wiak, Y. Jan, Design of massively parallel hardware multi-processors for highly-demanding embedded applications, J. Microprocess. Microsyst. (2013).
    [7]
    L. Jó¿wiak, M. Lindwer, R. Corvino, P. Meloni, L. Micconi, J. Madsen, E. Diken, D. Gangadharan, R. Jordans, S. Pomata, P. Pop, G. Tuveri, L. Raffo, Asam: automatic architecture synthesis and application mapping, J. Microprocess. Microsyst. (2013).
    [8]
    Y. Lin, H. Lee, M. Woh, Y. Harel, S. Mahlke, T. Mudge, C. Chakrabarti, K. Flautner, Soda: a low-power architecture for software radio, SIGARCH Comput. Archit. News, 34 (2006) 89-101.
    [9]
    J.H. Ahn, W.J. Dally, B. Khailany, U.J. Kapasi, A. Das, Evaluating the imagine stream architecture, in: ISCA '04, IEEE Computer Society, Washington, DC, USA, 2004, pp. 14.
    [10]
    K. van Berkel, F. Heinle, P.P.E. Meuwissen, K. Moerman, M. Weiss, Vector processing as an enabler for software-defined radio in handheld devices, EURASIP J. Appl. Signal Process., 2005 (2005) 2613-2625.
    [11]
    Y. Park, J. Jong, K. Hyunchul, P.S. Mahlke, Libra: tailoring SIMD execution using heterogeneous hardware and dynamic configurability, in: Proceedings of the 2012 IEEE ACM 45th International Symposium on Microarchitecture (MICRO-45), 2012.
    [12]
    G. Dasika, M. Woh, S. Seo, N. Clark, T. Mudge, S. Mahlke, Mighty morphing power-simd, in: CASES'10, ACM, New York, NY, USA, 2010, pp. 67-76.
    [13]
    R. Krashinsky, C. Batten, M. Hampton, S. Gerding, B. Pharris, J. Casper, K. Asanovic, The vector thread architecture, SIGARCH Comput. Archit. News, 32 (2004) 52.
    [14]
    K. Sankaralingam, R. Nagarajan, H. Liu, C. Kim, J. Huh, D. Burger, S.W. Keckler, C.R. Moore, Exploiting ilp, tlp, and dlp with the polymorphous trips architecture, in: ISCA '03, ACM, New York, NY, USA, 2003, pp. 422-433.
    [15]
    E. Diken, R. Corvino, L. Jó¿wiak, Rapid and accurate energy estimation of vector processing in vliw asips, in: ECyPS 2013 - EUROMICRO/IEEE Workshop on Embedded and Cyber-Physical Systems, Budva, Montenegro, 2013, pp. 33-37 (http://dx.doi.org/10.1109/MECO.2013.6601350).
    [16]
    P.R. Panda, F. Catthoor, N.D. Dutt, K. Danckaert, E. Brockmeyer, C. Kulkarni, A. Vandercappelle, P.G. Kjeldsberg, Data and memory optimization techniques for embedded systems, ACM Trans. Des. Autom. Electron. Syst. (TODAES), 6 (2001) 149-206.
    [17]
    K. Danckaert, K. Masselos, F. Cathoor, H.J. DeMan, C. Goutis, Strategy for power-efficient design of parallel systems, IEEE Trans. Very Large Scale Int. (VLSI) Syst., 7 (1999) 258-265.
    [18]
    R. Corvino, A. Gamatié, M. Geilen, L. Jó¿wiak, Design space exploration in application-specific hardware synthesis for multiple communicating nested loops, in: SAMOS XII - 12th International Conference on Embedded Computer Systems, Samos, Greece, 2012, pp. 1-8. <http://www.samos-conference.com/> (http://dx.doi.org/10.1109/SAMOS.2012.6404166).
    [19]
    R. Corvino, E. Diken, A. Gamatié, L. Jó¿wiak, Transformation based exploration of data parallel architecture for customizable hardware: a JPEG encoder case study, in: DSD 2012 - 15th Euromicro Conference on Digital System Design, Cesme, Izmir, Turkey, 2012, pp. 774-781. http://www.univ-valenciennes.fr/congres/dsd2012/ (http://dx.doi.org/10.1109/DSD.2012.133).
    [20]
    C. Glitia, P. Dumont, P. Boulet, Array-ol with delays, a domain specific specification language for multidimensional intensive signal processing, Multidim. Syst. Signal Process., 21 (2010) 105-131.
    [21]
    M. Lam, Software pipelining: an effective scheduling technique for vliw machines, SIGPLAN Not., 23 (1988) 318-328.
    [22]
    R. Jordans, R. Corvino, L. Jó¿wiak, H. Corporaal, Instruction-set architecture exploration strategies for deeply clustered vliw asips, in: ECyPS 2013 - EUROMICRO/IEEE Workshop on Embedded and Cyber-Physical Systems, Budva, Montenegro, 2013, pp. 38-41. <http://www.embeddedcomputing.me/>.
    [23]
    R. Jordans, R. Corvino, H. Corporaal, L. Jó¿wiak, Exploring processor parallelism: estimation methods and optimization strategies, in: DDECS 2013 - 16th IEEE Symposium on Design and Diagnostics of Electronic Circuits and Systems, Karlovy Vary, Czech Republic, 2013, pp. 18-23. <http://www.fit.vutbr.cz/events/ddecs2013/> (recieved best paper award).
    [24]
    Y. He, Y. Pu, R. Kleihorst, Z. Ye, A.A. Abbo, S.M. Londono, H. Corporaal, Xet al-Pro: an ultra-low energy and high throughput SIMD processor, in: DAC '10, ACM, New York, NY, USA, 2010, pp. 543-548.
    [25]
    Y. Pu, Y. He, Z. Ye, S.M. Londono, A.A. Abbo, R.P. Kleihorst, H. Corporaal, From Xetal-II to Xetal-Pro: on the road toward an ultralow-energy and high-throughput SIMD processor, IEEE Trans. Circ. Syst. Video Techn., 21 (2011) 472-484.

    Cited By

    View all
    • (2018)Effective Implementation of MatrixVector Multiplication on Intel's AVX multicore ProcessorComputer Languages, Systems and Structures10.1016/j.cl.2017.06.00351:C(158-175)Online publication date: 1-Jan-2018

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Microprocessors &amp; Microsystems
    Microprocessors & Microsystems  Volume 38, Issue 8
    November 2014
    342 pages

    Publisher

    Elsevier Science Publishers B. V.

    Netherlands

    Publication History

    Published: 01 November 2014

    Author Tags

    1. ASIPs
    2. DLP
    3. SIMD
    4. VLIW
    5. Vector processing

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 09 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2018)Effective Implementation of MatrixVector Multiplication on Intel's AVX multicore ProcessorComputer Languages, Systems and Structures10.1016/j.cl.2017.06.00351:C(158-175)Online publication date: 1-Jan-2018

    View Options

    View options

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media