Abstract
Multimedia applications are among the most dominant computing workloads driving innovations in high performance and cost effective systems. In this regard, modern general-purpose microprocessors have included multimedia extensions (e.g., MMX, SSE, VIS, MAX, ALTIVEC) to their instruction set architectures to improve the performance of multimedia with little added cost to microprocessors. Whereas prior studies of multimedia extensions have primarily focused on a single processor, this paper quantitatively evaluates the impact of multimedia extensions on system performance and efficiency for different number of processing elements (PEs) within an integrated multiprocessor array. This paper also identifies the optimal PE granularity for the array system and implementation technology in terms of throughput, area efficiency, and energy efficiency using architectural and workload simulation. Experimental results with cycle accurate simulation and technology modeling show that MMX-type instructions (a representative Intel’s multimedia extensions) achieve an average speedup ranging from 1.24( (at a 65,536 PE system) to 5.65( (at a 4 PE system) over the baseline performance. In addition, the MMX-enhanced processor array increases both area and energy efficiency over the baseline for all the configurations and programs. Moreover, the highest area and energy efficiency are achieved at the number of PEs between 256 and 1,024. These evaluation techniques composed of performance simulation and technology modeling can provide solutions to the design challenges in a new class of multiprocessor array systems for multimedia.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Agerwala, T., Chatterjee, S.: Computer architecture: challenges and opportunities for the next decade. IEEE Micro, 58–69 (May-June 2005)
Peleg, A., Weiser, U.: MMX technology extension to the Intel architecture. IEEE Micro 16(4), 42–50 (1996)
Raman, S.K., Pentkovski, V., Keshava, J.: Implementing streaming SIMD extensions on the Pentium III processor. IEEE Micro 20(4), 28–39 (2000)
Lee, R.B.: Subword parallelism with MAX-2. IEEE Micro 16(4), 51–59 (1996)
Tremblay, M., O’Connor, J.M., Narayanan, V., He, L.: VIS speeds new media processing. IEEE Micro 16(4), 10–20 (1996)
Sites, R. (ed.): Alpha Reference Manual. Digital, Burlington (1992)
Nguyen, H., John, L.: Exploiting SIMD parallelism in DSP and multimedia algorithms using the AltiVec technology. In: Proc. Intl. Supercomputer Conference, June 1999, pp. 11–20 (1999)
TMS320C64x DSP Technical Brief, http://www.ti.com/sc/docs/products/dsp/c6000/c64xmptb.pdf
Fridman, J., Greenfield, Z.: The TigerSHARC DSP architecture. In: Proc. IEEE/ACM Intl. Sym. on Computer Architecture, May 1999, pp. 124–135 (1999)
Chai, S.M., Taha, T.M., Wills, D.S., Meindl, J.D.: Heterogeneous architecture models for interconnect-motivated system design. IEEE Trans. VLSI Systems, special issue on system level interconnect prediction 8(6), 660–670 (2000)
Nugent, S., Wills, D.S., Meindl, J.D.: A hierarchical block-based modeling methodology for SoC in GENESYS. In: Proc. of the 15th Ann. IEEE Intl. ASIC/SOC Conf., September 2002, pp. 239–243 (2002)
Nozawa, T., et al.: A parallel vector-quantization processor eliminating redundant calculations for real-time motion picture compression. IEEE J. Solid-State Circuits 35(11), 1744–1751 (2000)
Gonzalez, R.C., Woods, R.E.: Digital Image Processing, 3rd edn. Prentice Hall, Englewood Cliffs (2008)
Tremeau, A., Plataniotis, K., Tominaga, S.: Color in Image and Video Processing. Special issue of EURASIP Journal on Image and Video Processing (2008)
Chiu, J.-C., Chou, Y.-L., Tzeng, H.-Y.: A Multi-streaming SIMD Architecture for Multimedia Applications. In: Proceedings of the 6th ACM conference on Computing frontiers, pp. 51–60 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kim, JM. (2010). Impact of Multimedia Extensions for Different Processing Element Granularities on an Embedded Imaging System. In: Hsu, CH., Yang, L.T., Park, J.H., Yeo, SS. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2010. Lecture Notes in Computer Science, vol 6081. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13119-6_42
Download citation
DOI: https://doi.org/10.1007/978-3-642-13119-6_42
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13118-9
Online ISBN: 978-3-642-13119-6
eBook Packages: Computer ScienceComputer Science (R0)