Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

A Customized Cross-Bar for Data-Shuffling in Domain-Specific SIMD Processors

  • Conference paper
Architecture of Computing Systems - ARCS 2007 (ARCS 2007)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4415))

Included in the following conference series:

Abstract

Shuffle operations are one of the most common operations in SIMD based embedded system architectures. In this paper we study different families of shuffle operations that frequently occur in embedded applications running on SIMD architectures. These shuffle operations are used to drive the design of a custom shuffler for domain-specific SIMD processors. The energy efficiency of various crossbar based custom shufflers is analyzed and compared with the widely used full crossbar. We show that by customizing the crossbar to implement specific shuffle operations required in the target application domain, we can reduce the energy consumption of shuffle operations by up to 80%. We also illustrate the tradeoffs between flexibility and energy efficiency of custom shufflers and show that customization offers reasonable benefits without compromising the flexibility required for the target application domain.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Sasanka, R.: Energy Efficient Support for All levels of Parallelism for Complex Media Applications. PhD thesis, University of Illinois at Urbana-Champaign (June 2005)

    Google Scholar 

  2. Lee, H., Lin, Y., Harel, Y., Woh, M., Mahlke, S., Mudge, T., Flautner, K.: Software defined radio - a high performance embedded challenge. In: Proc. 2005 Intl. Conference on High Performance Embedded Architectures and Compilers (HiPEAC), November (2005)

    Google Scholar 

  3. IBM: The Cell Microprocessor (2005), http://www.research.ibm.com/cell/

  4. Van Berkel, K., Heinle, F., Meuwissen, P., Moerman, K., Weiss, M.: Vector processing as an enabler for software-defined radio in handsets from 3G+WLAN onwards. In: Proc. of Software Defined Radio Technical Conference, November, pp. 125–130 (2004)

    Google Scholar 

  5. Lin, Y., Lee, H., Woh, M., Harel, Y., Mahlke, S., Mudge, T., Chakrabarti, C., Flautner, K.: SODA: A low-power architecture for software radio. In: Proc. of ISCA (2006)

    Google Scholar 

  6. Freescale Semiconductor, http://www.freescale.com/files/32bit/doc/ref_manual/MPC7400UM.pdf?srch=1 . Altivec Velocity Engine

  7. Intel: Streaming SIMD Extension 2 (SSE2), http://www.intel.com/support/processors/sb/cs-001650.htm

  8. Freescle Semiconductor, http://www.freescale.com/webapp/sps/site/overview.jsp?nodeId=0162468rH3bTdGmKqW5Nf2 . Altivec Engine Benchmarks (2006)

  9. DeMan, H.: Ambient intelligence: Giga-scale dreams and nano-scale realities. In: Proc. of ISSCC, Keynote Speech, February (2005)

    Google Scholar 

  10. Duato, J., Yalamanchili, S., Ni, L.: Interconnection Networks: an Engineering Approach. IEEE Computer Society Press, Los Alamitos (1997)

    Google Scholar 

  11. Das, N., Bhattacharya, B.B., Menon, R., Bezrukov, S.L.: Permutation admissibility in shuffle-exchange networks with arbitrary number of stages. In: Intl Conference on High Performance Computing (HIPC), pp. 270–276 (1998)

    Google Scholar 

  12. Cam, H., Fortes, J.A.B.: Rearrangeability of shuffle-exchange networks. In: Proc. of Frontiers of Massively Parallel Computation, pp. 303–314 (1990)

    Google Scholar 

  13. Scherson, I.D., Corbett, P.F., Lang, T.: An analytical characterization of generalized shuffle-exchange networks. In: IEEE Proc. of Computer and Communication Societies (INFOCOM), pp. 409–414. IEEE Computer Society Press, Los Alamitos (1990)

    Google Scholar 

  14. Padmanabhan, K.: Design and analysis of even-sized binary shuffle-exchange networks for multiprocessors. In: IEEE Transactions on Parallel and Distributed Systems, pp. 385–397. IEEE Computer Society Press, Los Alamitos (1991)

    Google Scholar 

  15. Smith, S.D., Siegel, H.J.: An emulator network for SIMD machine interconnect networks. Computers, 232–241 (1979)

    Google Scholar 

  16. Padmanabhan, K.: Cube structures for multiprocessors. Commun. ACM 33(1), 43–52 (1990)

    Article  MathSciNet  Google Scholar 

  17. McGregor, J.P., Lee, R.B.: Architecture techniques for acclerating subword permutations with repetitions. In: Trans. on VLSI, pp. 325–335 (2003)

    Google Scholar 

  18. Yang, X., Vachharajani, M., Lee, R.B.: Fast subword permutation instructions based on butterfly networks. In: Proc. of SPIE, Media Processor, pp. 80–86 (2000)

    Google Scholar 

  19. McGregor, J.P., Lee, R.B.: Architectural enhancements for fast subword permutations with repetitions in cryptographic applications. In: Proc. of ICCD (2001)

    Google Scholar 

  20. Elnaggar, A., Aboelaze, M., Al-Naamany, A.: A modified shuffle-free architecture for linear convolution. In: Trans. on Circuits and Systems II, pp. 862–866 (2001)

    Google Scholar 

  21. Synopsys, Inc.: Physical Compiler User Guide (2006)

    Google Scholar 

  22. Mentor Graphics: ModelSim SE User’s Manual (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Paul Lukowicz Lothar Thiele Gerhard Tröster

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Raghavan, P. et al. (2007). A Customized Cross-Bar for Data-Shuffling in Domain-Specific SIMD Processors. In: Lukowicz, P., Thiele, L., Tröster, G. (eds) Architecture of Computing Systems - ARCS 2007. ARCS 2007. Lecture Notes in Computer Science, vol 4415. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71270-1_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-71270-1_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-71267-1

  • Online ISBN: 978-3-540-71270-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics