Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article
Free access

Effective exploitation of a zero overhead loop buffer

Published: 01 May 1999 Publication History

Abstract

A Zero Overhead Loop Buffer (ZOLB) is an architectural feature that is commonly found in DSP processors. This buffer can be viewed as a compiler managed cache that contains a sequence of instructions that will be executed a specified number of times. Unlike techniques such as loop unrolling, a loop buffer is a hardware technique that can be used to minimize loop overhead without the penalty of increasing code size. In addition, a ZOLB also requires relatively little space and power, which are both important considerations for most DSP applications. This paper describes strategies for generating code to effectively use a ZOLB. The authors have found that many common improving transformations used by optimizing compilers to improve code on conventional architectures are shown (1) to allow more loops to be placed in a ZOLB and (2) to further reduce loop overhead of the loops placed in a ZOLB. The results given in this paper demonstrate that this architectural feature can often be exploited with substantial improvements in execution time and slight reductions in code size.

References

[1]
J. Hennessy and D. Patterson, Computer Architecture: A Quantitative Approach, Second Edition, Morgan Kaufmann, San Francisco, CA (1996).
[2]
J.W. Davidson and S. Jinturkar, "Aggressive Loop Unrolling in a Retargetable, Optimizing Compiler," Proceedings of Compiler Construction Conference, pp. 59-73 (April 1996).
[3]
Lucent Technologies, DSP16000 Digital Signal Processor Core Information Manual, 1997.
[4]
Lucent Technologies, DSP16000 C Compiler User Guide, 1997.
[5]
A.V. Aho, R. Sethi, and J. D. Ullman, Compilers Principles, Techniques, and Tools, Addison-Wesley, Reading, MA (1986).
[6]
P. Lapsley, J. Bier, A. Shoham, and E. Lee, DSP Processor Fundamentals - Architecture and Features, IEEE Press (1996).
[7]
J. Eyre and J. Bier, "DSP Processors Hit the Mainstream" IEEE Computer 31(8)pp. 51-59 (August 1998).
[8]
Lucent Technologies, DSP16000 Digital Signal Processor Core instruction Set Manual, 1997.
[9]
Yuhong Wang, Interprocedural Optimizations for Embedded Systems, Masters Project, Florida State University, Tallahassee, FL (1999).
[10]
David Whalley, DSP16000 C Optimizer Overview and Rationale, Lucent Technologies, Allentown, PA (July, 1998).
[11]
Lucent Technologies, DSP16000 LuxWorks Debugger, 1997.
[12]
Lucent Technologies, DSP16000 Assembly Language User Guide, 1997.

Cited By

View all
  • (2015)ACDCACM Transactions on Embedded Computing Systems10.1145/267709314:2(1-26)Online publication date: 17-Feb-2015
  • (2015)A predictable hardware to exploit temporal reuse in real-time and embedded systemsJournal of Systems Architecture: the EUROMICRO Journal10.1016/j.sysarc.2015.05.00161:5(227-238)Online publication date: 1-May-2015
  • (2014)Improving performance of loops on DIAM-based VLIW architecturesACM SIGPLAN Notices10.1145/2666357.259782549:5(135-144)Online publication date: 12-Jun-2014
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices
ACM SIGPLAN Notices  Volume 34, Issue 7
LCTES '99. Languages, compilers, and tools for embedded systems: proceedings of the ACM SIGPLAN 1999 workshop
July 1999
104 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/315253
Issue’s Table of Contents
  • cover image ACM Conferences
    LCTES '99: Proceedings of the ACM SIGPLAN 1999 workshop on Languages, compilers, and tools for embedded systems
    May 1999
    120 pages
    ISBN:1581131364
    DOI:10.1145/314403
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 May 1999
Published in SIGPLAN Volume 34, Issue 7

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)109
  • Downloads (Last 6 weeks)18
Reflects downloads up to 15 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2015)ACDCACM Transactions on Embedded Computing Systems10.1145/267709314:2(1-26)Online publication date: 17-Feb-2015
  • (2015)A predictable hardware to exploit temporal reuse in real-time and embedded systemsJournal of Systems Architecture: the EUROMICRO Journal10.1016/j.sysarc.2015.05.00161:5(227-238)Online publication date: 1-May-2015
  • (2014)Improving performance of loops on DIAM-based VLIW architecturesACM SIGPLAN Notices10.1145/2666357.259782549:5(135-144)Online publication date: 12-Jun-2014
  • (2014)Improving performance of loops on DIAM-based VLIW architecturesProceedings of the 2014 SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems10.1145/2597809.2597825(135-144)Online publication date: 12-Jun-2014
  • (2013)Loop program mapping and compact code generation for programmable hardware acceleratorsProceedings of the 2013 IEEE 24th International Conference on Application-specific Systems, Architectures and Processors (ASAP)10.1109/ASAP.2013.6567544(10-17)Online publication date: 5-Jun-2013
  • (2010)Global State-of-the-Art OverviewUltra-Low Energy Domain-Specific Instruction-Set Processors10.1007/978-90-481-9528-2_2(17-32)Online publication date: 3-Jul-2010
  • (2017)Emerging NVMACM Transactions on Design Automation of Electronic Systems10.1145/313184823:2(1-32)Online publication date: 14-Nov-2017
  • (2014)Compact Code Generation for Tightly-Coupled Processor ArraysJournal of Signal Processing Systems10.1007/s11265-014-0891-277:1-2(5-29)Online publication date: 1-Oct-2014
  • (2013)Design exploration of a NVM based hybrid instruction memory organization for embedded platformsDesign Automation for Embedded Systems10.1007/s10617-014-9151-817:3-4(459-483)Online publication date: 1-Sep-2013
  • (2010)Micro-architectural optimization of a coarse-grained array based baseband processor2010 IEEE Workshop On Signal Processing Systems10.1109/SIPS.2010.5624780(146-150)Online publication date: Oct-2010
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media