Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/337292.337571acmconferencesArticle/Chapter ViewAbstractPublication PagesdacConference Proceedingsconference-collections
Article
Free access

Optimal two level partitioning and loop scheduling for hiding memory latency for DSP applications

Published: 01 June 2000 Publication History

Abstract

The large latency of memory accesses in modern computers is a key obstacle in achieving high processor utilization. To hide this latency, this paper proposes a new memory management technique that can be applied to computer architectures with three levels of memory. The technique takes advantage of access pattern information that is available at compile time by prefetching certain data elements from the higher level memory. It as well maintains certain data for a period of time to prevent unnecessary data swapping. Data locality is much improved compared with the usual pattern by partitioning the iteration space and reducing execution in each partition. These combined approaches lead to improvements in average execution times of approximately 35% over the one-level partition algorithm and more than 80% over list scheduling and hardware prefetching.

References

[1]
EChen and E.H.-M.Sha. Loop scheduling and partitions for hiding memory latencies. To appear in Proc. IEEE & ACM 12th Intl. Symposium on System Synthesis (ISSS), Nov 1999.
[2]
F.Dahlgren and M.Dubois. Sequential hardware prefetching in shared-memory multiprocessors. IEEE Transactions on Parallel and Distributed Systems, 6(7):733-746, July 1995.
[3]
N.Passos and E.H.-M.Sha. Scheduling of uniform multidimensioanl systems under resource constraints. Journal of IEEE Transactions on VLSI Systems, 6(4):719-730, December 1998.
[4]
T.-F.Chen. Data Prefetching for High-Performance Processors. PhD thesis, Dept. of Comp. Sci. and Engr, Univ. of Washington, 1993.
[5]
T.Mowry. Tolerating latency in multiprocessors through compiler-inserted prefetching. ACM Transactions on Computer Systems, 16(1):55-92, February 1998.
[6]
Z. Wang, V.Andronache, and Edwin H.M. Sha. Optimal partitioning under memory constraints for minimizing average schedule length. In Proc. lASTED 11 th Intl. Conference on Parallel and Distributed Computing and Systems, pages 758-763, Cambridge, MA, Nov 1999.

Cited By

View all
  • (2015)Minimizing write operation for multi-dimensional DSP applications via a two-level partition technique with complete memory latency hidingJournal of Systems Architecture: the EUROMICRO Journal10.1016/j.sysarc.2015.02.00161:2(112-126)Online publication date: 1-Feb-2015
  • (2011)BibliographyReal-Time Embedded Systems10.1201/b10935-12(187-207)Online publication date: 7-Jun-2011
  • (2009)Loop scheduling and bank type assignment for heterogeneous multi-bank memoryJournal of Parallel and Distributed Computing10.1016/j.jpdc.2009.02.00569:6(546-558)Online publication date: 1-Jun-2009
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
DAC '00: Proceedings of the 37th Annual Design Automation Conference
June 2000
819 pages
ISBN:1581131879
DOI:10.1145/337292
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 June 2000

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

DAC00
Sponsor:
DAC00: ACM/IEEE-CAS/EDAC Design Automation Conference
June 5 - 9, 2000
California, Los Angeles, USA

Acceptance Rates

Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

Upcoming Conference

DAC '25
62nd ACM/IEEE Design Automation Conference
June 22 - 26, 2025
San Francisco , CA , USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)55
  • Downloads (Last 6 weeks)13
Reflects downloads up to 03 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2015)Minimizing write operation for multi-dimensional DSP applications via a two-level partition technique with complete memory latency hidingJournal of Systems Architecture: the EUROMICRO Journal10.1016/j.sysarc.2015.02.00161:2(112-126)Online publication date: 1-Feb-2015
  • (2011)BibliographyReal-Time Embedded Systems10.1201/b10935-12(187-207)Online publication date: 7-Jun-2011
  • (2009)Loop scheduling and bank type assignment for heterogeneous multi-bank memoryJournal of Parallel and Distributed Computing10.1016/j.jpdc.2009.02.00569:6(546-558)Online publication date: 1-Jun-2009
  • (2005)Improving the memory bandwidth utilization using loop transformationsProceedings of the 15th international conference on Integrated Circuit and System Design: power and Timing Modeling, Optimization and Simulation10.1007/11556930_13(117-126)Online publication date: 21-Sep-2005
  • (2004)Efficient variable partitioning and scheduling for DSP processors with multiple memory modulesIEEE Transactions on Signal Processing10.1109/TSP.2004.82350652:4(1090-1099)Online publication date: 1-Apr-2004
  • (2001)Combined partitioning and data padding for scheduling multiple loop nestsProceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems10.1145/502217.502228(67-75)Online publication date: 16-Nov-2001
  • (2001)Scheduling and partitioning for multiple loop nestsProceedings of the 14th international symposium on Systems synthesis10.1145/500001.500042(183-188)Online publication date: 30-Sep-2001
  • (2001)Optimal partitioning and balanced scheduling with the maximal overlap of data footprintsProceedings of the 11th Great Lakes symposium on VLSI10.1145/368122.368155(31-36)Online publication date: 1-Mar-2001

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media