Article

Free access

Predictability of load/store instruction latencies

Authors:

Rajiv GuptaAuthors Info & Claims

MICRO 26: Proceedings of the 26th annual international symposium on Microarchitecture

Pages 139 - 152

Published: 01 December 1993 Publication History

PDF eReader

References

[1]

Alpha Architecture Handbook- Preliminary Edition. Digital Equipment Corporation, Maynard, MA, 1992.

Google Scholar

[2]

G. R. Beck, D. W. L. Yen, and T. L. Anderson. The Cydra 5 mini-supercomputer: architecture and implementation. The Journal o/ $upercomputing, 7(1/2):143-180, 1992.

Digital Library

Google Scholar

[3]

D. Callahan, K. Kennedy, and A. Porterfield. Software prefetching. In Proc. o/ASPLOS IV, pages 40- 52, 1991.

Digital Library

Google Scholar

[4]

D. Callahan and A. Porterfield. Data cache performance of supercomputer applications. In $upercomputing '90, pages 564-572, 1990.

Digital Library

Google Scholar

[5]

P. Chang, S. Mahlke, W. Chen, and W.W. Hwu. Profile-guided automatic inline expansion for C programs. So. ware-Practice and Experience, 22(5):349- 369, 1992.

Digital Library

Google Scholar

[6]

T-F. Chen and J-L. Baer. Reducing memory latency via non-blocking and prefetching caches. In Proc. o/ ASPLO$ V, pages 51-61, 1992.

Digital Library

Google Scholar

[7]

W. Y. Chen, S. A. Mahlke, and W.W. Hwu. Tolerating first level memory access latency in highperformance systems. In Intl. Conf. on Parallel Processing, pages 1-36 - 1-43, 1992.

Google Scholar

[8]

F. Chow and J. Hennessy. Register allocation by priority-based coloring. In Proc. of the 198j Syrnp. on Compiler Construction, pages 222-232, 1984.

Digital Library

Google Scholar

[9]

R. P. Colwell, R. P. Nix, J. J. O'Donnell, D. B. Papworth, and P. K. Rodman. A VLIW architecture for a trace scheduling compiler. IEEE Transactions on Computers, C-37(8):967-979, 1988.

Digital Library

Google Scholar

[10]

J. A. Fisher. Trace scheduling: A technique for global microcode compaction, iEEE Transactions on Cornputers, C-30(7):478-490, July 1981.

Google Scholar

[11]

W. W. Hwu and P. P. Chang. Achieving high instruction cache performance with an optimizing compiler. In Proc. of 16th Intl. Syrnp. on Computer Architecture, pages 242-251, 1989.

Digital Library

Google Scholar

[12]

W. W. Hwu, S. A. Mahlke, W. Y. Chen, P. P. Chang, N. J. Wafter, R. A. Bringmann, R. G. Ouellette, R. E. Hank, T. Kiyohara, G. E. Haab, J. G. Holm, and D. M. Lavery. The superblock: an effective technique for vliw and superscalar compilation. J. of Supercomputing, 7(1/2):229-248, 1993.

Digital Library

Google Scholar

[13]

V. Kathail, M. S. Schlansker, and B. R. Rau. HPL PlayDoh architecture specification: Version 1.0. Technical Report HPL-93-80, Hewlett-Packard Laboratories, 1993. in preparation.

Google Scholar

[14]

D. R. Kerns and S. J. Eggers. Balanced scheduling: Instruction scheduling when memory latency is uncertain. In Proc. of the SIGPLAN '93 Conf. on Pro9. Lang. Design and Implementation, pages 278- 289, 1993.

Digital Library

Google Scholar

[15]

A. C. Klaiber and H. M. Levy. Architecture for software controlled data prefetching, in Proc. of 18th Intl. Syrup. on Computer Architecture, pages 43-63, 1991.

Digital Library

Google Scholar

[16]

M. Martonosi, A. Gupta, and T. Anderson. Memspy: Analyzing memory system bottlenecks in programs. In Proc. A CM SIGMETRICS Con/., pages 1-12, 1992.

Digital Library

Google Scholar

[17]

R. L. Mattson, J. Gecsei, D. R. Slutz, and I. L. Traiger. Evaluation techniques for storage hierarchies. IBM Systems Journal, 9(2):78-117, 1970.

Digital Library

Google Scholar

[18]

S. McFarling. Program optimization for instruction caches. In Proc. of ASPLOS IiI, 1989.

Digital Library

Google Scholar

[19]

D. M. McNiven. Reduction in Main Memory Traffic through the Efficient use of Local Memory. PhD thesis, University of Illinois, 1988.

Digital Library

Google Scholar

[20]

T. C. Mowry, M. S. Lam, and A. Gupta. Design and evaluation of a compiler algorithm for prefetching. In Proc. of ASPLOS V, pages 62-73, 1992.

Digital Library

Google Scholar

[21]

B. R. Rau, D. W. L. Yen, W. Yen, and R. A. Towle. The Cydra 5 departmental supercomputer: Design philosophies, decisions and trade-offs. IEEE Computer, 22(1):12-35, 1989.

Digital Library

Google Scholar

[22]

R. A. Sugumar and S. G. Abraham. Multiconfiguration simulation algorithms for the evaluation of computer architecture designs. Technical Report CSE-TR-173-93, CSE Division, University of Michigan, 1992.

Google Scholar

Cited By

View all

González AAliagas CValero M(2014)A data cache with multiple caching strategies tuned to different types of localityACM International Conference on Supercomputing 25th Anniversary Volume10.1145/2591635.2667170(217-226)Online publication date: 10-Jun-2014
https://dl.acm.org/doi/10.1145/2591635.2667170
Choi HAhn JSung W(2012)Reducing off-chip memory traffic by selective cache management scheme in GPGPUsProceedings of the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units10.1145/2159430.2159443(110-119)Online publication date: 3-Mar-2012
https://dl.acm.org/doi/10.1145/2159430.2159443
Winkel SKrishnaiyer RSampson RSoffa MDuesterwald E(2008)Latency-tolerant software pipelining in a production compilerProceedings of the 6th annual IEEE/ACM international symposium on Code generation and optimization10.1145/1356058.1356073(104-113)Online publication date: 6-Apr-2008
https://dl.acm.org/doi/10.1145/1356058.1356073
Show More Cited By

Predictability of load/store instruction latencies
1. Hardware
  1. Integrated circuits
    1. Semiconductor memory
2. Software and its engineering
  1. Software notations and tools

Recommendations

Implementing time-predictable load and store operations
EMSOFT '09: Proceedings of the seventh ACM international conference on Embedded software

Scratchpads have been widely proposed as an alternative to caches for embedded systems. Advantages of scratchpads include reduced energy consumption in comparison to a cache and access latencies that are independent of the preceding memory access ...
Computer-assisted instruction versus traditional lecture instruction in developmental studies in a rural mid-south community college
An investigation of the effects of computer-assisted reading instruction versus traditional reading instruction on selected high school freshmen

Comments

Information & Contributors

Information

Published In

MICRO 26: Proceedings of the 26th annual international symposium on Microarchitecture

December 1993

276 pages

ISBN:0818652802

Chairmen:
Andrew Wolfe
Princeton Univ., Princeton, NJ
,
William Mangione-Smith
Motorola

Publisher

IEEE Computer Society Press

Washington, DC, United States

Publication History

Published: 01 December 1993

Check for updates

Qualifiers

Article

Conference

MICRO93

Sponsor:

SIGMICRO
IEEE-CS\TCMM

MICRO93: 26th Annual International Symposium on Microarchitecture

December 1 - 3, 1993

Texas, Austin, USA

Acceptance Rates

Overall Acceptance Rate 484 of 2,242 submissions, 22%

Upcoming Conference

MICRO '24

Sponsor:
sigmicro

57th Annual IEEE/ACM International Symposium on Microarchitecture

November 2 - 6, 2024

Austin , TX , USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

48
Total Citations
View Citations
642
Total Downloads

Downloads (Last 12 months)125
Downloads (Last 6 weeks)20

Reflects downloads up to 15 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

González AAliagas CValero M(2014)A data cache with multiple caching strategies tuned to different types of localityACM International Conference on Supercomputing 25th Anniversary Volume10.1145/2591635.2667170(217-226)Online publication date: 10-Jun-2014
https://dl.acm.org/doi/10.1145/2591635.2667170
Choi HAhn JSung W(2012)Reducing off-chip memory traffic by selective cache management scheme in GPGPUsProceedings of the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units10.1145/2159430.2159443(110-119)Online publication date: 3-Mar-2012
https://dl.acm.org/doi/10.1145/2159430.2159443
Winkel SKrishnaiyer RSampson RSoffa MDuesterwald E(2008)Latency-tolerant software pipelining in a production compilerProceedings of the 6th annual IEEE/ACM international symposium on Code generation and optimization10.1145/1356058.1356073(104-113)Online publication date: 6-Apr-2008
https://dl.acm.org/doi/10.1145/1356058.1356073
Yan JZhang W(2007)Hybrid multi-core architecture for boosting single-threaded performanceACM SIGARCH Computer Architecture News10.1145/1241601.124160335:1(141-148)Online publication date: 1-Mar-2007
https://dl.acm.org/doi/10.1145/1241601.1241603
Agaram KKeckler SLin CMcKinley KPetrank EMoss E(2006)Decomposing memory performanceProceedings of the 5th international symposium on Memory management10.1145/1133956.1133970(95-103)Online publication date: 10-Jun-2006
https://dl.acm.org/doi/10.1145/1133956.1133970
Jeong JStenström PDubois MAlderighi MSalapura VMcKee S(2006)Simple penalty-sensitive replacement policies for cachesProceedings of the 3rd conference on Computing frontiers10.1145/1128022.1128068(341-352)Online publication date: 3-May-2006
https://dl.acm.org/doi/10.1145/1128022.1128068
Dybdahl HStenström P(2006)Enhancing last-level cache performance by block bypassing and early miss determinationProceedings of the 11th Asia-Pacific conference on Advances in Computer Systems Architecture10.1007/11859802_6(52-66)Online publication date: 6-Sep-2006
https://dl.acm.org/doi/10.1007/11859802_6
Zhou HConte T(2005)Enhancing Memory-Level Parallelism via Recovery-Free Value PredictionIEEE Transactions on Computers10.1109/TC.2005.11754:7(897-912)Online publication date: 1-Jul-2005
https://dl.acm.org/doi/10.1109/TC.2005.117
Das ALu JChen HKim JYew PHsu WChen D(2005)Performance of Runtime Optimization on BLASTProceedings of the international symposium on Code generation and optimization10.1109/CGO.2005.25(86-96)Online publication date: 20-Mar-2005
https://dl.acm.org/doi/10.1109/CGO.2005.25
Rabbah RSandanagobalane HEkpanyapong MWong W(2004)Compiler orchestrated prefetching via speculation and predicationACM SIGOPS Operating Systems Review10.1145/1037949.102441638:5(189-198)Online publication date: 7-Oct-2004
https://dl.acm.org/doi/10.1145/1037949.1024416
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Recommendations

Implementing time-predictable load and store operations

Computer-assisted instruction versus traditional lecture instruction in developmental studies in a rural mid-south community college

An investigation of the effects of computer-assisted reading instruction versus traditional reading instruction on selected high school freshmen

Comments

Published In

Sponsors

Publisher

Publication History

Check for updates

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF

eReader

Login options

Full Access

References

Cited By

Recommendations

Implementing time-predictable load and store operations

Computer-assisted instruction versus traditional lecture instruction in developmental studies in a rural mid-south community college

An investigation of the effects of computer-assisted reading instruction versus traditional reading instruction on selected high school freshmen

Comments

Information

Published In

Sponsors

Publisher

Publication History

Check for updates

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations