Computer Architecture, Fifth Edition: A Quantitative Approach | Guide books

Computer Architecture, Fifth Edition: A Quantitative ApproachSeptember 2011

Go to Computer Architecture, Fifth Edition

September 2011

Publisher:

Morgan Kaufmann Publishers Inc.
340 Pine Street, Sixth Floor
San Francisco
CA
United States

ISBN:978-0-12-383872-8

Published:29 September 2011

Pages:

880

PDF eReader ePub

Bibliometrics

Abstract

The computing world today is in the middle of a revolution: mobile clients and cloud computing have emerged as the dominant paradigms driving programming and hardware innovation today. The Fifth Edition of Computer Architecture focuses on this dramatic shift, exploring the ways in which software and technology in the "cloud" are accessed by cell phones, tablets, laptops, and other mobile computing devices. Each chapter includes two real-world examples, one mobile and one datacenter, to illustrate this revolutionary change. Updated to cover the mobile computing revolutionEmphasizes the two most important topics in architecture today: memory hierarchy and parallelism in all its forms.Develops common themes throughout each chapter: power, performance, cost, dependability, protection, programming models, and emerging trends ("What's Next")Includes three review appendices in the printed text. Additional reference appendices are available online.Includes updated Case Studies and completely new exercises.

References

Adve, S. V., and K. Gharachorloo [1996]. "Shared memory consistency models: A tutorial," IEEE Computer 29:12 (December), 66-76. Google ScholarDigital Library
Adve, S. V., and M. D. Hill [1990]. "Weak ordering--a new definition," Proc. 17th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 28-31, 1990, Seattle, Wash., 2-14. Google Scholar
Agarwal, A. [1987]. "Analysis of Cache Performance for Operating Systems and Multiprogramming," Ph. D. thesis, Tech. Rep. No. CSL-TR-87-332, Stanford University, Palo Alto, Calif. Google Scholar
Agarwal, A. [1991]. "Limits on interconnection network performance," IEEE Trans. on Parallel and Distributed Systems 2:4 (April), 398-412. Google ScholarDigital Library
Agarwal, A., and S. D. Pudar [1993]. "Column-associative caches: A technique for reducing the miss rate of direct-mapped caches," 20th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 16-19, 1993, San Diego, Calif. Also appears in Computer Architecture News 21:2 (May), 179-190, 1993. Google Scholar
Agarwal, A., R. Bianchini, D. Chaiken, K. Johnson, and D. Kranz [1995]. "The MIT Alewife machine: Architecture and performance," Int'l. Symposium on Computer Architecture (Denver, Colo.), June, 2-13. Google Scholar
Agarwal, A., J. L. Hennessy, R. Simoni, and M. A. Horowitz [1988]. "An evaluation of directory schemes for cache coherence," Proc. 15th Int'l. Symposium on Computer Architecture (June), 280-289. Google Scholar
Agarwal, A., J. Kubiatowicz, D. Kranz, B.-H. Lim, D. Yeung, G. D'Souza, and M. Parkin [1993]. "Sparcle: An evolutionary processor design for large-scale multiprocessors," IEEE Micro 13 (June), 48-61. Google ScholarDigital Library
Agerwala, T., and J. Cocke [1987]. High Performance Reduced Instruction Set Processors , IBM Tech. Rep. RC12434, IBM, Armonk, N.Y.Google Scholar
Akeley, K. and T. Jermoluk [1988]. "High-Performance Polygon Rendering," Proc. 15th Annual Conf. on Computer Graphics and Interactive Techniques (SIGGRAPH 1988) , August 1-5, 1988, Atlanta, Ga., 239-246. Google Scholar
Alexander, W. G., and D. B. Wortman [1975]. "Static and dynamic characteristics of XPL programs," IEEE Computer 8:11 (November), 41-46. Google ScholarDigital Library
Alles, A. [1995]. "ATM Internetworking," White Paper (May), Cisco Systems, Inc., San Jose, Calif. (www.cisco.com/warp/public/614/12.html)Google Scholar
Alliant. [1987]. Alliant FX/Series: Product Summary , Alliant Computer Systems Corp., Acton, Mass.Google Scholar
Almasi, G. S., and A. Gottlieb [1989]. Highly Parallel Computing , Benjamin/Cummings, Redwood City, Calif. Google Scholar
Alverson, G., R. Alverson, D. Callahan, B. Koblenz, A. Porterfield, and B. Smith [1992]. "Exploiting heterogeneous parallelism on a multithreaded multiprocessor," Proc. ACM/IEEE Conf. on Supercomputing , November 16-20, 1992, Minneapolis, Minn., 188-197. Google Scholar
Amdahl, G. M. [1967]. "Validity of the single processor approach to achieving large scale computing capabilities," Proc. AFIPS Spring Joint Computer Conf. , April 18-20, 1967, Atlantic City, N. J., 483-485. Google Scholar
Amdahl, G. M., G. A. Blaauw, and F. P. Brooks, Jr. [1964]. "Architecture of the IBM System 360," IBM J. Research and Development 8:2 (April), 87-101. Google ScholarDigital Library
Amza, C., A. L. Cox, S. Dwarkadas, P. Keleher, H. Lu, R. Rajamony, W. Yu, and W. Zwaenepoel [1996]. "Treadmarks: Shared memory computing on networks of workstations," IEEE Computer 29:2 (February), 18-28. Google ScholarDigital Library
Anderson, D. [2003]. "You don't know jack about disks," Queue , 1:4 (June), 20-30. Google ScholarDigital Library
Anderson, D., J. Dykes, and E. Riedel [2003]. "SCSI vs. ATA--More than an interface," Proc. 2nd USENIX Conf. on File and Storage Technology (FAST '03) , March 31- April 2, 2003, San Francisco. Google Scholar
Anderson, D. W., F. J. Sparacio, and R. M. Tomasulo [1967]. "The IBM 360 Model 91: Processor philosophy and instruction handling," IBM J. Research and Development 11:1 (January), 8-24. Google ScholarDigital Library
Anderson, M. H. [1990]. "Strength (and safety) in numbers (RAID, disk storage technology)," Byte 15:13 (December), 337-339.Google Scholar
Anderson, T. E., D. E. Culler, and D. Patterson [1995]. "A case for NOW (networks of workstations)," IEEE Micro 15:1 (February), 54-64. Google ScholarDigital Library
Ang, B., D. Chiou, D. Rosenband, M. Ehrlich, L. Rudolph, and Arvind [1998]. "StarTVoyager: A flexible platform for exploring scalable SMP issues," Proc. ACM/IEEE Conf. on Supercomputing , November 7-13, 1998, Orlando, FL. Google Scholar
Anjan, K. V., and T. M. Pinkston [1995]. "An efficient, fully-adaptive deadlock recovery scheme: Disha," Proc. 22nd Annual Int'l. Symposium on Computer Architecture (ISCA) , June 22-24, 1995, Santa Margherita, Italy. Google Scholar
Anon. et al. [1985]. A Measure of Transaction Processing Power , Tandem Tech. Rep. TR85.2. Also appears in Datamation 31:7 (April), 112-118, 1985. Google Scholar
Apache Hadoop. [2011]. http://hadoop.apache.org.Google Scholar
Archibald, J., and J.-L. Baer [1986]. "Cache coherence protocols: Evaluation using a multiprocessor simulation model," ACM Trans. on Computer Systems 4:4 (November), 273-298. Google ScholarDigital Library
Armbrust, M., A. Fox, R. Griffith, A. D. Joseph, R. Katz, A. Konwinski, G. Lee, D. Patterson, A. Rabkin, I. Stoica, and M. Zaharia [2009]. Above the Clouds: A Berkeley View of Cloud Computing , Tech. Rep. UCB/EECS-2009-28, University of California, Berkeley (http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-28.html).Google Scholar
Arpaci, R. H., D. E. Culler, A. Krishnamurthy, S. G. Steinberg, and K. Yelick [1995]. "Empirical evaluation of the CRAY-T3D: A compiler perspective," 22nd Annual Int'l. Symposium on Computer Architecture (ISCA) , June 22-24, 1995, Santa Margherita, Italy. Google Scholar
Asanovic, K. [1998]. "Vector Microprocessors," Ph. D. thesis, Computer Science Division, University of California, Berkeley. Google Scholar
Associated Press. [2005]. "Gap Inc. shuts down two Internet stores for major overhaul," USATODAY.com , August 8, 2005.Google Scholar
Atanasoff, J. V. [1940]. Computing Machine for the Solution of Large Systems of Linear Equations , Internal Report, Iowa State University, Ames.Google Scholar
Atkins, M. [1991]. Performance and the i860 Microprocessor, IEEE Micro , 11:5 (September), 24-27, 72-78. Google ScholarDigital Library
Austin, T. M., and G. Sohi [1992]. "Dynamic dependency analysis of ordinary programs," Proc. 19th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 19-21, 1992, Gold Coast, Australia, 342-351. Google Scholar
Babbay, F., and A. Mendelson [1998]. "Using value prediction to increase the power of speculative execution hardware," ACM Trans. on Computer Systems 16:3 (August), 234-270. Google Scholar
Baer, J.-L., and W.-H. Wang [1988]. "On the inclusion property for multi-level cache hierarchies," Proc. 15th Annual Int'l. Symposium on Computer Architecture , May 30-June 2, 1988, Honolulu, Hawaii, 73-80. Google Scholar
Bailey, D. H., E. Barszcz, J. T. Barton, D. S. Browning, R. L. Carter, L. Dagum, R. A. Fatoohi, P. O. Frederickson, T. A. Lasinski, R. S. Schreiber, H. D. Simon, V. Venkatakrishnan, and S. K. Weeratunga [1991]. "The NAS parallel benchmarks," Int'l. J. Supercomputing Applications 5, 63-73. Google ScholarDigital Library
Bakoglu, H. B., G. F. Grohoski, L. E. Thatcher, J. A. Kaeli, C. R. Moore, D. P. Tattle, W. E. Male, W. R. Hardell, D. A. Hicks, M. Nguyen Phu, R. K. Montoye, W. T. Glover, and S. Dhawan [1989]. "IBM second-generation RISC processor organization," Proc. IEEE Int'l. Conf. on Computer Design , September 30-October 4, 1989, Rye, N.Y., 138-142.Google Scholar
Balakrishnan, H., V. N. Padmanabhan, S. Seshan, and R. H. Katz [1997]. "A comparison of mechanisms for improving TCP performance over wireless links," IEEE/ACM Trans. on Networking 5:6 (December), 756-769. Google ScholarDigital Library
Ball, T., and J. Larus [1993]. "Branch prediction for free," Proc. ACM SIGPLAN'93 Conference on Programming Language Design and Implementation (PLDI) , June 23-25, 1993, Albuquerque, N. M., 300-313. Google Scholar
Banerjee, U. [1979]. "Speedup of Ordinary Programs," Ph. D. thesis, Dept. of Computer Science, University of Illinois at Urbana-Champaign. Google Scholar
Barham, P., B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, and R. Neugebauer [2003]. "Xen and the art of virtualization," Proc. of the 19th ACM Symposium on Operating Systems Principles , October 19-22, 2003, Bolton Landing, N.Y. Google Scholar
Barroso, L. A. [2010]. "Warehouse Scale Computing [keynote address]," Proc. ACM SIGMOD , June 8-10, 2010, Indianapolis, Ind. Google Scholar
Barroso, L. A., and U. Holzle [2007], "The case for energy-proportional computing," IEEE Computer , 40:12 (December), 33-37. Google ScholarDigital Library
Barroso, L. A., and U. Holzle [2009]. The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines , Morgan & Claypool, San Rafael, Calif. Google Scholar
Barroso, L. A., K. Gharachorloo, and E. Bugnion [1998]. "Memory system characterization of commercial workloads," Proc. 25th Annual Int'l. Symposium on Computer Architecture (ISCA) , July 3-14, 1998, Barcelona, Spain, 3-14. Google Scholar
Barton, R. S. [1961]. "A new approach to the functional design of a computer," Proc. Western Joint Computer Conf. , May 9-11, 1961, Los Angeles, Calif., 393-396. Google Scholar
Bashe, C. J., W. Buchholz, G. V. Hawkins, J. L. Ingram, and N. Rochester [1981]. "The architecture of IBM's early computers," IBM J. Research and Development 25:5 (September), 363-375. Google ScholarDigital Library
Bashe, C. J., L. R. Johnson, J. H. Palmer, and E. W. Pugh [1986]. IBM's Early Computers , MIT Press, Cambridge, Mass. Google Scholar
Baskett, F., and T. W. Keller [1977]. "An evaluation of the Cray-1 processor," in High Speed Computer and Algorithm Organization , D. J. Kuck, D. H. Lawrie, and A. H. Sameh, eds., Academic Press, San Diego, 71-84.Google Scholar
Baskett, F., T. Jermoluk, and D. Solomon [1988]. "The 4D-MP graphics superworkstation: Computing + graphics = 40 MIPS + 40 MFLOPS and 10,000 lighted polygons per second," Proc. IEEE COMPCON , February 29-March 4, 1988, San Francisco, 468-471.Google Scholar
BBN Laboratories. [1986]. Butterfly Parallel Processor Overview , Tech. Rep. 6148, BBN Laboratories, Cambridge, Mass.Google Scholar
Bell, C. G. [1984]. "The mini and micro industries," IEEE Computer 17:10 (October), 14-30. Google ScholarDigital Library
Bell, C. G. [1985]. "Multis: A new class of multiprocessor computers," Science 228 (April 26), 462-467.Google ScholarCross Ref
Bell, C. G. [1989]. "The future of high performance computers in science and engineering," Communications of the ACM 32:9 (September), 1091-1101. Google ScholarDigital Library
Bell, G., and J. Gray [2001]. Crays, Clusters and Centers , Tech. Rep. MSR-TR-2001-76, Microsoft Research, Redmond, Wash.Google Scholar
Bell, C. G., and J. Gray [2002]. "What's next in high performance computing?" CACM 45:2 (February), 91-95. Google ScholarDigital Library
Bell, C. G., and A. Newell [1971]. Computer Structures: Readings and Examples , McGraw-Hill, New York. Google Scholar
Bell, C. G., and W. D. Strecker [1976]. "Computer structures: What have we learned from the PDP-11?," Third Annual Int'l. Symposium on Computer Architecture (ISCA) , January 19-21, 1976, Tampa, Fla., 1-14. Google Scholar
Bell, C. G., and W. D. Strecker [1998]. "Computer structures: What have we learned from the PDP-11?" 25 Years of the International Symposia on Computer Architecture (Selected Papers) , ACM, New York, 138-151. Google ScholarDigital Library
Bell, C. G., J. C. Mudge, and J. E. McNamara [1978]. A DEC View of Computer Engineering , Digital Press, Bedford, Mass.Google Scholar
Bell, C. G., R. Cady, H. McFarland, B. DeLagi, J. O'Laughlin, R. Noonan, and W. Wulf [1970]. "A new architecture for mini-computers: The DEC PDP-11," Proc. AFIPS Spring Joint Computer Conf. , May 5-May 7, 1970, Atlantic City, N. J., 657-675. Google ScholarDigital Library
Benes, V. E. [1962]. "Rearrangeable three stage connecting networks," Bell System Technical Journal 41, 1481-1492.Google ScholarCross Ref
Bertozzi, D., A. Jalabert, S. Murali, R. Tamhankar, S. Stergiou, L. Benini, and G. De Micheli [2005]. "NoC synthesis flow for customized domain specific multiprocessor systems-on-chip," IEEE Trans. on Parallel and Distributed Systems 16:2 (February), 113-130. Google ScholarDigital Library
Bhandarkar, D. P. [1995]. Alpha Architecture and Implementations , Digital Press, Newton, Mass.Google Scholar
Bhandarkar, D. P., and D. W. Clark [1991]. "Performance from architecture: Comparing a RISC and a CISC with similar hardware organizations," Proc. Fourth Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , April 8-11, 1991, Palo Alto, Calif., 310-319. Google Scholar
Bhandarkar, D. P., and J. Ding [1997]. "Performance characterization of the Pentium Pro processor," Proc. Third Int'l. Symposium on High-Performance Computer Architecture , February 1-February 5, 1997, San Antonio, Tex., 288-297. Google Scholar
Bhuyan, L. N., and D. P. Agrawal [1984]. "Generalized hypercube and hyperbus structures for a computer network," IEEE Trans. on Computers 32:4 (April), 322-333. Google Scholar
Bienia, C., S. Kumar, P. S. Jaswinder, and K. Li [2008]. The Parsec Benchmark Suite: Characterization and Architectural Implications , Tech. Rep. TR-811-08, Princeton University, Princeton, N. J.Google Scholar
Bier, J. [1997]. "The Evolution of DSP Processors," presentation at Univesity of California, Berkeley, November 14.Google Scholar
Bird, S., A. Phansalkar, L. K. John, A. Mericas, and R. Indukuru [2007]. "Characterization of performance of SPEC CPU benchmarks on Intel's Core Microarchitecture based processor," Proc. 2007 SPEC Benchmark Workshop , January 21, 2007, Austin, Tex.Google Scholar
Birman, M., A. Samuels, G. Chu, T. Chuk, L. Hu, J. McLeod, and J. Barnes [1990]. "Developing the WRL3170/3171 SPARC floating-point coprocessors," IEEE Micro 10:1, 55-64. Google ScholarDigital Library
Blackburn, M., R. Garner, C. Hoffman, A. M. Khan, K. S. McKinley, R. Bentzur, A. Diwan, D. Feinberg, D. Frampton, S. Z. Guyer, M. Hirzel, A. Hosking, M. Jump, H. Lee, J. E. B. Moss, A. Phansalkar, D. Stefanovic, T. VanDrunen, D. von Dincklage, and B. Wiedermann [2006]. "The DaCapo benchmarks: Java benchmarking development and analysis," ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA) , October 22-26, 2006, 169-190. Google Scholar
Blaum, M., J. Bruck, and A. Vardy [1996]. "MDS array codes with independent parity symbols," IEEE Trans. on Information Theory , IT-42 (March), 529-42. Google ScholarDigital Library
Blaum, M., J. Brady, J. Bruck, and J. Menon [1994]. "EVENODD: An optimal scheme for tolerating double disk failures in RAID architectures," Proc. 21st Annual Int'l. Symposium on Computer Architecture (ISCA) , April 18-21, 1994, Chicago, 245-254. Google Scholar
Blaum, M., J. Brady, J. Bruck, and J. Menon [1995]. "EVENODD: An optimal scheme for tolerating double disk failures in RAID architectures," IEEE Trans. on Computers 44:2 (February), 192-202. Google ScholarDigital Library
Blaum, M., J. Brady, J., Bruck, J. Menon, and A. Vardy [2001]. "The EVENODD code and its generalization," in H. Jin, T. Cortes, and R. Buyya, eds., High Performance Mass Storage and Parallel I/O: Technologies and Applications , Wiley-IEEE, New York, 187-208.Google Scholar
Bloch, E. [1959]. "The engineering design of the Stretch computer," 1959 Proceedings of the Eastern Joint Computer Conf. , December 1-3, 1959, Boston, Mass., 48-59. Google Scholar
Boddie, J. R. [2000]. "History of DSPs," www.lucent.com/micro/dsp/dsphist.html.Google Scholar
Bolt, K. M. [2005]. "Amazon sees sales rise, profit fall," Seattle Post-Intelligencer , October 25 (http://seattlepi.nwsource.com/business/245943_techearns26.html).Google Scholar
Bordawekar, R., U. Bondhugula, R. Rao [2010]. "Believe It or Not!: Multi-core CPUs can Match GPU Performance for a FLOP-Intensive Application!" 19th International Conference on Parallel Architecture and Compilation Techniques (PACT 2010) . Vienna, Austria, September 11-15, 2010, 537-538. Google Scholar
Borg, A., R. E. Kessler, and D. W. Wall [1990]. "Generation and analysis of very long address traces," 19th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 19-21, 1992, Gold Coast, Australia, 270-279. Google Scholar
Bouknight, W. J., S. A. Deneberg, D. E. McIntyre, J. M. Randall, A. H. Sameh, and D. L. Slotnick [1972]. "The Illiac IV system," Proc. IEEE 60:4, 369-379. Also appears in D. P. Siewiorek, C. G. Bell, and A. Newell, Computer Structures: Principles and Examples , McGraw-Hill, New York, 1982, 306-316. Google ScholarCross Ref
Brady, J. T. [1986]. "A theory of productivity in the creative process," IEEE CG&A (May), 25-34. Google Scholar
Brain, M. [2000]. "Inside a Digital Cell Phone," www.howstuffworks.com/insidecellphone. htm.Google Scholar
Brandt, M., J. Brooks, M. Cahir, T. Hewitt, E. Lopez-Pineda, and D. Sandness [2000]. The Benchmarker's Guide for Cray SV1 Systems. Cray Inc., Seattle, Wash.Google Scholar
Brent, R. P., and H. T. Kung [1982]. "A regular layout for parallel adders," IEEE Trans. on Computers C-31, 260-264. Google ScholarDigital Library
Brewer, E. A., and B. C. Kuszmaul [1994]. "How to get good performance from the CM-5 data network," Proc. Eighth Int'l. Parallel Processing Symposium , April 26-27, 1994, Cancun, Mexico. Google ScholarCross Ref
Brin, S., and L. Page [1998]. "The anatomy of a large-scale hypertextual Web search engine," Proc. 7th Int'l. World Wide Web Conf. , April 14-18, 1998, Brisbane, Queensland, Australia, 107-117. Google Scholar
Brown, A., and D. A. Patterson [2000]. "Towards maintainability, availability, and growth benchmarks: A case study of software RAID systems." Proc. 2000 USENIX Annual Technical Conf. , June 18-23, 2000, San Diego, Calif. Google Scholar
Bucher, I. V., and A. H. Hayes [1980]. "I/O performance measurement on Cray-1 and CDC 7000 computers," Proc. Computer Performance Evaluation Users Group , 16th Meeting , NBS 500-65, 245-254.Google Scholar
Bucher, I. Y. [1983]. "The computational speed of supercomputers," Proc. Int'l. Conf. on Measuring and Modeling of Computer Systems (SIGMETRICS 1983) , August 29-31, 1983, Minneapolis, Minn., 151-165. Google Scholar
Bucholtz, W. [1962]. Planning a Computer System: Project Stretch , McGraw-Hill, New York. Google Scholar
Burgess, N., and T. Williams [1995]. "Choices of operand truncation in the SRT division algorithm," IEEE Trans. on Computers 44:7, 933-938. Google ScholarDigital Library
Burkhardt III, H., S. Frank, B. Knobe, and J. Rothnie [1992]. Overview of the KSR1 Computer System , Tech. Rep. KSR-TR-9202001, Kendall Square Research, Boston, Mass.Google Scholar
Burks, A. W., H. H. Goldstine, and J. von Neumann [1946]. "Preliminary discussion of the logical design of an electronic computing instrument," Report to the U. S. Army Ordnance Department, p. 1; also appears in Papers of John von Neumann , W. Aspray and A. Burks, eds., MIT Press, Cambridge, Mass., and Tomash Publishers, Los Angeles, Calif., 1987, 97-146.Google Scholar
Calder, B., G. Reinman, and D. M. Tullsen [1999]. "Selective value prediction," Proc. 26th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 2-4, 1999, Atlanta, Ga. Google Scholar
Calder, B., D. Grunwald, M. Jones, D. Lindsay, J. Martin, M. Mozer, and B. Zorn [1997]. "Evidence-based static branch prediction using machine learning," ACM Trans. Program. Lang. Syst. 19:1, 188-222. Google ScholarDigital Library
Callahan, D., J. Dongarra, and D. Levine [1988]. "Vectorizing compilers: A test suite and results," Proc. ACM/IEEE Conf. on Supercomputing , November 12-17, 1988, Orland, Fla., 98-105. Google Scholar
Cantin, J. F., and M. D. Hill [2001]. "Cache Performance for Selected SPEC CPU2000 Benchmarks," www.jfred.org/cache-data.html (June).Google Scholar
Cantin, J. F., and M. D. Hill [2003]. "Cache Performance for SPEC CPU2000 Benchmarks, Version 3.0," www.cs.wisc.edu/multifacet/misc/spec2000cache-data/index.html.Google Scholar
Carles, S. [2005]. "Amazon reports record Xmas season, top game picks," Gamasutra , December 27 (http://www.gamasutra.com/php-bin/news_index.php?story=7630.)Google Scholar
Carter, J., and K. Rajamani [2010]. "Designing energy-efficient servers and data centers," IEEE Computer 43:7 (July), 76-78. Google ScholarDigital Library
Case, R. P., and A. Padegs [1978]. "The architecture of the IBM System/370," Communications of the ACM 21:1, 73-96. Also appears in D. P. Siewiorek, C. G. Bell, and A. Newell, Computer Structures: Principles and Examples , McGraw-Hill, New York, 1982, 830-855. Google ScholarDigital Library
Censier, L., and P. Feautrier [1978]. "A new solution to coherence problems in multicache systems," IEEE Trans. on Computers C-27:12 (December), 1112-1118. Google ScholarDigital Library
Chandra, R., S. Devine, B. Verghese, A. Gupta, and M. Rosenblum [1994]. "Scheduling and page migration for multiprocessor compute servers," Sixth Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 4-7, 1994, San Jose, Calif., 12-24. Google Scholar
Chang, F., J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber [2006]. "Bigtable: A distributed storage system for structured data," Proc. 7th USENIX Symposium on Operating Systems Design and Implementation (OSDI '06) , November 6-8, 2006, Seattle, Wash. Google Scholar
Chang, J., J. Meza, P. Ranganathan, C. Bash, and A. Shah [2010]. "Green server design: Beyond operational energy to sustainability," Proc. Workshop on Power Aware Computing and Systems (HotPower '10) , October 3, 2010, Vancouver, British Columbia. Google Scholar
Chang, P. P., S. A. Mahlke, W. Y. Chen, N. J. Warter, and W. W. Hwu [1991]. "IMPACT: An architectural framework for multiple-instruction-issue processors," 18th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 27-30, 1991, Toronto, Canada, 266-275. Google Scholar
Charlesworth, A. E. [1981]. "An approach to scientific array processing: The architecture design of the AP-120B/FPS-164 family," Computer 14:9 (September), 18-27. Google ScholarDigital Library
Charlesworth, A. [1998]. "Starfire: Extending the SMP envelope," IEEE Micro 18:1 (January/February), 39-49. Google ScholarDigital Library
Chen, P. M., and E. K. Lee [1995]. "Striping in a RAID level 5 disk array," Proc. ACM SIGMETRICS Conf. on Measurement and Modeling of Computer Systems , May 15-19, 1995, Ottawa, Canada, 136-145. Google Scholar
Chen, P. M., G. A. Gibson, R. H. Katz, and D. A. Patterson [1990]. "An evaluation of redundant arrays of inexpensive disks using an Amdahl 5890," Proc. ACM SIGMETRICS Conf. on Measurement and Modeling of Computer Systems , May 22-25, 1990, Boulder, Colo. Google Scholar
Chen, P. M., E. K. Lee, G. A. Gibson, R. H. Katz, and D. A. Patterson [1994]. "RAID: High-performance, reliable secondary storage," ACM Computing Surveys 26:2 (June), 145-188. Google ScholarDigital Library
Chen, S. [1983]. "Large-scale and high-speed multiprocessor system for scientific applications," Proc. NATO Advanced Research Workshop on High-Speed Computing , June 20-22, 1983, Julich, West Germany. Also appears in K. Hwang, ed., "Superprocessors: Design and applications," IEEE (August), 602-609, 1984.Google Scholar
Chen, T. C. [1980]. "Overlap and parallel processing," in H. Stone, ed., Introduction to Computer Architecture , Science Research Associates, Chicago, 427-486.Google Scholar
Chow, F. C. [1983]. "A Portable Machine-Independent Global Optimizer--Design and Measurements," Ph. D. thesis, Stanford University, Palo Alto, Calif. Google Scholar
Chrysos, G. Z., and J. S. Emer [1998]. "Memory dependence prediction using store sets," Proc. 25th Annual Int'l. Symposium on Computer Architecture (ISCA) , July 3-14, 1998, Barcelona, Spain, 142-153. Google Scholar
Clark, B., T. Deshane, E. Dow, S. Evanchik, M. Finlayson, J. Herne, and J. Neefe Matthews [2004]. "Xen and the art of repeated research," Proc. USENIX Annual Technical Conf. , June 27-July 2, 2004, 135-144. Google Scholar
Clark, D. W. [1983]. "Cache performance of the VAX-11/780," ACM Trans. on Computer Systems 1:1, 24-37. Google ScholarDigital Library
Clark, D. W. [1987]. "Pipelining and performance in the VAX 8800 processor," Proc. Second Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 5-8, 1987, Palo Alto, Calif., 173-177. Google Scholar
Clark, D. W., and J. S. Emer [1985]. "Performance of the VAX-11/780 translation buffer: Simulation and measurement," ACM Trans. on Computer Systems 3:1 (February), 31-62. Google ScholarDigital Library
Clark, D., and H. Levy [1982]. "Measurement and analysis of instruction set use in the VAX-11/780," Proc. Ninth Annual Int'l. Symposium on Computer Architecture (ISCA) , April 26-29, 1982, Austin, Tex., 9-17. Google Scholar
Clark, D., and W. D. Strecker [1980]. "Comments on 'the case for the reduced instruction set computer,'" Computer Architecture News 8:6 (October), 34-38. Google ScholarDigital Library
Clark, W. A. [1957]. "The Lincoln TX-2 computer development," Proc. Western Joint Computer Conference , February 26-28, 1957, Los Angeles, 143-145. Google Scholar
Clidaras, J., C. Johnson, and B. Felderman [2010]. Private communication. Climate Savers Computing Initiative. [2007]. "Efficiency Specs," http://www. climatesaverscomputing.org/.Google Scholar
Clos, C. [1953]. "A study of non-blocking switching networks," Bell Systems Technical Journal 32 (March), 406-424.Google ScholarCross Ref
Cody, W. J., J. T. Coonen, D. M. Gay, K. Hanson, D. Hough, W. Kahan, R. Karpinski, J. Palmer, F. N. Ris, and D. Stevenson [1984]. "A proposed radix- and word-lengthindependent standard for floating-point arithmetic," IEEE Micro 4:4, 86-100. Google ScholarDigital Library
Colwell, R. P., and R. Steck [1995]. "A 0.6 µm BiCMOS processor with dynamic execution." Proc. of IEEE Int'l. Symposium on Solid State Circuits (ISSCC) , February 15-17, 1995, San Francisco, 176-177.Google Scholar
Colwell, R. P., R. P. Nix, J. J. O'Donnell, D. B. Papworth, and P. K. Rodman [1987]. "A VLIW architecture for a trace scheduling compiler," Proc. Second Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 5-8, 1987, Palo Alto, Calif., 180-192. Google Scholar
Comer, D. [1993]. Internetworking with TCP/IP , 2nd ed., Prentice Hall, Englewood Cliffs, N. J. Google Scholar
Compaq Computer Corporation. [1999]. Compiler Writer's Guide for the Alpha 21264 , Order Number EC-RJ66A-TE, June, www1.support.compaq.com/alpha-tools/documentation/current/21264_EV67/ec-rj66a-te_comp_writ_gde_for_alpha21264.pdf.Google Scholar
Conti, C., D. H. Gibson, and S. H. Pitkowsky [1968]. "Structural aspects of the System/ 360 Model 85. Part I. General organization," IBM Systems J. 7:1, 2-14. Google ScholarDigital Library
Coonen, J. [1984]. "Contributions to a Proposed Standard for Binary Floating-Point Arithmetic," Ph. D. thesis, University of California, Berkeley. Google Scholar
Corbett, P., B. English, A. Goel, T. Grcanac, S. Kleiman, J. Leong, and S. Sankar [2004]. "Row-diagonal parity for double disk failure correction," Proc. 3rd USENIX Conf. on File and Storage Technology (FAST '04) , March 31-April 2, 2004, San Francisco. Google Scholar
Crawford, J., and P. Gelsinger [1988]. Programming the 80386 , Sybex Books, Alameda, Calif.Google Scholar
Culler, D. E., J. P. Singh, and A. Gupta [1999]. Parallel Computer Architecture: A Hardware/Software Approach , Morgan Kaufmann, San Francisco. Google Scholar
Curnow, H. J., and B. A. Wichmann [1976]. "A synthetic benchmark," The Computer J. 19:1, 43-49.Google ScholarCross Ref
Cvetanovic, Z., and R. E. Kessler [2000]. "Performance analysis of the Alpha 21264- based Compaq ES40 system," Proc. 27th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 10-14, 2000, Vancouver, Canada, 192-202. Google Scholar
Dally, W. J. [1990]. "Performance analysis of k -ary n -cube interconnection networks," IEEE Trans. on Computers 39:6 (June), 775-785. Google ScholarDigital Library
Dally, W. J. [1992]. "Virtual channel flow control," IEEE Trans. on Parallel and Distributed Systems 3:2 (March), 194-205. Google ScholarDigital Library
Dally, W. J. [1999]. "Interconnect limited VLSI architecture," Proc. of the International Interconnect Technology Conference , May 24-26, 1999, San Francisco.Google Scholar
Dally, W. J., and C. I. Seitz [1986]. "The torus routing chip," Distributed Computing 1:4, 187-196.Google ScholarCross Ref
Dally, W. J., and B. Towles [2001]. "Route packets, not wires: On-chip interconnection networks," Proc. 38th Design Automation Conference , June 18-22, 2001, Las Vegas. Google Scholar
Dally, W. J., and B. Towles [2003]. Principles and Practices of Interconnection Networks , Morgan Kaufmann, San Francisco. Google Scholar
Darcy, J. D., and D. Gay [1996]. "FLECKmarks: Measuring floating point performance using a full IEEE compliant arithmetic benchmark," CS 252 class project, University of California, Berkeley (see HTTP.CS.Berkeley.EDU/~darcy/Projects/cs252/).Google Scholar
Darley, H. M. et al. [1989]. "Floating Point/Integer Processor with Divide and Square Root Functions," U. S. Patent 4,878,190, October 31.Google Scholar
Davidson, E. S. [1971]. "The design and control of pipelined function generators," Proc. IEEE Conf. on Systems , Networks , and Computers , January 19-21, 1971, Oaxtepec, Mexico, 19-21.Google Scholar
Davidson, E. S., A. T. Thomas, L. E. Shar, and J. H. Patel [1975]. "Effective control for pipelined processors," Proc. IEEE COMPCON , February 25-27, 1975, San Francisco, 181-184.Google Scholar
Davie, B. S., L. L. Peterson, and D. Clark [1999]. Computer Networks: A Systems Approach , 2nd ed., Morgan Kaufmann, San Francisco. Google Scholar
Dean, J. [2009]. "Designs, lessons and advice from building large distributed systems [keynote address]," Proc. 3rd ACM SIGOPS Int'l. Workshop on Large-Scale Distributed Systems and Middleware , Co-located with the 22nd ACM Symposium on Operating Systems Principles , October 11-14, 2009, Big Sky, Mont.Google Scholar
Dean, J., and S. Ghemawat [2004]. "MapReduce: Simplified data processing on large clusters." In Proc. Operating Systems Design and Implementation (OSDI) , December 6-8, 2004, San Francisco, Calif., 137-150. Google Scholar
Dean, J., and S. Ghemawat [2008]. "MapReduce: Simplified data processing on large clusters," Communications of the ACM , 51:1, 107-113. Google ScholarDigital Library
DeCandia, G., D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels [2007]. "Dynamo: Amazon's highly available key-value store," Proc. 21st ACM Symposium on Operating Systems Principles , October 14-17, 2007, Stevenson, Wash. Google Scholar
Dehnert, J. C., P. Y.-T. Hsu, and J. P. Bratt [1989]. "Overlapped loop support on the Cydra 5," Proc. Third Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , April 3-6, 1989, Boston, Mass., 26-39. Google Scholar
Demmel, J. W., and X. Li [1994]. "Faster numerical algorithms via exception handling," IEEE Trans. on Computers 43:8, 983-992. Google ScholarDigital Library
Denehy, T. E., J. Bent, F. I. Popovici, A. C. Arpaci-Dusseau, and R. H. Arpaci-Dusseau [2004]. "Deconstructing storage arrays," Proc. 11th Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 7-13, 2004, Boston, Mass., 59-71. Google Scholar
Desurvire, E. [1992]. "Lightwave communications: The fifth generation," Scientific American (International Edition) 266:1 (January), 96-103.Google Scholar
Diep, T. A., C. Nelson, and J. P. Shen [1995]. "Performance evaluation of the PowerPC 620 microarchitecture," Proc. 22nd Annual Int'l. Symposium on Computer Architecture (ISCA) , June 22-24, 1995, Santa Margherita, Italy. Google Scholar
Digital Semiconductor. [1996]. Alpha Architecture Handbook , Version 3 , Digital Press, Maynard, Mass.Google Scholar
Ditzel, D. R., and H. R. McLellan [1987]. "Branch folding in the CRISP microprocessor: Reducing the branch delay to zero," Proc. 14th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 2-5, 1987, Pittsburgh, Penn., 2-7. Google Scholar
Ditzel, D. R., and D. A. Patterson [1980]. "Retrospective on high-level language computer architecture," Proc. Seventh Annual Int'l. Symposium on Computer Architecture (ISCA) , May 6-8, 1980, La Baule, France, 97-104. Google Scholar
Doherty, W. J., and R. P. Kelisky [1979]. "Managing VM/CMS systems for user effectiveness," IBM Systems J. 18:1, 143-166. Google ScholarDigital Library
Dongarra, J. J. [1986]. "A survey of high performance processors," Proc. IEEE COMPCON , March 3-6, 1986, San Francisco, 8-11.Google Scholar
Dongarra, J., T. Sterling, H. Simon, and E. Strohmaier [2005]. "High-performance computing: Clusters, constellations, MPPs, and future directions," Computing in Science & Engineering , 7:2 (March/April), 51-59. Google Scholar
Douceur, J. R., and W. J. Bolosky [1999]. "A large scale study of file-system contents," Proc. ACM SIGMETRICS Conf. on Measurement and Modeling of Computer Systems , May 1-9, 1999, Atlanta, Ga., 59-69. Google Scholar
Douglas, J. [2005]. "Intel 8xx series and Paxville Xeon-MP microprocessors," paper presented at Hot Chips 17, August 14-16, 2005, Stanford University, Palo Alto, Calif.Google Scholar
Duato, J. [1993]. "A new theory of deadlock-free adaptive routing in wormhole networks," IEEE Trans. on Parallel and Distributed Systems 4:12 (December) 1320-1331. Google ScholarDigital Library
Duato, J., and T. M. Pinkston [2001]. "A general theory for deadlock-free adaptive routing using a mixed set of resources," IEEE Trans. on Parallel and Distributed Systems 12:12 (December), 1219-1235. Google ScholarDigital Library
Duato, J., S. Yalamanchili, and L. Ni [2003]. Interconnection Networks: An Engineering Approach , 2nd printing, Morgan Kaufmann, San Francisco.Google Scholar
Duato, J., I. Johnson, J. Flich, F. Naven, P. Garcia, and T. Nachiondo [2005a]. "A new scalable and cost-effective congestion management strategy for lossless multistage interconnection networks," Proc. 11th Int'l. Symposium on High-Performance Computer Architecture , February 12-16, 2005, San Francisco. Google ScholarDigital Library
Duato, J., O. Lysne, R. Pang, and T. M. Pinkston [2005b]. "Part I: A theory for deadlockfree dynamic reconfiguration of interconnection networks," IEEE Trans. on Parallel and Distributed Systems 16:5 (May), 412-427. Google ScholarDigital Library
Dubois, M., C. Scheurich, and F. Briggs [1988]. "Synchronization, coherence, and event ordering," IEEE Computer 21:2 (February), 9-21. Google ScholarDigital Library
Dunigan, W., K. Vetter, K. White, and P. Worley [2005]. "Performance evaluation of the Cray X1 distributed shared memory architecture," IEEE Micro January/February, 30-40. Google ScholarDigital Library
Eden, A., and T. Mudge [1998]. "The YAGS branch prediction scheme," Proc. of the 31st Annual ACM/IEEE Int'l. Symposium on Microarchitecture , November 30-December 2, 1998, Dallas, Tex., 69-80. Google Scholar
Edmondson, J. H., P. I. Rubinfield, R. Preston, and V. Rajagopalan [1995]. "Superscalar instruction execution in the 21164 Alpha microprocessor," IEEE Micro 15:2, 33-43. Google ScholarDigital Library
Eggers, S. [1989]. "Simulation Analysis of Data Sharing in Shared Memory Multiprocessors," Ph. D. thesis, University of California, Berkeley. Google Scholar
Elder, J., A. Gottlieb, C. K. Kruskal, K. P. McAuliffe, L. Randolph, M. Snir, P. Teller, and J. Wilson [1985]. "Issues related to MIMD shared-memory computers: The NYU Ultracomputer approach," Proc. 12th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 17-19, 1985, Boston, Mass., 126-135. Google Scholar
Ellis, J. R. [1986]. Bulldog: A Compiler for VLIW Architectures , MIT Press, Cambridge, Mass. Google Scholar
Emer, J. S., and D. W. Clark [1984]. "A characterization of processor performance in the VAX-11/780," Proc. 11th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 5-7, 1984, Ann Arbor, Mich., 301-310. Google Scholar
Enriquez, P. [2001]. "What happened to my dial tone? A study of FCC service disruption reports," poster, Richard Tapia Symposium on the Celebration of Diversity in Computing , October 18-20, Houston, Tex.Google Scholar
Erlichson, A., N. Nuckolls, G. Chesson, and J. L. Hennessy [1996]. "SoftFLASH: Analyzing the performance of clustered distributed virtual shared memory," Proc. Seventh Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 1-5, 1996, Cambridge, Mass., 210-220. Google Scholar
Esmaeilzadeh, H., T. Cao, Y. Xi, S. M. Blackburn, and K. S. McKinley [2011]. "Looking Back on the Language and Hardware Revolution: Measured Power, Performance, and Scaling," Proc. 16th Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , March 5-11, 2011, Newport Beach, Calif. Google Scholar
Evers, M., S. J. Patel, R. S. Chappell, and Y. N. Patt [1998]. "An analysis of correlation and predictability: What makes two-level branch predictors work," Proc. 25th Annual Int'l. Symposium on Computer Architecture (ISCA) , July 3-14, 1998, Barcelona, Spain, 52-61. Google Scholar
Fabry, R. S. [1974]. "Capability based addressing," Communications of the ACM 17:7 (July), 403-412. Google ScholarDigital Library
Falsafi, B., and D. A. Wood [1997]. "Reactive NUMA: A design for unifying S-COMA and CC-NUMA," Proc. 24th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 2-4, 1997, Denver, Colo., 229-240. Google Scholar
Fan, X., W. Weber, and L. A. Barroso [2007]. "Power provisioning for a warehouse-sized computer," Proc. 34th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 9-13, 2007, San Diego, Calif. Google Scholar
Farkas, K. I., and N. P. Jouppi [1994]. "Complexity/performance trade-offs with nonblocking loads," Proc. 21st Annual Int'l. Symposium on Computer Architecture (ISCA) , April 18-21, 1994, Chicago. Google Scholar
Farkas, K. I., N. P. Jouppi, and P. Chow [1995]. "How useful are non-blocking loads, stream buffers and speculative execution in multiple issue processors?," Proc. First IEEE Symposium on High-Performance Computer Architecture , January 22-25, 1995, Raleigh, N.C., 78-89. Google ScholarCross Ref
Farkas, K. I., P. Chow, N. P. Jouppi, and Z. Vranesic [1997]. "Memory-system design considerations for dynamically-scheduled processors," Proc. 24th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 2-4, 1997, Denver, Colo., 133-143. Google Scholar
Fazio, D. [1987]. "It's really much more fun building a supercomputer than it is simply inventing one," Proc. IEEE COMPCON , February 23-27, 1987, San Francisco, 102-105.Google Scholar
Fisher, J. A. [1981]. "Trace scheduling: A technique for global microcode compaction," IEEE Trans. on Computers 30:7 (July), 478-490. Google Scholar
Fisher, J. A. [1983]. "Very long instruction word architectures and ELI-512," 10th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 5-7, 1982, Stockholm, Sweden, 140-150. Google Scholar
Fisher, J. A., and S. M. Freudenberger [1992]. "Predicting conditional branches from previous runs of a program," Proc. Fifth Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 12-15, 1992, Boston, Mass., 85-95. Google Scholar
Fisher, J. A., and B. R. Rau [1993]. Journal of Supercomputing , January (special issue).Google Scholar
Fisher, J. A., J. R. Ellis, J. C. Ruttenberg, and A. Nicolau [1984]. "Parallel processing: A smart compiler and a dumb processor," Proc. SIGPLAN Conf. on Compiler Construction , June 17-22, 1984, Montreal, Canada, 11-16. Google Scholar
Flemming, P. J., and J. J. Wallace [1986]. "How not to lie with statistics: The correct way to summarize benchmarks results," Communications of the ACM 29:3 (March), 218-221. Google ScholarDigital Library
Flynn, M. J. [1966]. "Very high-speed computing systems," Proc. IEEE 54:12 (December), 1901-1909.Google ScholarCross Ref
Forgie, J. W. [1957]. "The Lincoln TX-2 input-output system," Proc. Western Joint Computer Conference (February), Institute of Radio Engineers, Los Angeles, 156-160. Google Scholar
Foster, C. C., and E. M. Riseman [1972]. "Percolation of code to enhance parallel dispatching and execution," IEEE Trans. on Computers C-21:12 (December), 1411- 1415. Google ScholarDigital Library
Frank, S. J. [1984]. "Tightly coupled multiprocessor systems speed memory access time," Electronics 57:1 (January), 164-169.Google ScholarCross Ref
Freiman, C. V. [1961]. "Statistical analysis of certain binary division algorithms," Proc. IRE 49:1, 91-103.Google ScholarCross Ref
Friesenborg, S. E., and R. J. Wicks [1985]. DASD Expectations: The 3380, 3380-23, and MVS/XA , Tech. Bulletin GG22-9363-02, IBM Washington Systems Center, Gaithersburg, Md.Google Scholar
Fuller, S. H., and W. E. Burr [1977]. "Measurement and evaluation of alternative computer architectures," Computer 10:10 (October), 24-35. Google ScholarDigital Library
Furber, S. B. [1996]. ARM System Architecture , Addison-Wesley, Harlow, England (see www.cs.man.ac.uk/amulet/publications/books/ARMsysArch). Google Scholar
Gagliardi, U. O. [1973]. "Report of workshop 4--software-related advances in computer hardware," Proc. Symposium on the High Cost of Software , September 17-19, 1973, Monterey, Calif., 99-120.Google Scholar
Gajski, D., D. Kuck, D. Lawrie, and A. Sameh [1983]. "CEDAR--a large scale multiprocessor," Proc. Int'l. Conf. on Parallel Processing (ICPP) , August, Columbus, Ohio, 524-529.Google Scholar
Gallagher, D. M., W. Y. Chen, S. A. Mahlke, J. C. Gyllenhaal, and W. W. Hwu [1994]. "Dynamic memory disambiguation using the memory conflict buffer," Proc. Sixth Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 4-7, Santa Jose, Calif., 183-193. Google Scholar
Galles, M. [1996]. "Scalable pipelined interconnect for distributed endpoint routing: The SGI SPIDER chip," Proc. IEEE HOT Interconnects '96 , August 15-17, 1996, Stanford University, Palo Alto, Calif.Google Scholar
Game, M., and A. Booker [1999]. "CodePack code compression for PowerPC processors," MicroNews , 5:1, www.chips.ibm.com/micronews/vol5_no1/codepack.html.Google Scholar
Gao, Q. S. [1993]. "The Chinese remainder theorem and the prime memory system," 20th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 16-19, 1993, San Diego, Calif. ( Computer Architecture News 21:2 (May), 337-340). Google Scholar
Gap. [2005]. "Gap Inc. Reports Third Quarter Earnings," http://gapinc.com/public/documents/PR_Q405EarningsFeb2306.pdf.Google Scholar
Gap. [2006]. "Gap Inc. Reports Fourth Quarter and Full Year Earnings," http://gapinc.com/public/documents/Q32005PressRelease_Final22.pdff.Google Scholar
Garner, R., A. Agarwal, F. Briggs, E. Brown, D. Hough, B. Joy, S. Kleiman, S. Muchnick, M. Namjoo, D. Patterson, J. Pendleton, and R. Tuck [1988]. "Scalable processor architecture (SPARC)," Proc. IEEE COMPCON , February 29-March 4, 1988, San Francisco, 278-283.Google Scholar
Gebis, J., and D. Patterson [2007]. "Embracing and extending 20th-century instruction set architectures," IEEE Computer 40:4 (April), 68-75. Google ScholarDigital Library
Gee, J. D., M. D. Hill, D. N. Pnevmatikatos, and A. J. Smith [1993]. "Cache performance of the SPEC92 benchmark suite," IEEE Micro 13:4 (August), 17-27. Google ScholarDigital Library
Gehringer, E. F., D. P. Siewiorek, and Z. Segall [1987]. Parallel Processing: The Cm* Experience , Digital Press, Bedford, Mass. Google Scholar
Gharachorloo, K., A. Gupta, and J. L. Hennessy [1992]. "Hiding memory latency using dynamic scheduling in shared-memory multiprocessors," Proc. 19th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 19-21, 1992, Gold Coast, Australia. Google Scholar
Gharachorloo, K., D. Lenoski, J. Laudon, P. Gibbons, A. Gupta, and J. L. Hennessy [1990]. "Memory consistency and event ordering in scalable shared-memory multiprocessors," Proc. 17th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 28-31, 1990, Seattle, Wash., 15-26. Google Scholar
Ghemawat, S., H. Gobioff, and S.-T. Leung [2003]. "The Google file system," Proc. 19th ACM Symposium on Operating Systems Principles , October 19-22, 2003, Bolton Landing, N.Y. Google ScholarDigital Library
Gibson, D. H. [1967]. "Considerations in block-oriented systems design," AFIPS Conf. Proc. 30, 75-80. Google Scholar
Gibson, G. A. [1992]. Redundant Disk Arrays: Reliable , Parallel Secondary Storage , ACM Distinguished Dissertation Series, MIT Press, Cambridge, Mass. Google Scholar
Gibson, J. C. [1970]. "The Gibson mix," Rep. TR. 00.2043, IBM Systems Development Division, Poughkeepsie, N.Y. (research done in 1959).Google Scholar
Gibson, J., R. Kunz, D. Ofelt, M. Horowitz, J. Hennessy, and M. Heinrich [2000]. "FLASH vs. (simulated) FLASH: Closing the simulation loop," Proc. Ninth Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , November 12-15, Cambridge, Mass., 49-58. Google Scholar
Glass, C. J., and L. M. Ni [1992]. "The Turn Model for adaptive routing," 19th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 19-21, 1992, Gold Coast, Australia. Google Scholar
Goldberg, D. [1991]. "What every computer scientist should know about floating-point arithmetic," Computing Surveys 23:1, 5-48. Google ScholarDigital Library
Goldberg, I. B. [1967]. "27 bits are not enough for 8-digit accuracy," Communications of the ACM 10:2, 105-106. Google ScholarDigital Library
Goldstein, S. [1987]. Storage Performance--An Eight Year Outlook , Tech. Rep. TR 03.308-1, Santa Teresa Laboratory, IBM Santa Teresa Laboratory, San Jose, Calif.Google Scholar
Goldstine, H. H. [1972]. The Computer: From Pascal to von Neumann , Princeton University Press, Princeton, N. J. Google Scholar
Gonzalez, J., and A. González [1998]. "Limits of instruction level parallelism with data speculation," Proc. Vector and Parallel Processing (VECPAR) Conf. , June 21-23, 1998, Porto, Portugal, 585-598. Google Scholar
Goodman, J. R. [1983]. "Using cache memory to reduce processor memory traffic," Proc. 10th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 5-7, 1982, Stockholm, Sweden, 124-131. Google Scholar
Goralski, W. [1997]. SONET: A Guide to Synchronous Optical Network , McGraw-Hill, New York. Google Scholar
Gosling, J. B. [1980]. Design of Arithmetic Units for Digital Computers , Springer-Verlag, New York.Google Scholar
Gray, J. [1990]. "A census of Tandem system availability between 1985 and 1990," IEEE Trans. on Reliability , 39:4 (October), 409-418.Google ScholarCross Ref
Gray, J. (ed.) [1993]. The Benchmark Handbook for Database and Transaction Processing Systems , 2nd ed., Morgan Kaufmann, San Francisco.Google Scholar
Gray, J. [2006]. Sort benchmark home page, http://sortbenchmark.org/.Google Scholar
Gray, J., and A. Reuter [1993]. Transaction Processing: Concepts and Techniques , Morgan Kaufmann, San Francisco. Google Scholar
Gray, J., and D. P. Siewiorek [1991]. "High-availability computer systems," Computer 24:9 (September), 39-48. Google ScholarDigital Library
Gray, J., and C. van Ingen [2005]. Empirical Measurements of Disk Failure Rates and Error Rates , MSR-TR-2005-166, Microsoft Research, Redmond, Wash.Google Scholar
Greenberg, A., N. Jain, S. Kandula, C. Kim, P. Lahiri, D. Maltz, P. Patel, and S. Sengupta [2009]. "VL2: A Scalable and Flexible Data Center Network," in Proc. ACM SIGCOMM , August 17-21, 2009, Barcelona, Spain. Google ScholarDigital Library
Grice, C., and M. Kanellos [2000]. "Cell phone industry at crossroads: Go high or low?," CNET News , August 31, technews.netscape.com/news/0-1004-201-2518386- 0.html?tag=st.ne.1002.tgif.sf.Google Scholar
Groe, J. B., and L. E. Larson [2000]. CDMA Mobile Radio Design , Artech House, Boston. Google Scholar
Gunther, K. D. [1981]. "Prevention of deadlocks in packet-switched data transport systems," IEEE Trans. on Communications COM-29:4 (April), 512-524.Google ScholarCross Ref
Hagersten, E., and M. Koster [1998]. "WildFire: A scalable path for SMPs," Proc. Fifth Int'l. Symposium on High-Performance Computer Architecture , January 9-12, 1999, Orlando, Fla. Google Scholar
Hagersten, E., A. Landin, and S. Haridi [1992]. "DDM--a cache-only memory architecture," IEEE Computer 25:9 (September), 44-54. Google ScholarDigital Library
Hamacher, V. C., Z. G. Vranesic, and S. G. Zaky [1984]. Computer Organization , 2nd ed., McGraw-Hill, New York. Google Scholar
Hamilton, J. [2009]. "Data center networks are in my way," paper presented at the Stanford Clean Slate CTO Summit, October 23, 2009 (http://mvdirona.com/jrh/TalksAndPapers/JamesHamilton_CleanSlateCTO2009.pdf).Google Scholar
Hamilton, J. [2010]. "Cloud computing economies of scale," paper presented at the AWS Workshop on Genomics and Cloud Computing , June 8, 2010, Seattle, Wash. (http://mvdirona.com/jrh/TalksAndPapers/JamesHamilton_GenomicsCloud20100608.pdf).Google Scholar
Handy, J. [1993]. The Cache Memory Book , Academic Press, Boston. Google Scholar
Hauck, E. A., and B. A. Dent [1968]. "Burroughs' B6500/B7500 stack mechanism," Proc. AFIPS Spring Joint Computer Conf. , April 30-May 2, 1968, Atlantic City, N. J., 245-251. Google Scholar
Heald, R., K. Aingaran, C. Amir, M. Ang, M. Boland, A. Das, P. Dixit, G. Gouldsberry, J. Hart, T. Horel, W.-J. Hsu, J. Kaku, C. Kim, S. Kim, F. Klass, H. Kwan, R. Lo, H. McIntyre, A. Mehta, D. Murata, S. Nguyen, Y.-P. Pai, S. Patel, K. Shin, K. Tam, S. Vishwanthaiah, J. Wu, G. Yee, and H. You [2000]. "Implementation of thirdgeneration SPARC V9 64-b microprocessor," ISSCC Digest of Technical Papers , 412-413 and slide supplement.Google Scholar
Heinrich, J. [1993]. MIPS R4000 User's Manual , Prentice Hall, Englewood Cliffs, N. J. Henly, M., and B. McNutt [1989]. DASD I/O Characteristics: A Comparison of MVS to VM ," Tech. Rep. TR 02.1550 (May), IBM General Products Division, San Jose, Calif. Google Scholar
Hennessy, J. [1984]. "VLSI processor architecture," IEEE Trans. on Computers C-33:11 (December), 1221-1246. Google ScholarDigital Library
Hennessy, J. [1985]. "VLSI RISC processors," VLSI Systems Design 6:10 (October), 22-32.Google Scholar
Hennessy, J., N. Jouppi, F. Baskett, and J. Gill [1981]. "MIPS: A VLSI processor architecture," in CMU Conference on VLSI Systems and Computations , Computer Science Press, Rockville, Md.Google Scholar
Hewlett-Packard. [1994]. PA-RISC 2.0 Architecture Reference Manual , 3rd ed., Hewlett-Packard, Palo Alto, Calif.Google Scholar
Hewlett-Packard. [1998]. "HP's '5NINES:5MINUTES' Vision Extends Leadership and Redefines High Availability in Mission-Critical Environments," February 10, www.future.enterprisecomputing.hp.com/ia64/news/5nines_vision_pr.html.Google Scholar
Hill, M. D. [1987]. "Aspects of Cache Memory and Instruction Buffer Performance," Ph. D. thesis, Tech. Rep. UCB/CSD 87/381, Computer Science Division, University of California, Berkeley. Google Scholar
Hill, M. D. [1988]. "A case for direct mapped caches," Computer 21:12 (December), 25-40. Google ScholarDigital Library
Hill, M. D. [1998]. "Multiprocessors should support simple memory consistency models," IEEE Computer 31:8 (August), 28-34. Google ScholarDigital Library
Hillis, W. D. [1985]. The Connection Multiprocessor , MIT Press, Cambridge, Mass.Google Scholar
Hillis, W. D. and G. L. Steele [1986]. "Data parallel algorithms," Communications of the ACM 29:12 (December), 1170-1183. (http://doi.acm.org/10.1145/7902.7903). Google ScholarDigital Library
Hinton, G., D. Sager, M. Upton, D. Boggs, D. Carmean, A. Kyker, and P. Roussel [2001]. "The microarchitecture of the Pentium 4 processor," Intel Technology Journal , February.Google Scholar
Hintz, R. G., and D. P. Tate [1972]. "Control data STAR-100 processor design," Proc. IEEE COMPCON , September 12-14, 1972, San Francisco, 1-4.Google Scholar
Hirata, H., K. Kimura, S. Nagamine, Y. Mochizuki, A. Nishimura, Y. Nakase, and T. Nishizawa [1992]. "An elementary processor architecture with simultaneous instruction issuing from multiple threads," Proc. 19th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 19-21, 1992, Gold Coast, Australia, 136-145. Google Scholar
Hitachi. [1997]. SuperH RISC Engine SH7700 Series Programming Manual , Hitachi, Santa Clara, Calif. (see www.halsp.hitachi.com/tech_prod/and search for title).Google Scholar
Ho, R., K. W. Mai, and M. A. Horowitz [2001]. "The future of wires," Proc. of the IEEE 89:4 (April), 490-504.Google ScholarCross Ref
Hoagland, A. S. [1963]. Digital Magnetic Recording , Wiley, New York.Google Scholar
Hockney, R. W., and C. R. Jesshope [1988]. Parallel Computers 2: Architectures , Programming and Algorithms , Adam Hilger, Ltd., Bristol, England. Google Scholar
Holland, J. H. [1959]. "A universal computer capable of executing an arbitrary number of subprograms simultaneously," Proc. East Joint Computer Conf. 16, 108-113. Google Scholar
Holt, R. C. [1972]. "Some deadlock properties of computer systems," ACM Computer Surveys 4:3 (September), 179-196. Google ScholarDigital Library
Hopkins, M. [2000]. "A critical look at IA-64: Massive resources, massive ILP, but can it deliver?" Microprocessor Report , February.Google Scholar
Hord, R. M. [1982]. The Illiac-IV , The First Supercomputer , Computer Science Press, Rockville, Md.Google Scholar
Horel, T., and G. Lauterbach [1999]. "UltraSPARC-III: Designing third-generation 64-bit performance," IEEE Micro 19:3 (May-June), 73-85. Google ScholarDigital Library
Hospodor, A. D., and A. S. Hoagland [1993]. "The changing nature of disk controllers." Proc. IEEE 81:4 (April), 586-594.Google ScholarCross Ref
Holzle, U. [2010]. "Brawny cores still beat wimpy cores, most of the time," IEEE Micro 30:4 (July/August).Google Scholar
Hristea, C., D. Lenoski, and J. Keen [1997]. "Measuring memory hierarchy performance of cache-coherent multiprocessors using micro benchmarks," Proc. ACM/IEEE Conf. on Supercomputing , November 16-21, 1997, San Jose, Calif. Google Scholar
Hsu, P. [1994]. "Designing the TFP microprocessor," IEEE Micro 18:2 (April), 2333. Google Scholar
Huck, J. et al. [2000]. "Introducing the IA-64 Architecture" IEEE Micro , 20:5 (September-October), 12-23. Google ScholarDigital Library
Hughes, C. J., P. Kaul, S. V. Adve, R. Jain, C. Park, and J. Srinivasan [2001]. "Variability in the execution of multimedia applications and implications for architecture," Proc. 28th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 30-July 4, 2001, Goteborg, Sweden, 254-265. Google Scholar
Hwang, K. [1979]. Computer Arithmetic: Principles , Architecture , and Design , Wiley, New York. Google Scholar
Hwang, K. [1993]. Advanced Computer Architecture and Parallel Programming , McGraw-Hill, New York.Google Scholar
Hwu, W.-M., and Y. Patt [1986]. "HPSm, a high performance restricted data flow architecture having minimum functionality," Proc. 13th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 2-5, 1986, Tokyo, 297-307. Google Scholar
Hwu, W. W., S. A. Mahlke, W. Y. Chen, P. P. Chang, N. J. Warter, R. A. Bringmann, R. O. Ouellette, R. E. Hank, T. Kiyohara, G. E. Haab, J. G. Holm, and D. M. Lavery [1993]. "The superblock: An effective technique for VLIW and superscalar compilation," J. Supercomputing 7:1, 2 (March), 229-248. Google Scholar
IBM. [1982]. The Economic Value of Rapid Response Time , GE20-0752-0, IBM, White Plains, N.Y., 11-82.Google Scholar
IBM. [1990]. "The IBM RISC System/6000 processor" (collection of papers), IBM J. Research and Development 34:1 (January).Google Scholar
IBM. [1994]. The PowerPC Architecture , Morgan Kaufmann, San Francisco.Google Scholar
IBM. [2005]. "Blue Gene," IBM J. Research and Development , 49:2/3 (special issue).Google ScholarDigital Library
IEEE. [1985]. "IEEE standard for binary floating-point arithmetic," SIGPLAN Notices 22:2, 9-25.Google Scholar
IEEE. [2005]. "Intel virtualization technology, computer," IEEE Computer Society 38:5 (May), 48-56. Google Scholar
IEEE. 754-2008 Working Group. [2006]. "DRAFT Standard for Floating-Point Arithmetic 754-2008," http://dx.doi.org/10.1109/IEEESTD.2008.4610935.Google Scholar
Imprimis Product Specification , 97209 Sabre Disk Drive IPI-2 Interface 1.2 GB , Document No. 64402302, Imprimis, Dallas, Tex.Google Scholar
InfiniBand Trade Association. [2001]. InfiniBand Architecture Specifications Release 1.0.a , www.infinibandta.org.Google Scholar
Intel. [2001]. "Using MMX Instructions to Convert RGB to YUV Color Conversion," cedar.intel.com/cgi-bin/ids.dll/content/content.jsp?cntKey=Legacy::irtm_AP548_9996& cntType=IDS_ EDITORIAL.Google Scholar
Internet Retailer. [2005]. "The Gap launches a new site--after two weeks of downtime," Internet® Retailer , September 28, http://www.internetretailer.com/2005/09/28/thegap-launches-a-new-site-after-two-weeks-of-downtime.Google Scholar
Jain, R. [1991]. The Art of Computer Systems Performance Analysis: Techniques for Experimental Design , Measurement , Simulation , and Modeling , Wiley, New York.Google Scholar
Jantsch, A., and H. Tenhunen (eds.) [2003]. Networks on Chips , Kluwer Academic Publishers, The Netherlands. Google Scholar
Jimenez, D. A., and C. Lin [2002]. "Neural methods for dynamic branch prediction," ACM Trans. on Computer Systems 20:4 (November), 369-397. Google ScholarDigital Library
Johnson, M. [1990]. Superscalar Microprocessor Design , Prentice Hall, Englewood Cliffs, N. J.Google Scholar
Jordan, H. F. [1983]. "Performance measurements on HEP--a pipelined MIMD computer," Proc. 10th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 5-7, 1982, Stockholm, Sweden, 207-212. Google Scholar
Jordan, K. E. [1987]. "Performance comparison of large-scale scientific processors: Scalar mainframes, mainframes with vector facilities, and supercomputers," Computer 20:3 (March), 10-23. Google ScholarDigital Library
Jouppi, N. P. [1990]. "Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers," Proc. 17th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 28-31, 1990, Seattle, Wash., 364-373. Google Scholar
Jouppi, N. P. [1998]. "Retrospective: Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers," 25 Years of the International Symposia on Computer Architecture (Selected Papers) , ACM, New York, 71-73. Google ScholarDigital Library
Jouppi, N. P., and D. W. Wall [1989]. "Available instruction-level parallelism for superscalar and superpipelined processors," Proc. Third Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , April 3-6, 1989, Boston, 272-282. Google Scholar
Jouppi, N. P., and S. J. E. Wilton [1994]. "Trade-offs in two-level on-chip caching," Proc. 21st Annual Int'l. Symposium on Computer Architecture (ISCA) , April 18-21, 1994, Chicago, 34-45. Google Scholar
Kaeli, D. R., and P. G. Emma [1991]. "Branch history table prediction of moving target branches due to subroutine returns," Proc. 18th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 27-30, 1991, Toronto, Canada, 34-42. Google Scholar
Kahan, J. [1990]. "On the advantage of the 8087's stack," unpublished course notes, Computer Science Division, University of California, Berkeley.Google Scholar
Kahan, W. [1968]. "7094-II system support for numerical analysis," SHARE Secretarial Distribution SSD-159, Department of Computer Science, University of Toronto.Google Scholar
Kahaner, D. K. [1988]. "Benchmarks for 'real' programs," SIAM News , November.Google Scholar
Kahn, R. E. [1972]. "Resource-sharing computer communication networks," Proc. IEEE 60:11 (November), 1397-1407.Google ScholarCross Ref
Kane, G. [1986]. MIPS R2000 RISC Architecture , Prentice Hall, Englewood Cliffs, N. J.Google Scholar
Kane, G. [1996]. PA-RISC 2.0 Architecture , Prentice Hall, Upper Saddle River, N. J. Google Scholar
Kane, G., and J. Heinrich [1992]. MIPS RISC Architecture , Prentice Hall, Englewood Cliffs, N. J. Google Scholar
Katz, R. H., D. A. Patterson, and G. A. Gibson [1989]. "Disk system architectures for high performance computing," Proc. IEEE 77:12 (December), 1842-1858.Google ScholarCross Ref
Keckler, S. W., and W. J. Dally [1992]. "Processor coupling: Integrating compile time and runtime scheduling for parallelism," Proc. 19th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 19-21, 1992, Gold Coast, Australia, 202-213. Google Scholar
Keller, R. M. [1975]. "Look-ahead processors," ACM Computing Surveys 7:4 (December), 177-195. Google ScholarDigital Library
Keltcher, C. N., K. J. McGrath, A. Ahmed, and P. Conway [2003]. "The AMD Opteron processor for multiprocessor servers," IEEE Micro 23:2 (March-April), 66-76 (dx.doi.org/10.1109. MM.2003.119116). Google ScholarDigital Library
Kembel, R. [2000]. "Fibre Channel: A comprehensive introduction," Internet Week , April. Google Scholar
Kermani, P., and L. Kleinrock [1979]. "Virtual Cut-Through: A New Computer Communication Switching Technique," Computer Networks 3 (January), 267-286.Google Scholar
Kessler, R. [1999]. "The Alpha 21264 microprocessor," IEEE Micro 19:2 (March/April) 24-36. Google ScholarDigital Library
Kilburn, T., D. B. G. Edwards, M. J. Lanigan, and F. H. Sumner [1962]. "One-level storage system," IRE Trans. on Electronic Computers EC-11 (April) 223-235. AlsoGoogle ScholarCross Ref
appears in D. P. Siewiorek, C. G. Bell, and A. Newell, Computer Structures: Principles and Examples , McGraw-Hill, New York, 1982, 135-148. Google Scholar
Killian, E. [1991]. "MIPS R4000 technical overview-64 bits/100 MHz or bust," Hot Chips III Symposium Record , August 26-27, 1991, Stanford University, Palo Alto, Calif., 1.6-1.19.Google Scholar
Kim, M. Y. [1986]. "Synchronized disk interleaving," IEEE Trans. on Computers C-35:11 (November), 978-988. Google ScholarDigital Library
Kissell, K. D. [1997]. "MIPS16: High-density for the embedded market," Proc. Real Time Systems '97 , June 15, 1997, Las Vegas, Nev. (see www.sgi.com/MIPS/arch/MIPS16/MIPS16.whitepaper.pdf).Google Scholar
Kitagawa, K., S. Tagaya, Y. Hagihara, and Y. Kanoh [2003]. "A hardware overview of SX- 6 and SX-7 supercomputer," NEC Research & Development J. 44:1 (January), 2-7.Google Scholar
Knuth, D. [1981]. The Art of Computer Programming , Vol. II, 2nd ed., Addison-Wesley, Reading, Mass.Google Scholar
Kogge, P. M. [1981]. The Architecture of Pipelined Computers , McGraw-Hill, New York.Google Scholar
Kohn, L., and S.-W. Fu [1989]. "A 1,000,000 transistor microprocessor," Proc. of IEEE Int'l. Symposium on Solid State Circuits (ISSCC) , February 15-17, 1989, New York, 54-55.Google Scholar
Kohn, L., and N. Margulis [1989]. "Introducing the Intel i860 64-Bit Microprocessor," IEEE Micro , 9:4 (July), 15-30. Google ScholarDigital Library
Kontothanassis, L., G. Hunt, R. Stets, N. Hardavellas, M. Cierniak, S. Parthasarathy, W. Meira, S. Dwarkadas, and M. Scott [1997]. "VM-based shared memory on lowlatency, remote-memory-access networks," Proc. 24th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 2-4, 1997, Denver, Colo. Google Scholar
Koren, I. [1989]. Computer Arithmetic Algorithms , Prentice Hall, Englewood Cliffs, N. J. Kozyrakis, C. [2000]. "Vector IRAM: A media-oriented vector processor with embedded DRAM," paper presented at Hot Chips 12, August 13-15, 2000, Palo Alto, Calif, 13-15. Google Scholar
Kozyrakis, C., and D. Patterson, [2002]. "Vector vs. superscalar and VLIW architectures for embedded multimedia benchmarks," Proc. 35th Annual Int'l. Symposium on Microarchitecture (MICRO-35) , November 18-22, 2002, Istanbul, Turkey. Google Scholar
Kroft, D. [1981]. "Lockup-free instruction fetch/prefetch cache organization," Proc. Eighth Annual Int'l. Symposium on Computer Architecture (ISCA) , May 12-14, 1981, Minneapolis, Minn., 81-87. Google Scholar
Kroft, D. [1998]. "Retrospective: Lockup-free instruction fetch/prefetch cache organization," 25 Years of the International Symposia on Computer Architecture (Selected Papers) , ACM, New York, 20-21. Google ScholarDigital Library
Kuck, D., P. P. Budnik, S.-C. Chen, D. H. Lawrie, R. A. Towle, R. E. Strebendt, E. W. Davis, Jr., J. Han, P. W. Kraska, and Y. Muraoka [1974]. "Measurements of parallelism in ordinary FORTRAN programs," Computer 7:1 (January), 37-46.Google ScholarCross Ref
Kuhn, D. R. [1997]. "Sources of failure in the public switched telephone network," IEEE Computer 30:4 (April), 31-36. Google ScholarDigital Library
Kumar, A. [1997]. "The HP PA-8000 RISC CPU," IEEE Micro 17:2 (March/April), 27-32. Google ScholarDigital Library
Kunimatsu, A., N. Ide, T. Sato, Y. Endo, H. Murakami, T. Kamei, M. Hirano, F. Ishihara, H. Tago, M. Oka, A. Ohba, T. Yutaka, T. Okada, and M. Suzuoki [2000]. "Vector unit architecture for emotion synthesis," IEEE Micro 20:2 (March-April), 40-47. Google ScholarDigital Library
Kunkel, S. R., and J. E. Smith [1986]. "Optimal pipelining in supercomputers," Proc. 13th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 2-5, 1986, Tokyo, 404-414. Google Scholar
Kurose, J. F., and K. W. Ross [2001]. Computer Networking: A Top-Down Approach Featuring the Internet , Addison-Wesley, Boston. Google Scholar
Kuskin, J., D. Ofelt, M. Heinrich, J. Heinlein, R. Simoni, K. Gharachorloo, J. Chapin, D. Nakahira, J. Baxter, M. Horowitz, A. Gupta, M. Rosenblum, and J. L. Hennessy [1994]. "The Stanford FLASH multiprocessor," Proc. 21st Annual Int'l. Symposium on Computer Architecture (ISCA) , April 18-21, 1994, Chicago. Google ScholarDigital Library
Lam, M. [1988]. "Software pipelining: An effective scheduling technique for VLIW processors," SIGPLAN Conf. on Programming Language Design and Implementation , June 22-24, 1988, Atlanta, Ga., 318-328. Google Scholar
Lam, M. S., and R. P. Wilson [1992]. "Limits of control flow on parallelism," Proc. 19th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 19-21, 1992, Gold Coast, Australia, 46-57. Google Scholar
Lam, M. S., E. E. Rothberg, and M. E. Wolf [1991]. "The cache performance and optimizations of blocked algorithms," Proc. Fourth Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , April 8-11, 1991, Santa Clara, Calif. ( SIGPLAN Notices 26:4 (April), 63-74). Google Scholar
Lambright, D. [2000]. "Experiences in measuring the reliability of a cache-based storage system," Proc. of First Workshop on Industrial Experiences with Systems Software (WIESS 2000), Co-Located with the 4th Symposium on Operating Systems Design and Implementation (OSDI) , October 22, 2000, San Diego, Calif. Google Scholar
Lamport, L. [1979]. "How to make a multiprocessor computer that correctly executes multiprocess programs," IEEE Trans. on Computers C-28:9 (September), 241-248. Google Scholar
Lang, W., J. M. Patel, and S. Shankar [2010]. "Wimpy node clusters: What about non-wimpy workloads?" Proc. Sixth International Workshop on Data Management on New Hardware (DaMoN) , June 7, Indianapolis, Ind. Google Scholar
Laprie, J.-C. [1985]. "Dependable computing and fault tolerance: Concepts and terminology," Proc. 15th Annual Int'l. Symposium on Fault-Tolerant Computing , June 19-21, 1985, Ann Arbor, Mich., 2-11.Google Scholar
Larson, E. R. [1973]. "Findings of fact, conclusions of law, and order for judgment," File No. 4-67, Civ. 138, Honeywell v. Sperry-Rand and Illinois Scientific Development , U. S. District Court for the State of Minnesota, Fourth Division (October 19).Google Scholar
Laudon, J., and D. Lenoski [1997]. "The SGI Origin: A ccNUMA highly scalable server," Proc. 24th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 2-4, 1997, Denver, Colo., 241-251. Google Scholar
Laudon, J., A. Gupta, and M. Horowitz [1994]. "Interleaving: A multithreading technique targeting multiprocessors and workstations," Proc. Sixth Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 4-7, San Jose, Calif., 308-318. Google Scholar
Lauterbach, G., and T. Horel [1999]. "UltraSPARC-III: Designing third generation 64-bit performance," IEEE Micro 19:3 (May/June). Google Scholar
Lazowska, E. D., J. Zahorjan, G. S. Graham, and K. C. Sevcik [1984]. Quantitative System Performance: Computer System Analysis Using Queueing Network Models , Prentice Hall, Englewood Cliffs, N. J. (Although out of print, it is available online at www.cs.washington.edu/homes/lazowska/qsp/.) Google Scholar
Lebeck, A. R., and D. A. Wood [1994]. "Cache profiling and the SPEC benchmarks: A case study," Computer 27:10 (October), 15-26. Google ScholarDigital Library
Lee, R. [1989]. "Precision architecture," Computer 22:1 (January), 78-91. Google ScholarDigital Library
Lee, W. V. et al. [2010]. "Debunking the 100X GPU vs. CPU myth: An evaluation of throughput computing on CPU and GPU," Proc. 37th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 19-23, 2010, Saint-Malo, France. Google Scholar
Leighton, F. T. [1992]. Introduction to Parallel Algorithms and Architectures: Arrays , Trees , Hypercubes , Morgan Kaufmann, San Francisco. Google Scholar
Leiner, A. L. [1954]. "System specifications for the DYSEAC," J. ACM 1:2 (April), 57-81. Google ScholarDigital Library
Leiner, A. L., and S. N. Alexander [1954]. "System organization of the DYSEAC," IRE Trans. of Electronic Computers EC-3:1 (March), 1-10.Google Scholar
Leiserson, C. E. [1985]. "Fat trees: Universal networks for hardware-efficient supercomputing," IEEE Trans. on Computers C-34:10 (October), 892-901. Google ScholarCross Ref
Lenoski, D., J. Laudon, K. Gharachorloo, A. Gupta, and J. L. Hennessy [1990]. "The Stanford DASH multiprocessor," Proc. 17th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 28-31, 1990, Seattle, Wash., 148-159.Google ScholarCross Ref
Lenoski, D., J. Laudon, K. Gharachorloo, W.-D. Weber, A. Gupta, J. L. Hennessy, M. A. Horowitz, and M. Lam [1992]. "The Stanford DASH multiprocessor," IEEE Computer 25:3 (March), 63-79. Google ScholarDigital Library
Levy, H., and R. Eckhouse [1989]. Computer Programming and Architecture: The VAX , Digital Press, Boston. Google Scholar
Li, K. [1988]. "IVY: A shared virtual memory system for parallel computing," Proc. 1988 Int'l. Conf. on Parallel Processing , Pennsylvania State University Press, University Park, Penn.Google Scholar
Li, S., K. Chen, J. B. Brockman, and N. Jouppi [2011]. "Performance Impacts of Nonblocking Caches in Out-of-order Processors," HP Labs Tech Report HPL-2011-65 (full text available at http://Library.hp.com/techpubs/2011/Hpl-2011-65.html).Google Scholar
Lim, K., P. Ranganathan, J. Chang, C. Patel, T. Mudge, and S. Reinhardt [2008]. "Understanding and designing new system architectures for emerging warehouse-computing environments," Proc. 35th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 21-25, 2008, Beijing, China. Google Scholar
Lincoln, N. R. [1982]. "Technology and design trade offs in the creation of a modern supercomputer," IEEE Trans. on Computers C-31:5 (May), 363-376. Google Scholar
Lindholm, T., and F. Yellin [1999]. The Java Virtual Machine Specification , 2nd ed., Addison-Wesley, Reading, Mass. (also available online at java.sun.com/docs/books/vmspec/). Google Scholar
Lipasti, M. H., and J. P. Shen [1996]. "Exceeding the dataflow limit via value prediction," Proc. 29th Int'l. Symposium on Microarchitecture , December 2-4, 1996, Paris, France. Google Scholar
Lipasti, M. H., C. B. Wilkerson, and J. P. Shen [1996]. "Value locality and load value prediction," Proc. Seventh Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 1-5, 1996, Cambridge, Mass., 138-147. Google ScholarDigital Library
Liptay, J. S. [1968]. "Structural aspects of the System/360 Model 85, Part II: The cache," IBM Systems J. 7:1, 15-21. Google ScholarDigital Library
Lo, J., L. Barroso, S. Eggers, K. Gharachorloo, H. Levy, and S. Parekh [1998]. "An analysis of database workload performance on simultaneous multithreaded processors," Proc. 25th Annual Int'l. Symposium on Computer Architecture (ISCA) , July 3-14, 1998, Barcelona, Spain, 39-50. Google Scholar
Lo, J., S. Eggers, J. Emer, H. Levy, R. Stamm, and D. Tullsen [1997]. "Converting threadlevel parallelism into instruction-level parallelism via simultaneous multithreading," ACM Trans. on Computer Systems 15:2 (August), 322-354. Google ScholarDigital Library
Lovett, T., and S. Thakkar [1988]. "The Symmetry multiprocessor system," Proc. 1988 Int'l. Conf. of Parallel Processing , University Park, Penn., 303-310.Google Scholar
Lubeck, O., J. Moore, and R. Mendez [1985]. "A benchmark comparison of three supercomputers: Fujitsu VP-200, Hitachi S810/20, and Cray X-MP/2," Computer 18:12 (December), 10-24. Google ScholarDigital Library
Luk, C.-K., and T. C Mowry [1999]. "Automatic compiler-inserted prefetching for pointer-based applications," IEEE Trans. on Computers 48:2 (February), 134-141. Google Scholar
Lunde, A. [1977]. "Empirical evaluation of some features of instruction set processor architecture," Communications of the ACM 20:3 (March), 143-152. Google ScholarDigital Library
Luszczek, P., J. J. Dongarra, D. Koester, R. Rabenseifner, B. Lucas, J. Kepner, J. McCalpin, D. Bailey, and D. Takahashi [2005]. "Introduction to the HPC challenge benchmark suite," Lawrence Berkeley National Laboratory, Paper LBNL-57493 (April 25), repositories.cdlib.org/lbnl/LBNL-57493.Google Scholar
Maberly, N. C. [1966]. Mastering Speed Reading , New American Library, New York.Google Scholar
Magenheimer, D. J., L. Peters, K. W. Pettis, and D. Zuras [1988]. "Integer multiplication and division on the HP precision architecture," IEEE Trans. on Computers 37:8, 980-990. Google ScholarDigital Library
Mahlke, S. A., W. Y. Chen, W.-M. Hwu, B. R. Rau, and M. S. Schlansker [1992]. "Sentinel scheduling for VLIW and superscalar processors," Proc. Fifth Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 12-15, 1992, Boston, 238-247. Google Scholar
Mahlke, S. A., R. E. Hank, J. E. McCormick, D. I. August, and W. W. Hwu [1995]. "A comparison of full and partial predicated execution support for ILP processors," Proc. 22nd Annual Int'l. Symposium on Computer Architecture (ISCA) , June 22-24, 1995, Santa Margherita, Italy, 138-149. Google Scholar
Major, J. B. [1989]. "Are queuing models within the grasp of the unwashed?," Proc. Int'l. Conf. on Management and Performance Evaluation of Computer Systems , December 11-15, 1989, Reno, Nev., 831-839.Google Scholar
Markstein, P. W. [1990]. "Computation of elementary functions on the IBM RISC System/6000 processor," IBM J. Research and Development 34:1, 111-119. Google ScholarDigital Library
Mathis, H. M., A. E. Mercias, J. D. McCalpin, R. J. Eickemeyer, and S. R. Kunkel [2005]. "Characterization of the multithreading (SMT) efficiency in Power5," IBM J. Research and Development , 49:4/5 (July/September), 555-564. Google ScholarCross Ref
McCalpin, J. [2005]. "STREAM: Sustainable Memory Bandwidth in High Performance Computers," www.cs.virginia.edu/stream/.Google Scholar
McCalpin, J., D. Bailey, and D. Takahashi [2005]. Introduction to the HPC Challenge Benchmark Suite , Paper LBNL-57493 Lawrence Berkeley National Laboratory, University of California, Berkeley, repositories.cdlib.org/lbnl/LBNL-57493.Google Scholar
McCormick, J., and A. Knies [2002]. "A brief analysis of the SPEC CPU2000 benchmarks on the Intel Itanium 2 processor," paper presented at Hot Chips 14, August 18-20, 2002, Stanford University, Palo Alto, Calif.Google Scholar
McFarling, S. [1989]. "Program optimization for instruction caches," Proc. Third Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , April 3-6, 1989, Boston, 183-191. Google Scholar
McFarling, S. [1993]. Combining Branch Predictors , WRL Technical Note TN-36, Digital Western Research Laboratory, Palo Alto, Calif.Google Scholar
McFarling, S., and J. Hennessy [1986]. "Reducing the cost of branches," Proc. 13th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 2-5, 1986, Tokyo, 396-403. Google Scholar
McGhan, H., and M. O'Connor [1998]. "PicoJava: A direct execution engine for Java bytecode," Computer 31:10 (October), 22-30. Google ScholarDigital Library
McKeeman, W. M. [1967]. "Language directed computer design," Proc. AFIPS Fall Joint Computer Conf. , November 14-16, 1967, Washington, D.C., 413-417. Google Scholar
McMahon, F. M. [1986]. " The Livermore FORTRAN Kernels: A Computer Test of Numerical Performance Range ," Tech. Rep. UCRL-55745, Lawrence Livermore National Laboratory, University of California, Livermore.Google Scholar
McNairy, C., and D. Soltis [2003]. "Itanium 2 processor microarchitecture," IEEE Micro 23:2 (March-April), 44-55. Google ScholarDigital Library
Mead, C., and L. Conway [1980]. Introduction to VLSI Systems , Addison-Wesley, Reading, Mass. Google Scholar
Mellor-Crummey, J. M., and M. L. Scott [1991]. "Algorithms for scalable synchronization on shared-memory multiprocessors," ACM Trans. on Computer Systems 9:1 (February), 21-65. Google ScholarDigital Library
Menabrea, L. F. [1842]. "Sketch of the analytical engine invented by Charles Babbage," Bibliothèque Universelle de Genève , 82 (October).Google Scholar
Menon, A., J. Renato Santos, Y. Turner, G. Janakiraman, and W. Zwaenepoel [2005]. "Diagnosing performance overheads in the xen virtual machine environment," Proc. First ACM/USENIX Int'l. Conf. on Virtual Execution Environments , June 11-12, 2005, Chicago, 13-23. Google Scholar
Merlin, P. M., and P. J. Schweitzer [1980]. "Deadlock avoidance in store-and-forward networks. Part I. Store-and-forward deadlock," IEEE Trans. on Communications COM-28:3 (March), 345-354.Google ScholarCross Ref
Metcalfe, R. M. [1993]. "Computer/network interface design: Lessons from Arpanet and Ethernet," IEEE J. on Selected Areas in Communications 11:2 (February), 173-180. Google ScholarDigital Library
Metcalfe, R. M., and D. R. Boggs [1976]. "Ethernet: Distributed packet switching for local computer networks," Communications of the ACM 19:7 (July), 395-404. Google ScholarDigital Library
Metropolis, N., J. Howlett, and G. C. Rota (eds.) [1980]. A History of Computing in the Twentieth Century , Academic Press, New York. Google Scholar
Meyer, R. A., and L. H. Seawright [1970]. A virtual machine time sharing system, IBM Systems J. 9:3, 199-218. Google ScholarDigital Library
Meyers, G. J. [1978]. "The evaluation of expressions in a storage-to-storage architecture," Computer Architecture News 7:3 (October), 20-23. Google Scholar
Meyers, G. J. [1982]. Advances in Computer Architecture , 2nd ed., Wiley, New York. Micron. [2004]. "Calculating Memory System Power for DDR2," http://download. micron.com/pdf/pubs/designline/dl1Q04.pdf. Google Scholar
Micron. [2006]. "The Micron® System-Power Calculator," http://www.micron.com/systemcalc.Google Scholar
MIPS. [1997]. "MIPS16 Application Specific Extension Product Description," www.sgi.com/MIPS/arch/MIPS16/mips16.pdf.Google Scholar
Miranker, G. S., J. Rubenstein, and J. Sanguinetti [1988]. "Squeezing a Cray-class supercomputer into a single-user package," Proc. IEEE COMPCON , February 29-March 4, 1988, San Francisco, 452-456.Google Scholar
Mitchell, D. [1989]. "The Transputer: The time is now," Computer Design (RISC suppl.), 40-41.Google Scholar
Mitsubishi. [1996]. Mitsubishi 32-Bit Single Chip Microcomputer M32R Family Software Manual , Mitsubishi, Cypress, Calif.Google Scholar
Miura, K., and K. Uchida [1983]. "FACOM vector processing system: VP100/200," Proc. NATO Advanced Research Workshop on High-Speed Computing , June 20-22, 1983, Jülich, West Germany. Also appears in K. Hwang, ed., "Superprocessors: Design and applications," IEEE (August 1984), 59-73.Google Scholar
Miya, E. N. [1985]. "Multiprocessor/distributed processing bibliography," Computer Architecture News 13:1, 27-29. Google ScholarDigital Library
Montoye, R. K., E. Hokenek, and S. L. Runyon [1990]. "Design of the IBM RISC System/6000 floating-point execution," IBM J. Research and Development 34:1, 59-70. Google ScholarDigital Library
Moore, B., A. Padegs, R. Smith, and W. Bucholz [1987]. "Concepts of the System/370 vector architecture," 14th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 2-5, 1987, Pittsburgh, Penn., 282-292. Google Scholar
Moore, G. E. [1965]. "Cramming more components onto integrated circuits," Electronics , 38:8 (April 19), 114-117.Google Scholar
Morse, S., B. Ravenal, S. Mazor, and W. Pohlman [1980]. "Intel microprocessors--8080 to 8086," Computer 13:10 (October). Google Scholar
Moshovos, A., and G. S. Sohi [1997]. "Streamlining inter-operation memory communication via data dependence prediction," Proc. 30th Annual Int'l. Symposium on Microarchitecture , December 1-3, Research Triangle Park, N.C., 235-245. Google Scholar
Moshovos, A., S. Breach, T. N. Vijaykumar, and G. S. Sohi [1997]. "Dynamic speculation and synchronization of data dependences," 24th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 2-4, 1997, Denver, Colo. Google Scholar
Moussouris, J., L. Crudele, D. Freitas, C. Hansen, E. Hudson, S. Przybylski, T. Riordan, and C. Rowen [1986]. "A CMOS RISC processor with integrated system functions," Proc. IEEE COMPCON , March 3-6, 1986, San Francisco, 191.Google Scholar
Mowry, T. C., S. Lam, and A. Gupta [1992]. "Design and evaluation of a compiler algorithm for prefetching," Proc. Fifth Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 12-15, 1992, Boston ( SIGPLAN Notices 27:9 (September), 62-73). Google Scholar
MSN Money. [2005]. "Amazon Shares Tumble after Rally Fizzles," http://moneycentral .msn.com/content/CNBCTV/Articles/Dispatches/P133695.asp.Google Scholar
Muchnick, S. S. [1988]. "Optimizing compilers for SPARC," Sun Technology 1:3 (Summer), 64-77.Google Scholar
Mueller, M., L. C. Alves, W. Fischer, M. L. Fair, and I. Modi [1999]. "RAS strategy for IBM S/390 G5 and G6," IBM J. Research and Development 43:5-6 (September-November), 875-888. Google ScholarDigital Library
Mukherjee, S. S., C. Weaver, J. S. Emer, S. K. Reinhardt, and T. M. Austin [2003]. "Measuring architectural vulnerability factors," IEEE Micro 23:6, 70-75. Google ScholarDigital Library
Murphy, B., and T. Gent [1995]. "Measuring system and software reliability using an automated data collection process," Quality and Reliability Engineering International 11:5 (September-October), 341-353.Google ScholarCross Ref
Myer, T. H., and I. E. Sutherland [1968]. "On the design of display processors," Communications of the ACM 11:6 (June), 410-414. Google ScholarDigital Library
Narayanan, D., E. Thereska, A. Donnelly, S. Elnikety, and A. Rowstron [2009]. "Migrating server storage to SSDs: Analysis of trade-offs," Proc. 4th ACM European Conf. on Computer Systems , April 1-3, 2009, Nuremberg, Germany. Google Scholar
National Research Council. [1997]. The Evolution of Untethered Communications , Computer Science and Telecommunications Board, National Academy Press, Washington, D.C. Google Scholar
National Storage Industry Consortium. [1998]. "Tape Roadmap," www.nsic.org.Google Scholar
Nelson, V. P. [1990]. "Fault-tolerant computing: Fundamental concepts," Computer 23:7 (July), 19-25. Google ScholarDigital Library
Ngai, T.-F., and M. J. Irwin [1985]. "Regular, area-time efficient carry-lookahead adders," Proc. Seventh IEEE Symposium on Computer Arithmetic , June 4-6, 1985, University of Illinois, Urbana, 9-15.Google Scholar
Nicolau, A., and J. A. Fisher [1984]. "Measuring the parallelism available for very long instruction word architectures," IEEE Trans. on Computers C-33:11 (November), 968-976. Google ScholarDigital Library
Nikhil, R. S., G. M. Papadopoulos, and Arvind [1992]. "*T: A multithreaded massively parallel architecture," Proc. 19th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 19-21, 1992, Gold Coast, Australia, 156-167. Google Scholar
Noordergraaf, L., and R. van der Pas [1999]. "Performance experiences on Sun's WildFire prototype," Proc. ACM/IEEE Conf. on Supercomputing , November 13-19, 1999, Portland, Ore. Google Scholar
Nyberg, C. R., T. Barclay, Z. Cvetanovic, J. Gray, and D. Lomet [1994]. "AlphaSort: A RISC machine sort," Proc. ACM SIGMOD , May 24-27, 1994, Minneapolis, Minn. Google Scholar
Oka, M., and M. Suzuoki [1999]. "Designing and programming the emotion engine," IEEE Micro 19:6 (November-December), 20-28. Google ScholarDigital Library
Okada, S., S. Okada, Y. Matsuda, T. Yamada, and A. Kobayashi [1999]. "System on a chip for digital still camera," IEEE Trans. on Consumer Electronics 45:3 (August), 584-590. Google ScholarDigital Library
Oliker, L., A. Canning, J. Carter, J. Shalf, and S. Ethier [2004]. "Scientific computations on modern parallel vector systems," Proc. ACM/IEEE Conf. on Supercomputing , November 6-12, 2004, Pittsburgh, Penn., 10. Google Scholar
Pabst, T. [2000]. "Performance Showdown at 133 MHz FSB--The Best Platform for Coppermine," www6.tomshardware.com/mainboard/00q1/000302/.Google Scholar
Padua, D., and M. Wolfe [1986]. "Advanced compiler optimizations for supercomputers," Communications of the ACM 29:12 (December), 1184-1201. Google ScholarDigital Library
Palacharla, S., and R. E. Kessler [1994]. "Evaluating stream buffers as a secondary cache replacement," Proc. 21st Annual Int'l. Symposium on Computer Architecture (ISCA) , April 18-21, 1994, Chicago, 24-33. Google Scholar
Palmer, J., and S. Morse [1984]. The 8087 Primer , John Wiley & Sons, New York, 93.Google Scholar
Pan, S.-T., K. So, and J. T. Rameh [1992]. "Improving the accuracy of dynamic branch prediction using branch correlation," Proc. Fifth Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 12-15, 1992, Boston, 76-84. Google Scholar
Partridge, C. [1994]. Gigabit Networking , Addison-Wesley, Reading, Mass. Google Scholar
Patterson, D. [1985]. "Reduced instruction set computers," Communications of the ACM 28:1 (January), 8-21. Google ScholarDigital Library
Patterson, D. [2004]. "Latency lags bandwidth," Communications of the ACM 47:10 (October), 71-75. Google ScholarDigital Library
Patterson, D. A., and D. R. Ditzel [1980]. "The case for the reduced instruction set computer," Computer Architecture News 8:6 (October), 25-33. Google ScholarDigital Library
Patterson, D. A., and J. L. Hennessy [2004]. Computer Organization and Design: The Hardware/Software Interface , 3rd ed., Morgan Kaufmann, San Francisco. Google Scholar
Patterson, D. A., G. A. Gibson, and R. H. Katz [1987]. A Case for Redundant Arrays of Inexpensive Disks (RAID) , Tech. Rep. UCB/CSD 87/391, University of California, Berkeley. Also appeared in Proc. ACM SIGMOD , June 1-3, 1988, Chicago, 109-116. Google ScholarDigital Library
Patterson, D. A., P. Garrison, M. Hill, D. Lioupis, C. Nyberg, T. Sippel, and K. Van Dyke [1983]. "Architecture of a VLSI instruction cache for a RISC," 10th Annual Int'l. Conf. on Computer Architecture Conf. Proc. , June 13-16, 1983, Stockholm, Sweden, 108-116. Google Scholar
Pavan, P., R. Bez, P. Olivo, and E. Zanoni [1997]. "Flash memory cells--an overview." Proc. IEEE 85:8 (August), 1248-1271.Google ScholarCross Ref
Peh, L. S., and W. J. Dally [2001]. "A delay model and speculative architecture for pipelined routers," Proc. 7th Int'l. Symposium on High-Performance Computer Architecture , January 22-24, 2001, Monterrey, Mexico. Google Scholar
Peng, V., S. Samudrala, and M. Gavrielov [1987]. "On the implementation of shifters, multipliers, and dividers in VLSI floating point units," Proc. 8th IEEE Symposium on Computer Arithmetic , May 19-21, 1987, Como, Italy, 95-102.Google Scholar
Pfister, G. F. [1998]. In Search of Clusters , 2nd ed., Prentice Hall, Upper Saddle River, N. J. Google ScholarDigital Library
Pfister, G. F., W. C. Brantley, D. A. George, S. L. Harvey, W. J. Kleinfekder, K. P. McAuliffe, E. A. Melton, V. A. Norton, and J. Weiss [1985]. "The IBM research parallel processor prototype (RP3): Introduction and architecture," Proc. 12th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 17-19, 1985, Boston, Mass., 764-771.Google Scholar
Pinheiro, E., W. D. Weber, and L. A. Barroso [2007]. "Failure trends in a large disk drive population," Proc. 5th USENIX Conference on File and Storage Technologies (FAST '07) , February 13-16, 2007, San Jose, Calif. Google ScholarDigital Library
Pinkston, T. M. [2004]. "Deadlock characterization and resolution in interconnection networks," in M. C. Zhu and M. P. Fanti, eds., Deadlock Resolution in Computer-Integrated Systems , CRC Press, Boca Raton, FL, 445-492.Google Scholar
Pinkston, T. M., and J. Shin [2005]. "Trends toward on-chip networked microsystems," Int'l. J. of High Performance Computing and Networking 3:1, 3-18. Google ScholarDigital Library
Pinkston, T. M., and S. Warnakulasuriya [1997]. "On deadlocks in interconnection networks," 24th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 2-4, 1997, Denver, Colo. Google Scholar
Pinkston, T. M., A. Benner, M. Krause, I. Robinson, and T. Sterling [2003]. "InfiniBand: The 'de facto' future standard for system and local area networks or just a scalable replacement for PCI buses?" Cluster Computing (special issue on communication architecture for clusters) 6:2 (April), 95-104. Google Scholar
Postiff, M. A., D. A. Greene, G. S. Tyson, and T. N. Mudge [1999]. "The limits of instruction level parallelism in SPEC95 applications," Computer Architecture News 27:1 (March), 31-40. Google ScholarDigital Library
Przybylski, S. A. [1990]. Cache Design: A Performance-Directed Approach , Morgan Kaufmann, San Francisco. Google Scholar
Przybylski, S. A., M. Horowitz, and J. L. Hennessy [1988]. "Performance trade-offs in cache design," 15th Annual Int'l. Symposium on Computer Architecture , May 30-June 2, 1988, Honolulu, Hawaii, 290-298. Google Scholar
Puente, V., R. Beivide, J. A. Gregorio, J. M. Prellezo, J. Duato, and C. Izu [1999]. "Adaptive bubble router: A design to improve performance in torus networks," Proc. 28th Int'l. Conference on Parallel Processing , September 21-24, 1999, Aizu-Wakamatsu, Fukushima, Japan. Google Scholar
Radin, G. [1982]. "The 801 minicomputer," Proc. Symposium Architectural Support for Programming Languages and Operating Systems (ASPLOS) , March 1-3, 1982, Palo Alto, Calif., 39-47. Google Scholar
Rajesh Bordawekar, Uday Bondhugula, Ravi Rao: Believe it or not!: mult-core CPUs can match GPU performance for a FLOP-intensive application! 19th International Conference on Parallel Architecture and Compilation Techniques (PACT 2010), Vienna, Austria, September 11-15, 2010: 537-538. Google Scholar
Ramamoorthy, C. V., and H. F. Li [1977]. "Pipeline architecture," ACM Computing Surveys 9:1 (March), 61-102. Google ScholarDigital Library
Ranganathan, P., P. Leech, D. Irwin, and J. Chase [2006]. "Ensemble-Level Power Management for Dense Blade Servers," Proc. 33rd Annual Int'l. Symposium on Computer Architecture (ISCA) , June 17-21, 2006, Boston, Mass., 66-77. Google Scholar
Rau, B. R. [1994]. "Iterative modulo scheduling: An algorithm for software pipelining loops," Proc. 27th Annual Int'l. Symposium on Microarchitecture , November 30-December 2, 1994, San Jose, Calif., 63-74. Google Scholar
Rau, B. R., C. D. Glaeser, and R. L. Picard [1982]. "Efficient code generation for horizontal architectures: Compiler techniques and architectural support," Proc. Ninth Annual Int'l. Symposium on Computer Architecture (ISCA) , April 26-29, 1982, Austin, Tex., 131-139. Google Scholar
Rau, B. R., D. W. L. Yen, W. Yen, and R. A. Towle [1989]. "The Cydra 5 departmental supercomputer: Design philosophies, decisions, and trade-offs," IEEE Computers 22:1 (January), 12-34. Google ScholarDigital Library
Reddi, V. J., B. C. Lee, T. Chilimbi, and K. Vaid [2010]. "Web search using mobile cores: Quantifying and mitigating the price of efficiency," Proc. 37th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 19-23, 2010, Saint-Malo, France. Google Scholar
Redmond, K. C., and T. M. Smith [1980]. Project Whirlwind--The History of a Pioneer Computer , Digital Press, Boston. Google Scholar
Reinhardt, S. K., J. R. Larus, and D. A. Wood [1994]. "Tempest and Typhoon: User-level shared memory," 21st Annual Int'l. Symposium on Computer Architecture (ISCA) , April 18-21, 1994, Chicago, 325-336. Google Scholar
Reinman, G., and N. P. Jouppi. [1999]. "Extensions to CACTI," research.compaq.com/wrl/people/jouppi/CACTI.html.Google Scholar
Rettberg, R. D., W. R. Crowther, P. P. Carvey, and R. S. Towlinson [1990]. "The Monarch parallel processor hardware design," IEEE Computer 23:4 (April), 18-30. Google ScholarDigital Library
Riemens, A., K. A. Vissers, R. J. Schutten, F. W. Sijstermans, G. J. Hekstra, and G. D. La Hei [1999]. "Trimedia CPU64 application domain and benchmark suite," Proc. IEEE Int'l. Conf. on Computer Design: VLSI in Computers and Processors (ICCD'99) , October 10-13, 1999, Austin, Tex., 580-585. Google Scholar
Riseman, E. M., and C. C. Foster [1972]. "Percolation of code to enhance paralled dispatching and execution," IEEE Trans. on Computers C-21:12 (December), 1411-1415. Google Scholar
Robin, J., and C. Irvine [2000]. "Analysis of the Intel Pentium's ability to support a secure virtual machine monitor." Proc. USENIX Security Symposium , August 14-17, 2000, Denver, Colo. Google ScholarCross Ref
Robinson, B., and L. Blount [1986]. The VM/HPO 3880-23 Performance Results , IBM Tech. Bulletin GG66-0247-00, IBM Washington Systems Center, Gaithersburg, Md.Google Scholar
Ropers, A., H. W. Lollman, and J. Wellhausen [1999]. DSPstone: Texas Instruments TMS320C54x , Tech. Rep. IB 315 1999/9-ISS-Version 0.9, Aachen University of Technology, Aaachen, Germany (www.ert.rwth-aachen.de/Projekte/Tools/coal/dspstone_c54x/index.html).Google Scholar
Rosenblum, M., S. A. Herrod, E. Witchel, and A. Gupta [1995]. "Complete computer simulation: The SimOS approach," in IEEE Parallel and Distributed Technology (now called Concurrency ) 4:3, 34-43. Google Scholar
Rowen, C., M. Johnson, and P. Ries [1988]. "The MIPS R3010 floating-point coprocessor," IEEE Micro 8:3 (June), 53-62. Google ScholarDigital Library
Russell, R. M. [1978]. "The Cray-1 processor system," Communications of the ACM 21:1 (January), 63-72. Google ScholarDigital Library
Rymarczyk, J. [1982]. "Coding guidelines for pipelined processors," Proc. Symposium Architectural Support for Programming Languages and Operating Systems (ASPLOS) , March 1-3, 1982, Palo Alto, Calif., 12-19. Google Scholar
Saavedra-Barrera, R. H. [1992]. "CPU Performance Evaluation and Execution Time Prediction Using Narrow Spectrum Benchmarking," Ph. D. dissertation, University of California, Berkeley. Google Scholar
Salem, K., and H. Garcia-Molina [1986]. "Disk striping," Proc. 2nd Int'l. IEEE Conf. on Data Engineering , February 5-7, 1986, Washington, D.C., 249-259. Google Scholar
Saltzer, J. H., D. P. Reed, and D. D. Clark [1984]. "End-to-end arguments in system design," ACM Trans. on Computer Systems 2:4 (November), 277-288. Google ScholarDigital Library
Samples, A. D., and P. N. Hilfinger [1988]. Code Reorganization for Instruction Caches , Tech. Rep. UCB/CSD 88/447, University of California, Berkeley. Google Scholar
Santoro, M. R., G. Bewick, and M. A. Horowitz [1989]. "Rounding algorithms for IEEE multipliers," Proc. Ninth IEEE Symposium on Computer Arithmetic , September 6-8, Santa Monica, Calif., 176-183.Google Scholar
Satran, J., D. Smith, K. Meth, C. Sapuntzakis, M. Wakeley, P. Von Stamwitz, R. Haagens, E. Zeidner, L. Dalle Ore, and Y. Klein [2001]. "iSCSI," IPS Working Group of IETF, Internet draft www.ietf.org/internet-drafts/draft-ietf-ips-iscsi-07.txt.Google Scholar
Saulsbury, A., T. Wilkinson, J. Carter, and A. Landin [1995]. "An argument for Simple COMA," Proc. First IEEE Symposium on High-Performance Computer Architectures , January 22-25, 1995, Raleigh, N.C., 276-285. Google Scholar
Schneck, P. B. [1987]. Superprocessor Architecture , Kluwer Academic Publishers, Norwell, Mass.Google Scholar
Schroeder, B., and G. A. Gibson [2007]. "Understanding failures in petascale computers," J. of Physics Conf. Series 78(1), 188-198.Google Scholar
Schroeder, B., E. Pinheiro, and W.-D. Weber [2009]. "DRAM errors in the wild: a largescale field study," Proc. Eleventh Int'l. Joint Conf. on Measurement and Modeling of Computer Systems (SIGMETRICS) , June 15-19, 2009, Seattle, Wash. Google Scholar
Schurman, E., and J. Brutlag [2009]. "The user and business impact of server delays," Proc. Velocity: Web Performance and Operations Conf. , June 22-24, 2009, San Jose, Calif.Google Scholar
Schwartz, J. T. [1980]. "Ultracomputers," ACM Trans. on Programming Languages and Systems 4:2, 484-521. Google ScholarDigital Library
Scott, N. R. [1985]. Computer Number Systems and Arithmetic , Prentice Hall, Englewood Cliffs, N. J. Google Scholar
Scott, S. L. [1996]. "Synchronization and communication in the T3E multiprocessor," Seventh Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 1-5, 1996, Cambridge, Mass. Google Scholar
Scott, S. L., and J. Goodman [1994]. "The impact of pipelined channels on k -ary n -cube networks," IEEE Trans. on Parallel and Distributed Systems 5:1 (January), 1-16. Google ScholarDigital Library
Scott, S. L., and G. M. Thorson [1996]. "The Cray T3E network: Adaptive routing in a high performance 3D torus," Proc. IEEE HOT Interconnects '96 , August 15-17, 1996, Stanford University, Palo Alto, Calif., 14-156.Google Scholar
Scranton, R. A., D. A. Thompson, and D. W. Hunter [1983]. The Access Time Myth ," Tech. Rep. RC 10197 (45223), IBM, Yorktown Heights, N.Y.Google Scholar
Seagate. [2000]. Seagate Cheetah 73 Family: ST173404LW/LWV/LC/LCV Product Manual , Vol. 1, Seagate, Scotts Valley, Calif. (www.seagate.com/support/disc/manuals/scsi/29478b.pdf).Google Scholar
Seitz, C. L. [1985]. "The Cosmic Cube (concurrent computing)," Communications of the ACM 28:1 (January), 22-33. Google ScholarDigital Library
Senior, J. M. [1993]. Optical Fiber Commmunications: Principles and Practice , 2nd ed., Prentice Hall, Hertfordshire, U. K. Google Scholar
Sharangpani, H., and K. Arora [2000]. "Itanium Processor Microarchitecture," IEEE Micro 20:5 (September-October), 24-43. Google ScholarDigital Library
Shurkin, J. [1984]. Engines of the Mind: A History of the Computer , W. W. Norton, New York. Google Scholar
Shustek, L. J. [1978]. "Analysis and Performance of Computer Instruction Sets," Ph. D. dissertation, Stanford University, Palo Alto, Calif. Google Scholar
Silicon Graphics. [1996]. MIPS V Instruction Set (see http://www.sgi.com/MIPS/arch/ISA5/#MIPSV_indx).Google Scholar
Singh, J. P., J. L. Hennessy, and A. Gupta [1993]. "Scaling parallel programs for multiprocessors: Methodology and examples," Computer 26:7 (July), 22-33. Google ScholarDigital Library
Sinharoy, B., R. N. Koala, J. M. Tendler, R. J. Eickemeyer, and J. B. Joyner [2005]. "POWER5 system microarchitecture," IBM J. Research and Development , 49:4-5, 505-521. Google ScholarCross Ref
Sites, R. [1979]. Instruction Ordering for the CRAY-1 Computer , Tech. Rep. 78-CS-023, Dept. of Computer Science, University of California, San Diego.Google Scholar
Sites, R. L. (ed.) [1992]. Alpha Architecture Reference Manual , Digital Press, Burlington, Mass. Google Scholar
Sites, R. L., and R. Witek, (eds.) [1995]. Alpha Architecture Reference Manual , 2nd ed., Digital Press, Newton, Mass. Google Scholar
Skadron, K., and D. W. Clark [1997]. "Design issues and tradeoffs for write buffers," Proc. Third Int'l. Symposium on High-Performance Computer Architecture , February 1-5, 1997, San Antonio, Tex., 144-155. Google Scholar
Skadron, K., P. S. Ahuja, M. Martonosi, and D. W. Clark [1999]. "Branch prediction, instruction-window size, and cache size: Performance tradeoffs and simulation techniques," IEEE Trans. on Computers 48:11 (November). Google ScholarDigital Library
Slater, R. [1987]. Portraits in Silicon , MIT Press, Cambridge, Mass. Google Scholar
Slotnick, D. L., W. C. Borck, and R. C. McReynolds [1962]. "The Solomon computer," Proc. AFIPS Fall Joint Computer Conf. , December 4-6, 1962, Philadelphia, Penn., 97-107. Google Scholar
Smith, A. J. [1982]. "Cache memories," Computing Surveys 14:3 (September), 473-530. Google ScholarDigital Library
Smith, A., and J. Lee [1984]. "Branch prediction strategies and branch-target buffer design," Computer 17:1 (January), 6-22. Google Scholar
Smith, B. J. [1978]. "A pipelined, shared resource MIMD computer," Proc. Int'l. Conf. on Parallel Processing (ICPP) , August, Bellaire, Mich., 6-8.Google Scholar
Smith, B. J. [1981]. "Architecture and applications of the HEP multiprocessor system," Real-Time Signal Processing IV 298 (August), 241-248.Google Scholar
Smith, J. E. [1981]. "A study of branch prediction strategies," Proc. Eighth Annual Int'l. Symposium on Computer Architecture (ISCA) , May 12-14, 1981, Minneapolis, Minn., 135-148. Google Scholar
Smith, J. E. [1984]. "Decoupled access/execute computer architectures," ACM Trans. on Computer Systems 2:4 (November), 289-308. Google ScholarDigital Library
Smith, J. E. [1988]. "Characterizing computer performance with a single number," Communications of the ACM 31:10 (October), 1202-1206. Google ScholarDigital Library
Smith, J. E. [1989]. "Dynamic instruction scheduling and the Astronautics ZS-1," Computer 22:7 (July), 21-35. Google ScholarDigital Library
Smith, J. E., and J. R. Goodman [1983]. "A study of instruction cache organizations and replacement policies," Proc. 10th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 5-7, 1982, Stockholm, Sweden, 132-137. Google Scholar
Smith, J. E., and A. R. Pleszkun [1988]. "Implementing precise interrupts in pipelined processors," IEEE Trans. on Computers 37:5 (May), 562-573. (This paper is based on an earlier paper that appeared in Proc. 12th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 17-19, 1985, Boston, Mass.) Google Scholar
Smith, J. E., G. E. Dermer, B. D. Vanderwarn, S. D. Klinger, C. M. Rozewski, D. L. Fowler, K. R. Scidmore, and J. P. Laudon [1987]. "The ZS-1 central processor," Proc. Second Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 5-8, 1987, Palo Alto, Calif., 199-204. Google Scholar
Smith, M. D., M. Horowitz, and M. S. Lam [1992]. "Efficient superscalar performance through boosting," Proc. Fifth Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 12-15, 1992, Boston, 248-259. Google Scholar
Smith, M. D., M. Johnson, and M. A. Horowitz [1989]. "Limits on multiple instruction issue," Proc. Third Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , April 3-6, 1989, Boston, 290-302. Google Scholar
Smotherman, M. [1989]. "A sequencing-based taxonomy of I/O systems and review of historical machines," Computer Architecture News 17:5 (September), 5-15. Reprinted in Computer Architecture Readings , M. D. Hill, N. P. Jouppi, and G. S. Sohi, eds., Morgan Kaufmann, San Francisco, 1999, 451-461. Google ScholarDigital Library
Sodani, A., and G. Sohi [1997]. "Dynamic instruction reuse," Proc. 24th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 2-4, 1997, Denver, Colo. Google Scholar
Sohi, G. S. [1990]. "Instruction issue logic for high-performance, interruptible, multiple functional unit, pipelined computers," IEEE Trans. on Computers 39:3 (March), 349-359. Google ScholarDigital Library
Sohi, G. S., and S. Vajapeyam [1989]. "Tradeoffs in instruction format design for horizontal architectures," Proc. Third Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , April 3-6, 1989, Boston, 15-25. Google Scholar
Soundararajan, V., M. Heinrich, B. Verghese, K. Gharachorloo, A. Gupta, and J. L. Hennessy [1998]. "Flexible use of memory for replication/migration in cachecoherent DSM multiprocessors," Proc. 25th Annual Int'l. Symposium on Computer Architecture (ISCA) , July 3-14, 1998, Barcelona, Spain, 342-355. Google Scholar
SPEC. [1989]. SPEC Benchmark Suite Release 1.0 (October 2).Google Scholar
SPEC. [1994]. SPEC Newsletter (June).Google Scholar
Sporer, M., F. H. Moss, and C. J. Mathais [1988]. "An introduction to the architecture of the Stellar Graphics supercomputer," Proc. IEEE COMPCON , February 29-March 4, 1988, San Francisco, 464.Google Scholar
Spurgeon, C. [2001]. "Charles Spurgeon's Ethernet Web Site," wwwhost.ots.utexas.edu/ethernet/ethernet-home.html.Google Scholar
Spurgeon, C. [2006]. "Charles Spurgeon's Ethernet Web SITE," www.ethermanage.com/ethernet/ethernet.html.Google Scholar
Stenstrom, P., T. Joe, and A. Gupta [1992]. "Comparative performance evaluation of cache-coherent NUMA and COMA architectures," Proc. 19th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 19-21, 1992, Gold Coast, Australia, 80-91. Google Scholar
Sterling, T. [2001]. Beowulf PC Cluster Computing with Windows and Beowulf PC Cluster Computing with Linux , MIT Press, Cambridge, Mass. Google Scholar
Stern, N. [1980]. "Who invented the first electronic digital computer?" Annals of the History of Computing 2:4 (October), 375-376.Google Scholar
Stevens, W. R. [1994-1996]. TCP/IP Illustrated (three volumes), Addison-Wesley, Reading, Mass.Google Scholar
Stokes, J. [2000]. "Sound and Vision: A Technical Overview of the Emotion Engine," arstechnica.com/reviews/1q00/playstation2/ee-1.html.Google Scholar
Stone, H. [1991]. High Performance Computers , Addison-Wesley, New York.Google Scholar
Strauss, W. [1998]. "DSP Strategies 2002," www.usadata.com/market_research/spr_05/spr_r127-005.htm.Google Scholar
Strecker, W. D. [1976]. "Cache memories for the PDP-11?," Proc. Third Annual Int'l. Symposium on Computer Architecture (ISCA) , January 19-21, 1976, Tampa, Fla., 155-158. Google Scholar
Strecker, W. D. [1978]. "VAX-11/780: A virtual address extension of the PDP-11 family," Proc. AFIPS National Computer Conf. , June 5-8, 1978, Anaheim, Calif., 47, 967-980.Google Scholar
Sugumar, R. A., and S. G. Abraham [1993]. "Efficient simulation of caches under optimal replacement with applications to miss characterization," Proc. ACM SIGMETRICS Conf. on Measurement and Modeling of Computer Systems , May 17-21, 1993, Santa Clara, Calif., 24-35. Google Scholar
Sun Microsystems. [1989]. The SPARC Architectural Manual , Version 8, Part No. 8001399-09, Sun Microsystems, Santa Clara, Calif.Google Scholar
Sussenguth, E. [1999]. "IBM's ACS-1 Machine," IEEE Computer 22:11 (November).Google Scholar
Swan, R. J., S. H. Fuller, and D. P. Siewiorek [1977]. "Cm*--a modular, multimicroprocessor," Proc. AFIPS National Computing Conf. , June 13-16, 1977, Dallas, Tex., 637-644. Google Scholar
Swan, R. J., A. Bechtolsheim, K. W. Lai, and J. K. Ousterhout [1977]. "The implementation of the Cm* multi-microprocessor," Proc. AFIPS National Computing Conf. , June 13-16, 1977, Dallas, Tex., 645-654. Google Scholar
Swartzlander, E. (ed.) [1990]. Computer Arithmetic , IEEE Computer Society Press, Los Alamitos, Calif. Google Scholar
Takagi, N., H. Yasuura, and S. Yajima [1985]."High-speed VLSI multiplication algorithm with a redundant binary addition tree," IEEE Trans. on Computers C-34:9, 789-796. Google Scholar
Talagala, N. [2000]. "Characterizing Large Storage Systems: Error Behavior and Performance Benchmarks," Ph. D. dissertation, Computer Science Division, University of California, Berkeley. Google Scholar
Talagala, N., and D. Patterson [1999]. An Analysis of Error Behavior in a Large Storage System , Tech. Report UCB//CSD-99-1042, Computer Science Division, University of California, Berkeley. Google Scholar
Talagala, N., R. Arpaci-Dusseau, and D. Patterson [2000]. Micro-Benchmark Based Extraction of Local and Global Disk Characteristics , CSD-99-1063, Computer Science Division, University of California, Berkeley. Google Scholar
Talagala, N., S. Asami, D. Patterson, R. Futernick, and D. Hart [2000]. "The art of massive storage: A case study of a Web image archive," Computer (November). Google Scholar
Tamir, Y., and G. Frazier [1992]. "Dynamically-allocated multi-queue buffers for VLSI communication switches," IEEE Trans. on Computers 41:6 (June), 725-734. Google ScholarDigital Library
Tanenbaum, A. S. [1978]. "Implications of structured programming for machine architecture," Communications of the ACM 21:3 (March), 237-246. Google ScholarDigital Library
Tanenbaum, A. S. [1988]. Computer Networks , 2nd ed., Prentice Hall, Englewood Cliffs, N. J. Google Scholar
Tang, C. K. [1976]. "Cache design in the tightly coupled multiprocessor system," Proc. AFIPS National Computer Conf. , June 7-10, 1976, New York, 749-753. Google Scholar
Tanqueray, D. [2002]. "The Cray X1 and supercomputer road map," Proc. 13th Daresbury Machine Evaluation Workshop , December 11-12, 2002, Daresbury Laboratories, Daresbury, Cheshire, U. K.Google Scholar
Tarjan, D., S. Thoziyoor, and N. Jouppi [2005]. "HPL Technical Report on CACTI 4.0," www.hpl.hp.com/techeports/2006/HPL=2006+86.html.Google Scholar
Taylor, G. S. [1981]. "Compatible hardware for division and square root," Proc. 5th IEEE Symposium on Computer Arithmetic , May 18-19, 1981, University of Michigan, Ann Arbor, Mich., 127-134.Google Scholar
Taylor, G. S. [1985]. "Radix 16 SRT dividers with overlapped quotient selection stages," Proc. Seventh IEEE Symposium on Computer Arithmetic , June 4-6, 1985, University of Illinois, Urbana, Ill., 64-71.Google Scholar
Taylor, G., P. Hilfinger, J. Larus, D. Patterson, and B. Zorn [1986]. "Evaluation of the SPUR LISP architecture," Proc. 13th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 2-5, 1986, Tokyo. Google Scholar
Taylor, M. B., W. Lee, S. P. Amarasinghe, and A. Agarwal [2005]. "Scalar operand networks," IEEE Trans. on Parallel and Distributed Systems 16:2 (February), 145-162. Google ScholarDigital Library
Tendler, J. M., J. S. Dodson, J. S. Fields, Jr., H. Le, and B. Sinharoy [2002]. "Power4 system microarchitecture," IBM J. Research and Development 46:1, 5-26. Google ScholarDigital Library
Texas Instruments. [2000]. "History of Innovation: 1980s," www.ti.com/corp/docs/company/history/1980s.shtml.Google Scholar
Tezzaron Semiconductor. [2004]. Soft Errors in Electronic Memory , White Paper,Google Scholar
Tezzaron Semiconductor, Naperville, Ill. (http://www.tezzaron.com/about/papers/soft_errors_1_1_secure.pdf).Google Scholar
Thacker, C. P., E. M. McCreight, B. W. Lampson, R. F. Sproull, and D. R. Boggs [1982]. "Alto: A personal computer," in D. P. Siewiorek, C. G. Bell, and A. Newell, eds., Computer Structures: Principles and Examples , McGraw-Hill, New York, 549-572.Google Scholar
Thadhani, A. J. [1981]. "Interactive user productivity," IBM Systems J. 20:4, 407-423. Google ScholarDigital Library
Thekkath, R., A. P. Singh, J. P. Singh, S. John, and J. L. Hennessy [1997]. "An evaluation of a commercial CC-NUMA architecture--the CONVEX Exemplar SPP1200," Proc. 11th Int'l. Parallel Processing Symposium (IPPS) , April 1-7, 1997, Geneva, Switzerland. Google ScholarCross Ref
Thorlin, J. F. [1967]. "Code generation for PIE (parallel instruction execution) computers," Proc. Spring Joint Computer Conf. , April 18-20, 1967, Atlantic City, N. J., 27. Google Scholar
Thornton, J. E. [1964]. "Parallel operation in the Control Data 6600," Proc. AFIPS Fall Joint Computer Conf. , Part II , October 27-29, 1964, San Francisco, 26, 33-40. Google Scholar
Thornton, J. E. [1970]. Design of a Computer, the Control Data 6600 , Scott, Foresman, Glenview, Ill. Google Scholar
Tjaden, G. S., and M. J. Flynn [1970]. "Detection and parallel execution of independent instructions," IEEE Trans. on Computers C-19:10 (October), 889-895. Google ScholarDigital Library
Tomasulo, R. M. [1967]. "An efficient algorithm for exploiting multiple arithmetic units," IBM J. Research and Development 11:1 (January), 25-33. Google ScholarDigital Library
Torrellas, J., A. Gupta, and J. Hennessy [1992]. "Characterizing the caching and synchronization performance of a multiprocessor operating system," Proc. Fifth Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 12-15, 1992, Boston ( SIGPLAN Notices 27:9 (September), 162-174). Google Scholar
Touma, W. R. [1993]. The Dynamics of the Computer Industry: Modeling the Supply of Workstations and Their Components , Kluwer Academic, Boston. Google Scholar
Tuck, N., and D. Tullsen [2003]. "Initial observations of the simultaneous multithreading Pentium 4 processor," Proc. 12th Int. Conf. on Parallel Architectures and Compilation Techniques (PACT'03 ), September 27-October 1, 2003, New Orleans, La., 26-34. Google Scholar
Tullsen, D. M., S. J. Eggers, and H. M. Levy [1995]. "Simultaneous multithreading: Maximizing on-chip parallelism," Proc. 22nd Annual Int'l. Symposium on Computer Architecture (ISCA) , June 22-24, 1995, Santa Margherita, Italy, 392-403. Google Scholar
Tullsen, D. M., S. J. Eggers, J. S. Emer, H. M. Levy, J. L. Lo, and R. L. Stamm [1996]. "Exploiting choice: Instruction fetch and issue on an implementable simultaneous multithreading processor," Proc. 23rd Annual Int'l. Symposium on Computer Architecture (ISCA) , May 22-24, 1996, Philadelphia, Penn., 191-202. Google Scholar
Ungar, D., R. Blau, P. Foley, D. Samples, and D. Patterson [1984]. "Architecture of SOAR: Smalltalk on a RISC," Proc. 11th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 5-7, 1984, Ann Arbor, Mich., 188-197. Google Scholar
Unger, S. H. [1958]. "A computer oriented towards spatial problems," Proc. Institute of Radio Engineers 46:10 (October), 1744-1750. Google Scholar
Vahdat, A., M. Al-Fares, N. Farrington, R. Niranjan Mysore, G. Porter, and S. Radhakrishnan [2010]. "Scale-Out Networking in the Data Center," IEEE Micro 30:4 (July/August), 29-41. Google ScholarDigital Library
Vaidya, A. S., A Sivasubramaniam, and C. R. Das [1997]. "Performance benefits of virtual channels and adaptive routing: An application-driven study," Proc. ACM/IEEE Conf. on Supercomputing , November 16-21, 1997, San Jose, Calif. Google ScholarDigital Library
Vajapeyam, S. [1991]. "Instruction-Level Characterization of the Cray Y-MP Processor," Ph. D. thesis, Computer Sciences Department, University of Wisconsin-Madison. Google Scholar
van Eijndhoven, J. T. J., F. W. Sijstermans, K. A. Vissers, E. J. D. Pol, M. I. A. Tromp, P. Struik, R. H. J. Bloks, P. van der Wolf, A. D. Pimentel, and H. P. E. Vranken [1999]. "Trimedia CPU64 architecture," Proc. IEEE Int'l. Conf. on Computer Design: VLSI in Computers and Processors (ICCD'99) , October 10-13, 1999, Austin, Tex., 586-592. Google Scholar
Van Vleck, T. [2005]. "The IBM 360/67 and CP/CMS," http://www.multicians.org/thvv/360-67.html.Google Scholar
von Eicken, T., D. E. Culler, S. C. Goldstein, and K. E. Schauser [1992]. "Active Messages: A mechanism for integrated communication and computation," Proc. 19th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 19-21, 1992, Gold Coast, Australia. Google Scholar
Waingold, E., M. Taylor, D. Srikrishna, V. Sarkar, W. Lee, V. Lee, J. Kim, M. Frank, P. Finch, R. Barua, J. Babb, S. Amarasinghe, and A. Agarwal [1997]. "Baring it all to software: Raw Machines," IEEE Computer 30 (September), 86-93. Google ScholarDigital Library
Wakerly, J. [1989]. Microcomputer Architecture and Programming , Wiley, New York. Google Scholar
Wall, D. W. [1991]. "Limits of instruction-level parallelism," Proc. Fourth Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , April 8-11, 1991, Palo Alto, Calif., 248-259. Google Scholar
Wall, D. W. [1993]. Limits of Instruction-Level Parallelism , Research Rep. 93/6, Western Research Laboratory, Digital Equipment Corp., Palo Alto, Calif.Google Scholar
Walrand, J. [1991]. Communication Networks: A First Course , Aksen Associates/Irwin, Homewood, Ill. Google Scholar
Wang, W.-H., J.-L. Baer, and H. M. Levy [1989]. "Organization and performance of a two-level virtual-real cache hierarchy," Proc. 16th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 28-June 1, 1989, Jerusalem, 140-148. Google ScholarCross Ref
Watanabe, T. [1987]. "Architecture and performance of the NEC supercomputer SX system," Parallel Computing 5, 247-255.Google ScholarCross Ref
Waters, F. (ed.) [1986]. IBM RT Personal Computer Technology , SA 23-1057, IBM, Austin, Tex.Google Scholar
Watson, W. J. [1972]. "The TI ASC--a highly modular and flexible super processor architecture," Proc. AFIPS Fall Joint Computer Conf. , December 5-7, 1972, Anaheim, Calif., 221-228. Google Scholar
Weaver, D. L., and T. Germond [1994]. The SPARC Architectural Manual , Version 9, Prentice Hall, Englewood Cliffs, N. J. Google Scholar
Weicker, R. P. [1984]. "Dhrystone: A synthetic systems programming benchmark," Communications of the ACM 27:10 (October), 1013-1030. Google ScholarDigital Library
Weiss, S., and J. E. Smith [1984]. "Instruction issue logic for pipelined supercomputers," Proc. 11th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 5-7, 1984, Ann Arbor, Mich., 110-118. Google Scholar
Weiss, S., and J. E. Smith [1987]. "A study of scalar compilation techniques for pipelined supercomputers," Proc. Second Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 5-8, 1987, Palo Alto, Calif., 105-109. Google Scholar
Weiss, S., and J. E. Smith [1994]. Power and PowerPC , Morgan Kaufmann, San Francisco. Google Scholar
Wendel, D., R. Kalla, J. Friedrich, J. Kahle, J. Leenstra, C. Lichtenau, B. Sinharoy, W. Starke, and V. Zyuban [2010]. "The Power7 processor SoC," Proc. Int'l. Conf. on IC Design and Technology , June 2-4, 2010, Grenoble, France, 71-73.Google Scholar
Weste, N., and K. Eshraghian [1993]. Principles of CMOS VLSI Design: A Systems Perspective , 2nd ed., Addison-Wesley, Reading, Mass.Google Scholar
Wiecek, C. [1982]. "A case study of the VAX 11 instruction set usage for compiler execution," Proc. Symposium on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , March 1-3, 1982, Palo Alto, Calif., 177-184. Google ScholarDigital Library
Wilkes, M. [1965]. "Slave memories and dynamic storage allocation," IEEE Trans. Electronic Computers EC-14:2 (April), 270-271.Google ScholarCross Ref
Wilkes, M. V. [1982]. "Hardware support for memory protection: Capability implementations," Proc. Symposium on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , March 1-3, 1982, Palo Alto, Calif., 107-116. Google Scholar
Wilkes, M. V. [1985]. Memoirs of a Computer Pioneer , MIT Press, Cambridge, Mass. Google Scholar
Wilkes, M. V. [1995]. Computing Perspectives , Morgan Kaufmann, San Francisco. Google Scholar
Wilkes, M. V., D. J. Wheeler, and S. Gill [1951]. The Preparation of Programs for an Electronic Digital Computer , Addison-Wesley, Cambridge, Mass.Google Scholar
Williams, S., A. Waterman, and D. Patterson [2009]. "Roofline: An insightful visual performance model for multicore architectures," Communications of the ACM , 52:4 (April), 65-76. Google ScholarDigital Library
Williams, T. E., M. Horowitz, R. L. Alverson, and T. S. Yang [1987]. "A self-timed chip for division," in P. Losleben, ed., 1987 Stanford Conference on Advanced Research in VLSI , MIT Press, Cambridge, Mass.Google Scholar
Wilson, A. W., Jr. [1987]. "Hierarchical cache/bus architecture for shared-memory multiprocessors," Proc. 14th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 2-5, 1987, Pittsburgh, Penn., 244-252. Google Scholar
Wilson, R. P., and M. S. Lam [1995]. "Efficient context-sensitive pointer analysis for C programs," Proc. ACM SIGPLAN'95 Conf. on Programming Language Design and Implementation , June 18-21, 1995, La Jolla, Calif., 1-12. Google Scholar
Wolfe, A., and J. P. Shen [1991]. "A variable instruction stream extension to the VLIW architecture," Proc. Fourth Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , April 8-11, 1991, Palo Alto, Calif., 2-14. Google Scholar
Wood, D. A., and M. D. Hill [1995]. "Cost-effective parallel computing," IEEE Computer 28:2 (February), 69-72. Google ScholarDigital Library
Wulf, W. [1981]. "Compilers and computer architecture," Computer 14:7 (July), 41-47. Google ScholarDigital Library
Wulf, W., and C. G. Bell [1972]. "C.mmp--A multi-mini-processor," Proc. AFIPS Fall Joint Computer Conf. , December 5-7, 1972, Anaheim, Calif., 765-777. Google Scholar
Wulf, W., and S. P. Harbison [1978]. "Reflections in a pool of processors--an experience report on C.mmp/Hydra," Proc. AFIPS National Computing Conf. June 5-8, 1978, Anaheim, Calif., 939-951.Google Scholar
Wulf, W. A., and S. A. McKee [1995]. "Hitting the memory wall: Implications of the obvious," ACM SIGARCH Computer Architecture News , 23:1 (March), 20-24. Google ScholarDigital Library
Wulf, W. A., R. Levin, and S. P. Harbison [1981]. Hydra/C.mmp: An Experimental Computer System , McGraw-Hill, New York.Google Scholar
Yamamoto, W., M. J. Serrano, A. R. Talcott, R. C. Wood, and M. Nemirosky [1994]. "Performance estimation of multistreamed, superscalar processors," Proc. 27th Annual Hawaii Int'l. Conf. on System Sciences , January 4-7, 1994, Maui, 195-204.Google Scholar
Yang, Y., and G. Mason [1991]. "Nonblocking broadcast switching networks," IEEE Trans. on Computers 40:9 (September), 1005-1015. Google ScholarDigital Library
Yeager, K. [1996]. "The MIPS R10000 superscalar microprocessor," IEEE Micro 16:2 (April), 28-40. Google ScholarDigital Library
Yeh, T., and Y. N. Patt [1993a]. "Alternative implementations of two-level adaptive branch prediction," Proc. 19th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 19-21, 1992, Gold Coast, Australia, 124-134. Google Scholar
Yeh, T., and Y. N. Patt [1993b]. "A comparison of dynamic branch predictors that use two levels of branch history," Proc. 20th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 16-19, 1993, San Diego, Calif., 257-266. Google Scholar

Cited By

Contributors

John L. Hennessy
Stanford University
- Publication Years1977 - 2024
- Publication counts129
- Citation count14,355
- Available for Download102
- Downloads (cumulative)135,739
- Downloads (12 months)21,493
- Downloads (6 weeks)2,304
- Average Downloads per Article1,331
- Average Citation per Article111
View Full Profile
David A Patterson
Google LLC
- Publication Years1975 - 2024
- Publication counts296
- Citation count35,356
- Available for Download151
- Downloads (cumulative)1,555,485
- Downloads (12 months)95,601
- Downloads (6 weeks)12,847
- Average Downloads per Article10,301
- Average Citation per Article119
View Full Profile

Index Terms

Computer Architecture, Fifth Edition: A Quantitative Approach

Reviews

Reviewer: Ruay-Shiung Chang

Moore's law states that the number of transistors that can be placed on an integrated circuit (IC) doubles approximately every two years. This exponential growth in IC technology has led to advancements in everything digital, from central processing units (CPUs) and memory to digital cameras. Since computers are made up of CPUs, memory, and input/output (I/O) devices, it is a logical consequence that computers have also experienced tremendous improvements. This drastic change in computers makes it difficult-if not impossible-for a textbook on computer architecture to include every new technology. Often, when a computer architecture textbook hits the counter, it is already out of date. Therefore, it is no wonder that this book is in its fifth edition. The main part of the book contains six chapters. The focus is on parallelism. Besides a chapter on the fundamentals of quantitative methods and a chapter on memory hierarchy, the other four chapters deal with parallelism at various levels. It is explained as it relates to cloud computing in chapter 6, "Warehouse-Scale Computers to Exploit Request-Level and Data-Level Parallelism." However, not everyone will agree with the authors' decisions regarding which topics to include or exclude. For example, traditional computer architecture textbooks would include designs of CPU, memory, and I/O. In this book, I/O systems are rarely touched on at all. Moore's law tells us that computer industries and technologies are still quickly evolving. To chase the newest technology in a textbook is unrealistic. Going back to the basics may be the solution. We have to teach computer science students the basic principles that have applied since the computer was invented. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Browse Books

Sections

References

Cited By

Index Terms

Reviews

Access critical reviews of Computing literature here

Programming Windows, Fifth Edition

Microsoft Computer Dictionary, Fifth Edition

Graphs & Digraphs, Fifth Edition

Save to Binder

Sections

References

Cited By

Save to Binder

Index Terms

Reviews

Access critical reviews of Computing literature here

Recommendations

Programming Windows, Fifth Edition

Microsoft Computer Dictionary, Fifth Edition

Graphs & Digraphs, Fifth Edition