Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
Skip header Section
Computer Architecture, Fifth Edition: A Quantitative ApproachSeptember 2011
Publisher:
  • Morgan Kaufmann Publishers Inc.
  • 340 Pine Street, Sixth Floor
  • San Francisco
  • CA
  • United States
ISBN:978-0-12-383872-8
Published:29 September 2011
Pages:
880
Skip Bibliometrics Section
Bibliometrics
Skip Abstract Section
Abstract

The computing world today is in the middle of a revolution: mobile clients and cloud computing have emerged as the dominant paradigms driving programming and hardware innovation today. The Fifth Edition of Computer Architecture focuses on this dramatic shift, exploring the ways in which software and technology in the "cloud" are accessed by cell phones, tablets, laptops, and other mobile computing devices. Each chapter includes two real-world examples, one mobile and one datacenter, to illustrate this revolutionary change. Updated to cover the mobile computing revolutionEmphasizes the two most important topics in architecture today: memory hierarchy and parallelism in all its forms.Develops common themes throughout each chapter: power, performance, cost, dependability, protection, programming models, and emerging trends ("What's Next")Includes three review appendices in the printed text. Additional reference appendices are available online.Includes updated Case Studies and completely new exercises.

References

  1. Adve, S. V., and K. Gharachorloo [1996]. "Shared memory consistency models: A tutorial," IEEE Computer 29:12 (December), 66-76. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Adve, S. V., and M. D. Hill [1990]. "Weak ordering--a new definition," Proc. 17th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 28-31, 1990, Seattle, Wash., 2-14. Google ScholarGoogle Scholar
  3. Agarwal, A. [1987]. "Analysis of Cache Performance for Operating Systems and Multiprogramming," Ph. D. thesis, Tech. Rep. No. CSL-TR-87-332, Stanford University, Palo Alto, Calif. Google ScholarGoogle Scholar
  4. Agarwal, A. [1991]. "Limits on interconnection network performance," IEEE Trans. on Parallel and Distributed Systems 2:4 (April), 398-412. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Agarwal, A., and S. D. Pudar [1993]. "Column-associative caches: A technique for reducing the miss rate of direct-mapped caches," 20th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 16-19, 1993, San Diego, Calif. Also appears in Computer Architecture News 21:2 (May), 179-190, 1993. Google ScholarGoogle Scholar
  6. Agarwal, A., R. Bianchini, D. Chaiken, K. Johnson, and D. Kranz [1995]. "The MIT Alewife machine: Architecture and performance," Int'l. Symposium on Computer Architecture (Denver, Colo.), June, 2-13. Google ScholarGoogle Scholar
  7. Agarwal, A., J. L. Hennessy, R. Simoni, and M. A. Horowitz [1988]. "An evaluation of directory schemes for cache coherence," Proc. 15th Int'l. Symposium on Computer Architecture (June), 280-289. Google ScholarGoogle Scholar
  8. Agarwal, A., J. Kubiatowicz, D. Kranz, B.-H. Lim, D. Yeung, G. D'Souza, and M. Parkin [1993]. "Sparcle: An evolutionary processor design for large-scale multiprocessors," IEEE Micro 13 (June), 48-61. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Agerwala, T., and J. Cocke [1987]. High Performance Reduced Instruction Set Processors , IBM Tech. Rep. RC12434, IBM, Armonk, N.Y.Google ScholarGoogle Scholar
  10. Akeley, K. and T. Jermoluk [1988]. "High-Performance Polygon Rendering," Proc. 15th Annual Conf. on Computer Graphics and Interactive Techniques (SIGGRAPH 1988) , August 1-5, 1988, Atlanta, Ga., 239-246. Google ScholarGoogle Scholar
  11. Alexander, W. G., and D. B. Wortman [1975]. "Static and dynamic characteristics of XPL programs," IEEE Computer 8:11 (November), 41-46. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Alles, A. [1995]. "ATM Internetworking," White Paper (May), Cisco Systems, Inc., San Jose, Calif. (www.cisco.com/warp/public/614/12.html)Google ScholarGoogle Scholar
  13. Alliant. [1987]. Alliant FX/Series: Product Summary , Alliant Computer Systems Corp., Acton, Mass.Google ScholarGoogle Scholar
  14. Almasi, G. S., and A. Gottlieb [1989]. Highly Parallel Computing , Benjamin/Cummings, Redwood City, Calif. Google ScholarGoogle Scholar
  15. Alverson, G., R. Alverson, D. Callahan, B. Koblenz, A. Porterfield, and B. Smith [1992]. "Exploiting heterogeneous parallelism on a multithreaded multiprocessor," Proc. ACM/IEEE Conf. on Supercomputing , November 16-20, 1992, Minneapolis, Minn., 188-197. Google ScholarGoogle Scholar
  16. Amdahl, G. M. [1967]. "Validity of the single processor approach to achieving large scale computing capabilities," Proc. AFIPS Spring Joint Computer Conf. , April 18-20, 1967, Atlantic City, N. J., 483-485. Google ScholarGoogle Scholar
  17. Amdahl, G. M., G. A. Blaauw, and F. P. Brooks, Jr. [1964]. "Architecture of the IBM System 360," IBM J. Research and Development 8:2 (April), 87-101. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Amza, C., A. L. Cox, S. Dwarkadas, P. Keleher, H. Lu, R. Rajamony, W. Yu, and W. Zwaenepoel [1996]. "Treadmarks: Shared memory computing on networks of workstations," IEEE Computer 29:2 (February), 18-28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Anderson, D. [2003]. "You don't know jack about disks," Queue , 1:4 (June), 20-30. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Anderson, D., J. Dykes, and E. Riedel [2003]. "SCSI vs. ATA--More than an interface," Proc. 2nd USENIX Conf. on File and Storage Technology (FAST '03) , March 31- April 2, 2003, San Francisco. Google ScholarGoogle Scholar
  21. Anderson, D. W., F. J. Sparacio, and R. M. Tomasulo [1967]. "The IBM 360 Model 91: Processor philosophy and instruction handling," IBM J. Research and Development 11:1 (January), 8-24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Anderson, M. H. [1990]. "Strength (and safety) in numbers (RAID, disk storage technology)," Byte 15:13 (December), 337-339.Google ScholarGoogle Scholar
  23. Anderson, T. E., D. E. Culler, and D. Patterson [1995]. "A case for NOW (networks of workstations)," IEEE Micro 15:1 (February), 54-64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Ang, B., D. Chiou, D. Rosenband, M. Ehrlich, L. Rudolph, and Arvind [1998]. "StarTVoyager: A flexible platform for exploring scalable SMP issues," Proc. ACM/IEEE Conf. on Supercomputing , November 7-13, 1998, Orlando, FL. Google ScholarGoogle Scholar
  25. Anjan, K. V., and T. M. Pinkston [1995]. "An efficient, fully-adaptive deadlock recovery scheme: Disha," Proc. 22nd Annual Int'l. Symposium on Computer Architecture (ISCA) , June 22-24, 1995, Santa Margherita, Italy. Google ScholarGoogle Scholar
  26. Anon. et al. [1985]. A Measure of Transaction Processing Power , Tandem Tech. Rep. TR85.2. Also appears in Datamation 31:7 (April), 112-118, 1985. Google ScholarGoogle Scholar
  27. Apache Hadoop. [2011]. http://hadoop.apache.org.Google ScholarGoogle Scholar
  28. Archibald, J., and J.-L. Baer [1986]. "Cache coherence protocols: Evaluation using a multiprocessor simulation model," ACM Trans. on Computer Systems 4:4 (November), 273-298. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Armbrust, M., A. Fox, R. Griffith, A. D. Joseph, R. Katz, A. Konwinski, G. Lee, D. Patterson, A. Rabkin, I. Stoica, and M. Zaharia [2009]. Above the Clouds: A Berkeley View of Cloud Computing , Tech. Rep. UCB/EECS-2009-28, University of California, Berkeley (http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-28.html).Google ScholarGoogle Scholar
  30. Arpaci, R. H., D. E. Culler, A. Krishnamurthy, S. G. Steinberg, and K. Yelick [1995]. "Empirical evaluation of the CRAY-T3D: A compiler perspective," 22nd Annual Int'l. Symposium on Computer Architecture (ISCA) , June 22-24, 1995, Santa Margherita, Italy. Google ScholarGoogle Scholar
  31. Asanovic, K. [1998]. "Vector Microprocessors," Ph. D. thesis, Computer Science Division, University of California, Berkeley. Google ScholarGoogle Scholar
  32. Associated Press. [2005]. "Gap Inc. shuts down two Internet stores for major overhaul," USATODAY.com , August 8, 2005.Google ScholarGoogle Scholar
  33. Atanasoff, J. V. [1940]. Computing Machine for the Solution of Large Systems of Linear Equations , Internal Report, Iowa State University, Ames.Google ScholarGoogle Scholar
  34. Atkins, M. [1991]. Performance and the i860 Microprocessor, IEEE Micro , 11:5 (September), 24-27, 72-78. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Austin, T. M., and G. Sohi [1992]. "Dynamic dependency analysis of ordinary programs," Proc. 19th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 19-21, 1992, Gold Coast, Australia, 342-351. Google ScholarGoogle Scholar
  36. Babbay, F., and A. Mendelson [1998]. "Using value prediction to increase the power of speculative execution hardware," ACM Trans. on Computer Systems 16:3 (August), 234-270. Google ScholarGoogle Scholar
  37. Baer, J.-L., and W.-H. Wang [1988]. "On the inclusion property for multi-level cache hierarchies," Proc. 15th Annual Int'l. Symposium on Computer Architecture , May 30-June 2, 1988, Honolulu, Hawaii, 73-80. Google ScholarGoogle Scholar
  38. Bailey, D. H., E. Barszcz, J. T. Barton, D. S. Browning, R. L. Carter, L. Dagum, R. A. Fatoohi, P. O. Frederickson, T. A. Lasinski, R. S. Schreiber, H. D. Simon, V. Venkatakrishnan, and S. K. Weeratunga [1991]. "The NAS parallel benchmarks," Int'l. J. Supercomputing Applications 5, 63-73. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Bakoglu, H. B., G. F. Grohoski, L. E. Thatcher, J. A. Kaeli, C. R. Moore, D. P. Tattle, W. E. Male, W. R. Hardell, D. A. Hicks, M. Nguyen Phu, R. K. Montoye, W. T. Glover, and S. Dhawan [1989]. "IBM second-generation RISC processor organization," Proc. IEEE Int'l. Conf. on Computer Design , September 30-October 4, 1989, Rye, N.Y., 138-142.Google ScholarGoogle Scholar
  40. Balakrishnan, H., V. N. Padmanabhan, S. Seshan, and R. H. Katz [1997]. "A comparison of mechanisms for improving TCP performance over wireless links," IEEE/ACM Trans. on Networking 5:6 (December), 756-769. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Ball, T., and J. Larus [1993]. "Branch prediction for free," Proc. ACM SIGPLAN'93 Conference on Programming Language Design and Implementation (PLDI) , June 23-25, 1993, Albuquerque, N. M., 300-313. Google ScholarGoogle Scholar
  42. Banerjee, U. [1979]. "Speedup of Ordinary Programs," Ph. D. thesis, Dept. of Computer Science, University of Illinois at Urbana-Champaign. Google ScholarGoogle Scholar
  43. Barham, P., B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, and R. Neugebauer [2003]. "Xen and the art of virtualization," Proc. of the 19th ACM Symposium on Operating Systems Principles , October 19-22, 2003, Bolton Landing, N.Y. Google ScholarGoogle Scholar
  44. Barroso, L. A. [2010]. "Warehouse Scale Computing [keynote address]," Proc. ACM SIGMOD , June 8-10, 2010, Indianapolis, Ind. Google ScholarGoogle Scholar
  45. Barroso, L. A., and U. Holzle [2007], "The case for energy-proportional computing," IEEE Computer , 40:12 (December), 33-37. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Barroso, L. A., and U. Holzle [2009]. The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines , Morgan & Claypool, San Rafael, Calif. Google ScholarGoogle Scholar
  47. Barroso, L. A., K. Gharachorloo, and E. Bugnion [1998]. "Memory system characterization of commercial workloads," Proc. 25th Annual Int'l. Symposium on Computer Architecture (ISCA) , July 3-14, 1998, Barcelona, Spain, 3-14. Google ScholarGoogle Scholar
  48. Barton, R. S. [1961]. "A new approach to the functional design of a computer," Proc. Western Joint Computer Conf. , May 9-11, 1961, Los Angeles, Calif., 393-396. Google ScholarGoogle Scholar
  49. Bashe, C. J., W. Buchholz, G. V. Hawkins, J. L. Ingram, and N. Rochester [1981]. "The architecture of IBM's early computers," IBM J. Research and Development 25:5 (September), 363-375. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Bashe, C. J., L. R. Johnson, J. H. Palmer, and E. W. Pugh [1986]. IBM's Early Computers , MIT Press, Cambridge, Mass. Google ScholarGoogle Scholar
  51. Baskett, F., and T. W. Keller [1977]. "An evaluation of the Cray-1 processor," in High Speed Computer and Algorithm Organization , D. J. Kuck, D. H. Lawrie, and A. H. Sameh, eds., Academic Press, San Diego, 71-84.Google ScholarGoogle Scholar
  52. Baskett, F., T. Jermoluk, and D. Solomon [1988]. "The 4D-MP graphics superworkstation: Computing + graphics = 40 MIPS + 40 MFLOPS and 10,000 lighted polygons per second," Proc. IEEE COMPCON , February 29-March 4, 1988, San Francisco, 468-471.Google ScholarGoogle Scholar
  53. BBN Laboratories. [1986]. Butterfly Parallel Processor Overview , Tech. Rep. 6148, BBN Laboratories, Cambridge, Mass.Google ScholarGoogle Scholar
  54. Bell, C. G. [1984]. "The mini and micro industries," IEEE Computer 17:10 (October), 14-30. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Bell, C. G. [1985]. "Multis: A new class of multiprocessor computers," Science 228 (April 26), 462-467.Google ScholarGoogle ScholarCross RefCross Ref
  56. Bell, C. G. [1989]. "The future of high performance computers in science and engineering," Communications of the ACM 32:9 (September), 1091-1101. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Bell, G., and J. Gray [2001]. Crays, Clusters and Centers , Tech. Rep. MSR-TR-2001-76, Microsoft Research, Redmond, Wash.Google ScholarGoogle Scholar
  58. Bell, C. G., and J. Gray [2002]. "What's next in high performance computing?" CACM 45:2 (February), 91-95. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Bell, C. G., and A. Newell [1971]. Computer Structures: Readings and Examples , McGraw-Hill, New York. Google ScholarGoogle Scholar
  60. Bell, C. G., and W. D. Strecker [1976]. "Computer structures: What have we learned from the PDP-11?," Third Annual Int'l. Symposium on Computer Architecture (ISCA) , January 19-21, 1976, Tampa, Fla., 1-14. Google ScholarGoogle Scholar
  61. Bell, C. G., and W. D. Strecker [1998]. "Computer structures: What have we learned from the PDP-11?" 25 Years of the International Symposia on Computer Architecture (Selected Papers) , ACM, New York, 138-151. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Bell, C. G., J. C. Mudge, and J. E. McNamara [1978]. A DEC View of Computer Engineering , Digital Press, Bedford, Mass.Google ScholarGoogle Scholar
  63. Bell, C. G., R. Cady, H. McFarland, B. DeLagi, J. O'Laughlin, R. Noonan, and W. Wulf [1970]. "A new architecture for mini-computers: The DEC PDP-11," Proc. AFIPS Spring Joint Computer Conf. , May 5-May 7, 1970, Atlantic City, N. J., 657-675. Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Benes, V. E. [1962]. "Rearrangeable three stage connecting networks," Bell System Technical Journal 41, 1481-1492.Google ScholarGoogle ScholarCross RefCross Ref
  65. Bertozzi, D., A. Jalabert, S. Murali, R. Tamhankar, S. Stergiou, L. Benini, and G. De Micheli [2005]. "NoC synthesis flow for customized domain specific multiprocessor systems-on-chip," IEEE Trans. on Parallel and Distributed Systems 16:2 (February), 113-130. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Bhandarkar, D. P. [1995]. Alpha Architecture and Implementations , Digital Press, Newton, Mass.Google ScholarGoogle Scholar
  67. Bhandarkar, D. P., and D. W. Clark [1991]. "Performance from architecture: Comparing a RISC and a CISC with similar hardware organizations," Proc. Fourth Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , April 8-11, 1991, Palo Alto, Calif., 310-319. Google ScholarGoogle Scholar
  68. Bhandarkar, D. P., and J. Ding [1997]. "Performance characterization of the Pentium Pro processor," Proc. Third Int'l. Symposium on High-Performance Computer Architecture , February 1-February 5, 1997, San Antonio, Tex., 288-297. Google ScholarGoogle Scholar
  69. Bhuyan, L. N., and D. P. Agrawal [1984]. "Generalized hypercube and hyperbus structures for a computer network," IEEE Trans. on Computers 32:4 (April), 322-333. Google ScholarGoogle Scholar
  70. Bienia, C., S. Kumar, P. S. Jaswinder, and K. Li [2008]. The Parsec Benchmark Suite: Characterization and Architectural Implications , Tech. Rep. TR-811-08, Princeton University, Princeton, N. J.Google ScholarGoogle Scholar
  71. Bier, J. [1997]. "The Evolution of DSP Processors," presentation at Univesity of California, Berkeley, November 14.Google ScholarGoogle Scholar
  72. Bird, S., A. Phansalkar, L. K. John, A. Mericas, and R. Indukuru [2007]. "Characterization of performance of SPEC CPU benchmarks on Intel's Core Microarchitecture based processor," Proc. 2007 SPEC Benchmark Workshop , January 21, 2007, Austin, Tex.Google ScholarGoogle Scholar
  73. Birman, M., A. Samuels, G. Chu, T. Chuk, L. Hu, J. McLeod, and J. Barnes [1990]. "Developing the WRL3170/3171 SPARC floating-point coprocessors," IEEE Micro 10:1, 55-64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. Blackburn, M., R. Garner, C. Hoffman, A. M. Khan, K. S. McKinley, R. Bentzur, A. Diwan, D. Feinberg, D. Frampton, S. Z. Guyer, M. Hirzel, A. Hosking, M. Jump, H. Lee, J. E. B. Moss, A. Phansalkar, D. Stefanovic, T. VanDrunen, D. von Dincklage, and B. Wiedermann [2006]. "The DaCapo benchmarks: Java benchmarking development and analysis," ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA) , October 22-26, 2006, 169-190. Google ScholarGoogle Scholar
  75. Blaum, M., J. Bruck, and A. Vardy [1996]. "MDS array codes with independent parity symbols," IEEE Trans. on Information Theory , IT-42 (March), 529-42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. Blaum, M., J. Brady, J. Bruck, and J. Menon [1994]. "EVENODD: An optimal scheme for tolerating double disk failures in RAID architectures," Proc. 21st Annual Int'l. Symposium on Computer Architecture (ISCA) , April 18-21, 1994, Chicago, 245-254. Google ScholarGoogle Scholar
  77. Blaum, M., J. Brady, J. Bruck, and J. Menon [1995]. "EVENODD: An optimal scheme for tolerating double disk failures in RAID architectures," IEEE Trans. on Computers 44:2 (February), 192-202. Google ScholarGoogle ScholarDigital LibraryDigital Library
  78. Blaum, M., J. Brady, J., Bruck, J. Menon, and A. Vardy [2001]. "The EVENODD code and its generalization," in H. Jin, T. Cortes, and R. Buyya, eds., High Performance Mass Storage and Parallel I/O: Technologies and Applications , Wiley-IEEE, New York, 187-208.Google ScholarGoogle Scholar
  79. Bloch, E. [1959]. "The engineering design of the Stretch computer," 1959 Proceedings of the Eastern Joint Computer Conf. , December 1-3, 1959, Boston, Mass., 48-59. Google ScholarGoogle Scholar
  80. Boddie, J. R. [2000]. "History of DSPs," www.lucent.com/micro/dsp/dsphist.html.Google ScholarGoogle Scholar
  81. Bolt, K. M. [2005]. "Amazon sees sales rise, profit fall," Seattle Post-Intelligencer , October 25 (http://seattlepi.nwsource.com/business/245943_techearns26.html).Google ScholarGoogle Scholar
  82. Bordawekar, R., U. Bondhugula, R. Rao [2010]. "Believe It or Not!: Multi-core CPUs can Match GPU Performance for a FLOP-Intensive Application!" 19th International Conference on Parallel Architecture and Compilation Techniques (PACT 2010) . Vienna, Austria, September 11-15, 2010, 537-538. Google ScholarGoogle Scholar
  83. Borg, A., R. E. Kessler, and D. W. Wall [1990]. "Generation and analysis of very long address traces," 19th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 19-21, 1992, Gold Coast, Australia, 270-279. Google ScholarGoogle Scholar
  84. Bouknight, W. J., S. A. Deneberg, D. E. McIntyre, J. M. Randall, A. H. Sameh, and D. L. Slotnick [1972]. "The Illiac IV system," Proc. IEEE 60:4, 369-379. Also appears in D. P. Siewiorek, C. G. Bell, and A. Newell, Computer Structures: Principles and Examples , McGraw-Hill, New York, 1982, 306-316. Google ScholarGoogle ScholarCross RefCross Ref
  85. Brady, J. T. [1986]. "A theory of productivity in the creative process," IEEE CG&A (May), 25-34. Google ScholarGoogle Scholar
  86. Brain, M. [2000]. "Inside a Digital Cell Phone," www.howstuffworks.com/insidecellphone. htm.Google ScholarGoogle Scholar
  87. Brandt, M., J. Brooks, M. Cahir, T. Hewitt, E. Lopez-Pineda, and D. Sandness [2000]. The Benchmarker's Guide for Cray SV1 Systems. Cray Inc., Seattle, Wash.Google ScholarGoogle Scholar
  88. Brent, R. P., and H. T. Kung [1982]. "A regular layout for parallel adders," IEEE Trans. on Computers C-31, 260-264. Google ScholarGoogle ScholarDigital LibraryDigital Library
  89. Brewer, E. A., and B. C. Kuszmaul [1994]. "How to get good performance from the CM-5 data network," Proc. Eighth Int'l. Parallel Processing Symposium , April 26-27, 1994, Cancun, Mexico. Google ScholarGoogle ScholarCross RefCross Ref
  90. Brin, S., and L. Page [1998]. "The anatomy of a large-scale hypertextual Web search engine," Proc. 7th Int'l. World Wide Web Conf. , April 14-18, 1998, Brisbane, Queensland, Australia, 107-117. Google ScholarGoogle Scholar
  91. Brown, A., and D. A. Patterson [2000]. "Towards maintainability, availability, and growth benchmarks: A case study of software RAID systems." Proc. 2000 USENIX Annual Technical Conf. , June 18-23, 2000, San Diego, Calif. Google ScholarGoogle Scholar
  92. Bucher, I. V., and A. H. Hayes [1980]. "I/O performance measurement on Cray-1 and CDC 7000 computers," Proc. Computer Performance Evaluation Users Group , 16th Meeting , NBS 500-65, 245-254.Google ScholarGoogle Scholar
  93. Bucher, I. Y. [1983]. "The computational speed of supercomputers," Proc. Int'l. Conf. on Measuring and Modeling of Computer Systems (SIGMETRICS 1983) , August 29-31, 1983, Minneapolis, Minn., 151-165. Google ScholarGoogle Scholar
  94. Bucholtz, W. [1962]. Planning a Computer System: Project Stretch , McGraw-Hill, New York. Google ScholarGoogle Scholar
  95. Burgess, N., and T. Williams [1995]. "Choices of operand truncation in the SRT division algorithm," IEEE Trans. on Computers 44:7, 933-938. Google ScholarGoogle ScholarDigital LibraryDigital Library
  96. Burkhardt III, H., S. Frank, B. Knobe, and J. Rothnie [1992]. Overview of the KSR1 Computer System , Tech. Rep. KSR-TR-9202001, Kendall Square Research, Boston, Mass.Google ScholarGoogle Scholar
  97. Burks, A. W., H. H. Goldstine, and J. von Neumann [1946]. "Preliminary discussion of the logical design of an electronic computing instrument," Report to the U. S. Army Ordnance Department, p. 1; also appears in Papers of John von Neumann , W. Aspray and A. Burks, eds., MIT Press, Cambridge, Mass., and Tomash Publishers, Los Angeles, Calif., 1987, 97-146.Google ScholarGoogle Scholar
  98. Calder, B., G. Reinman, and D. M. Tullsen [1999]. "Selective value prediction," Proc. 26th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 2-4, 1999, Atlanta, Ga. Google ScholarGoogle Scholar
  99. Calder, B., D. Grunwald, M. Jones, D. Lindsay, J. Martin, M. Mozer, and B. Zorn [1997]. "Evidence-based static branch prediction using machine learning," ACM Trans. Program. Lang. Syst. 19:1, 188-222. Google ScholarGoogle ScholarDigital LibraryDigital Library
  100. Callahan, D., J. Dongarra, and D. Levine [1988]. "Vectorizing compilers: A test suite and results," Proc. ACM/IEEE Conf. on Supercomputing , November 12-17, 1988, Orland, Fla., 98-105. Google ScholarGoogle Scholar
  101. Cantin, J. F., and M. D. Hill [2001]. "Cache Performance for Selected SPEC CPU2000 Benchmarks," www.jfred.org/cache-data.html (June).Google ScholarGoogle Scholar
  102. Cantin, J. F., and M. D. Hill [2003]. "Cache Performance for SPEC CPU2000 Benchmarks, Version 3.0," www.cs.wisc.edu/multifacet/misc/spec2000cache-data/index.html.Google ScholarGoogle Scholar
  103. Carles, S. [2005]. "Amazon reports record Xmas season, top game picks," Gamasutra , December 27 (http://www.gamasutra.com/php-bin/news_index.php?story=7630.)Google ScholarGoogle Scholar
  104. Carter, J., and K. Rajamani [2010]. "Designing energy-efficient servers and data centers," IEEE Computer 43:7 (July), 76-78. Google ScholarGoogle ScholarDigital LibraryDigital Library
  105. Case, R. P., and A. Padegs [1978]. "The architecture of the IBM System/370," Communications of the ACM 21:1, 73-96. Also appears in D. P. Siewiorek, C. G. Bell, and A. Newell, Computer Structures: Principles and Examples , McGraw-Hill, New York, 1982, 830-855. Google ScholarGoogle ScholarDigital LibraryDigital Library
  106. Censier, L., and P. Feautrier [1978]. "A new solution to coherence problems in multicache systems," IEEE Trans. on Computers C-27:12 (December), 1112-1118. Google ScholarGoogle ScholarDigital LibraryDigital Library
  107. Chandra, R., S. Devine, B. Verghese, A. Gupta, and M. Rosenblum [1994]. "Scheduling and page migration for multiprocessor compute servers," Sixth Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 4-7, 1994, San Jose, Calif., 12-24. Google ScholarGoogle Scholar
  108. Chang, F., J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber [2006]. "Bigtable: A distributed storage system for structured data," Proc. 7th USENIX Symposium on Operating Systems Design and Implementation (OSDI '06) , November 6-8, 2006, Seattle, Wash. Google ScholarGoogle Scholar
  109. Chang, J., J. Meza, P. Ranganathan, C. Bash, and A. Shah [2010]. "Green server design: Beyond operational energy to sustainability," Proc. Workshop on Power Aware Computing and Systems (HotPower '10) , October 3, 2010, Vancouver, British Columbia. Google ScholarGoogle Scholar
  110. Chang, P. P., S. A. Mahlke, W. Y. Chen, N. J. Warter, and W. W. Hwu [1991]. "IMPACT: An architectural framework for multiple-instruction-issue processors," 18th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 27-30, 1991, Toronto, Canada, 266-275. Google ScholarGoogle Scholar
  111. Charlesworth, A. E. [1981]. "An approach to scientific array processing: The architecture design of the AP-120B/FPS-164 family," Computer 14:9 (September), 18-27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  112. Charlesworth, A. [1998]. "Starfire: Extending the SMP envelope," IEEE Micro 18:1 (January/February), 39-49. Google ScholarGoogle ScholarDigital LibraryDigital Library
  113. Chen, P. M., and E. K. Lee [1995]. "Striping in a RAID level 5 disk array," Proc. ACM SIGMETRICS Conf. on Measurement and Modeling of Computer Systems , May 15-19, 1995, Ottawa, Canada, 136-145. Google ScholarGoogle Scholar
  114. Chen, P. M., G. A. Gibson, R. H. Katz, and D. A. Patterson [1990]. "An evaluation of redundant arrays of inexpensive disks using an Amdahl 5890," Proc. ACM SIGMETRICS Conf. on Measurement and Modeling of Computer Systems , May 22-25, 1990, Boulder, Colo. Google ScholarGoogle Scholar
  115. Chen, P. M., E. K. Lee, G. A. Gibson, R. H. Katz, and D. A. Patterson [1994]. "RAID: High-performance, reliable secondary storage," ACM Computing Surveys 26:2 (June), 145-188. Google ScholarGoogle ScholarDigital LibraryDigital Library
  116. Chen, S. [1983]. "Large-scale and high-speed multiprocessor system for scientific applications," Proc. NATO Advanced Research Workshop on High-Speed Computing , June 20-22, 1983, Julich, West Germany. Also appears in K. Hwang, ed., "Superprocessors: Design and applications," IEEE (August), 602-609, 1984.Google ScholarGoogle Scholar
  117. Chen, T. C. [1980]. "Overlap and parallel processing," in H. Stone, ed., Introduction to Computer Architecture , Science Research Associates, Chicago, 427-486.Google ScholarGoogle Scholar
  118. Chow, F. C. [1983]. "A Portable Machine-Independent Global Optimizer--Design and Measurements," Ph. D. thesis, Stanford University, Palo Alto, Calif. Google ScholarGoogle Scholar
  119. Chrysos, G. Z., and J. S. Emer [1998]. "Memory dependence prediction using store sets," Proc. 25th Annual Int'l. Symposium on Computer Architecture (ISCA) , July 3-14, 1998, Barcelona, Spain, 142-153. Google ScholarGoogle Scholar
  120. Clark, B., T. Deshane, E. Dow, S. Evanchik, M. Finlayson, J. Herne, and J. Neefe Matthews [2004]. "Xen and the art of repeated research," Proc. USENIX Annual Technical Conf. , June 27-July 2, 2004, 135-144. Google ScholarGoogle Scholar
  121. Clark, D. W. [1983]. "Cache performance of the VAX-11/780," ACM Trans. on Computer Systems 1:1, 24-37. Google ScholarGoogle ScholarDigital LibraryDigital Library
  122. Clark, D. W. [1987]. "Pipelining and performance in the VAX 8800 processor," Proc. Second Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 5-8, 1987, Palo Alto, Calif., 173-177. Google ScholarGoogle Scholar
  123. Clark, D. W., and J. S. Emer [1985]. "Performance of the VAX-11/780 translation buffer: Simulation and measurement," ACM Trans. on Computer Systems 3:1 (February), 31-62. Google ScholarGoogle ScholarDigital LibraryDigital Library
  124. Clark, D., and H. Levy [1982]. "Measurement and analysis of instruction set use in the VAX-11/780," Proc. Ninth Annual Int'l. Symposium on Computer Architecture (ISCA) , April 26-29, 1982, Austin, Tex., 9-17. Google ScholarGoogle Scholar
  125. Clark, D., and W. D. Strecker [1980]. "Comments on 'the case for the reduced instruction set computer,'" Computer Architecture News 8:6 (October), 34-38. Google ScholarGoogle ScholarDigital LibraryDigital Library
  126. Clark, W. A. [1957]. "The Lincoln TX-2 computer development," Proc. Western Joint Computer Conference , February 26-28, 1957, Los Angeles, 143-145. Google ScholarGoogle Scholar
  127. Clidaras, J., C. Johnson, and B. Felderman [2010]. Private communication. Climate Savers Computing Initiative. [2007]. "Efficiency Specs," http://www. climatesaverscomputing.org/.Google ScholarGoogle Scholar
  128. Clos, C. [1953]. "A study of non-blocking switching networks," Bell Systems Technical Journal 32 (March), 406-424.Google ScholarGoogle ScholarCross RefCross Ref
  129. Cody, W. J., J. T. Coonen, D. M. Gay, K. Hanson, D. Hough, W. Kahan, R. Karpinski, J. Palmer, F. N. Ris, and D. Stevenson [1984]. "A proposed radix- and word-lengthindependent standard for floating-point arithmetic," IEEE Micro 4:4, 86-100. Google ScholarGoogle ScholarDigital LibraryDigital Library
  130. Colwell, R. P., and R. Steck [1995]. "A 0.6 µm BiCMOS processor with dynamic execution." Proc. of IEEE Int'l. Symposium on Solid State Circuits (ISSCC) , February 15-17, 1995, San Francisco, 176-177.Google ScholarGoogle Scholar
  131. Colwell, R. P., R. P. Nix, J. J. O'Donnell, D. B. Papworth, and P. K. Rodman [1987]. "A VLIW architecture for a trace scheduling compiler," Proc. Second Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 5-8, 1987, Palo Alto, Calif., 180-192. Google ScholarGoogle Scholar
  132. Comer, D. [1993]. Internetworking with TCP/IP , 2nd ed., Prentice Hall, Englewood Cliffs, N. J. Google ScholarGoogle Scholar
  133. Compaq Computer Corporation. [1999]. Compiler Writer's Guide for the Alpha 21264 , Order Number EC-RJ66A-TE, June, www1.support.compaq.com/alpha-tools/documentation/current/21264_EV67/ec-rj66a-te_comp_writ_gde_for_alpha21264.pdf.Google ScholarGoogle Scholar
  134. Conti, C., D. H. Gibson, and S. H. Pitkowsky [1968]. "Structural aspects of the System/ 360 Model 85. Part I. General organization," IBM Systems J. 7:1, 2-14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  135. Coonen, J. [1984]. "Contributions to a Proposed Standard for Binary Floating-Point Arithmetic," Ph. D. thesis, University of California, Berkeley. Google ScholarGoogle Scholar
  136. Corbett, P., B. English, A. Goel, T. Grcanac, S. Kleiman, J. Leong, and S. Sankar [2004]. "Row-diagonal parity for double disk failure correction," Proc. 3rd USENIX Conf. on File and Storage Technology (FAST '04) , March 31-April 2, 2004, San Francisco. Google ScholarGoogle Scholar
  137. Crawford, J., and P. Gelsinger [1988]. Programming the 80386 , Sybex Books, Alameda, Calif.Google ScholarGoogle Scholar
  138. Culler, D. E., J. P. Singh, and A. Gupta [1999]. Parallel Computer Architecture: A Hardware/Software Approach , Morgan Kaufmann, San Francisco. Google ScholarGoogle Scholar
  139. Curnow, H. J., and B. A. Wichmann [1976]. "A synthetic benchmark," The Computer J. 19:1, 43-49.Google ScholarGoogle ScholarCross RefCross Ref
  140. Cvetanovic, Z., and R. E. Kessler [2000]. "Performance analysis of the Alpha 21264- based Compaq ES40 system," Proc. 27th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 10-14, 2000, Vancouver, Canada, 192-202. Google ScholarGoogle Scholar
  141. Dally, W. J. [1990]. "Performance analysis of k -ary n -cube interconnection networks," IEEE Trans. on Computers 39:6 (June), 775-785. Google ScholarGoogle ScholarDigital LibraryDigital Library
  142. Dally, W. J. [1992]. "Virtual channel flow control," IEEE Trans. on Parallel and Distributed Systems 3:2 (March), 194-205. Google ScholarGoogle ScholarDigital LibraryDigital Library
  143. Dally, W. J. [1999]. "Interconnect limited VLSI architecture," Proc. of the International Interconnect Technology Conference , May 24-26, 1999, San Francisco.Google ScholarGoogle Scholar
  144. Dally, W. J., and C. I. Seitz [1986]. "The torus routing chip," Distributed Computing 1:4, 187-196.Google ScholarGoogle ScholarCross RefCross Ref
  145. Dally, W. J., and B. Towles [2001]. "Route packets, not wires: On-chip interconnection networks," Proc. 38th Design Automation Conference , June 18-22, 2001, Las Vegas. Google ScholarGoogle Scholar
  146. Dally, W. J., and B. Towles [2003]. Principles and Practices of Interconnection Networks , Morgan Kaufmann, San Francisco. Google ScholarGoogle Scholar
  147. Darcy, J. D., and D. Gay [1996]. "FLECKmarks: Measuring floating point performance using a full IEEE compliant arithmetic benchmark," CS 252 class project, University of California, Berkeley (see HTTP.CS.Berkeley.EDU/~darcy/Projects/cs252/).Google ScholarGoogle Scholar
  148. Darley, H. M. et al. [1989]. "Floating Point/Integer Processor with Divide and Square Root Functions," U. S. Patent 4,878,190, October 31.Google ScholarGoogle Scholar
  149. Davidson, E. S. [1971]. "The design and control of pipelined function generators," Proc. IEEE Conf. on Systems , Networks , and Computers , January 19-21, 1971, Oaxtepec, Mexico, 19-21.Google ScholarGoogle Scholar
  150. Davidson, E. S., A. T. Thomas, L. E. Shar, and J. H. Patel [1975]. "Effective control for pipelined processors," Proc. IEEE COMPCON , February 25-27, 1975, San Francisco, 181-184.Google ScholarGoogle Scholar
  151. Davie, B. S., L. L. Peterson, and D. Clark [1999]. Computer Networks: A Systems Approach , 2nd ed., Morgan Kaufmann, San Francisco. Google ScholarGoogle Scholar
  152. Dean, J. [2009]. "Designs, lessons and advice from building large distributed systems [keynote address]," Proc. 3rd ACM SIGOPS Int'l. Workshop on Large-Scale Distributed Systems and Middleware , Co-located with the 22nd ACM Symposium on Operating Systems Principles , October 11-14, 2009, Big Sky, Mont.Google ScholarGoogle Scholar
  153. Dean, J., and S. Ghemawat [2004]. "MapReduce: Simplified data processing on large clusters." In Proc. Operating Systems Design and Implementation (OSDI) , December 6-8, 2004, San Francisco, Calif., 137-150. Google ScholarGoogle Scholar
  154. Dean, J., and S. Ghemawat [2008]. "MapReduce: Simplified data processing on large clusters," Communications of the ACM , 51:1, 107-113. Google ScholarGoogle ScholarDigital LibraryDigital Library
  155. DeCandia, G., D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels [2007]. "Dynamo: Amazon's highly available key-value store," Proc. 21st ACM Symposium on Operating Systems Principles , October 14-17, 2007, Stevenson, Wash. Google ScholarGoogle Scholar
  156. Dehnert, J. C., P. Y.-T. Hsu, and J. P. Bratt [1989]. "Overlapped loop support on the Cydra 5," Proc. Third Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , April 3-6, 1989, Boston, Mass., 26-39. Google ScholarGoogle Scholar
  157. Demmel, J. W., and X. Li [1994]. "Faster numerical algorithms via exception handling," IEEE Trans. on Computers 43:8, 983-992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  158. Denehy, T. E., J. Bent, F. I. Popovici, A. C. Arpaci-Dusseau, and R. H. Arpaci-Dusseau [2004]. "Deconstructing storage arrays," Proc. 11th Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 7-13, 2004, Boston, Mass., 59-71. Google ScholarGoogle Scholar
  159. Desurvire, E. [1992]. "Lightwave communications: The fifth generation," Scientific American (International Edition) 266:1 (January), 96-103.Google ScholarGoogle Scholar
  160. Diep, T. A., C. Nelson, and J. P. Shen [1995]. "Performance evaluation of the PowerPC 620 microarchitecture," Proc. 22nd Annual Int'l. Symposium on Computer Architecture (ISCA) , June 22-24, 1995, Santa Margherita, Italy. Google ScholarGoogle Scholar
  161. Digital Semiconductor. [1996]. Alpha Architecture Handbook , Version 3 , Digital Press, Maynard, Mass.Google ScholarGoogle Scholar
  162. Ditzel, D. R., and H. R. McLellan [1987]. "Branch folding in the CRISP microprocessor: Reducing the branch delay to zero," Proc. 14th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 2-5, 1987, Pittsburgh, Penn., 2-7. Google ScholarGoogle Scholar
  163. Ditzel, D. R., and D. A. Patterson [1980]. "Retrospective on high-level language computer architecture," Proc. Seventh Annual Int'l. Symposium on Computer Architecture (ISCA) , May 6-8, 1980, La Baule, France, 97-104. Google ScholarGoogle Scholar
  164. Doherty, W. J., and R. P. Kelisky [1979]. "Managing VM/CMS systems for user effectiveness," IBM Systems J. 18:1, 143-166. Google ScholarGoogle ScholarDigital LibraryDigital Library
  165. Dongarra, J. J. [1986]. "A survey of high performance processors," Proc. IEEE COMPCON , March 3-6, 1986, San Francisco, 8-11.Google ScholarGoogle Scholar
  166. Dongarra, J., T. Sterling, H. Simon, and E. Strohmaier [2005]. "High-performance computing: Clusters, constellations, MPPs, and future directions," Computing in Science & Engineering , 7:2 (March/April), 51-59. Google ScholarGoogle Scholar
  167. Douceur, J. R., and W. J. Bolosky [1999]. "A large scale study of file-system contents," Proc. ACM SIGMETRICS Conf. on Measurement and Modeling of Computer Systems , May 1-9, 1999, Atlanta, Ga., 59-69. Google ScholarGoogle Scholar
  168. Douglas, J. [2005]. "Intel 8xx series and Paxville Xeon-MP microprocessors," paper presented at Hot Chips 17, August 14-16, 2005, Stanford University, Palo Alto, Calif.Google ScholarGoogle Scholar
  169. Duato, J. [1993]. "A new theory of deadlock-free adaptive routing in wormhole networks," IEEE Trans. on Parallel and Distributed Systems 4:12 (December) 1320-1331. Google ScholarGoogle ScholarDigital LibraryDigital Library
  170. Duato, J., and T. M. Pinkston [2001]. "A general theory for deadlock-free adaptive routing using a mixed set of resources," IEEE Trans. on Parallel and Distributed Systems 12:12 (December), 1219-1235. Google ScholarGoogle ScholarDigital LibraryDigital Library
  171. Duato, J., S. Yalamanchili, and L. Ni [2003]. Interconnection Networks: An Engineering Approach , 2nd printing, Morgan Kaufmann, San Francisco.Google ScholarGoogle Scholar
  172. Duato, J., I. Johnson, J. Flich, F. Naven, P. Garcia, and T. Nachiondo [2005a]. "A new scalable and cost-effective congestion management strategy for lossless multistage interconnection networks," Proc. 11th Int'l. Symposium on High-Performance Computer Architecture , February 12-16, 2005, San Francisco. Google ScholarGoogle ScholarDigital LibraryDigital Library
  173. Duato, J., O. Lysne, R. Pang, and T. M. Pinkston [2005b]. "Part I: A theory for deadlockfree dynamic reconfiguration of interconnection networks," IEEE Trans. on Parallel and Distributed Systems 16:5 (May), 412-427. Google ScholarGoogle ScholarDigital LibraryDigital Library
  174. Dubois, M., C. Scheurich, and F. Briggs [1988]. "Synchronization, coherence, and event ordering," IEEE Computer 21:2 (February), 9-21. Google ScholarGoogle ScholarDigital LibraryDigital Library
  175. Dunigan, W., K. Vetter, K. White, and P. Worley [2005]. "Performance evaluation of the Cray X1 distributed shared memory architecture," IEEE Micro January/February, 30-40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  176. Eden, A., and T. Mudge [1998]. "The YAGS branch prediction scheme," Proc. of the 31st Annual ACM/IEEE Int'l. Symposium on Microarchitecture , November 30-December 2, 1998, Dallas, Tex., 69-80. Google ScholarGoogle Scholar
  177. Edmondson, J. H., P. I. Rubinfield, R. Preston, and V. Rajagopalan [1995]. "Superscalar instruction execution in the 21164 Alpha microprocessor," IEEE Micro 15:2, 33-43. Google ScholarGoogle ScholarDigital LibraryDigital Library
  178. Eggers, S. [1989]. "Simulation Analysis of Data Sharing in Shared Memory Multiprocessors," Ph. D. thesis, University of California, Berkeley. Google ScholarGoogle Scholar
  179. Elder, J., A. Gottlieb, C. K. Kruskal, K. P. McAuliffe, L. Randolph, M. Snir, P. Teller, and J. Wilson [1985]. "Issues related to MIMD shared-memory computers: The NYU Ultracomputer approach," Proc. 12th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 17-19, 1985, Boston, Mass., 126-135. Google ScholarGoogle Scholar
  180. Ellis, J. R. [1986]. Bulldog: A Compiler for VLIW Architectures , MIT Press, Cambridge, Mass. Google ScholarGoogle Scholar
  181. Emer, J. S., and D. W. Clark [1984]. "A characterization of processor performance in the VAX-11/780," Proc. 11th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 5-7, 1984, Ann Arbor, Mich., 301-310. Google ScholarGoogle Scholar
  182. Enriquez, P. [2001]. "What happened to my dial tone? A study of FCC service disruption reports," poster, Richard Tapia Symposium on the Celebration of Diversity in Computing , October 18-20, Houston, Tex.Google ScholarGoogle Scholar
  183. Erlichson, A., N. Nuckolls, G. Chesson, and J. L. Hennessy [1996]. "SoftFLASH: Analyzing the performance of clustered distributed virtual shared memory," Proc. Seventh Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 1-5, 1996, Cambridge, Mass., 210-220. Google ScholarGoogle Scholar
  184. Esmaeilzadeh, H., T. Cao, Y. Xi, S. M. Blackburn, and K. S. McKinley [2011]. "Looking Back on the Language and Hardware Revolution: Measured Power, Performance, and Scaling," Proc. 16th Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , March 5-11, 2011, Newport Beach, Calif. Google ScholarGoogle Scholar
  185. Evers, M., S. J. Patel, R. S. Chappell, and Y. N. Patt [1998]. "An analysis of correlation and predictability: What makes two-level branch predictors work," Proc. 25th Annual Int'l. Symposium on Computer Architecture (ISCA) , July 3-14, 1998, Barcelona, Spain, 52-61. Google ScholarGoogle Scholar
  186. Fabry, R. S. [1974]. "Capability based addressing," Communications of the ACM 17:7 (July), 403-412. Google ScholarGoogle ScholarDigital LibraryDigital Library
  187. Falsafi, B., and D. A. Wood [1997]. "Reactive NUMA: A design for unifying S-COMA and CC-NUMA," Proc. 24th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 2-4, 1997, Denver, Colo., 229-240. Google ScholarGoogle Scholar
  188. Fan, X., W. Weber, and L. A. Barroso [2007]. "Power provisioning for a warehouse-sized computer," Proc. 34th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 9-13, 2007, San Diego, Calif. Google ScholarGoogle Scholar
  189. Farkas, K. I., and N. P. Jouppi [1994]. "Complexity/performance trade-offs with nonblocking loads," Proc. 21st Annual Int'l. Symposium on Computer Architecture (ISCA) , April 18-21, 1994, Chicago. Google ScholarGoogle Scholar
  190. Farkas, K. I., N. P. Jouppi, and P. Chow [1995]. "How useful are non-blocking loads, stream buffers and speculative execution in multiple issue processors?," Proc. First IEEE Symposium on High-Performance Computer Architecture , January 22-25, 1995, Raleigh, N.C., 78-89. Google ScholarGoogle ScholarCross RefCross Ref
  191. Farkas, K. I., P. Chow, N. P. Jouppi, and Z. Vranesic [1997]. "Memory-system design considerations for dynamically-scheduled processors," Proc. 24th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 2-4, 1997, Denver, Colo., 133-143. Google ScholarGoogle Scholar
  192. Fazio, D. [1987]. "It's really much more fun building a supercomputer than it is simply inventing one," Proc. IEEE COMPCON , February 23-27, 1987, San Francisco, 102-105.Google ScholarGoogle Scholar
  193. Fisher, J. A. [1981]. "Trace scheduling: A technique for global microcode compaction," IEEE Trans. on Computers 30:7 (July), 478-490. Google ScholarGoogle Scholar
  194. Fisher, J. A. [1983]. "Very long instruction word architectures and ELI-512," 10th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 5-7, 1982, Stockholm, Sweden, 140-150. Google ScholarGoogle Scholar
  195. Fisher, J. A., and S. M. Freudenberger [1992]. "Predicting conditional branches from previous runs of a program," Proc. Fifth Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 12-15, 1992, Boston, Mass., 85-95. Google ScholarGoogle Scholar
  196. Fisher, J. A., and B. R. Rau [1993]. Journal of Supercomputing , January (special issue).Google ScholarGoogle Scholar
  197. Fisher, J. A., J. R. Ellis, J. C. Ruttenberg, and A. Nicolau [1984]. "Parallel processing: A smart compiler and a dumb processor," Proc. SIGPLAN Conf. on Compiler Construction , June 17-22, 1984, Montreal, Canada, 11-16. Google ScholarGoogle Scholar
  198. Flemming, P. J., and J. J. Wallace [1986]. "How not to lie with statistics: The correct way to summarize benchmarks results," Communications of the ACM 29:3 (March), 218-221. Google ScholarGoogle ScholarDigital LibraryDigital Library
  199. Flynn, M. J. [1966]. "Very high-speed computing systems," Proc. IEEE 54:12 (December), 1901-1909.Google ScholarGoogle ScholarCross RefCross Ref
  200. Forgie, J. W. [1957]. "The Lincoln TX-2 input-output system," Proc. Western Joint Computer Conference (February), Institute of Radio Engineers, Los Angeles, 156-160. Google ScholarGoogle Scholar
  201. Foster, C. C., and E. M. Riseman [1972]. "Percolation of code to enhance parallel dispatching and execution," IEEE Trans. on Computers C-21:12 (December), 1411- 1415. Google ScholarGoogle ScholarDigital LibraryDigital Library
  202. Frank, S. J. [1984]. "Tightly coupled multiprocessor systems speed memory access time," Electronics 57:1 (January), 164-169.Google ScholarGoogle ScholarCross RefCross Ref
  203. Freiman, C. V. [1961]. "Statistical analysis of certain binary division algorithms," Proc. IRE 49:1, 91-103.Google ScholarGoogle ScholarCross RefCross Ref
  204. Friesenborg, S. E., and R. J. Wicks [1985]. DASD Expectations: The 3380, 3380-23, and MVS/XA , Tech. Bulletin GG22-9363-02, IBM Washington Systems Center, Gaithersburg, Md.Google ScholarGoogle Scholar
  205. Fuller, S. H., and W. E. Burr [1977]. "Measurement and evaluation of alternative computer architectures," Computer 10:10 (October), 24-35. Google ScholarGoogle ScholarDigital LibraryDigital Library
  206. Furber, S. B. [1996]. ARM System Architecture , Addison-Wesley, Harlow, England (see www.cs.man.ac.uk/amulet/publications/books/ARMsysArch). Google ScholarGoogle Scholar
  207. Gagliardi, U. O. [1973]. "Report of workshop 4--software-related advances in computer hardware," Proc. Symposium on the High Cost of Software , September 17-19, 1973, Monterey, Calif., 99-120.Google ScholarGoogle Scholar
  208. Gajski, D., D. Kuck, D. Lawrie, and A. Sameh [1983]. "CEDAR--a large scale multiprocessor," Proc. Int'l. Conf. on Parallel Processing (ICPP) , August, Columbus, Ohio, 524-529.Google ScholarGoogle Scholar
  209. Gallagher, D. M., W. Y. Chen, S. A. Mahlke, J. C. Gyllenhaal, and W. W. Hwu [1994]. "Dynamic memory disambiguation using the memory conflict buffer," Proc. Sixth Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 4-7, Santa Jose, Calif., 183-193. Google ScholarGoogle Scholar
  210. Galles, M. [1996]. "Scalable pipelined interconnect for distributed endpoint routing: The SGI SPIDER chip," Proc. IEEE HOT Interconnects '96 , August 15-17, 1996, Stanford University, Palo Alto, Calif.Google ScholarGoogle Scholar
  211. Game, M., and A. Booker [1999]. "CodePack code compression for PowerPC processors," MicroNews , 5:1, www.chips.ibm.com/micronews/vol5_no1/codepack.html.Google ScholarGoogle Scholar
  212. Gao, Q. S. [1993]. "The Chinese remainder theorem and the prime memory system," 20th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 16-19, 1993, San Diego, Calif. ( Computer Architecture News 21:2 (May), 337-340). Google ScholarGoogle Scholar
  213. Gap. [2005]. "Gap Inc. Reports Third Quarter Earnings," http://gapinc.com/public/documents/PR_Q405EarningsFeb2306.pdf.Google ScholarGoogle Scholar
  214. Gap. [2006]. "Gap Inc. Reports Fourth Quarter and Full Year Earnings," http://gapinc.com/public/documents/Q32005PressRelease_Final22.pdff.Google ScholarGoogle Scholar
  215. Garner, R., A. Agarwal, F. Briggs, E. Brown, D. Hough, B. Joy, S. Kleiman, S. Muchnick, M. Namjoo, D. Patterson, J. Pendleton, and R. Tuck [1988]. "Scalable processor architecture (SPARC)," Proc. IEEE COMPCON , February 29-March 4, 1988, San Francisco, 278-283.Google ScholarGoogle Scholar
  216. Gebis, J., and D. Patterson [2007]. "Embracing and extending 20th-century instruction set architectures," IEEE Computer 40:4 (April), 68-75. Google ScholarGoogle ScholarDigital LibraryDigital Library
  217. Gee, J. D., M. D. Hill, D. N. Pnevmatikatos, and A. J. Smith [1993]. "Cache performance of the SPEC92 benchmark suite," IEEE Micro 13:4 (August), 17-27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  218. Gehringer, E. F., D. P. Siewiorek, and Z. Segall [1987]. Parallel Processing: The Cm* Experience , Digital Press, Bedford, Mass. Google ScholarGoogle Scholar
  219. Gharachorloo, K., A. Gupta, and J. L. Hennessy [1992]. "Hiding memory latency using dynamic scheduling in shared-memory multiprocessors," Proc. 19th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 19-21, 1992, Gold Coast, Australia. Google ScholarGoogle Scholar
  220. Gharachorloo, K., D. Lenoski, J. Laudon, P. Gibbons, A. Gupta, and J. L. Hennessy [1990]. "Memory consistency and event ordering in scalable shared-memory multiprocessors," Proc. 17th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 28-31, 1990, Seattle, Wash., 15-26. Google ScholarGoogle Scholar
  221. Ghemawat, S., H. Gobioff, and S.-T. Leung [2003]. "The Google file system," Proc. 19th ACM Symposium on Operating Systems Principles , October 19-22, 2003, Bolton Landing, N.Y. Google ScholarGoogle ScholarDigital LibraryDigital Library
  222. Gibson, D. H. [1967]. "Considerations in block-oriented systems design," AFIPS Conf. Proc. 30, 75-80. Google ScholarGoogle Scholar
  223. Gibson, G. A. [1992]. Redundant Disk Arrays: Reliable , Parallel Secondary Storage , ACM Distinguished Dissertation Series, MIT Press, Cambridge, Mass. Google ScholarGoogle Scholar
  224. Gibson, J. C. [1970]. "The Gibson mix," Rep. TR. 00.2043, IBM Systems Development Division, Poughkeepsie, N.Y. (research done in 1959).Google ScholarGoogle Scholar
  225. Gibson, J., R. Kunz, D. Ofelt, M. Horowitz, J. Hennessy, and M. Heinrich [2000]. "FLASH vs. (simulated) FLASH: Closing the simulation loop," Proc. Ninth Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , November 12-15, Cambridge, Mass., 49-58. Google ScholarGoogle Scholar
  226. Glass, C. J., and L. M. Ni [1992]. "The Turn Model for adaptive routing," 19th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 19-21, 1992, Gold Coast, Australia. Google ScholarGoogle Scholar
  227. Goldberg, D. [1991]. "What every computer scientist should know about floating-point arithmetic," Computing Surveys 23:1, 5-48. Google ScholarGoogle ScholarDigital LibraryDigital Library
  228. Goldberg, I. B. [1967]. "27 bits are not enough for 8-digit accuracy," Communications of the ACM 10:2, 105-106. Google ScholarGoogle ScholarDigital LibraryDigital Library
  229. Goldstein, S. [1987]. Storage Performance--An Eight Year Outlook , Tech. Rep. TR 03.308-1, Santa Teresa Laboratory, IBM Santa Teresa Laboratory, San Jose, Calif.Google ScholarGoogle Scholar
  230. Goldstine, H. H. [1972]. The Computer: From Pascal to von Neumann , Princeton University Press, Princeton, N. J. Google ScholarGoogle Scholar
  231. Gonzalez, J., and A. González [1998]. "Limits of instruction level parallelism with data speculation," Proc. Vector and Parallel Processing (VECPAR) Conf. , June 21-23, 1998, Porto, Portugal, 585-598. Google ScholarGoogle Scholar
  232. Goodman, J. R. [1983]. "Using cache memory to reduce processor memory traffic," Proc. 10th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 5-7, 1982, Stockholm, Sweden, 124-131. Google ScholarGoogle Scholar
  233. Goralski, W. [1997]. SONET: A Guide to Synchronous Optical Network , McGraw-Hill, New York. Google ScholarGoogle Scholar
  234. Gosling, J. B. [1980]. Design of Arithmetic Units for Digital Computers , Springer-Verlag, New York.Google ScholarGoogle Scholar
  235. Gray, J. [1990]. "A census of Tandem system availability between 1985 and 1990," IEEE Trans. on Reliability , 39:4 (October), 409-418.Google ScholarGoogle ScholarCross RefCross Ref
  236. Gray, J. (ed.) [1993]. The Benchmark Handbook for Database and Transaction Processing Systems , 2nd ed., Morgan Kaufmann, San Francisco.Google ScholarGoogle Scholar
  237. Gray, J. [2006]. Sort benchmark home page, http://sortbenchmark.org/.Google ScholarGoogle Scholar
  238. Gray, J., and A. Reuter [1993]. Transaction Processing: Concepts and Techniques , Morgan Kaufmann, San Francisco. Google ScholarGoogle Scholar
  239. Gray, J., and D. P. Siewiorek [1991]. "High-availability computer systems," Computer 24:9 (September), 39-48. Google ScholarGoogle ScholarDigital LibraryDigital Library
  240. Gray, J., and C. van Ingen [2005]. Empirical Measurements of Disk Failure Rates and Error Rates , MSR-TR-2005-166, Microsoft Research, Redmond, Wash.Google ScholarGoogle Scholar
  241. Greenberg, A., N. Jain, S. Kandula, C. Kim, P. Lahiri, D. Maltz, P. Patel, and S. Sengupta [2009]. "VL2: A Scalable and Flexible Data Center Network," in Proc. ACM SIGCOMM , August 17-21, 2009, Barcelona, Spain. Google ScholarGoogle ScholarDigital LibraryDigital Library
  242. Grice, C., and M. Kanellos [2000]. "Cell phone industry at crossroads: Go high or low?," CNET News , August 31, technews.netscape.com/news/0-1004-201-2518386- 0.html?tag=st.ne.1002.tgif.sf.Google ScholarGoogle Scholar
  243. Groe, J. B., and L. E. Larson [2000]. CDMA Mobile Radio Design , Artech House, Boston. Google ScholarGoogle Scholar
  244. Gunther, K. D. [1981]. "Prevention of deadlocks in packet-switched data transport systems," IEEE Trans. on Communications COM-29:4 (April), 512-524.Google ScholarGoogle ScholarCross RefCross Ref
  245. Hagersten, E., and M. Koster [1998]. "WildFire: A scalable path for SMPs," Proc. Fifth Int'l. Symposium on High-Performance Computer Architecture , January 9-12, 1999, Orlando, Fla. Google ScholarGoogle Scholar
  246. Hagersten, E., A. Landin, and S. Haridi [1992]. "DDM--a cache-only memory architecture," IEEE Computer 25:9 (September), 44-54. Google ScholarGoogle ScholarDigital LibraryDigital Library
  247. Hamacher, V. C., Z. G. Vranesic, and S. G. Zaky [1984]. Computer Organization , 2nd ed., McGraw-Hill, New York. Google ScholarGoogle Scholar
  248. Hamilton, J. [2009]. "Data center networks are in my way," paper presented at the Stanford Clean Slate CTO Summit, October 23, 2009 (http://mvdirona.com/jrh/TalksAndPapers/JamesHamilton_CleanSlateCTO2009.pdf).Google ScholarGoogle Scholar
  249. Hamilton, J. [2010]. "Cloud computing economies of scale," paper presented at the AWS Workshop on Genomics and Cloud Computing , June 8, 2010, Seattle, Wash. (http://mvdirona.com/jrh/TalksAndPapers/JamesHamilton_GenomicsCloud20100608.pdf).Google ScholarGoogle Scholar
  250. Handy, J. [1993]. The Cache Memory Book , Academic Press, Boston. Google ScholarGoogle Scholar
  251. Hauck, E. A., and B. A. Dent [1968]. "Burroughs' B6500/B7500 stack mechanism," Proc. AFIPS Spring Joint Computer Conf. , April 30-May 2, 1968, Atlantic City, N. J., 245-251. Google ScholarGoogle Scholar
  252. Heald, R., K. Aingaran, C. Amir, M. Ang, M. Boland, A. Das, P. Dixit, G. Gouldsberry, J. Hart, T. Horel, W.-J. Hsu, J. Kaku, C. Kim, S. Kim, F. Klass, H. Kwan, R. Lo, H. McIntyre, A. Mehta, D. Murata, S. Nguyen, Y.-P. Pai, S. Patel, K. Shin, K. Tam, S. Vishwanthaiah, J. Wu, G. Yee, and H. You [2000]. "Implementation of thirdgeneration SPARC V9 64-b microprocessor," ISSCC Digest of Technical Papers , 412-413 and slide supplement.Google ScholarGoogle Scholar
  253. Heinrich, J. [1993]. MIPS R4000 User's Manual , Prentice Hall, Englewood Cliffs, N. J. Henly, M., and B. McNutt [1989]. DASD I/O Characteristics: A Comparison of MVS to VM ," Tech. Rep. TR 02.1550 (May), IBM General Products Division, San Jose, Calif. Google ScholarGoogle Scholar
  254. Hennessy, J. [1984]. "VLSI processor architecture," IEEE Trans. on Computers C-33:11 (December), 1221-1246. Google ScholarGoogle ScholarDigital LibraryDigital Library
  255. Hennessy, J. [1985]. "VLSI RISC processors," VLSI Systems Design 6:10 (October), 22-32.Google ScholarGoogle Scholar
  256. Hennessy, J., N. Jouppi, F. Baskett, and J. Gill [1981]. "MIPS: A VLSI processor architecture," in CMU Conference on VLSI Systems and Computations , Computer Science Press, Rockville, Md.Google ScholarGoogle Scholar
  257. Hewlett-Packard. [1994]. PA-RISC 2.0 Architecture Reference Manual , 3rd ed., Hewlett-Packard, Palo Alto, Calif.Google ScholarGoogle Scholar
  258. Hewlett-Packard. [1998]. "HP's '5NINES:5MINUTES' Vision Extends Leadership and Redefines High Availability in Mission-Critical Environments," February 10, www.future.enterprisecomputing.hp.com/ia64/news/5nines_vision_pr.html.Google ScholarGoogle Scholar
  259. Hill, M. D. [1987]. "Aspects of Cache Memory and Instruction Buffer Performance," Ph. D. thesis, Tech. Rep. UCB/CSD 87/381, Computer Science Division, University of California, Berkeley. Google ScholarGoogle Scholar
  260. Hill, M. D. [1988]. "A case for direct mapped caches," Computer 21:12 (December), 25-40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  261. Hill, M. D. [1998]. "Multiprocessors should support simple memory consistency models," IEEE Computer 31:8 (August), 28-34. Google ScholarGoogle ScholarDigital LibraryDigital Library
  262. Hillis, W. D. [1985]. The Connection Multiprocessor , MIT Press, Cambridge, Mass.Google ScholarGoogle Scholar
  263. Hillis, W. D. and G. L. Steele [1986]. "Data parallel algorithms," Communications of the ACM 29:12 (December), 1170-1183. (http://doi.acm.org/10.1145/7902.7903). Google ScholarGoogle ScholarDigital LibraryDigital Library
  264. Hinton, G., D. Sager, M. Upton, D. Boggs, D. Carmean, A. Kyker, and P. Roussel [2001]. "The microarchitecture of the Pentium 4 processor," Intel Technology Journal , February.Google ScholarGoogle Scholar
  265. Hintz, R. G., and D. P. Tate [1972]. "Control data STAR-100 processor design," Proc. IEEE COMPCON , September 12-14, 1972, San Francisco, 1-4.Google ScholarGoogle Scholar
  266. Hirata, H., K. Kimura, S. Nagamine, Y. Mochizuki, A. Nishimura, Y. Nakase, and T. Nishizawa [1992]. "An elementary processor architecture with simultaneous instruction issuing from multiple threads," Proc. 19th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 19-21, 1992, Gold Coast, Australia, 136-145. Google ScholarGoogle Scholar
  267. Hitachi. [1997]. SuperH RISC Engine SH7700 Series Programming Manual , Hitachi, Santa Clara, Calif. (see www.halsp.hitachi.com/tech_prod/and search for title).Google ScholarGoogle Scholar
  268. Ho, R., K. W. Mai, and M. A. Horowitz [2001]. "The future of wires," Proc. of the IEEE 89:4 (April), 490-504.Google ScholarGoogle ScholarCross RefCross Ref
  269. Hoagland, A. S. [1963]. Digital Magnetic Recording , Wiley, New York.Google ScholarGoogle Scholar
  270. Hockney, R. W., and C. R. Jesshope [1988]. Parallel Computers 2: Architectures , Programming and Algorithms , Adam Hilger, Ltd., Bristol, England. Google ScholarGoogle Scholar
  271. Holland, J. H. [1959]. "A universal computer capable of executing an arbitrary number of subprograms simultaneously," Proc. East Joint Computer Conf. 16, 108-113. Google ScholarGoogle Scholar
  272. Holt, R. C. [1972]. "Some deadlock properties of computer systems," ACM Computer Surveys 4:3 (September), 179-196. Google ScholarGoogle ScholarDigital LibraryDigital Library
  273. Hopkins, M. [2000]. "A critical look at IA-64: Massive resources, massive ILP, but can it deliver?" Microprocessor Report , February.Google ScholarGoogle Scholar
  274. Hord, R. M. [1982]. The Illiac-IV , The First Supercomputer , Computer Science Press, Rockville, Md.Google ScholarGoogle Scholar
  275. Horel, T., and G. Lauterbach [1999]. "UltraSPARC-III: Designing third-generation 64-bit performance," IEEE Micro 19:3 (May-June), 73-85. Google ScholarGoogle ScholarDigital LibraryDigital Library
  276. Hospodor, A. D., and A. S. Hoagland [1993]. "The changing nature of disk controllers." Proc. IEEE 81:4 (April), 586-594.Google ScholarGoogle ScholarCross RefCross Ref
  277. Holzle, U. [2010]. "Brawny cores still beat wimpy cores, most of the time," IEEE Micro 30:4 (July/August).Google ScholarGoogle Scholar
  278. Hristea, C., D. Lenoski, and J. Keen [1997]. "Measuring memory hierarchy performance of cache-coherent multiprocessors using micro benchmarks," Proc. ACM/IEEE Conf. on Supercomputing , November 16-21, 1997, San Jose, Calif. Google ScholarGoogle Scholar
  279. Hsu, P. [1994]. "Designing the TFP microprocessor," IEEE Micro 18:2 (April), 2333. Google ScholarGoogle Scholar
  280. Huck, J. et al. [2000]. "Introducing the IA-64 Architecture" IEEE Micro , 20:5 (September-October), 12-23. Google ScholarGoogle ScholarDigital LibraryDigital Library
  281. Hughes, C. J., P. Kaul, S. V. Adve, R. Jain, C. Park, and J. Srinivasan [2001]. "Variability in the execution of multimedia applications and implications for architecture," Proc. 28th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 30-July 4, 2001, Goteborg, Sweden, 254-265. Google ScholarGoogle Scholar
  282. Hwang, K. [1979]. Computer Arithmetic: Principles , Architecture , and Design , Wiley, New York. Google ScholarGoogle Scholar
  283. Hwang, K. [1993]. Advanced Computer Architecture and Parallel Programming , McGraw-Hill, New York.Google ScholarGoogle Scholar
  284. Hwu, W.-M., and Y. Patt [1986]. "HPSm, a high performance restricted data flow architecture having minimum functionality," Proc. 13th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 2-5, 1986, Tokyo, 297-307. Google ScholarGoogle Scholar
  285. Hwu, W. W., S. A. Mahlke, W. Y. Chen, P. P. Chang, N. J. Warter, R. A. Bringmann, R. O. Ouellette, R. E. Hank, T. Kiyohara, G. E. Haab, J. G. Holm, and D. M. Lavery [1993]. "The superblock: An effective technique for VLIW and superscalar compilation," J. Supercomputing 7:1, 2 (March), 229-248. Google ScholarGoogle Scholar
  286. IBM. [1982]. The Economic Value of Rapid Response Time , GE20-0752-0, IBM, White Plains, N.Y., 11-82.Google ScholarGoogle Scholar
  287. IBM. [1990]. "The IBM RISC System/6000 processor" (collection of papers), IBM J. Research and Development 34:1 (January).Google ScholarGoogle Scholar
  288. IBM. [1994]. The PowerPC Architecture , Morgan Kaufmann, San Francisco.Google ScholarGoogle Scholar
  289. IBM. [2005]. "Blue Gene," IBM J. Research and Development , 49:2/3 (special issue).Google ScholarGoogle ScholarDigital LibraryDigital Library
  290. IEEE. [1985]. "IEEE standard for binary floating-point arithmetic," SIGPLAN Notices 22:2, 9-25.Google ScholarGoogle Scholar
  291. IEEE. [2005]. "Intel virtualization technology, computer," IEEE Computer Society 38:5 (May), 48-56. Google ScholarGoogle Scholar
  292. IEEE. 754-2008 Working Group. [2006]. "DRAFT Standard for Floating-Point Arithmetic 754-2008," http://dx.doi.org/10.1109/IEEESTD.2008.4610935.Google ScholarGoogle Scholar
  293. Imprimis Product Specification , 97209 Sabre Disk Drive IPI-2 Interface 1.2 GB , Document No. 64402302, Imprimis, Dallas, Tex.Google ScholarGoogle Scholar
  294. InfiniBand Trade Association. [2001]. InfiniBand Architecture Specifications Release 1.0.a , www.infinibandta.org.Google ScholarGoogle Scholar
  295. Intel. [2001]. "Using MMX Instructions to Convert RGB to YUV Color Conversion," cedar.intel.com/cgi-bin/ids.dll/content/content.jsp?cntKey=Legacy::irtm_AP548_9996& cntType=IDS_ EDITORIAL.Google ScholarGoogle Scholar
  296. Internet Retailer. [2005]. "The Gap launches a new site--after two weeks of downtime," Internet® Retailer , September 28, http://www.internetretailer.com/2005/09/28/thegap-launches-a-new-site-after-two-weeks-of-downtime.Google ScholarGoogle Scholar
  297. Jain, R. [1991]. The Art of Computer Systems Performance Analysis: Techniques for Experimental Design , Measurement , Simulation , and Modeling , Wiley, New York.Google ScholarGoogle Scholar
  298. Jantsch, A., and H. Tenhunen (eds.) [2003]. Networks on Chips , Kluwer Academic Publishers, The Netherlands. Google ScholarGoogle Scholar
  299. Jimenez, D. A., and C. Lin [2002]. "Neural methods for dynamic branch prediction," ACM Trans. on Computer Systems 20:4 (November), 369-397. Google ScholarGoogle ScholarDigital LibraryDigital Library
  300. Johnson, M. [1990]. Superscalar Microprocessor Design , Prentice Hall, Englewood Cliffs, N. J.Google ScholarGoogle Scholar
  301. Jordan, H. F. [1983]. "Performance measurements on HEP--a pipelined MIMD computer," Proc. 10th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 5-7, 1982, Stockholm, Sweden, 207-212. Google ScholarGoogle Scholar
  302. Jordan, K. E. [1987]. "Performance comparison of large-scale scientific processors: Scalar mainframes, mainframes with vector facilities, and supercomputers," Computer 20:3 (March), 10-23. Google ScholarGoogle ScholarDigital LibraryDigital Library
  303. Jouppi, N. P. [1990]. "Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers," Proc. 17th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 28-31, 1990, Seattle, Wash., 364-373. Google ScholarGoogle Scholar
  304. Jouppi, N. P. [1998]. "Retrospective: Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers," 25 Years of the International Symposia on Computer Architecture (Selected Papers) , ACM, New York, 71-73. Google ScholarGoogle ScholarDigital LibraryDigital Library
  305. Jouppi, N. P., and D. W. Wall [1989]. "Available instruction-level parallelism for superscalar and superpipelined processors," Proc. Third Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , April 3-6, 1989, Boston, 272-282. Google ScholarGoogle Scholar
  306. Jouppi, N. P., and S. J. E. Wilton [1994]. "Trade-offs in two-level on-chip caching," Proc. 21st Annual Int'l. Symposium on Computer Architecture (ISCA) , April 18-21, 1994, Chicago, 34-45. Google ScholarGoogle Scholar
  307. Kaeli, D. R., and P. G. Emma [1991]. "Branch history table prediction of moving target branches due to subroutine returns," Proc. 18th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 27-30, 1991, Toronto, Canada, 34-42. Google ScholarGoogle Scholar
  308. Kahan, J. [1990]. "On the advantage of the 8087's stack," unpublished course notes, Computer Science Division, University of California, Berkeley.Google ScholarGoogle Scholar
  309. Kahan, W. [1968]. "7094-II system support for numerical analysis," SHARE Secretarial Distribution SSD-159, Department of Computer Science, University of Toronto.Google ScholarGoogle Scholar
  310. Kahaner, D. K. [1988]. "Benchmarks for 'real' programs," SIAM News , November.Google ScholarGoogle Scholar
  311. Kahn, R. E. [1972]. "Resource-sharing computer communication networks," Proc. IEEE 60:11 (November), 1397-1407.Google ScholarGoogle ScholarCross RefCross Ref
  312. Kane, G. [1986]. MIPS R2000 RISC Architecture , Prentice Hall, Englewood Cliffs, N. J.Google ScholarGoogle Scholar
  313. Kane, G. [1996]. PA-RISC 2.0 Architecture , Prentice Hall, Upper Saddle River, N. J. Google ScholarGoogle Scholar
  314. Kane, G., and J. Heinrich [1992]. MIPS RISC Architecture , Prentice Hall, Englewood Cliffs, N. J. Google ScholarGoogle Scholar
  315. Katz, R. H., D. A. Patterson, and G. A. Gibson [1989]. "Disk system architectures for high performance computing," Proc. IEEE 77:12 (December), 1842-1858.Google ScholarGoogle ScholarCross RefCross Ref
  316. Keckler, S. W., and W. J. Dally [1992]. "Processor coupling: Integrating compile time and runtime scheduling for parallelism," Proc. 19th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 19-21, 1992, Gold Coast, Australia, 202-213. Google ScholarGoogle Scholar
  317. Keller, R. M. [1975]. "Look-ahead processors," ACM Computing Surveys 7:4 (December), 177-195. Google ScholarGoogle ScholarDigital LibraryDigital Library
  318. Keltcher, C. N., K. J. McGrath, A. Ahmed, and P. Conway [2003]. "The AMD Opteron processor for multiprocessor servers," IEEE Micro 23:2 (March-April), 66-76 (dx.doi.org/10.1109. MM.2003.119116). Google ScholarGoogle ScholarDigital LibraryDigital Library
  319. Kembel, R. [2000]. "Fibre Channel: A comprehensive introduction," Internet Week , April. Google ScholarGoogle Scholar
  320. Kermani, P., and L. Kleinrock [1979]. "Virtual Cut-Through: A New Computer Communication Switching Technique," Computer Networks 3 (January), 267-286.Google ScholarGoogle Scholar
  321. Kessler, R. [1999]. "The Alpha 21264 microprocessor," IEEE Micro 19:2 (March/April) 24-36. Google ScholarGoogle ScholarDigital LibraryDigital Library
  322. Kilburn, T., D. B. G. Edwards, M. J. Lanigan, and F. H. Sumner [1962]. "One-level storage system," IRE Trans. on Electronic Computers EC-11 (April) 223-235. AlsoGoogle ScholarGoogle ScholarCross RefCross Ref
  323. appears in D. P. Siewiorek, C. G. Bell, and A. Newell, Computer Structures: Principles and Examples , McGraw-Hill, New York, 1982, 135-148. Google ScholarGoogle Scholar
  324. Killian, E. [1991]. "MIPS R4000 technical overview-64 bits/100 MHz or bust," Hot Chips III Symposium Record , August 26-27, 1991, Stanford University, Palo Alto, Calif., 1.6-1.19.Google ScholarGoogle Scholar
  325. Kim, M. Y. [1986]. "Synchronized disk interleaving," IEEE Trans. on Computers C-35:11 (November), 978-988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  326. Kissell, K. D. [1997]. "MIPS16: High-density for the embedded market," Proc. Real Time Systems '97 , June 15, 1997, Las Vegas, Nev. (see www.sgi.com/MIPS/arch/MIPS16/MIPS16.whitepaper.pdf).Google ScholarGoogle Scholar
  327. Kitagawa, K., S. Tagaya, Y. Hagihara, and Y. Kanoh [2003]. "A hardware overview of SX- 6 and SX-7 supercomputer," NEC Research & Development J. 44:1 (January), 2-7.Google ScholarGoogle Scholar
  328. Knuth, D. [1981]. The Art of Computer Programming , Vol. II, 2nd ed., Addison-Wesley, Reading, Mass.Google ScholarGoogle Scholar
  329. Kogge, P. M. [1981]. The Architecture of Pipelined Computers , McGraw-Hill, New York.Google ScholarGoogle Scholar
  330. Kohn, L., and S.-W. Fu [1989]. "A 1,000,000 transistor microprocessor," Proc. of IEEE Int'l. Symposium on Solid State Circuits (ISSCC) , February 15-17, 1989, New York, 54-55.Google ScholarGoogle Scholar
  331. Kohn, L., and N. Margulis [1989]. "Introducing the Intel i860 64-Bit Microprocessor," IEEE Micro , 9:4 (July), 15-30. Google ScholarGoogle ScholarDigital LibraryDigital Library
  332. Kontothanassis, L., G. Hunt, R. Stets, N. Hardavellas, M. Cierniak, S. Parthasarathy, W. Meira, S. Dwarkadas, and M. Scott [1997]. "VM-based shared memory on lowlatency, remote-memory-access networks," Proc. 24th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 2-4, 1997, Denver, Colo. Google ScholarGoogle Scholar
  333. Koren, I. [1989]. Computer Arithmetic Algorithms , Prentice Hall, Englewood Cliffs, N. J. Kozyrakis, C. [2000]. "Vector IRAM: A media-oriented vector processor with embedded DRAM," paper presented at Hot Chips 12, August 13-15, 2000, Palo Alto, Calif, 13-15. Google ScholarGoogle Scholar
  334. Kozyrakis, C., and D. Patterson, [2002]. "Vector vs. superscalar and VLIW architectures for embedded multimedia benchmarks," Proc. 35th Annual Int'l. Symposium on Microarchitecture (MICRO-35) , November 18-22, 2002, Istanbul, Turkey. Google ScholarGoogle Scholar
  335. Kroft, D. [1981]. "Lockup-free instruction fetch/prefetch cache organization," Proc. Eighth Annual Int'l. Symposium on Computer Architecture (ISCA) , May 12-14, 1981, Minneapolis, Minn., 81-87. Google ScholarGoogle Scholar
  336. Kroft, D. [1998]. "Retrospective: Lockup-free instruction fetch/prefetch cache organization," 25 Years of the International Symposia on Computer Architecture (Selected Papers) , ACM, New York, 20-21. Google ScholarGoogle ScholarDigital LibraryDigital Library
  337. Kuck, D., P. P. Budnik, S.-C. Chen, D. H. Lawrie, R. A. Towle, R. E. Strebendt, E. W. Davis, Jr., J. Han, P. W. Kraska, and Y. Muraoka [1974]. "Measurements of parallelism in ordinary FORTRAN programs," Computer 7:1 (January), 37-46.Google ScholarGoogle ScholarCross RefCross Ref
  338. Kuhn, D. R. [1997]. "Sources of failure in the public switched telephone network," IEEE Computer 30:4 (April), 31-36. Google ScholarGoogle ScholarDigital LibraryDigital Library
  339. Kumar, A. [1997]. "The HP PA-8000 RISC CPU," IEEE Micro 17:2 (March/April), 27-32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  340. Kunimatsu, A., N. Ide, T. Sato, Y. Endo, H. Murakami, T. Kamei, M. Hirano, F. Ishihara, H. Tago, M. Oka, A. Ohba, T. Yutaka, T. Okada, and M. Suzuoki [2000]. "Vector unit architecture for emotion synthesis," IEEE Micro 20:2 (March-April), 40-47. Google ScholarGoogle ScholarDigital LibraryDigital Library
  341. Kunkel, S. R., and J. E. Smith [1986]. "Optimal pipelining in supercomputers," Proc. 13th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 2-5, 1986, Tokyo, 404-414. Google ScholarGoogle Scholar
  342. Kurose, J. F., and K. W. Ross [2001]. Computer Networking: A Top-Down Approach Featuring the Internet , Addison-Wesley, Boston. Google ScholarGoogle Scholar
  343. Kuskin, J., D. Ofelt, M. Heinrich, J. Heinlein, R. Simoni, K. Gharachorloo, J. Chapin, D. Nakahira, J. Baxter, M. Horowitz, A. Gupta, M. Rosenblum, and J. L. Hennessy [1994]. "The Stanford FLASH multiprocessor," Proc. 21st Annual Int'l. Symposium on Computer Architecture (ISCA) , April 18-21, 1994, Chicago. Google ScholarGoogle ScholarDigital LibraryDigital Library
  344. Lam, M. [1988]. "Software pipelining: An effective scheduling technique for VLIW processors," SIGPLAN Conf. on Programming Language Design and Implementation , June 22-24, 1988, Atlanta, Ga., 318-328. Google ScholarGoogle Scholar
  345. Lam, M. S., and R. P. Wilson [1992]. "Limits of control flow on parallelism," Proc. 19th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 19-21, 1992, Gold Coast, Australia, 46-57. Google ScholarGoogle Scholar
  346. Lam, M. S., E. E. Rothberg, and M. E. Wolf [1991]. "The cache performance and optimizations of blocked algorithms," Proc. Fourth Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , April 8-11, 1991, Santa Clara, Calif. ( SIGPLAN Notices 26:4 (April), 63-74). Google ScholarGoogle Scholar
  347. Lambright, D. [2000]. "Experiences in measuring the reliability of a cache-based storage system," Proc. of First Workshop on Industrial Experiences with Systems Software (WIESS 2000), Co-Located with the 4th Symposium on Operating Systems Design and Implementation (OSDI) , October 22, 2000, San Diego, Calif. Google ScholarGoogle Scholar
  348. Lamport, L. [1979]. "How to make a multiprocessor computer that correctly executes multiprocess programs," IEEE Trans. on Computers C-28:9 (September), 241-248. Google ScholarGoogle Scholar
  349. Lang, W., J. M. Patel, and S. Shankar [2010]. "Wimpy node clusters: What about non-wimpy workloads?" Proc. Sixth International Workshop on Data Management on New Hardware (DaMoN) , June 7, Indianapolis, Ind. Google ScholarGoogle Scholar
  350. Laprie, J.-C. [1985]. "Dependable computing and fault tolerance: Concepts and terminology," Proc. 15th Annual Int'l. Symposium on Fault-Tolerant Computing , June 19-21, 1985, Ann Arbor, Mich., 2-11.Google ScholarGoogle Scholar
  351. Larson, E. R. [1973]. "Findings of fact, conclusions of law, and order for judgment," File No. 4-67, Civ. 138, Honeywell v. Sperry-Rand and Illinois Scientific Development , U. S. District Court for the State of Minnesota, Fourth Division (October 19).Google ScholarGoogle Scholar
  352. Laudon, J., and D. Lenoski [1997]. "The SGI Origin: A ccNUMA highly scalable server," Proc. 24th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 2-4, 1997, Denver, Colo., 241-251. Google ScholarGoogle Scholar
  353. Laudon, J., A. Gupta, and M. Horowitz [1994]. "Interleaving: A multithreading technique targeting multiprocessors and workstations," Proc. Sixth Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 4-7, San Jose, Calif., 308-318. Google ScholarGoogle Scholar
  354. Lauterbach, G., and T. Horel [1999]. "UltraSPARC-III: Designing third generation 64-bit performance," IEEE Micro 19:3 (May/June). Google ScholarGoogle Scholar
  355. Lazowska, E. D., J. Zahorjan, G. S. Graham, and K. C. Sevcik [1984]. Quantitative System Performance: Computer System Analysis Using Queueing Network Models , Prentice Hall, Englewood Cliffs, N. J. (Although out of print, it is available online at www.cs.washington.edu/homes/lazowska/qsp/.) Google ScholarGoogle Scholar
  356. Lebeck, A. R., and D. A. Wood [1994]. "Cache profiling and the SPEC benchmarks: A case study," Computer 27:10 (October), 15-26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  357. Lee, R. [1989]. "Precision architecture," Computer 22:1 (January), 78-91. Google ScholarGoogle ScholarDigital LibraryDigital Library
  358. Lee, W. V. et al. [2010]. "Debunking the 100X GPU vs. CPU myth: An evaluation of throughput computing on CPU and GPU," Proc. 37th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 19-23, 2010, Saint-Malo, France. Google ScholarGoogle Scholar
  359. Leighton, F. T. [1992]. Introduction to Parallel Algorithms and Architectures: Arrays , Trees , Hypercubes , Morgan Kaufmann, San Francisco. Google ScholarGoogle Scholar
  360. Leiner, A. L. [1954]. "System specifications for the DYSEAC," J. ACM 1:2 (April), 57-81. Google ScholarGoogle ScholarDigital LibraryDigital Library
  361. Leiner, A. L., and S. N. Alexander [1954]. "System organization of the DYSEAC," IRE Trans. of Electronic Computers EC-3:1 (March), 1-10.Google ScholarGoogle Scholar
  362. Leiserson, C. E. [1985]. "Fat trees: Universal networks for hardware-efficient supercomputing," IEEE Trans. on Computers C-34:10 (October), 892-901. Google ScholarGoogle ScholarCross RefCross Ref
  363. Lenoski, D., J. Laudon, K. Gharachorloo, A. Gupta, and J. L. Hennessy [1990]. "The Stanford DASH multiprocessor," Proc. 17th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 28-31, 1990, Seattle, Wash., 148-159.Google ScholarGoogle ScholarCross RefCross Ref
  364. Lenoski, D., J. Laudon, K. Gharachorloo, W.-D. Weber, A. Gupta, J. L. Hennessy, M. A. Horowitz, and M. Lam [1992]. "The Stanford DASH multiprocessor," IEEE Computer 25:3 (March), 63-79. Google ScholarGoogle ScholarDigital LibraryDigital Library
  365. Levy, H., and R. Eckhouse [1989]. Computer Programming and Architecture: The VAX , Digital Press, Boston. Google ScholarGoogle Scholar
  366. Li, K. [1988]. "IVY: A shared virtual memory system for parallel computing," Proc. 1988 Int'l. Conf. on Parallel Processing , Pennsylvania State University Press, University Park, Penn.Google ScholarGoogle Scholar
  367. Li, S., K. Chen, J. B. Brockman, and N. Jouppi [2011]. "Performance Impacts of Nonblocking Caches in Out-of-order Processors," HP Labs Tech Report HPL-2011-65 (full text available at http://Library.hp.com/techpubs/2011/Hpl-2011-65.html).Google ScholarGoogle Scholar
  368. Lim, K., P. Ranganathan, J. Chang, C. Patel, T. Mudge, and S. Reinhardt [2008]. "Understanding and designing new system architectures for emerging warehouse-computing environments," Proc. 35th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 21-25, 2008, Beijing, China. Google ScholarGoogle Scholar
  369. Lincoln, N. R. [1982]. "Technology and design trade offs in the creation of a modern supercomputer," IEEE Trans. on Computers C-31:5 (May), 363-376. Google ScholarGoogle Scholar
  370. Lindholm, T., and F. Yellin [1999]. The Java Virtual Machine Specification , 2nd ed., Addison-Wesley, Reading, Mass. (also available online at java.sun.com/docs/books/vmspec/). Google ScholarGoogle Scholar
  371. Lipasti, M. H., and J. P. Shen [1996]. "Exceeding the dataflow limit via value prediction," Proc. 29th Int'l. Symposium on Microarchitecture , December 2-4, 1996, Paris, France. Google ScholarGoogle Scholar
  372. Lipasti, M. H., C. B. Wilkerson, and J. P. Shen [1996]. "Value locality and load value prediction," Proc. Seventh Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 1-5, 1996, Cambridge, Mass., 138-147. Google ScholarGoogle ScholarDigital LibraryDigital Library
  373. Liptay, J. S. [1968]. "Structural aspects of the System/360 Model 85, Part II: The cache," IBM Systems J. 7:1, 15-21. Google ScholarGoogle ScholarDigital LibraryDigital Library
  374. Lo, J., L. Barroso, S. Eggers, K. Gharachorloo, H. Levy, and S. Parekh [1998]. "An analysis of database workload performance on simultaneous multithreaded processors," Proc. 25th Annual Int'l. Symposium on Computer Architecture (ISCA) , July 3-14, 1998, Barcelona, Spain, 39-50. Google ScholarGoogle Scholar
  375. Lo, J., S. Eggers, J. Emer, H. Levy, R. Stamm, and D. Tullsen [1997]. "Converting threadlevel parallelism into instruction-level parallelism via simultaneous multithreading," ACM Trans. on Computer Systems 15:2 (August), 322-354. Google ScholarGoogle ScholarDigital LibraryDigital Library
  376. Lovett, T., and S. Thakkar [1988]. "The Symmetry multiprocessor system," Proc. 1988 Int'l. Conf. of Parallel Processing , University Park, Penn., 303-310.Google ScholarGoogle Scholar
  377. Lubeck, O., J. Moore, and R. Mendez [1985]. "A benchmark comparison of three supercomputers: Fujitsu VP-200, Hitachi S810/20, and Cray X-MP/2," Computer 18:12 (December), 10-24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  378. Luk, C.-K., and T. C Mowry [1999]. "Automatic compiler-inserted prefetching for pointer-based applications," IEEE Trans. on Computers 48:2 (February), 134-141. Google ScholarGoogle Scholar
  379. Lunde, A. [1977]. "Empirical evaluation of some features of instruction set processor architecture," Communications of the ACM 20:3 (March), 143-152. Google ScholarGoogle ScholarDigital LibraryDigital Library
  380. Luszczek, P., J. J. Dongarra, D. Koester, R. Rabenseifner, B. Lucas, J. Kepner, J. McCalpin, D. Bailey, and D. Takahashi [2005]. "Introduction to the HPC challenge benchmark suite," Lawrence Berkeley National Laboratory, Paper LBNL-57493 (April 25), repositories.cdlib.org/lbnl/LBNL-57493.Google ScholarGoogle Scholar
  381. Maberly, N. C. [1966]. Mastering Speed Reading , New American Library, New York.Google ScholarGoogle Scholar
  382. Magenheimer, D. J., L. Peters, K. W. Pettis, and D. Zuras [1988]. "Integer multiplication and division on the HP precision architecture," IEEE Trans. on Computers 37:8, 980-990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  383. Mahlke, S. A., W. Y. Chen, W.-M. Hwu, B. R. Rau, and M. S. Schlansker [1992]. "Sentinel scheduling for VLIW and superscalar processors," Proc. Fifth Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 12-15, 1992, Boston, 238-247. Google ScholarGoogle Scholar
  384. Mahlke, S. A., R. E. Hank, J. E. McCormick, D. I. August, and W. W. Hwu [1995]. "A comparison of full and partial predicated execution support for ILP processors," Proc. 22nd Annual Int'l. Symposium on Computer Architecture (ISCA) , June 22-24, 1995, Santa Margherita, Italy, 138-149. Google ScholarGoogle Scholar
  385. Major, J. B. [1989]. "Are queuing models within the grasp of the unwashed?," Proc. Int'l. Conf. on Management and Performance Evaluation of Computer Systems , December 11-15, 1989, Reno, Nev., 831-839.Google ScholarGoogle Scholar
  386. Markstein, P. W. [1990]. "Computation of elementary functions on the IBM RISC System/6000 processor," IBM J. Research and Development 34:1, 111-119. Google ScholarGoogle ScholarDigital LibraryDigital Library
  387. Mathis, H. M., A. E. Mercias, J. D. McCalpin, R. J. Eickemeyer, and S. R. Kunkel [2005]. "Characterization of the multithreading (SMT) efficiency in Power5," IBM J. Research and Development , 49:4/5 (July/September), 555-564. Google ScholarGoogle ScholarCross RefCross Ref
  388. McCalpin, J. [2005]. "STREAM: Sustainable Memory Bandwidth in High Performance Computers," www.cs.virginia.edu/stream/.Google ScholarGoogle Scholar
  389. McCalpin, J., D. Bailey, and D. Takahashi [2005]. Introduction to the HPC Challenge Benchmark Suite , Paper LBNL-57493 Lawrence Berkeley National Laboratory, University of California, Berkeley, repositories.cdlib.org/lbnl/LBNL-57493.Google ScholarGoogle Scholar
  390. McCormick, J., and A. Knies [2002]. "A brief analysis of the SPEC CPU2000 benchmarks on the Intel Itanium 2 processor," paper presented at Hot Chips 14, August 18-20, 2002, Stanford University, Palo Alto, Calif.Google ScholarGoogle Scholar
  391. McFarling, S. [1989]. "Program optimization for instruction caches," Proc. Third Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , April 3-6, 1989, Boston, 183-191. Google ScholarGoogle Scholar
  392. McFarling, S. [1993]. Combining Branch Predictors , WRL Technical Note TN-36, Digital Western Research Laboratory, Palo Alto, Calif.Google ScholarGoogle Scholar
  393. McFarling, S., and J. Hennessy [1986]. "Reducing the cost of branches," Proc. 13th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 2-5, 1986, Tokyo, 396-403. Google ScholarGoogle Scholar
  394. McGhan, H., and M. O'Connor [1998]. "PicoJava: A direct execution engine for Java bytecode," Computer 31:10 (October), 22-30. Google ScholarGoogle ScholarDigital LibraryDigital Library
  395. McKeeman, W. M. [1967]. "Language directed computer design," Proc. AFIPS Fall Joint Computer Conf. , November 14-16, 1967, Washington, D.C., 413-417. Google ScholarGoogle Scholar
  396. McMahon, F. M. [1986]. " The Livermore FORTRAN Kernels: A Computer Test of Numerical Performance Range ," Tech. Rep. UCRL-55745, Lawrence Livermore National Laboratory, University of California, Livermore.Google ScholarGoogle Scholar
  397. McNairy, C., and D. Soltis [2003]. "Itanium 2 processor microarchitecture," IEEE Micro 23:2 (March-April), 44-55. Google ScholarGoogle ScholarDigital LibraryDigital Library
  398. Mead, C., and L. Conway [1980]. Introduction to VLSI Systems , Addison-Wesley, Reading, Mass. Google ScholarGoogle Scholar
  399. Mellor-Crummey, J. M., and M. L. Scott [1991]. "Algorithms for scalable synchronization on shared-memory multiprocessors," ACM Trans. on Computer Systems 9:1 (February), 21-65. Google ScholarGoogle ScholarDigital LibraryDigital Library
  400. Menabrea, L. F. [1842]. "Sketch of the analytical engine invented by Charles Babbage," Bibliothèque Universelle de Genève , 82 (October).Google ScholarGoogle Scholar
  401. Menon, A., J. Renato Santos, Y. Turner, G. Janakiraman, and W. Zwaenepoel [2005]. "Diagnosing performance overheads in the xen virtual machine environment," Proc. First ACM/USENIX Int'l. Conf. on Virtual Execution Environments , June 11-12, 2005, Chicago, 13-23. Google ScholarGoogle Scholar
  402. Merlin, P. M., and P. J. Schweitzer [1980]. "Deadlock avoidance in store-and-forward networks. Part I. Store-and-forward deadlock," IEEE Trans. on Communications COM-28:3 (March), 345-354.Google ScholarGoogle ScholarCross RefCross Ref
  403. Metcalfe, R. M. [1993]. "Computer/network interface design: Lessons from Arpanet and Ethernet," IEEE J. on Selected Areas in Communications 11:2 (February), 173-180. Google ScholarGoogle ScholarDigital LibraryDigital Library
  404. Metcalfe, R. M., and D. R. Boggs [1976]. "Ethernet: Distributed packet switching for local computer networks," Communications of the ACM 19:7 (July), 395-404. Google ScholarGoogle ScholarDigital LibraryDigital Library
  405. Metropolis, N., J. Howlett, and G. C. Rota (eds.) [1980]. A History of Computing in the Twentieth Century , Academic Press, New York. Google ScholarGoogle Scholar
  406. Meyer, R. A., and L. H. Seawright [1970]. A virtual machine time sharing system, IBM Systems J. 9:3, 199-218. Google ScholarGoogle ScholarDigital LibraryDigital Library
  407. Meyers, G. J. [1978]. "The evaluation of expressions in a storage-to-storage architecture," Computer Architecture News 7:3 (October), 20-23. Google ScholarGoogle Scholar
  408. Meyers, G. J. [1982]. Advances in Computer Architecture , 2nd ed., Wiley, New York. Micron. [2004]. "Calculating Memory System Power for DDR2," http://download. micron.com/pdf/pubs/designline/dl1Q04.pdf. Google ScholarGoogle Scholar
  409. Micron. [2006]. "The Micron® System-Power Calculator," http://www.micron.com/systemcalc.Google ScholarGoogle Scholar
  410. MIPS. [1997]. "MIPS16 Application Specific Extension Product Description," www.sgi.com/MIPS/arch/MIPS16/mips16.pdf.Google ScholarGoogle Scholar
  411. Miranker, G. S., J. Rubenstein, and J. Sanguinetti [1988]. "Squeezing a Cray-class supercomputer into a single-user package," Proc. IEEE COMPCON , February 29-March 4, 1988, San Francisco, 452-456.Google ScholarGoogle Scholar
  412. Mitchell, D. [1989]. "The Transputer: The time is now," Computer Design (RISC suppl.), 40-41.Google ScholarGoogle Scholar
  413. Mitsubishi. [1996]. Mitsubishi 32-Bit Single Chip Microcomputer M32R Family Software Manual , Mitsubishi, Cypress, Calif.Google ScholarGoogle Scholar
  414. Miura, K., and K. Uchida [1983]. "FACOM vector processing system: VP100/200," Proc. NATO Advanced Research Workshop on High-Speed Computing , June 20-22, 1983, Jülich, West Germany. Also appears in K. Hwang, ed., "Superprocessors: Design and applications," IEEE (August 1984), 59-73.Google ScholarGoogle Scholar
  415. Miya, E. N. [1985]. "Multiprocessor/distributed processing bibliography," Computer Architecture News 13:1, 27-29. Google ScholarGoogle ScholarDigital LibraryDigital Library
  416. Montoye, R. K., E. Hokenek, and S. L. Runyon [1990]. "Design of the IBM RISC System/6000 floating-point execution," IBM J. Research and Development 34:1, 59-70. Google ScholarGoogle ScholarDigital LibraryDigital Library
  417. Moore, B., A. Padegs, R. Smith, and W. Bucholz [1987]. "Concepts of the System/370 vector architecture," 14th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 2-5, 1987, Pittsburgh, Penn., 282-292. Google ScholarGoogle Scholar
  418. Moore, G. E. [1965]. "Cramming more components onto integrated circuits," Electronics , 38:8 (April 19), 114-117.Google ScholarGoogle Scholar
  419. Morse, S., B. Ravenal, S. Mazor, and W. Pohlman [1980]. "Intel microprocessors--8080 to 8086," Computer 13:10 (October). Google ScholarGoogle Scholar
  420. Moshovos, A., and G. S. Sohi [1997]. "Streamlining inter-operation memory communication via data dependence prediction," Proc. 30th Annual Int'l. Symposium on Microarchitecture , December 1-3, Research Triangle Park, N.C., 235-245. Google ScholarGoogle Scholar
  421. Moshovos, A., S. Breach, T. N. Vijaykumar, and G. S. Sohi [1997]. "Dynamic speculation and synchronization of data dependences," 24th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 2-4, 1997, Denver, Colo. Google ScholarGoogle Scholar
  422. Moussouris, J., L. Crudele, D. Freitas, C. Hansen, E. Hudson, S. Przybylski, T. Riordan, and C. Rowen [1986]. "A CMOS RISC processor with integrated system functions," Proc. IEEE COMPCON , March 3-6, 1986, San Francisco, 191.Google ScholarGoogle Scholar
  423. Mowry, T. C., S. Lam, and A. Gupta [1992]. "Design and evaluation of a compiler algorithm for prefetching," Proc. Fifth Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 12-15, 1992, Boston ( SIGPLAN Notices 27:9 (September), 62-73). Google ScholarGoogle Scholar
  424. MSN Money. [2005]. "Amazon Shares Tumble after Rally Fizzles," http://moneycentral .msn.com/content/CNBCTV/Articles/Dispatches/P133695.asp.Google ScholarGoogle Scholar
  425. Muchnick, S. S. [1988]. "Optimizing compilers for SPARC," Sun Technology 1:3 (Summer), 64-77.Google ScholarGoogle Scholar
  426. Mueller, M., L. C. Alves, W. Fischer, M. L. Fair, and I. Modi [1999]. "RAS strategy for IBM S/390 G5 and G6," IBM J. Research and Development 43:5-6 (September-November), 875-888. Google ScholarGoogle ScholarDigital LibraryDigital Library
  427. Mukherjee, S. S., C. Weaver, J. S. Emer, S. K. Reinhardt, and T. M. Austin [2003]. "Measuring architectural vulnerability factors," IEEE Micro 23:6, 70-75. Google ScholarGoogle ScholarDigital LibraryDigital Library
  428. Murphy, B., and T. Gent [1995]. "Measuring system and software reliability using an automated data collection process," Quality and Reliability Engineering International 11:5 (September-October), 341-353.Google ScholarGoogle ScholarCross RefCross Ref
  429. Myer, T. H., and I. E. Sutherland [1968]. "On the design of display processors," Communications of the ACM 11:6 (June), 410-414. Google ScholarGoogle ScholarDigital LibraryDigital Library
  430. Narayanan, D., E. Thereska, A. Donnelly, S. Elnikety, and A. Rowstron [2009]. "Migrating server storage to SSDs: Analysis of trade-offs," Proc. 4th ACM European Conf. on Computer Systems , April 1-3, 2009, Nuremberg, Germany. Google ScholarGoogle Scholar
  431. National Research Council. [1997]. The Evolution of Untethered Communications , Computer Science and Telecommunications Board, National Academy Press, Washington, D.C. Google ScholarGoogle Scholar
  432. National Storage Industry Consortium. [1998]. "Tape Roadmap," www.nsic.org.Google ScholarGoogle Scholar
  433. Nelson, V. P. [1990]. "Fault-tolerant computing: Fundamental concepts," Computer 23:7 (July), 19-25. Google ScholarGoogle ScholarDigital LibraryDigital Library
  434. Ngai, T.-F., and M. J. Irwin [1985]. "Regular, area-time efficient carry-lookahead adders," Proc. Seventh IEEE Symposium on Computer Arithmetic , June 4-6, 1985, University of Illinois, Urbana, 9-15.Google ScholarGoogle Scholar
  435. Nicolau, A., and J. A. Fisher [1984]. "Measuring the parallelism available for very long instruction word architectures," IEEE Trans. on Computers C-33:11 (November), 968-976. Google ScholarGoogle ScholarDigital LibraryDigital Library
  436. Nikhil, R. S., G. M. Papadopoulos, and Arvind [1992]. "*T: A multithreaded massively parallel architecture," Proc. 19th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 19-21, 1992, Gold Coast, Australia, 156-167. Google ScholarGoogle Scholar
  437. Noordergraaf, L., and R. van der Pas [1999]. "Performance experiences on Sun's WildFire prototype," Proc. ACM/IEEE Conf. on Supercomputing , November 13-19, 1999, Portland, Ore. Google ScholarGoogle Scholar
  438. Nyberg, C. R., T. Barclay, Z. Cvetanovic, J. Gray, and D. Lomet [1994]. "AlphaSort: A RISC machine sort," Proc. ACM SIGMOD , May 24-27, 1994, Minneapolis, Minn. Google ScholarGoogle Scholar
  439. Oka, M., and M. Suzuoki [1999]. "Designing and programming the emotion engine," IEEE Micro 19:6 (November-December), 20-28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  440. Okada, S., S. Okada, Y. Matsuda, T. Yamada, and A. Kobayashi [1999]. "System on a chip for digital still camera," IEEE Trans. on Consumer Electronics 45:3 (August), 584-590. Google ScholarGoogle ScholarDigital LibraryDigital Library
  441. Oliker, L., A. Canning, J. Carter, J. Shalf, and S. Ethier [2004]. "Scientific computations on modern parallel vector systems," Proc. ACM/IEEE Conf. on Supercomputing , November 6-12, 2004, Pittsburgh, Penn., 10. Google ScholarGoogle Scholar
  442. Pabst, T. [2000]. "Performance Showdown at 133 MHz FSB--The Best Platform for Coppermine," www6.tomshardware.com/mainboard/00q1/000302/.Google ScholarGoogle Scholar
  443. Padua, D., and M. Wolfe [1986]. "Advanced compiler optimizations for supercomputers," Communications of the ACM 29:12 (December), 1184-1201. Google ScholarGoogle ScholarDigital LibraryDigital Library
  444. Palacharla, S., and R. E. Kessler [1994]. "Evaluating stream buffers as a secondary cache replacement," Proc. 21st Annual Int'l. Symposium on Computer Architecture (ISCA) , April 18-21, 1994, Chicago, 24-33. Google ScholarGoogle Scholar
  445. Palmer, J., and S. Morse [1984]. The 8087 Primer , John Wiley & Sons, New York, 93.Google ScholarGoogle Scholar
  446. Pan, S.-T., K. So, and J. T. Rameh [1992]. "Improving the accuracy of dynamic branch prediction using branch correlation," Proc. Fifth Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 12-15, 1992, Boston, 76-84. Google ScholarGoogle Scholar
  447. Partridge, C. [1994]. Gigabit Networking , Addison-Wesley, Reading, Mass. Google ScholarGoogle Scholar
  448. Patterson, D. [1985]. "Reduced instruction set computers," Communications of the ACM 28:1 (January), 8-21. Google ScholarGoogle ScholarDigital LibraryDigital Library
  449. Patterson, D. [2004]. "Latency lags bandwidth," Communications of the ACM 47:10 (October), 71-75. Google ScholarGoogle ScholarDigital LibraryDigital Library
  450. Patterson, D. A., and D. R. Ditzel [1980]. "The case for the reduced instruction set computer," Computer Architecture News 8:6 (October), 25-33. Google ScholarGoogle ScholarDigital LibraryDigital Library
  451. Patterson, D. A., and J. L. Hennessy [2004]. Computer Organization and Design: The Hardware/Software Interface , 3rd ed., Morgan Kaufmann, San Francisco. Google ScholarGoogle Scholar
  452. Patterson, D. A., G. A. Gibson, and R. H. Katz [1987]. A Case for Redundant Arrays of Inexpensive Disks (RAID) , Tech. Rep. UCB/CSD 87/391, University of California, Berkeley. Also appeared in Proc. ACM SIGMOD , June 1-3, 1988, Chicago, 109-116. Google ScholarGoogle ScholarDigital LibraryDigital Library
  453. Patterson, D. A., P. Garrison, M. Hill, D. Lioupis, C. Nyberg, T. Sippel, and K. Van Dyke [1983]. "Architecture of a VLSI instruction cache for a RISC," 10th Annual Int'l. Conf. on Computer Architecture Conf. Proc. , June 13-16, 1983, Stockholm, Sweden, 108-116. Google ScholarGoogle Scholar
  454. Pavan, P., R. Bez, P. Olivo, and E. Zanoni [1997]. "Flash memory cells--an overview." Proc. IEEE 85:8 (August), 1248-1271.Google ScholarGoogle ScholarCross RefCross Ref
  455. Peh, L. S., and W. J. Dally [2001]. "A delay model and speculative architecture for pipelined routers," Proc. 7th Int'l. Symposium on High-Performance Computer Architecture , January 22-24, 2001, Monterrey, Mexico. Google ScholarGoogle Scholar
  456. Peng, V., S. Samudrala, and M. Gavrielov [1987]. "On the implementation of shifters, multipliers, and dividers in VLSI floating point units," Proc. 8th IEEE Symposium on Computer Arithmetic , May 19-21, 1987, Como, Italy, 95-102.Google ScholarGoogle Scholar
  457. Pfister, G. F. [1998]. In Search of Clusters , 2nd ed., Prentice Hall, Upper Saddle River, N. J. Google ScholarGoogle ScholarDigital LibraryDigital Library
  458. Pfister, G. F., W. C. Brantley, D. A. George, S. L. Harvey, W. J. Kleinfekder, K. P. McAuliffe, E. A. Melton, V. A. Norton, and J. Weiss [1985]. "The IBM research parallel processor prototype (RP3): Introduction and architecture," Proc. 12th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 17-19, 1985, Boston, Mass., 764-771.Google ScholarGoogle Scholar
  459. Pinheiro, E., W. D. Weber, and L. A. Barroso [2007]. "Failure trends in a large disk drive population," Proc. 5th USENIX Conference on File and Storage Technologies (FAST '07) , February 13-16, 2007, San Jose, Calif. Google ScholarGoogle ScholarDigital LibraryDigital Library
  460. Pinkston, T. M. [2004]. "Deadlock characterization and resolution in interconnection networks," in M. C. Zhu and M. P. Fanti, eds., Deadlock Resolution in Computer-Integrated Systems , CRC Press, Boca Raton, FL, 445-492.Google ScholarGoogle Scholar
  461. Pinkston, T. M., and J. Shin [2005]. "Trends toward on-chip networked microsystems," Int'l. J. of High Performance Computing and Networking 3:1, 3-18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  462. Pinkston, T. M., and S. Warnakulasuriya [1997]. "On deadlocks in interconnection networks," 24th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 2-4, 1997, Denver, Colo. Google ScholarGoogle Scholar
  463. Pinkston, T. M., A. Benner, M. Krause, I. Robinson, and T. Sterling [2003]. "InfiniBand: The 'de facto' future standard for system and local area networks or just a scalable replacement for PCI buses?" Cluster Computing (special issue on communication architecture for clusters) 6:2 (April), 95-104. Google ScholarGoogle Scholar
  464. Postiff, M. A., D. A. Greene, G. S. Tyson, and T. N. Mudge [1999]. "The limits of instruction level parallelism in SPEC95 applications," Computer Architecture News 27:1 (March), 31-40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  465. Przybylski, S. A. [1990]. Cache Design: A Performance-Directed Approach , Morgan Kaufmann, San Francisco. Google ScholarGoogle Scholar
  466. Przybylski, S. A., M. Horowitz, and J. L. Hennessy [1988]. "Performance trade-offs in cache design," 15th Annual Int'l. Symposium on Computer Architecture , May 30-June 2, 1988, Honolulu, Hawaii, 290-298. Google ScholarGoogle Scholar
  467. Puente, V., R. Beivide, J. A. Gregorio, J. M. Prellezo, J. Duato, and C. Izu [1999]. "Adaptive bubble router: A design to improve performance in torus networks," Proc. 28th Int'l. Conference on Parallel Processing , September 21-24, 1999, Aizu-Wakamatsu, Fukushima, Japan. Google ScholarGoogle Scholar
  468. Radin, G. [1982]. "The 801 minicomputer," Proc. Symposium Architectural Support for Programming Languages and Operating Systems (ASPLOS) , March 1-3, 1982, Palo Alto, Calif., 39-47. Google ScholarGoogle Scholar
  469. Rajesh Bordawekar, Uday Bondhugula, Ravi Rao: Believe it or not!: mult-core CPUs can match GPU performance for a FLOP-intensive application! 19th International Conference on Parallel Architecture and Compilation Techniques (PACT 2010), Vienna, Austria, September 11-15, 2010: 537-538. Google ScholarGoogle Scholar
  470. Ramamoorthy, C. V., and H. F. Li [1977]. "Pipeline architecture," ACM Computing Surveys 9:1 (March), 61-102. Google ScholarGoogle ScholarDigital LibraryDigital Library
  471. Ranganathan, P., P. Leech, D. Irwin, and J. Chase [2006]. "Ensemble-Level Power Management for Dense Blade Servers," Proc. 33rd Annual Int'l. Symposium on Computer Architecture (ISCA) , June 17-21, 2006, Boston, Mass., 66-77. Google ScholarGoogle Scholar
  472. Rau, B. R. [1994]. "Iterative modulo scheduling: An algorithm for software pipelining loops," Proc. 27th Annual Int'l. Symposium on Microarchitecture , November 30-December 2, 1994, San Jose, Calif., 63-74. Google ScholarGoogle Scholar
  473. Rau, B. R., C. D. Glaeser, and R. L. Picard [1982]. "Efficient code generation for horizontal architectures: Compiler techniques and architectural support," Proc. Ninth Annual Int'l. Symposium on Computer Architecture (ISCA) , April 26-29, 1982, Austin, Tex., 131-139. Google ScholarGoogle Scholar
  474. Rau, B. R., D. W. L. Yen, W. Yen, and R. A. Towle [1989]. "The Cydra 5 departmental supercomputer: Design philosophies, decisions, and trade-offs," IEEE Computers 22:1 (January), 12-34. Google ScholarGoogle ScholarDigital LibraryDigital Library
  475. Reddi, V. J., B. C. Lee, T. Chilimbi, and K. Vaid [2010]. "Web search using mobile cores: Quantifying and mitigating the price of efficiency," Proc. 37th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 19-23, 2010, Saint-Malo, France. Google ScholarGoogle Scholar
  476. Redmond, K. C., and T. M. Smith [1980]. Project Whirlwind--The History of a Pioneer Computer , Digital Press, Boston. Google ScholarGoogle Scholar
  477. Reinhardt, S. K., J. R. Larus, and D. A. Wood [1994]. "Tempest and Typhoon: User-level shared memory," 21st Annual Int'l. Symposium on Computer Architecture (ISCA) , April 18-21, 1994, Chicago, 325-336. Google ScholarGoogle Scholar
  478. Reinman, G., and N. P. Jouppi. [1999]. "Extensions to CACTI," research.compaq.com/wrl/people/jouppi/CACTI.html.Google ScholarGoogle Scholar
  479. Rettberg, R. D., W. R. Crowther, P. P. Carvey, and R. S. Towlinson [1990]. "The Monarch parallel processor hardware design," IEEE Computer 23:4 (April), 18-30. Google ScholarGoogle ScholarDigital LibraryDigital Library
  480. Riemens, A., K. A. Vissers, R. J. Schutten, F. W. Sijstermans, G. J. Hekstra, and G. D. La Hei [1999]. "Trimedia CPU64 application domain and benchmark suite," Proc. IEEE Int'l. Conf. on Computer Design: VLSI in Computers and Processors (ICCD'99) , October 10-13, 1999, Austin, Tex., 580-585. Google ScholarGoogle Scholar
  481. Riseman, E. M., and C. C. Foster [1972]. "Percolation of code to enhance paralled dispatching and execution," IEEE Trans. on Computers C-21:12 (December), 1411-1415. Google ScholarGoogle Scholar
  482. Robin, J., and C. Irvine [2000]. "Analysis of the Intel Pentium's ability to support a secure virtual machine monitor." Proc. USENIX Security Symposium , August 14-17, 2000, Denver, Colo. Google ScholarGoogle ScholarCross RefCross Ref
  483. Robinson, B., and L. Blount [1986]. The VM/HPO 3880-23 Performance Results , IBM Tech. Bulletin GG66-0247-00, IBM Washington Systems Center, Gaithersburg, Md.Google ScholarGoogle Scholar
  484. Ropers, A., H. W. Lollman, and J. Wellhausen [1999]. DSPstone: Texas Instruments TMS320C54x , Tech. Rep. IB 315 1999/9-ISS-Version 0.9, Aachen University of Technology, Aaachen, Germany (www.ert.rwth-aachen.de/Projekte/Tools/coal/dspstone_c54x/index.html).Google ScholarGoogle Scholar
  485. Rosenblum, M., S. A. Herrod, E. Witchel, and A. Gupta [1995]. "Complete computer simulation: The SimOS approach," in IEEE Parallel and Distributed Technology (now called Concurrency ) 4:3, 34-43. Google ScholarGoogle Scholar
  486. Rowen, C., M. Johnson, and P. Ries [1988]. "The MIPS R3010 floating-point coprocessor," IEEE Micro 8:3 (June), 53-62. Google ScholarGoogle ScholarDigital LibraryDigital Library
  487. Russell, R. M. [1978]. "The Cray-1 processor system," Communications of the ACM 21:1 (January), 63-72. Google ScholarGoogle ScholarDigital LibraryDigital Library
  488. Rymarczyk, J. [1982]. "Coding guidelines for pipelined processors," Proc. Symposium Architectural Support for Programming Languages and Operating Systems (ASPLOS) , March 1-3, 1982, Palo Alto, Calif., 12-19. Google ScholarGoogle Scholar
  489. Saavedra-Barrera, R. H. [1992]. "CPU Performance Evaluation and Execution Time Prediction Using Narrow Spectrum Benchmarking," Ph. D. dissertation, University of California, Berkeley. Google ScholarGoogle Scholar
  490. Salem, K., and H. Garcia-Molina [1986]. "Disk striping," Proc. 2nd Int'l. IEEE Conf. on Data Engineering , February 5-7, 1986, Washington, D.C., 249-259. Google ScholarGoogle Scholar
  491. Saltzer, J. H., D. P. Reed, and D. D. Clark [1984]. "End-to-end arguments in system design," ACM Trans. on Computer Systems 2:4 (November), 277-288. Google ScholarGoogle ScholarDigital LibraryDigital Library
  492. Samples, A. D., and P. N. Hilfinger [1988]. Code Reorganization for Instruction Caches , Tech. Rep. UCB/CSD 88/447, University of California, Berkeley. Google ScholarGoogle Scholar
  493. Santoro, M. R., G. Bewick, and M. A. Horowitz [1989]. "Rounding algorithms for IEEE multipliers," Proc. Ninth IEEE Symposium on Computer Arithmetic , September 6-8, Santa Monica, Calif., 176-183.Google ScholarGoogle Scholar
  494. Satran, J., D. Smith, K. Meth, C. Sapuntzakis, M. Wakeley, P. Von Stamwitz, R. Haagens, E. Zeidner, L. Dalle Ore, and Y. Klein [2001]. "iSCSI," IPS Working Group of IETF, Internet draft www.ietf.org/internet-drafts/draft-ietf-ips-iscsi-07.txt.Google ScholarGoogle Scholar
  495. Saulsbury, A., T. Wilkinson, J. Carter, and A. Landin [1995]. "An argument for Simple COMA," Proc. First IEEE Symposium on High-Performance Computer Architectures , January 22-25, 1995, Raleigh, N.C., 276-285. Google ScholarGoogle Scholar
  496. Schneck, P. B. [1987]. Superprocessor Architecture , Kluwer Academic Publishers, Norwell, Mass.Google ScholarGoogle Scholar
  497. Schroeder, B., and G. A. Gibson [2007]. "Understanding failures in petascale computers," J. of Physics Conf. Series 78(1), 188-198.Google ScholarGoogle Scholar
  498. Schroeder, B., E. Pinheiro, and W.-D. Weber [2009]. "DRAM errors in the wild: a largescale field study," Proc. Eleventh Int'l. Joint Conf. on Measurement and Modeling of Computer Systems (SIGMETRICS) , June 15-19, 2009, Seattle, Wash. Google ScholarGoogle Scholar
  499. Schurman, E., and J. Brutlag [2009]. "The user and business impact of server delays," Proc. Velocity: Web Performance and Operations Conf. , June 22-24, 2009, San Jose, Calif.Google ScholarGoogle Scholar
  500. Schwartz, J. T. [1980]. "Ultracomputers," ACM Trans. on Programming Languages and Systems 4:2, 484-521. Google ScholarGoogle ScholarDigital LibraryDigital Library
  501. Scott, N. R. [1985]. Computer Number Systems and Arithmetic , Prentice Hall, Englewood Cliffs, N. J. Google ScholarGoogle Scholar
  502. Scott, S. L. [1996]. "Synchronization and communication in the T3E multiprocessor," Seventh Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 1-5, 1996, Cambridge, Mass. Google ScholarGoogle Scholar
  503. Scott, S. L., and J. Goodman [1994]. "The impact of pipelined channels on k -ary n -cube networks," IEEE Trans. on Parallel and Distributed Systems 5:1 (January), 1-16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  504. Scott, S. L., and G. M. Thorson [1996]. "The Cray T3E network: Adaptive routing in a high performance 3D torus," Proc. IEEE HOT Interconnects '96 , August 15-17, 1996, Stanford University, Palo Alto, Calif., 14-156.Google ScholarGoogle Scholar
  505. Scranton, R. A., D. A. Thompson, and D. W. Hunter [1983]. The Access Time Myth ," Tech. Rep. RC 10197 (45223), IBM, Yorktown Heights, N.Y.Google ScholarGoogle Scholar
  506. Seagate. [2000]. Seagate Cheetah 73 Family: ST173404LW/LWV/LC/LCV Product Manual , Vol. 1, Seagate, Scotts Valley, Calif. (www.seagate.com/support/disc/manuals/scsi/29478b.pdf).Google ScholarGoogle Scholar
  507. Seitz, C. L. [1985]. "The Cosmic Cube (concurrent computing)," Communications of the ACM 28:1 (January), 22-33. Google ScholarGoogle ScholarDigital LibraryDigital Library
  508. Senior, J. M. [1993]. Optical Fiber Commmunications: Principles and Practice , 2nd ed., Prentice Hall, Hertfordshire, U. K. Google ScholarGoogle Scholar
  509. Sharangpani, H., and K. Arora [2000]. "Itanium Processor Microarchitecture," IEEE Micro 20:5 (September-October), 24-43. Google ScholarGoogle ScholarDigital LibraryDigital Library
  510. Shurkin, J. [1984]. Engines of the Mind: A History of the Computer , W. W. Norton, New York. Google ScholarGoogle Scholar
  511. Shustek, L. J. [1978]. "Analysis and Performance of Computer Instruction Sets," Ph. D. dissertation, Stanford University, Palo Alto, Calif. Google ScholarGoogle Scholar
  512. Silicon Graphics. [1996]. MIPS V Instruction Set (see http://www.sgi.com/MIPS/arch/ISA5/#MIPSV_indx).Google ScholarGoogle Scholar
  513. Singh, J. P., J. L. Hennessy, and A. Gupta [1993]. "Scaling parallel programs for multiprocessors: Methodology and examples," Computer 26:7 (July), 22-33. Google ScholarGoogle ScholarDigital LibraryDigital Library
  514. Sinharoy, B., R. N. Koala, J. M. Tendler, R. J. Eickemeyer, and J. B. Joyner [2005]. "POWER5 system microarchitecture," IBM J. Research and Development , 49:4-5, 505-521. Google ScholarGoogle ScholarCross RefCross Ref
  515. Sites, R. [1979]. Instruction Ordering for the CRAY-1 Computer , Tech. Rep. 78-CS-023, Dept. of Computer Science, University of California, San Diego.Google ScholarGoogle Scholar
  516. Sites, R. L. (ed.) [1992]. Alpha Architecture Reference Manual , Digital Press, Burlington, Mass. Google ScholarGoogle Scholar
  517. Sites, R. L., and R. Witek, (eds.) [1995]. Alpha Architecture Reference Manual , 2nd ed., Digital Press, Newton, Mass. Google ScholarGoogle Scholar
  518. Skadron, K., and D. W. Clark [1997]. "Design issues and tradeoffs for write buffers," Proc. Third Int'l. Symposium on High-Performance Computer Architecture , February 1-5, 1997, San Antonio, Tex., 144-155. Google ScholarGoogle Scholar
  519. Skadron, K., P. S. Ahuja, M. Martonosi, and D. W. Clark [1999]. "Branch prediction, instruction-window size, and cache size: Performance tradeoffs and simulation techniques," IEEE Trans. on Computers 48:11 (November). Google ScholarGoogle ScholarDigital LibraryDigital Library
  520. Slater, R. [1987]. Portraits in Silicon , MIT Press, Cambridge, Mass. Google ScholarGoogle Scholar
  521. Slotnick, D. L., W. C. Borck, and R. C. McReynolds [1962]. "The Solomon computer," Proc. AFIPS Fall Joint Computer Conf. , December 4-6, 1962, Philadelphia, Penn., 97-107. Google ScholarGoogle Scholar
  522. Smith, A. J. [1982]. "Cache memories," Computing Surveys 14:3 (September), 473-530. Google ScholarGoogle ScholarDigital LibraryDigital Library
  523. Smith, A., and J. Lee [1984]. "Branch prediction strategies and branch-target buffer design," Computer 17:1 (January), 6-22. Google ScholarGoogle Scholar
  524. Smith, B. J. [1978]. "A pipelined, shared resource MIMD computer," Proc. Int'l. Conf. on Parallel Processing (ICPP) , August, Bellaire, Mich., 6-8.Google ScholarGoogle Scholar
  525. Smith, B. J. [1981]. "Architecture and applications of the HEP multiprocessor system," Real-Time Signal Processing IV 298 (August), 241-248.Google ScholarGoogle Scholar
  526. Smith, J. E. [1981]. "A study of branch prediction strategies," Proc. Eighth Annual Int'l. Symposium on Computer Architecture (ISCA) , May 12-14, 1981, Minneapolis, Minn., 135-148. Google ScholarGoogle Scholar
  527. Smith, J. E. [1984]. "Decoupled access/execute computer architectures," ACM Trans. on Computer Systems 2:4 (November), 289-308. Google ScholarGoogle ScholarDigital LibraryDigital Library
  528. Smith, J. E. [1988]. "Characterizing computer performance with a single number," Communications of the ACM 31:10 (October), 1202-1206. Google ScholarGoogle ScholarDigital LibraryDigital Library
  529. Smith, J. E. [1989]. "Dynamic instruction scheduling and the Astronautics ZS-1," Computer 22:7 (July), 21-35. Google ScholarGoogle ScholarDigital LibraryDigital Library
  530. Smith, J. E., and J. R. Goodman [1983]. "A study of instruction cache organizations and replacement policies," Proc. 10th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 5-7, 1982, Stockholm, Sweden, 132-137. Google ScholarGoogle Scholar
  531. Smith, J. E., and A. R. Pleszkun [1988]. "Implementing precise interrupts in pipelined processors," IEEE Trans. on Computers 37:5 (May), 562-573. (This paper is based on an earlier paper that appeared in Proc. 12th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 17-19, 1985, Boston, Mass.) Google ScholarGoogle Scholar
  532. Smith, J. E., G. E. Dermer, B. D. Vanderwarn, S. D. Klinger, C. M. Rozewski, D. L. Fowler, K. R. Scidmore, and J. P. Laudon [1987]. "The ZS-1 central processor," Proc. Second Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 5-8, 1987, Palo Alto, Calif., 199-204. Google ScholarGoogle Scholar
  533. Smith, M. D., M. Horowitz, and M. S. Lam [1992]. "Efficient superscalar performance through boosting," Proc. Fifth Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 12-15, 1992, Boston, 248-259. Google ScholarGoogle Scholar
  534. Smith, M. D., M. Johnson, and M. A. Horowitz [1989]. "Limits on multiple instruction issue," Proc. Third Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , April 3-6, 1989, Boston, 290-302. Google ScholarGoogle Scholar
  535. Smotherman, M. [1989]. "A sequencing-based taxonomy of I/O systems and review of historical machines," Computer Architecture News 17:5 (September), 5-15. Reprinted in Computer Architecture Readings , M. D. Hill, N. P. Jouppi, and G. S. Sohi, eds., Morgan Kaufmann, San Francisco, 1999, 451-461. Google ScholarGoogle ScholarDigital LibraryDigital Library
  536. Sodani, A., and G. Sohi [1997]. "Dynamic instruction reuse," Proc. 24th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 2-4, 1997, Denver, Colo. Google ScholarGoogle Scholar
  537. Sohi, G. S. [1990]. "Instruction issue logic for high-performance, interruptible, multiple functional unit, pipelined computers," IEEE Trans. on Computers 39:3 (March), 349-359. Google ScholarGoogle ScholarDigital LibraryDigital Library
  538. Sohi, G. S., and S. Vajapeyam [1989]. "Tradeoffs in instruction format design for horizontal architectures," Proc. Third Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , April 3-6, 1989, Boston, 15-25. Google ScholarGoogle Scholar
  539. Soundararajan, V., M. Heinrich, B. Verghese, K. Gharachorloo, A. Gupta, and J. L. Hennessy [1998]. "Flexible use of memory for replication/migration in cachecoherent DSM multiprocessors," Proc. 25th Annual Int'l. Symposium on Computer Architecture (ISCA) , July 3-14, 1998, Barcelona, Spain, 342-355. Google ScholarGoogle Scholar
  540. SPEC. [1989]. SPEC Benchmark Suite Release 1.0 (October 2).Google ScholarGoogle Scholar
  541. SPEC. [1994]. SPEC Newsletter (June).Google ScholarGoogle Scholar
  542. Sporer, M., F. H. Moss, and C. J. Mathais [1988]. "An introduction to the architecture of the Stellar Graphics supercomputer," Proc. IEEE COMPCON , February 29-March 4, 1988, San Francisco, 464.Google ScholarGoogle Scholar
  543. Spurgeon, C. [2001]. "Charles Spurgeon's Ethernet Web Site," wwwhost.ots.utexas.edu/ethernet/ethernet-home.html.Google ScholarGoogle Scholar
  544. Spurgeon, C. [2006]. "Charles Spurgeon's Ethernet Web SITE," www.ethermanage.com/ethernet/ethernet.html.Google ScholarGoogle Scholar
  545. Stenstrom, P., T. Joe, and A. Gupta [1992]. "Comparative performance evaluation of cache-coherent NUMA and COMA architectures," Proc. 19th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 19-21, 1992, Gold Coast, Australia, 80-91. Google ScholarGoogle Scholar
  546. Sterling, T. [2001]. Beowulf PC Cluster Computing with Windows and Beowulf PC Cluster Computing with Linux , MIT Press, Cambridge, Mass. Google ScholarGoogle Scholar
  547. Stern, N. [1980]. "Who invented the first electronic digital computer?" Annals of the History of Computing 2:4 (October), 375-376.Google ScholarGoogle Scholar
  548. Stevens, W. R. [1994-1996]. TCP/IP Illustrated (three volumes), Addison-Wesley, Reading, Mass.Google ScholarGoogle Scholar
  549. Stokes, J. [2000]. "Sound and Vision: A Technical Overview of the Emotion Engine," arstechnica.com/reviews/1q00/playstation2/ee-1.html.Google ScholarGoogle Scholar
  550. Stone, H. [1991]. High Performance Computers , Addison-Wesley, New York.Google ScholarGoogle Scholar
  551. Strauss, W. [1998]. "DSP Strategies 2002," www.usadata.com/market_research/spr_05/spr_r127-005.htm.Google ScholarGoogle Scholar
  552. Strecker, W. D. [1976]. "Cache memories for the PDP-11?," Proc. Third Annual Int'l. Symposium on Computer Architecture (ISCA) , January 19-21, 1976, Tampa, Fla., 155-158. Google ScholarGoogle Scholar
  553. Strecker, W. D. [1978]. "VAX-11/780: A virtual address extension of the PDP-11 family," Proc. AFIPS National Computer Conf. , June 5-8, 1978, Anaheim, Calif., 47, 967-980.Google ScholarGoogle Scholar
  554. Sugumar, R. A., and S. G. Abraham [1993]. "Efficient simulation of caches under optimal replacement with applications to miss characterization," Proc. ACM SIGMETRICS Conf. on Measurement and Modeling of Computer Systems , May 17-21, 1993, Santa Clara, Calif., 24-35. Google ScholarGoogle Scholar
  555. Sun Microsystems. [1989]. The SPARC Architectural Manual , Version 8, Part No. 8001399-09, Sun Microsystems, Santa Clara, Calif.Google ScholarGoogle Scholar
  556. Sussenguth, E. [1999]. "IBM's ACS-1 Machine," IEEE Computer 22:11 (November).Google ScholarGoogle Scholar
  557. Swan, R. J., S. H. Fuller, and D. P. Siewiorek [1977]. "Cm*--a modular, multimicroprocessor," Proc. AFIPS National Computing Conf. , June 13-16, 1977, Dallas, Tex., 637-644. Google ScholarGoogle Scholar
  558. Swan, R. J., A. Bechtolsheim, K. W. Lai, and J. K. Ousterhout [1977]. "The implementation of the Cm* multi-microprocessor," Proc. AFIPS National Computing Conf. , June 13-16, 1977, Dallas, Tex., 645-654. Google ScholarGoogle Scholar
  559. Swartzlander, E. (ed.) [1990]. Computer Arithmetic , IEEE Computer Society Press, Los Alamitos, Calif. Google ScholarGoogle Scholar
  560. Takagi, N., H. Yasuura, and S. Yajima [1985]."High-speed VLSI multiplication algorithm with a redundant binary addition tree," IEEE Trans. on Computers C-34:9, 789-796. Google ScholarGoogle Scholar
  561. Talagala, N. [2000]. "Characterizing Large Storage Systems: Error Behavior and Performance Benchmarks," Ph. D. dissertation, Computer Science Division, University of California, Berkeley. Google ScholarGoogle Scholar
  562. Talagala, N., and D. Patterson [1999]. An Analysis of Error Behavior in a Large Storage System , Tech. Report UCB//CSD-99-1042, Computer Science Division, University of California, Berkeley. Google ScholarGoogle Scholar
  563. Talagala, N., R. Arpaci-Dusseau, and D. Patterson [2000]. Micro-Benchmark Based Extraction of Local and Global Disk Characteristics , CSD-99-1063, Computer Science Division, University of California, Berkeley. Google ScholarGoogle Scholar
  564. Talagala, N., S. Asami, D. Patterson, R. Futernick, and D. Hart [2000]. "The art of massive storage: A case study of a Web image archive," Computer (November). Google ScholarGoogle Scholar
  565. Tamir, Y., and G. Frazier [1992]. "Dynamically-allocated multi-queue buffers for VLSI communication switches," IEEE Trans. on Computers 41:6 (June), 725-734. Google ScholarGoogle ScholarDigital LibraryDigital Library
  566. Tanenbaum, A. S. [1978]. "Implications of structured programming for machine architecture," Communications of the ACM 21:3 (March), 237-246. Google ScholarGoogle ScholarDigital LibraryDigital Library
  567. Tanenbaum, A. S. [1988]. Computer Networks , 2nd ed., Prentice Hall, Englewood Cliffs, N. J. Google ScholarGoogle Scholar
  568. Tang, C. K. [1976]. "Cache design in the tightly coupled multiprocessor system," Proc. AFIPS National Computer Conf. , June 7-10, 1976, New York, 749-753. Google ScholarGoogle Scholar
  569. Tanqueray, D. [2002]. "The Cray X1 and supercomputer road map," Proc. 13th Daresbury Machine Evaluation Workshop , December 11-12, 2002, Daresbury Laboratories, Daresbury, Cheshire, U. K.Google ScholarGoogle Scholar
  570. Tarjan, D., S. Thoziyoor, and N. Jouppi [2005]. "HPL Technical Report on CACTI 4.0," www.hpl.hp.com/techeports/2006/HPL=2006+86.html.Google ScholarGoogle Scholar
  571. Taylor, G. S. [1981]. "Compatible hardware for division and square root," Proc. 5th IEEE Symposium on Computer Arithmetic , May 18-19, 1981, University of Michigan, Ann Arbor, Mich., 127-134.Google ScholarGoogle Scholar
  572. Taylor, G. S. [1985]. "Radix 16 SRT dividers with overlapped quotient selection stages," Proc. Seventh IEEE Symposium on Computer Arithmetic , June 4-6, 1985, University of Illinois, Urbana, Ill., 64-71.Google ScholarGoogle Scholar
  573. Taylor, G., P. Hilfinger, J. Larus, D. Patterson, and B. Zorn [1986]. "Evaluation of the SPUR LISP architecture," Proc. 13th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 2-5, 1986, Tokyo. Google ScholarGoogle Scholar
  574. Taylor, M. B., W. Lee, S. P. Amarasinghe, and A. Agarwal [2005]. "Scalar operand networks," IEEE Trans. on Parallel and Distributed Systems 16:2 (February), 145-162. Google ScholarGoogle ScholarDigital LibraryDigital Library
  575. Tendler, J. M., J. S. Dodson, J. S. Fields, Jr., H. Le, and B. Sinharoy [2002]. "Power4 system microarchitecture," IBM J. Research and Development 46:1, 5-26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  576. Texas Instruments. [2000]. "History of Innovation: 1980s," www.ti.com/corp/docs/company/history/1980s.shtml.Google ScholarGoogle Scholar
  577. Tezzaron Semiconductor. [2004]. Soft Errors in Electronic Memory , White Paper,Google ScholarGoogle Scholar
  578. Tezzaron Semiconductor, Naperville, Ill. (http://www.tezzaron.com/about/papers/soft_errors_1_1_secure.pdf).Google ScholarGoogle Scholar
  579. Thacker, C. P., E. M. McCreight, B. W. Lampson, R. F. Sproull, and D. R. Boggs [1982]. "Alto: A personal computer," in D. P. Siewiorek, C. G. Bell, and A. Newell, eds., Computer Structures: Principles and Examples , McGraw-Hill, New York, 549-572.Google ScholarGoogle Scholar
  580. Thadhani, A. J. [1981]. "Interactive user productivity," IBM Systems J. 20:4, 407-423. Google ScholarGoogle ScholarDigital LibraryDigital Library
  581. Thekkath, R., A. P. Singh, J. P. Singh, S. John, and J. L. Hennessy [1997]. "An evaluation of a commercial CC-NUMA architecture--the CONVEX Exemplar SPP1200," Proc. 11th Int'l. Parallel Processing Symposium (IPPS) , April 1-7, 1997, Geneva, Switzerland. Google ScholarGoogle ScholarCross RefCross Ref
  582. Thorlin, J. F. [1967]. "Code generation for PIE (parallel instruction execution) computers," Proc. Spring Joint Computer Conf. , April 18-20, 1967, Atlantic City, N. J., 27. Google ScholarGoogle Scholar
  583. Thornton, J. E. [1964]. "Parallel operation in the Control Data 6600," Proc. AFIPS Fall Joint Computer Conf. , Part II , October 27-29, 1964, San Francisco, 26, 33-40. Google ScholarGoogle Scholar
  584. Thornton, J. E. [1970]. Design of a Computer, the Control Data 6600 , Scott, Foresman, Glenview, Ill. Google ScholarGoogle Scholar
  585. Tjaden, G. S., and M. J. Flynn [1970]. "Detection and parallel execution of independent instructions," IEEE Trans. on Computers C-19:10 (October), 889-895. Google ScholarGoogle ScholarDigital LibraryDigital Library
  586. Tomasulo, R. M. [1967]. "An efficient algorithm for exploiting multiple arithmetic units," IBM J. Research and Development 11:1 (January), 25-33. Google ScholarGoogle ScholarDigital LibraryDigital Library
  587. Torrellas, J., A. Gupta, and J. Hennessy [1992]. "Characterizing the caching and synchronization performance of a multiprocessor operating system," Proc. Fifth Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 12-15, 1992, Boston ( SIGPLAN Notices 27:9 (September), 162-174). Google ScholarGoogle Scholar
  588. Touma, W. R. [1993]. The Dynamics of the Computer Industry: Modeling the Supply of Workstations and Their Components , Kluwer Academic, Boston. Google ScholarGoogle Scholar
  589. Tuck, N., and D. Tullsen [2003]. "Initial observations of the simultaneous multithreading Pentium 4 processor," Proc. 12th Int. Conf. on Parallel Architectures and Compilation Techniques (PACT'03 ), September 27-October 1, 2003, New Orleans, La., 26-34. Google ScholarGoogle Scholar
  590. Tullsen, D. M., S. J. Eggers, and H. M. Levy [1995]. "Simultaneous multithreading: Maximizing on-chip parallelism," Proc. 22nd Annual Int'l. Symposium on Computer Architecture (ISCA) , June 22-24, 1995, Santa Margherita, Italy, 392-403. Google ScholarGoogle Scholar
  591. Tullsen, D. M., S. J. Eggers, J. S. Emer, H. M. Levy, J. L. Lo, and R. L. Stamm [1996]. "Exploiting choice: Instruction fetch and issue on an implementable simultaneous multithreading processor," Proc. 23rd Annual Int'l. Symposium on Computer Architecture (ISCA) , May 22-24, 1996, Philadelphia, Penn., 191-202. Google ScholarGoogle Scholar
  592. Ungar, D., R. Blau, P. Foley, D. Samples, and D. Patterson [1984]. "Architecture of SOAR: Smalltalk on a RISC," Proc. 11th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 5-7, 1984, Ann Arbor, Mich., 188-197. Google ScholarGoogle Scholar
  593. Unger, S. H. [1958]. "A computer oriented towards spatial problems," Proc. Institute of Radio Engineers 46:10 (October), 1744-1750. Google ScholarGoogle Scholar
  594. Vahdat, A., M. Al-Fares, N. Farrington, R. Niranjan Mysore, G. Porter, and S. Radhakrishnan [2010]. "Scale-Out Networking in the Data Center," IEEE Micro 30:4 (July/August), 29-41. Google ScholarGoogle ScholarDigital LibraryDigital Library
  595. Vaidya, A. S., A Sivasubramaniam, and C. R. Das [1997]. "Performance benefits of virtual channels and adaptive routing: An application-driven study," Proc. ACM/IEEE Conf. on Supercomputing , November 16-21, 1997, San Jose, Calif. Google ScholarGoogle ScholarDigital LibraryDigital Library
  596. Vajapeyam, S. [1991]. "Instruction-Level Characterization of the Cray Y-MP Processor," Ph. D. thesis, Computer Sciences Department, University of Wisconsin-Madison. Google ScholarGoogle Scholar
  597. van Eijndhoven, J. T. J., F. W. Sijstermans, K. A. Vissers, E. J. D. Pol, M. I. A. Tromp, P. Struik, R. H. J. Bloks, P. van der Wolf, A. D. Pimentel, and H. P. E. Vranken [1999]. "Trimedia CPU64 architecture," Proc. IEEE Int'l. Conf. on Computer Design: VLSI in Computers and Processors (ICCD'99) , October 10-13, 1999, Austin, Tex., 586-592. Google ScholarGoogle Scholar
  598. Van Vleck, T. [2005]. "The IBM 360/67 and CP/CMS," http://www.multicians.org/thvv/360-67.html.Google ScholarGoogle Scholar
  599. von Eicken, T., D. E. Culler, S. C. Goldstein, and K. E. Schauser [1992]. "Active Messages: A mechanism for integrated communication and computation," Proc. 19th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 19-21, 1992, Gold Coast, Australia. Google ScholarGoogle Scholar
  600. Waingold, E., M. Taylor, D. Srikrishna, V. Sarkar, W. Lee, V. Lee, J. Kim, M. Frank, P. Finch, R. Barua, J. Babb, S. Amarasinghe, and A. Agarwal [1997]. "Baring it all to software: Raw Machines," IEEE Computer 30 (September), 86-93. Google ScholarGoogle ScholarDigital LibraryDigital Library
  601. Wakerly, J. [1989]. Microcomputer Architecture and Programming , Wiley, New York. Google ScholarGoogle Scholar
  602. Wall, D. W. [1991]. "Limits of instruction-level parallelism," Proc. Fourth Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , April 8-11, 1991, Palo Alto, Calif., 248-259. Google ScholarGoogle Scholar
  603. Wall, D. W. [1993]. Limits of Instruction-Level Parallelism , Research Rep. 93/6, Western Research Laboratory, Digital Equipment Corp., Palo Alto, Calif.Google ScholarGoogle Scholar
  604. Walrand, J. [1991]. Communication Networks: A First Course , Aksen Associates/Irwin, Homewood, Ill. Google ScholarGoogle Scholar
  605. Wang, W.-H., J.-L. Baer, and H. M. Levy [1989]. "Organization and performance of a two-level virtual-real cache hierarchy," Proc. 16th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 28-June 1, 1989, Jerusalem, 140-148. Google ScholarGoogle ScholarCross RefCross Ref
  606. Watanabe, T. [1987]. "Architecture and performance of the NEC supercomputer SX system," Parallel Computing 5, 247-255.Google ScholarGoogle ScholarCross RefCross Ref
  607. Waters, F. (ed.) [1986]. IBM RT Personal Computer Technology , SA 23-1057, IBM, Austin, Tex.Google ScholarGoogle Scholar
  608. Watson, W. J. [1972]. "The TI ASC--a highly modular and flexible super processor architecture," Proc. AFIPS Fall Joint Computer Conf. , December 5-7, 1972, Anaheim, Calif., 221-228. Google ScholarGoogle Scholar
  609. Weaver, D. L., and T. Germond [1994]. The SPARC Architectural Manual , Version 9, Prentice Hall, Englewood Cliffs, N. J. Google ScholarGoogle Scholar
  610. Weicker, R. P. [1984]. "Dhrystone: A synthetic systems programming benchmark," Communications of the ACM 27:10 (October), 1013-1030. Google ScholarGoogle ScholarDigital LibraryDigital Library
  611. Weiss, S., and J. E. Smith [1984]. "Instruction issue logic for pipelined supercomputers," Proc. 11th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 5-7, 1984, Ann Arbor, Mich., 110-118. Google ScholarGoogle Scholar
  612. Weiss, S., and J. E. Smith [1987]. "A study of scalar compilation techniques for pipelined supercomputers," Proc. Second Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 5-8, 1987, Palo Alto, Calif., 105-109. Google ScholarGoogle Scholar
  613. Weiss, S., and J. E. Smith [1994]. Power and PowerPC , Morgan Kaufmann, San Francisco. Google ScholarGoogle Scholar
  614. Wendel, D., R. Kalla, J. Friedrich, J. Kahle, J. Leenstra, C. Lichtenau, B. Sinharoy, W. Starke, and V. Zyuban [2010]. "The Power7 processor SoC," Proc. Int'l. Conf. on IC Design and Technology , June 2-4, 2010, Grenoble, France, 71-73.Google ScholarGoogle Scholar
  615. Weste, N., and K. Eshraghian [1993]. Principles of CMOS VLSI Design: A Systems Perspective , 2nd ed., Addison-Wesley, Reading, Mass.Google ScholarGoogle Scholar
  616. Wiecek, C. [1982]. "A case study of the VAX 11 instruction set usage for compiler execution," Proc. Symposium on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , March 1-3, 1982, Palo Alto, Calif., 177-184. Google ScholarGoogle ScholarDigital LibraryDigital Library
  617. Wilkes, M. [1965]. "Slave memories and dynamic storage allocation," IEEE Trans. Electronic Computers EC-14:2 (April), 270-271.Google ScholarGoogle ScholarCross RefCross Ref
  618. Wilkes, M. V. [1982]. "Hardware support for memory protection: Capability implementations," Proc. Symposium on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , March 1-3, 1982, Palo Alto, Calif., 107-116. Google ScholarGoogle Scholar
  619. Wilkes, M. V. [1985]. Memoirs of a Computer Pioneer , MIT Press, Cambridge, Mass. Google ScholarGoogle Scholar
  620. Wilkes, M. V. [1995]. Computing Perspectives , Morgan Kaufmann, San Francisco. Google ScholarGoogle Scholar
  621. Wilkes, M. V., D. J. Wheeler, and S. Gill [1951]. The Preparation of Programs for an Electronic Digital Computer , Addison-Wesley, Cambridge, Mass.Google ScholarGoogle Scholar
  622. Williams, S., A. Waterman, and D. Patterson [2009]. "Roofline: An insightful visual performance model for multicore architectures," Communications of the ACM , 52:4 (April), 65-76. Google ScholarGoogle ScholarDigital LibraryDigital Library
  623. Williams, T. E., M. Horowitz, R. L. Alverson, and T. S. Yang [1987]. "A self-timed chip for division," in P. Losleben, ed., 1987 Stanford Conference on Advanced Research in VLSI , MIT Press, Cambridge, Mass.Google ScholarGoogle Scholar
  624. Wilson, A. W., Jr. [1987]. "Hierarchical cache/bus architecture for shared-memory multiprocessors," Proc. 14th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 2-5, 1987, Pittsburgh, Penn., 244-252. Google ScholarGoogle Scholar
  625. Wilson, R. P., and M. S. Lam [1995]. "Efficient context-sensitive pointer analysis for C programs," Proc. ACM SIGPLAN'95 Conf. on Programming Language Design and Implementation , June 18-21, 1995, La Jolla, Calif., 1-12. Google ScholarGoogle Scholar
  626. Wolfe, A., and J. P. Shen [1991]. "A variable instruction stream extension to the VLIW architecture," Proc. Fourth Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , April 8-11, 1991, Palo Alto, Calif., 2-14. Google ScholarGoogle Scholar
  627. Wood, D. A., and M. D. Hill [1995]. "Cost-effective parallel computing," IEEE Computer 28:2 (February), 69-72. Google ScholarGoogle ScholarDigital LibraryDigital Library
  628. Wulf, W. [1981]. "Compilers and computer architecture," Computer 14:7 (July), 41-47. Google ScholarGoogle ScholarDigital LibraryDigital Library
  629. Wulf, W., and C. G. Bell [1972]. "C.mmp--A multi-mini-processor," Proc. AFIPS Fall Joint Computer Conf. , December 5-7, 1972, Anaheim, Calif., 765-777. Google ScholarGoogle Scholar
  630. Wulf, W., and S. P. Harbison [1978]. "Reflections in a pool of processors--an experience report on C.mmp/Hydra," Proc. AFIPS National Computing Conf. June 5-8, 1978, Anaheim, Calif., 939-951.Google ScholarGoogle Scholar
  631. Wulf, W. A., and S. A. McKee [1995]. "Hitting the memory wall: Implications of the obvious," ACM SIGARCH Computer Architecture News , 23:1 (March), 20-24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  632. Wulf, W. A., R. Levin, and S. P. Harbison [1981]. Hydra/C.mmp: An Experimental Computer System , McGraw-Hill, New York.Google ScholarGoogle Scholar
  633. Yamamoto, W., M. J. Serrano, A. R. Talcott, R. C. Wood, and M. Nemirosky [1994]. "Performance estimation of multistreamed, superscalar processors," Proc. 27th Annual Hawaii Int'l. Conf. on System Sciences , January 4-7, 1994, Maui, 195-204.Google ScholarGoogle Scholar
  634. Yang, Y., and G. Mason [1991]. "Nonblocking broadcast switching networks," IEEE Trans. on Computers 40:9 (September), 1005-1015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  635. Yeager, K. [1996]. "The MIPS R10000 superscalar microprocessor," IEEE Micro 16:2 (April), 28-40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  636. Yeh, T., and Y. N. Patt [1993a]. "Alternative implementations of two-level adaptive branch prediction," Proc. 19th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 19-21, 1992, Gold Coast, Australia, 124-134. Google ScholarGoogle Scholar
  637. Yeh, T., and Y. N. Patt [1993b]. "A comparison of dynamic branch predictors that use two levels of branch history," Proc. 20th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 16-19, 1993, San Diego, Calif., 257-266. Google ScholarGoogle Scholar

Cited By

  1. ACM
    Venkatesha S and Parthasarathi R (2024). Survey on Redundancy Based-Fault tolerance methods for Processors and Hardware accelerators - Trends in Quantum Computing, Heterogeneous Systems and Reliability, ACM Computing Surveys, 56:11, (1-76), Online publication date: 30-Nov-2024.
  2. Elderhalli Y, Hasan O and Tahar S (2024). Dynamic dependability analysis of shuffle-exchange networks, Formal Methods in System Design, 62:1-3, (285-325), Online publication date: 1-Jun-2024.
  3. ACM
    Fu X, Yang W, Dong D and Su X Optimizing Attention by Exploiting Data Reuse on ARM Multi-core CPUs Proceedings of the 38th ACM International Conference on Supercomputing, (137-149)
  4. Mosquera F, Ekanayake A, Hua W, Kavi K, Mehta G and John L (2024). SecurityCloak, Journal of Systems Architecture: the EUROMICRO Journal, 150:C, Online publication date: 1-May-2024.
  5. ACM
    Miao X, Oliaro G, Zhang Z, Cheng X, Wang Z, Zhang Z, Wong R, Zhu A, Yang L, Shi X, Shi C, Chen Z, Arfeen D, Abhyankar R and Jia Z SpecInfer: Accelerating Large Language Model Serving with Tree-based Speculative Inference and Verification Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3, (932-949)
  6. Ottaviano A, Balas R, Bambini G, Del Vecchio A, Ciani M, Rossi D, Benini L and Bartolini A (2024). ControlPULP: A RISC-V On-Chip Parallel Power Controller for Many-Core HPC Processors with FPGA-Based Hardware-In-The-Loop Power and Thermal Emulation, International Journal of Parallel Programming, 52:1-2, (93-123), Online publication date: 1-Apr-2024.
  7. ACM
    Wang Z, Liu L and Xiao L (2024). iSwap: A New Memory Page Swap Mechanism for Reducing Ineffective I/O Operations in Cloud Environments, ACM Transactions on Architecture and Code Optimization, 0:0
  8. ACM
    Zhou C, Hassman Z, Shah D, Richard V and Li Y YFlows: Systematic Dataflow Exploration and Code Generation for Efficient Neural Network Inference using SIMD Architectures on CPUs Proceedings of the 33rd ACM SIGPLAN International Conference on Compiler Construction, (212-226)
  9. Chang L, Zhao X, Yue T, Yang X, Li C, Lin S and Zhou J (2024). IPOCIM: Artificial Intelligent Architecture Design Space Exploration With Scalable Ping-Pong Computing-in-Memory Macro, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 32:2, (256-268), Online publication date: 1-Feb-2024.
  10. Nicolás-Conesa V, Titos-Gil R, Fernández-Pascual R, Ros A and Acacio M (2024). On the interactions between ILP and TLP with hardware transactional memory, Microprocessors & Microsystems, 104:C, Online publication date: 1-Feb-2024.
  11. ACM
    Yang S, Dong C, Xiao Y, Cheng Y, Shi Z, Li Z and Sun L (2023). Asteria-Pro: Enhancing Deep Learning-based Binary Code Similarity Detection by Incorporating Domain Knowledge, ACM Transactions on Software Engineering and Methodology, 33:1, (1-40), Online publication date: 31-Jan-2024.
  12. Jiang Z, Yang K, Fisher N, Guan N, Audsley N and Dong Z (2024). Hopscotch: A Hardware-Software Co-Design for Efficient Cache Resizing on Multi-Core SoCs, IEEE Transactions on Parallel and Distributed Systems, 35:1, (89-104), Online publication date: 1-Jan-2024.
  13. Yang J, Yang Z, Casas J and Ray S (2024). Correct-by-Construction Design of Custom Accelerator Microarchitectures, IEEE Transactions on Computers, 73:1, (278-291), Online publication date: 1-Jan-2024.
  14. Dong P, Kong Z, Meng X, Yu P, Gong Y, Yuan G, Tang H and Wang Y HotBEV Proceedings of the 37th International Conference on Neural Information Processing Systems, (2824-2836)
  15. Li Y, Li N, Zhang Y, Guo J, Huang B, Xing M and Huang W Hmem: A Holistic Memory Performance Metric for Cloud Computing Benchmarking, Measuring, and Optimizing, (171-187)
  16. Jiang Z, Dai X, Wei R, Gray I, Gu Z, Zhao Q and Zhao S (2023). NPRC-I/O: An NoC-Based Real-Time I/O System With Reduced Contention and Enhanced Predictability, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 42:12, (4629-4642), Online publication date: 1-Dec-2023.
  17. ACM
    Lee J, Min D, Byun I, Jang H and Kim J Fast, Light-weight, and Accurate Performance Evaluation using Representative Datacenter Behaviors Proceedings of the 24th International Middleware Conference, (220-233)
  18. ACM
    Cheshmi K, Strout M and Mehri Dehnavi M Runtime Composition of Iterations for Fusing Loop-carried Sparse Dependence Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, (1-15)
  19. ACM
    Peng J, Fang J, Liu J, Xie M, Dai Y, Yang B, Li S and Wang Z Optimizing MPI Collectives on Shared Memory Multi-Cores Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, (1-15)
  20. Lin Z, Liang T, Zhao J, Sinha S and Zhang W (2023). HL-Pow: Learning-Assisted Pre-RTL Power Modeling and Optimization for FPGA HLS, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 42:11, (3925-3938), Online publication date: 1-Nov-2023.
  21. ACM
    Moghimi A, Hattori J, Li A, Ben Chikha M and Shahrad M Parrotfish Proceedings of the 2023 ACM Symposium on Cloud Computing, (177-192)
  22. ACM
    Zeng J, Jeong J and Jung C Persistent Processor Architecture Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, (1075-1091)
  23. Villon L, Susskind Z, Bacellar A, Miranda I, de Araújo L, Lima P, Breternitz M, John L, França F and Dutra D (2023). A conditional branch predictor based on weightless neural networks, Neurocomputing, 555:C, Online publication date: 28-Oct-2023.
  24. Kong X, Zheng X, Zhu Y, Duan G and Chen Z (2023). I/O-efficient GPU-based acceleration of coherent dedispersion for pulsar observation, Journal of Systems Architecture: the EUROMICRO Journal, 142:C, Online publication date: 1-Sep-2023.
  25. ACM
    Kong L, Tan J, Huang J, Chen G, Wang S, Jin X, Zeng P, Khan M and Das S (2022). Edge-computing-driven Internet of Things: A Survey, ACM Computing Surveys, 55:8, (1-41), Online publication date: 31-Aug-2023.
  26. Chen C, Kande R, Nguyen N, Andersen F, Tyagi A, Sadeghi A and Rajendran J HyPFuzz Proceedings of the 32nd USENIX Conference on Security Symposium, (1361-1378)
  27. ACM
    Min D, Kim K, Moon C, Khan A, Lee S, Yun C, Chung W and Kim Y (2023). A Multi-tenant Key-value SSD with Secondary Index for Search Query Processing and Analysis, ACM Transactions on Embedded Computing Systems, 22:4, (1-27), Online publication date: 31-Jul-2023.
  28. ACM
    Naghibijouybari H, Koruyeh E and Abu-Ghazaleh N (2022). Microarchitectural Attacks in Heterogeneous Systems: A Survey, ACM Computing Surveys, 55:7, (1-40), Online publication date: 31-Jul-2023.
  29. Orts F, Ortega G, Combarro E, Rúa I, Puertas A and Garzón E (2023). Efficient design of a quantum absolute-value circuit using Clifford+T gates, The Journal of Supercomputing, 79:11, (12656-12670), Online publication date: 1-Jul-2023.
  30. Khanna G, Chaturvedi S and Othman M (2023). On design and performance analysis of improved shuffle exchange gamma interconnection network layouts, The Journal of Supercomputing, 79:11, (11611-11640), Online publication date: 1-Jul-2023.
  31. Li X, Parazeres M, Oberman A, Ghaffari A, Asgharian M and Nia V (2023). EuclidNets: An Alternative Operation for Efficient Inference of Deep Learning Models, SN Computer Science, 4:5, Online publication date: 30-Jun-2023.
  32. ACM
    Resch S, Cilasun H, Chowdhury Z, Zabihi M, Zhao Z, Wang J, Sapatnekar S and Karpuzcu U On Endurance of Processing in (Nonvolatile) Memory Proceedings of the 50th Annual International Symposium on Computer Architecture, (1-13)
  33. ACM
    Friedman R, Goaz O and Hovav D PKache: A Generic Framework for Data Plane Caching Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing, (1268-1276)
  34. ACM
    Mhatre S and Chandran P On the Measurement of Performance Metrics for Virtualization-Enhanced Architectures Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing, (49-56)
  35. ACM
    Araújo De Medeiros D, Markidis S and Bo Peng I LibCOS: Enabling Converged HPC and Cloud Data Stores with MPI Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region, (106-116)
  36. ACM
    Hessien S and Hassan M (2022). PISCOT: A Pipelined Split-Transaction COTS-Coherent Bus for Multi-Core Real-Time Systems, ACM Transactions on Embedded Computing Systems, 22:1, (1-27), Online publication date: 31-Jan-2023.
  37. Sahabandu D, Mertoguno J and Poovendran R (2023). A Natural Language Processing Approach for Instruction Set Architecture Identification, IEEE Transactions on Information Forensics and Security, 18, (4086-4099), Online publication date: 1-Jan-2023.
  38. Xu M, Ng W, Lim W, Kang J, Xiong Z, Niyato D, Yang Q, Shen X and Miao C (2023). A Full Dive Into Realizing the Edge-Enabled Metaverse: Visions, Enabling Technologies, and Challenges, IEEE Communications Surveys & Tutorials, 25:1, (656-700), Online publication date: 1-Jan-2023.
  39. Neto A, Neto J and Moreno E (2022). The development of a low-cost big data cluster using Apache Hadoop and Raspberry Pi. A complete guide, Computers and Electrical Engineering, 104:PA, Online publication date: 1-Dec-2022.
  40. Kopper P, Copplestone S, Pfeiffer M, Koch C, Fasoulas S and Beck A (2022). Hybrid parallelization of Euler–Lagrange simulations based on MPI-3 shared memory, Advances in Engineering Software, 174:C, Online publication date: 1-Dec-2022.
  41. Han Y, Yuan Z, Pu Y, Xue C, Song S, Sun G and Huang G Latency-aware spatial-wise dynamic networks Proceedings of the 36th International Conference on Neural Information Processing Systems, (36845-36857)
  42. Song C, Wright S, Lin C and Diakonikolas J Coordinate linear variance reduction for generalized linear programming Proceedings of the 36th International Conference on Neural Information Processing Systems, (22049-22063)
  43. Du X, Chen A, He B, Chen H, Zhang F and Chen Y (2022). AflIot, Computers and Security, 122:C, Online publication date: 1-Nov-2022.
  44. Bang T, May N, Petrov I and Binnig C (2022). The full story of 1000 cores, The VLDB Journal — The International Journal on Very Large Data Bases, 31:6, (1185-1213), Online publication date: 1-Nov-2022.
  45. ACM
    Gebregiorgis A, Du Nguyen H, Yu J, Bishnoi R, Taouil M, Catthoor F and Hamdioui S (2022). A Survey on Memory-centric Computer Architectures, ACM Journal on Emerging Technologies in Computing Systems, 18:4, (1-50), Online publication date: 31-Oct-2022.
  46. Zhang J, Cheng Y, Deng X, Wang B, Xie J, Yang Y and Zhang M (2022). A Reputation-Based Mechanism for Transaction Processing in Blockchain Systems, IEEE Transactions on Computers, 71:10, (2423-2434), Online publication date: 1-Oct-2022.
  47. Jeong I, Lee J, Yoon M and Ro W Reconstructing Out-of-Order Issue Queue Proceedings of the 55th Annual IEEE/ACM International Symposium on Microarchitecture, (144-161)
  48. ACM
    Resch S, Khatamifard S, Chowdhury Z, Zabihi M, Zhao Z, Cilasun H, Wang J, Sapatnekar S and Karpuzcu U (2022). Energy-efficient and Reliable Inference in Nonvolatile Memory under Extreme Operating Conditions, ACM Transactions on Embedded Computing Systems, 21:5, (1-36), Online publication date: 30-Sep-2022.
  49. ACM
    Baldassin A, Barreto J, Castro D and Romano P (2021). Persistent Memory, ACM Computing Surveys, 54:7, (1-37), Online publication date: 30-Sep-2022.
  50. ACM
    Resch S and Karpuzcu U (2021). Benchmarking Quantum Computers and the Impact of Quantum Noise, ACM Computing Surveys, 54:7, (1-35), Online publication date: 30-Sep-2022.
  51. Jiang Z, Yang K, Fisher N, Audsley N and Dong Z (2022). Towards an energy-efficient quarter-clairvoyant mixed-criticality system, Journal of Systems Architecture: the EUROMICRO Journal, 130:C, Online publication date: 1-Sep-2022.
  52. Rosenbloom P Thoughts on Architecture Artificial General Intelligence, (364-373)
  53. Mahafzah B, Al-Adwan A and Zaghloul R (2022). Topological properties assessment of optoelectronic architectures, Telecommunications Systems, 80:4, (599-627), Online publication date: 1-Aug-2022.
  54. ACM
    Beckmann N, Gibbons P and McGuffey C Brief Announcement: Spatial Locality and Granularity Change in Caching Proceedings of the 34th ACM Symposium on Parallelism in Algorithms and Architectures, (173-175)
  55. ACM
    Wu N, Yang H, Xie Y, Li P and Hao C High-level synthesis performance prediction using GNNs Proceedings of the 59th ACM/IEEE Design Automation Conference, (49-54)
  56. Orts F, Ortega G, Filatovas E and M. Garzón E (2022). Implementation of three efficient 4-digit fault-tolerant quantum carry lookahead adders, The Journal of Supercomputing, 78:11, (13323-13341), Online publication date: 1-Jul-2022.
  57. ACM
    Mbongue J, Kwadjo D, Shuping A and Bobda C (2022). Deploying Multi-tenant FPGAs within Linux-based Cloud Infrastructure, ACM Transactions on Reconfigurable Technology and Systems, 15:2, (1-31), Online publication date: 30-Jun-2022.
  58. ACM
    Paul A, Choi J, Karimi A and Wang F Machine Learning Assisted HPC Workload Trace Generation for Leadership Scale Storage Systems Proceedings of the 31st International Symposium on High-Performance Parallel and Distributed Computing, (199-212)
  59. ACM
    Shukla S, Bandishte S, Gaur J and Subramoney S Register file prefetching Proceedings of the 49th Annual International Symposium on Computer Architecture, (410-423)
  60. Gudaparthi S and Shrestha R (2022). Selective register-file cache: an energy saving technique for embedded processor architecture, Design Automation for Embedded Systems, 26:2, (105-124), Online publication date: 1-Jun-2022.
  61. ACM
    Xiong W and Szefer J (2021). Survey of Transient Execution Attacks and Their Mitigations, ACM Computing Surveys, 54:3, (1-36), Online publication date: 30-Apr-2022.
  62. Li Y, Yu X, Yang Y, Zhou Y, Yang T, Ma Z and Chen S (2021). Pyramid Family: Generic Frameworks for Accurate and Fast Flow Size Measurement, IEEE/ACM Transactions on Networking, 30:2, (586-600), Online publication date: 1-Apr-2022.
  63. Arras P, Andronidis A, Pina L, Mituzas K, Shu Q, Grumberg D and Cadar C (2022). SaBRe: load-time selective binary rewriting, International Journal on Software Tools for Technology Transfer (STTT), 24:2, (205-223), Online publication date: 1-Apr-2022.
  64. Guerrero-Balaguera J, Condia J and Reorda M A compaction method for STLs for GPU in-field test Proceedings of the 2022 Conference & Exhibition on Design, Automation & Test in Europe, (454-459)
  65. Bura A, Rengarajan D, Kalathil D, Shakkottai S and Chamberland J (2021). Learning to Cache and Caching to Learn: Regret Analysis of Caching Algorithms, IEEE/ACM Transactions on Networking, 30:1, (18-31), Online publication date: 1-Feb-2022.
  66. Jiang Z, Dong P, Wei R, Zhao Q, Wang Y, Zhu D, Zhuang Y and Audsley N (2022). PSpSys, Journal of Systems Architecture: the EUROMICRO Journal, 123:C, Online publication date: 1-Feb-2022.
  67. Berg B, Whitehouse J, Moseley B, Wang W and Harchol-Balter M (2021). The case for phase-aware scheduling of parallelizable jobs, Performance Evaluation, 153:C, Online publication date: 1-Feb-2022.
  68. ACM
    Wang M, Wen C and Chao H (2021). Roadrunner+: An Autonomous Intersection Management Cooperating with Connected Autonomous Vehicles and Pedestrians with Spillback Considered, ACM Transactions on Cyber-Physical Systems, 6:1, (1-29), Online publication date: 31-Jan-2022.
  69. ACM
    Gade S and Deb S (2021). A Novel Hybrid Cache Coherence with Global Snooping for Many-core Architectures, ACM Transactions on Design Automation of Electronic Systems, 27:1, (1-31), Online publication date: 31-Jan-2022.
  70. ACM
    Moti N, Schimmelpfennig F, Salkhordeh R, Klopp D, Cortes T, Rückert U and Brinkmann A Simurgh Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, (1-14)
  71. Chowdhury S, Yang K and Nuzzo P ReIGNN: State Register Identification Using Graph Neural Networks for Circuit Reverse Engineering 2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD), (1-9)
  72. ACM
    Zeitak A and Morrison A Cuckoo Trie Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles, (147-162)
  73. ACM
    Moreira A, Ottoni G and Quintão Pereira F (2021). VESPA: static profiling for binary optimization, Proceedings of the ACM on Programming Languages, 5:OOPSLA, (1-28), Online publication date: 20-Oct-2021.
  74. ACM
    Zhang M, Xie L, Zhang Z, Yu Q, Xi G, Zhang H, Liu F, Zheng Y, Zheng Y and Zhang S Exploiting Different Levels of Parallelism in the Quantum Control Microarchitecture for Superconducting Qubits MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture, (898-911)
  75. ACM
    LeMay M, Rakshit J, Deutsch S, Durham D, Ghosh S, Nori A, Gaur J, Weiler A, Sultana S, Grewal K and Subramoney S Cryptographic Capability Computing MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture, (253-267)
  76. ACM
    Carvalho D and Seznec A (2021). Understanding Cache Compression, ACM Transactions on Architecture and Code Optimization, 18:3, (1-27), Online publication date: 30-Sep-2021.
  77. Wu Y, Li J, Dai H, Yi X, Wang Y and Yang X micROS.BT: An Event-Driven Behavior Tree Framework for Swarm Robots 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (9146-9153)
  78. Nair A, Pai A, Raveendran B and Patil G MOESI Proceedings of the 2021 IEEE/ACM 25th International Symposium on Distributed Simulation and Real Time Applications, (1-8)
  79. Das A, Jose J and Mishra P (2021). Data Criticality in Multithreaded Applications: An Insight for Many-Core Systems, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 29:9, (1675-1679), Online publication date: 1-Sep-2021.
  80. Zhang J, Zhou X, Ge T, Wang X and Hwang T (2021). Joint Task Scheduling and Containerizing for Efficient Edge Computing, IEEE Transactions on Parallel and Distributed Systems, 32:8, (2086-2100), Online publication date: 1-Aug-2021.
  81. ACM
    Kim H, Amarnath A, Bagherzadeh J, Talati N and Dreslinski R (2021). A Survey Describing Beyond Si Transistors and Exploring Their Implications for Future Processors, ACM Journal on Emerging Technologies in Computing Systems, 17:3, (1-44), Online publication date: 31-Jul-2021.
  82. ACM
    Min D and Kim Y Isolating namespace and performance in key-value SSDs for multi-tenant environments Proceedings of the 13th ACM Workshop on Hot Topics in Storage and File Systems, (8-13)
  83. Chen J, Lu C, Ni J, Guo X, Girard P and Cheng Y (2021). DOVA PRO: A Dynamic Overwriting Voltage Adjustment Technique for STT-MRAM L1 Cache Considering Dielectric Breakdown Effect, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 29:7, (1325-1334), Online publication date: 1-Jul-2021.
  84. ACM
    Mustard C, Goswami S, Gharavi N, Nider J, Beschastnikh I and Fedorova A Jumpgate Proceedings of the 14th ACM International Conference on Systems and Storage, (1-12)
  85. ACM
    Skarlatos D, Zhao Z, Paccagnella R, Fletcher C and Torrellas J Jamais vu: thwarting microarchitectural replay attacks Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, (1061-1076)
  86. ACM
    C. A, Lee W and Lin W Branchboozle Proceedings of the 36th Annual ACM Symposium on Applied Computing, (1617-1625)
  87. Bazzaz M, Hoseinghorban A and Ejlali A (2021). Fast and Predictable Non-Volatile Data Memory for Real-Time Embedded Systems, IEEE Transactions on Computers, 70:3, (359-371), Online publication date: 1-Mar-2021.
  88. Zhou C, Wu W, He H, Yang P, Lyu F, Cheng N and Shen X (2021). Deep Reinforcement Learning for Delay-Oriented IoT Task Scheduling in SAGIN, IEEE Transactions on Wireless Communications, 20:2, (911-925), Online publication date: 1-Feb-2021.
  89. Schuiki F, Zaruba F, Hoefler T and Benini L (2021). Stream Semantic Registers: A Lightweight RISC-V ISA Extension Achieving Full Compute Utilization in Single-Issue Cores, IEEE Transactions on Computers, 70:2, (212-227), Online publication date: 1-Feb-2021.
  90. Parra P, Guzmán D, Polo Ó, da Silva A, Martínez A, Sánchez S and Prieto M (2021). Improving performance and determinism of multitasking systems on the LEON architecture, Microprocessors & Microsystems, 80:C, Online publication date: 1-Feb-2021.
  91. ACM
    Salehnamadi N, Alshayban A, Ahmed I and Malek S ER catcher Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, (324-335)
  92. Park H, Ahn H and Jung S (2020). A Novel Matchline Scheduling Method for Low-Power and Reliable Search Operation in Cross-Point-Array Nonvolatile Ternary CAM, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 28:12, (2650-2657), Online publication date: 1-Dec-2020.
  93. Mozafari S and Meyer B (2020). Hot sparing for lifetime-chip-performance and cost improvement in application specific SIMT processors, Design Automation for Embedded Systems, 24:4, (249-266), Online publication date: 1-Dec-2020.
  94. Coffin E, Young S, Kaur H, Brown J, Pirvu M and Kent K MicroJIT Proceedings of the 30th Annual International Conference on Computer Science and Software Engineering, (179-188)
  95. Jeon Y, Park B, Kwon S, Kim B, Yun J and Lee D BiQGEMM Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, (1-16)
  96. Berg B, Berger D, McAllister S, Grosof I, Gunasekar S, Lu J, Uhlar M, Carrig J, Beckmann N, Harchol-Balter M and Ganger G The CacheLib caching engine Proceedings of the 14th USENIX Conference on Operating Systems Design and Implementation, (769-786)
  97. Orts F, Ortega G, Puertas A, García I and Garzón E (2020). On solving the unrelated parallel machine scheduling problem: active microrheology as a case study, The Journal of Supercomputing, 76:11, (8494-8509), Online publication date: 1-Nov-2020.
  98. Salazar C and Bobby Birrer M Instrumentation and Extension of reduced, simulated Single Cycle MIPS architecture to improve Student Comprehension 2020 IEEE Frontiers in Education Conference (FIE), (1-5)
  99. Wang M, Wang J, Wen C and Chao H Roadrunner: Autonomous Intersection Management with Dynamic Lane Assignment 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC), (1-7)
  100. Fichte J, Hecher M and Szeider S A Time Leap Challenge for SAT-Solving Principles and Practice of Constraint Programming, (267-285)
  101. ACM
    Alam M, Nahiyan A, Sadi M, Forte D and Tehranipoor M (2020). Soft-HaT, ACM Transactions on Design Automation of Electronic Systems, 25:4, (1-22), Online publication date: 2-Sep-2020.
  102. Zhang Z, Henderson T, Karaman S and Sze V (2020). FSMI, International Journal of Robotics Research, 39:9, (1155-1177), Online publication date: 1-Aug-2020.
  103. ACM
    Sheikh S and Pasha M (2020). Energy-efficient Real-time Scheduling on Multicores, ACM Transactions on Embedded Computing Systems, 19:4, (1-25), Online publication date: 31-Jul-2020.
  104. ACM
    Atre N, Sherry J, Wang W and Berger D Caching with Delayed Hits Proceedings of the Annual conference of the ACM Special Interest Group on Data Communication on the applications, technologies, architectures, and protocols for computer communication, (495-513)
  105. Manjith B.C. and Ramasubramanian N. (2020). Securing AES Accelerator from Key-Leaking Trojans on FPGA, International Journal of Embedded and Real-Time Communication Systems, 11:3, (84-105), Online publication date: 1-Jul-2020.
  106. ACM
    Ritter F and Hack S PMEvo: portable inference of port mappings for out-of-order processors by evolutionary optimization Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation, (608-622)
  107. ACM
    Lozano R and Schulte C (2019). Survey on Combinatorial Register Allocation and Instruction Scheduling, ACM Computing Surveys, 52:3, (1-50), Online publication date: 31-May-2020.
  108. ACM
    Lipp M, Schwarz M, Gruss D, Prescher T, Haas W, Horn J, Mangard S, Kocher P, Genkin D, Yarom Y, Hamburg M and Strackx R (2020). Meltdown, Communications of the ACM, 63:6, (46-56), Online publication date: 21-May-2020.
  109. Lanuza J, Trabes G and Wainer G Parallel execution of DEVS in shared-memory multicore architectures Proceedings of the 2020 Spring Simulation Conference, (1-11)
  110. El-Moursy A, Sibai F, El-Moursy M and Mohamed A (2020). PMSMC, Journal of Parallel and Distributed Computing, 139:C, (135-147), Online publication date: 1-May-2020.
  111. ACM
    Nguyen H, Yu J, Lebdeh M, Taouil M, Hamdioui S and Catthoor F (2020). A Classification of Memory-Centric Computing, ACM Journal on Emerging Technologies in Computing Systems, 16:2, (1-26), Online publication date: 30-Apr-2020.
  112. Jošilo S and Dán G (2020). Computation Offloading Scheduling for Periodic Tasks in Mobile Edge Computing, IEEE/ACM Transactions on Networking, 28:2, (667-680), Online publication date: 1-Apr-2020.
  113. Hahn S and Reineke J (2019). Design and analysis of SIC: a provably timing-predictable pipelined processor core, Real-Time Systems, 56:2, (207-245), Online publication date: 1-Apr-2020.
  114. ACM
    Vineyard C, Plagge M and Green S Comparing Neural Accelerators & Neuromorphic Architectures The False Idol of Operations Proceedings of the 2020 Annual Neuro-Inspired Computational Elements Workshop, (1-6)
  115. ACM
    Zhang R, Biswas S, Balaji V, Bond M and Lucia B Peacenik Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems, (317-333)
  116. Szymczyk M and Szymczyk P (2020). Automatic processing of Z-transform artificial neural networks using parallel programming, Neurocomputing, 379:C, (74-88), Online publication date: 28-Feb-2020.
  117. ACM
    Liu B, Cheshmi K, Soori S, Strout M and Dehnavi M MatRox Proceedings of the 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, (389-402)
  118. Damaj I, Elshafei M, El-Abd M and Aydin M (2022). An analytical framework for high-speed hardware particle swarm optimization, Microprocessors & Microsystems, 72:C, Online publication date: 1-Feb-2020.
  119. ACM
    Edelkamp S and Weiß A (2019). BlockQuicksort, ACM Journal of Experimental Algorithmics, 24, (1-22), Online publication date: 17-Dec-2019.
  120. García-Martín E, Rodrigues C, Riley G and Grahn H (2019). Estimation of energy consumption in machine learning, Journal of Parallel and Distributed Computing, 134:C, (75-88), Online publication date: 1-Dec-2019.
  121. Wang L, Gao W, Yang K and Jiang Z BOPS, A New Computation-Centric Metric for Datacenter Computing Benchmarking, Measuring, and Optimizing, (262-277)
  122. Coffin E, Young S, Kent K and Pirvu M A roadmap for extending MicroJIT Proceedings of the 29th Annual International Conference on Computer Science and Software Engineering, (293-298)
  123. Zaruba F and Benini L (2019). The Cost of Application-Class Processing: Energy and Performance Analysis of a Linux-Ready 1.7-GHz 64-Bit RISC-V Core in 22-nm FDSOI Technology, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 27:11, (2629-2640), Online publication date: 1-Nov-2019.
  124. ACM
    Park S, Wu Y, Lee J, Aupov A and Mahlke S (2019). Multi-objective Exploration for Practical Optimization Decisions in Binary Translation, ACM Transactions on Embedded Computing Systems, 18:5s, (1-19), Online publication date: 31-Oct-2019.
  125. ACM
    Castro-Godínez J, Shafique M and Henkel J (2019). ECAx, ACM Transactions on Embedded Computing Systems, 18:5s, (1-20), Online publication date: 31-Oct-2019.
  126. ACM
    Nongpoh B, Ray R and Banerjee A Approximate computing for multithreaded programs in shared memory architectures Proceedings of the 17th ACM-IEEE International Conference on Formal Methods and Models for System Design, (1-9)
  127. Nair A, Colaco L, Patil G, Raveendran B and Punnekkatt S MEDIATOR Proceedings of the 23rd IEEE/ACM International Symposium on Distributed Simulation and Real Time Applications, (146-153)
  128. Hou Y, He H, Shamsi K, Jin Y, Wu D and Wu H (2019). On-Chip Analog Trojan Detection Framework for Microprocessor Trustworthiness, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 38:10, (1820-1830), Online publication date: 1-Oct-2019.
  129. ACM
    Lozano R, Carlsson M, Blindell G and Schulte C (2019). Combinatorial Register Allocation and Instruction Scheduling, ACM Transactions on Programming Languages and Systems, 41:3, (1-53), Online publication date: 30-Sep-2019.
  130. Sperl P and Böttinger K Side-Channel Aware Fuzzing Computer Security – ESORICS 2019, (259-278)
  131. Nadeem M, Li Z, Malik A, Biglari-Abhari M and Salcic Z (2019). Allocation and scheduling of SystemJ programs on chip multiprocessors with weighted TDMA scheduling, Journal of Systems Architecture: the EUROMICRO Journal, 98:C, (63-78), Online publication date: 1-Sep-2019.
  132. Nadeau D, Ezzati-Jivan N and Dagenais M (2019). Efficient large-scale heterogeneous debugging using dynamic tracing, Journal of Systems Architecture: the EUROMICRO Journal, 98:C, (346-360), Online publication date: 1-Sep-2019.
  133. ACM
    Ponugoti M and Milenkovic A (2019). Enabling On-the-Fly Hardware Tracing of Data Reads in Multicores, ACM Transactions on Embedded Computing Systems, 18:4, (1-27), Online publication date: 31-Jul-2019.
  134. Liu Z, Nath A, Ding X, Fu H, Muhib Khan M and Yu W (2022). Multivariate modeling and two-level scheduling of analytic queries, Parallel Computing, 85:C, (66-78), Online publication date: 1-Jul-2019.
  135. Reichenbach M, Holzinger P, Häublein K, Lieske T, Blinzer P and Fey D (2019). Heterogeneous Computing Utilizing FPGAs, Journal of Signal Processing Systems, 91:7, (745-757), Online publication date: 1-Jul-2019.
  136. ACM
    Geng T, Wang T, Wu C, Yang C, Wu W, Li A and Herbordt M O3BNN Proceedings of the ACM International Conference on Supercomputing, (461-472)
  137. ACM
    Chen Y and Louri A An online quality management framework for approximate communication in network-on-chips Proceedings of the ACM International Conference on Supercomputing, (217-226)
  138. Real P, Molina-Abril H, Díaz-del-Río F, Blanco-Trejo S and Onchis D Enhanced Parallel Generation of Tree Structures for the Recognition of 3D Images Pattern Recognition, (292-301)
  139. ACM
    Van Sandt P, Chronis Y and Patel J Efficiently Searching In-Memory Sorted Arrays Proceedings of the 2019 International Conference on Management of Data, (36-53)
  140. ACM
    Ayers G, Nagendra N, August D, Cho H, Kanev S, Kozyrakis C, Krishnamurthy T, Litz H, Moseley T and Ranganathan P AsmDB Proceedings of the 46th International Symposium on Computer Architecture, (462-473)
  141. ACM
    Pittino F, Bonfà P, Bartolini A, Affinito F, Benini L and Cavazzoni C Prediction of Time-to-Solution in Material Science Simulations Using Deep Learning Proceedings of the Platform for Advanced Scientific Computing Conference, (1-9)
  142. ACM
    Hanford N, Ahuja V, Farrens M, Tierney B and Ghosal D (2018). A Survey of End-System Optimizations for High-Speed Networks, ACM Computing Surveys, 51:3, (1-36), Online publication date: 31-May-2019.
  143. ACM
    Calciu I, Puddu I, Kolli A, Nowatzyk A, Gandhi J, Mutlu O and Subrahmanyam P Project PBerry Proceedings of the Workshop on Hot Topics in Operating Systems, (127-135)
  144. ACM
    Moreira F, Oliveira D and Navaux P SPADA Proceedings of the 16th ACM International Conference on Computing Frontiers, (50-58)
  145. Li G, Yang Y, Le F, Lim Y and Wang J Update Algebra: Toward Continuous, Non-Blocking Composition of Network Updates in SDN IEEE INFOCOM 2019 - IEEE Conference on Computer Communications, (1081-1089)
  146. ACM
    Gurung A and Ray R Simultaneous Solving of Batched Linear Programs on a GPU Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering, (59-66)
  147. ACM
    Gebai M and Dagenais M (2018). Survey and Analysis of Kernel and Userspace Tracers on Linux, ACM Computing Surveys, 51:2, (1-33), Online publication date: 31-Mar-2019.
  148. ACM
    Nongpoh B, Ray R, Das M and Banerjee A (2019). Enhancing Speculative Execution With Selective Approximate Computing, ACM Transactions on Design Automation of Electronic Systems, 24:2, (1-29), Online publication date: 21-Mar-2019.
  149. Li F, Xu L, Duan S, Wu W, Zhao H and Ling Q (2019). Improving hierarchical mobile video caching through distributed cross-layer coordination, Multimedia Tools and Applications, 78:5, (6049-6071), Online publication date: 1-Mar-2019.
  150. ACM
    Jordan H, Subotić P, Zhao D and Scholz B A specialized B-tree for concurrent datalog evaluation Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming, (327-339)
  151. Ying B, Yuan K and Sayed A (2019). Supervised Learning Under Distributed Features, IEEE Transactions on Signal Processing, 67:4, (977-992), Online publication date: 1-Feb-2019.
  152. Al-Adwan A, Sharieh A and Mahafzah B (2019). Parallel heuristic local search algorithm on OTIS hyper hexa-cell and OTIS mesh of trees optoelectronic architectures, Applied Intelligence, 49:2, (661-688), Online publication date: 1-Feb-2019.
  153. ACM
    Rhisheekesan A, Jeyapaul R and Shrivastava A (2019). Control Flow Checking or Not? (for Soft Errors), ACM Transactions on Embedded Computing Systems, 18:1, (1-25), Online publication date: 31-Jan-2019.
  154. ACM
    Guo X, Wang H, Zhang C, Tang H and Yuan Y Leakage-aware thermal management for multi-core systems using piecewise linear model based predictive control Proceedings of the 24th Asia and South Pacific Design Automation Conference, (64-69)
  155. Pontarelli S, Bonola M and Bianchi G (2018). Smashing OpenFlow's “atomic” actions, International Journal of Network Management, 29:1, Online publication date: 11-Jan-2019.
  156. ACM
    Shelor C and Kavi K Reconfigurable dataflow graphs for processing-in-memory Proceedings of the 20th International Conference on Distributed Computing and Networking, (110-119)
  157. Jošilo S and Dán G (2018). Selfish Decentralized Computation Offloading for Mobile Cloud Computing in Dense Wireless Networks, IEEE Transactions on Mobile Computing, 18:1, (207-220), Online publication date: 1-Jan-2019.
  158. Chen Y (2019). Reshaping Future Computing Systems With Emerging Nonvolatile Memory Technologies, IEEE Micro, 39:1, (54-57), Online publication date: 1-Jan-2019.
  159. Jiang Z, Gao W, Wang L, Xiong X, Zhang Y, Wen X, Luo C, Ye H, Lu X, Zhang Y, Feng S, Li K, Xu W and Zhan J HPC AI500: A Benchmark Suite for HPC AI Systems Benchmarking, Measuring, and Optimizing, (10-22)
  160. ACM
    Tran K, Jimborean A, Carlson T, Koukos K, Själander M and Kaxiras S (2018). SWOOP: software-hardware co-design for non-speculative, execute-ahead, in-order cores, ACM SIGPLAN Notices, 53:4, (328-343), Online publication date: 2-Dec-2018.
  161. Zhang J, Wu C, Yang D, Chen Y, Meng X, Xu L and Guo M (2018). HSCS, Frontiers of Computer Science: Selected Publications from Chinese Universities, 12:6, (1090-1104), Online publication date: 1-Dec-2018.
  162. Breβ S, Köcher B, Funke H, Zeuch S, Rabl T and Markl V (2018). Generating custom code for efficient query execution on heterogeneous processors, The VLDB Journal — The International Journal on Very Large Data Bases, 27:6, (797-822), Online publication date: 1-Dec-2018.
  163. ACM
    Lee J, Kim C, Lin K, Cheng L, Govindaraju R and Kim J (2018). WSMeter, ACM SIGPLAN Notices, 53:2, (549-563), Online publication date: 30-Nov-2018.
  164. ACM
    Einziger G, Eytan O, Friedman R and Manes B Adaptive Software Cache Management Proceedings of the 19th International Middleware Conference, (94-106)
  165. Asă?Voae I, Asă?Voae M and Riesco A (2018). Slicing from formal semantics, International Journal on Software Tools for Technology Transfer (STTT), 20:6, (739-769), Online publication date: 1-Nov-2018.
  166. Dey M, Nazari A, Zajic A and Prvulovic M TEMProf Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, (881-893)
  167. Yan M, Choi J, Skarlatos D, Morrison A, Fletcher C and Torrellas J InvisiSpec Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, (428-441)
  168. ACM
    Khattab O, Hammoud M and Shekfeh O PolyHJ Proceedings of the 27th ACM International Conference on Information and Knowledge Management, (1323-1332)
  169. ACM
    Rashid S, Nelissen G and Tovar E Trading Between Intra- and Inter-Task Cache Interference to Improve Schedulability Proceedings of the 26th International Conference on Real-Time Networks and Systems, (125-136)
  170. ACM
    Zoni D, Barenghi A, Pelosi G and Fornaciari W (2018). A Comprehensive Side-Channel Information Leakage Analysis of an In-Order RISC CPU Microarchitecture, ACM Transactions on Design Automation of Electronic Systems, 23:5, (1-30), Online publication date: 30-Sep-2018.
  171. ACM
    Jimenez L and Agyeman M A Study of Techniques to Increase Instruction Level Parallelisms Proceedings of the 2nd International Symposium on Computer Science and Intelligent Control, (1-5)
  172. García-Martín E, Lavesson N, Grahn H, Casalicchio E and Boeva V How to Measure Energy Consumption in Machine Learning Algorithms ECML PKDD 2018 Workshops, (243-255)
  173. ACM
    Ognawala S, Amato R, Pretschner A and Kulkarni P Automatically assessing vulnerabilities discovered by compositional analysis Proceedings of the 1st International Workshop on Machine Learning and Software Engineering in Symbiosis, (16-25)
  174. Gu J, Yin S, liu L and Wei S (2018). Stress-Aware Loops Mapping on CGRAs with Dynamic Multi-Map Reconfiguration, IEEE Transactions on Parallel and Distributed Systems, 29:9, (2105-2120), Online publication date: 1-Sep-2018.
  175. ACM
    Ji K, Ling M, Shi L and Pan J (2018). An Analytical Cache Performance Evaluation Framework for Embedded Out-of-Order Processors Using Software Characteristics, ACM Transactions on Embedded Computing Systems, 17:4, (1-25), Online publication date: 29-Aug-2018.
  176. ACM
    Tan W, Chang S, Fong L, Li C, Wang Z and Cao L Matrix Factorization on GPUs with Memory Optimization and Approximate Computing Proceedings of the 47th International Conference on Parallel Processing, (1-10)
  177. Catalán S, Herrero J, Quintana-Ortí E and Rodríguez-Sánchez R (2018). Static scheduling of the LU factorization with look-ahead on asymmetric multicore processors, Parallel Computing, 76:C, (18-27), Online publication date: 1-Aug-2018.
  178. Jakovljević R, Berić A, Van Dalen E and Milićev D (2018). New access modes of parallel memory subsystem for sub-pixel motion estimation, Journal of Real-Time Image Processing, 15:2, (279-296), Online publication date: 1-Aug-2018.
  179. ACM
    Psychou G, Rodopoulos D, Sabry M, Gemmeke T, Atienza D, Noll T and Catthoor F (2017). Classification of Resilience Techniques Against Functional Errors at Higher Abstraction Layers of Digital Systems, ACM Computing Surveys, 50:4, (1-38), Online publication date: 31-Jul-2018.
  180. Schulz L, Broneske D and Saake G (2018). An eight-dimensional systematic evaluation of optimized search algorithms on modern processors, Proceedings of the VLDB Endowment, 11:11, (1550-1562), Online publication date: 1-Jul-2018.
  181. ACM
    Kwon K, Amid A, Gholami A, Wu B, Asanovic K and Keutzer K Co-design of deep neural nets and neural net accelerators for embedded vision applications Proceedings of the 55th Annual Design Automation Conference, (1-6)
  182. Kwon K, Amid A, Gholami A, Wu B, Asanovic K and Keutzer K Invited: Co-Design of Deep Neural Nets and Neural Net Accelerators for Embedded Vision Applications 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC), (1-6)
  183. ACM
    Kougkas A, Devarajan H and Sun X IRIS Proceedings of the 2018 International Conference on Supercomputing, (33-42)
  184. ACM
    Tran K, Jimborean A, Carlson T, Koukos K, Själander M and Kaxiras S SWOOP: software-hardware co-design for non-speculative, execute-ahead, in-order cores Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation, (328-343)
  185. ACM
    Zhang J and Gruenwald L Regularizing irregularity Proceedings of the 1st ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA), (1-8)
  186. Bae D, Jo I, Choi Y, Hwang J, Cho S, Lee D and Jeong J 2B-SSD Proceedings of the 45th Annual International Symposium on Computer Architecture, (425-438)
  187. Parasar M, Bhattacharjee A and Krishna T SEESAW Proceedings of the 45th Annual International Symposium on Computer Architecture, (193-206)
  188. ACM
    Morse J, Kerrison S and Eder K (2018). On the Limitations of Analyzing Worst-Case Dynamic Energy of Processing, ACM Transactions on Embedded Computing Systems, 17:3, (1-22), Online publication date: 31-May-2018.
  189. ACM
    Crawford P, Barnes Jr. P, Eidenbenz S and Wilsey P Sampling Simulation Model Profile Data for Analysis Proceedings of the 2018 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation, (17-28)
  190. ACM
    Kelefouras V and Djemame K A methodology for efficient code optimizations and memory management Proceedings of the 15th ACM International Conference on Computing Frontiers, (105-112)
  191. ACM
    Malas T, Hager G, Ltaief H and Keyes D (2017). Multidimensional Intratile Parallelization for Memory-Starved Stencil Computations, ACM Transactions on Parallel Computing, 4:3, (1-32), Online publication date: 27-Apr-2018.
  192. Liao C, Lee S, Chiou Y, Lee C and Lee C (2018). Power consumption minimization by distributive particle swarm optimization for luminance control and its parallel implementations, Expert Systems with Applications: An International Journal, 96:C, (479-491), Online publication date: 15-Apr-2018.
  193. ACM
    Chen K and Chen C (2018). Enabling SIMT Execution Model on Homogeneous Multi-Core System, ACM Transactions on Architecture and Code Optimization, 15:1, (1-26), Online publication date: 31-Mar-2018.
  194. ACM
    Lee J, Kim C, Lin K, Cheng L, Govindaraju R and Kim J WSMeter Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, (549-563)
  195. Prakash A, Clarke C, Lam S and Srikanthan T (2018). Rapid Memory-Aware Selection of Hardware Accelerators in Programmable SoC Design, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 26:3, (445-456), Online publication date: 1-Mar-2018.
  196. Dolbeau R (2018). Theoretical peak FLOPS per instruction set: a tutorial, The Journal of Supercomputing, 74:3, (1341-1377), Online publication date: 1-Mar-2018.
  197. ACM
    Baba T, Watanabe S, Jackin B, Ohkawa T, Ootsu K, Yokota T, Hayasaki Y and Yatagai T Overcoming the difficulty of large-scale CGH generation on multi-GPU cluster Proceedings of the 11th Workshop on General Purpose GPUs, (13-21)
  198. ACM
    Josipović L, Ghosal R and Ienne P Dynamically Scheduled High-level Synthesis Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, (127-136)
  199. Siddique N, Grubel P, Badawy A and Cook J (2018). A performance study of the time-varying cache behavior, The Journal of Supercomputing, 74:2, (665-695), Online publication date: 1-Feb-2018.
  200. Al-Adwan A, Mahafzah B and Sharieh A (2018). Solving traveling salesman problem using parallel repetitive nearest neighbor algorithm on OTIS-Hypercube and OTIS-Mesh optoelectronic architectures, The Journal of Supercomputing, 74:1, (1-36), Online publication date: 1-Jan-2018.
  201. Chen X, Wardi Y and Yalamanchili S Power regulation in high performance multicore processors 2017 IEEE 56th Annual Conference on Decision and Control (CDC), (2674-2679)
  202. ACM
    Sieber C, Durner R, Ehm M, Kellerer W and Sharma P Towards optimal adaptation of NFV packet processing to modern CPU memory architectures Proceedings of the 2nd Workshop on Cloud-Assisted Networking, (7-12)
  203. Crawford P, Eidenbenz S, Barnes P and Wilsey P Some properties of communication behaviors in discrete-event simulation models Proceedings of the 2017 Winter Simulation Conference, (1-12)
  204. Zhang Y, Hou J, Cao Y, Gu J and Huang C (2017). OpenMP parallelization of a gridded SWAT (SWATG), Computers & Geosciences, 109:C, (228-237), Online publication date: 1-Dec-2017.
  205. He H, Cui L, Zhou F and Wang D (2017). Distributed proxy cache technology based on autonomic computing in smart cities, Future Generation Computer Systems, 76:C, (370-383), Online publication date: 1-Nov-2017.
  206. Ortega G, Filatovas E, Garzón E and Casado L (2017). Non-dominated sorting procedure for Pareto dominance ranking on multicore CPU and/or GPU, Journal of Global Optimization, 69:3, (607-627), Online publication date: 1-Nov-2017.
  207. Wan H, Gao X, Long X and Jiang B Introducing parallel computing concepts in computer system related courses 2017 IEEE Frontiers in Education Conference (FIE), (1-7)
  208. ACM
    Kulkarni C, Kesavan A, Zhang T, Ricci R and Stutsman R Rocksteady Proceedings of the 26th Symposium on Operating Systems Principles, (390-405)
  209. ACM
    Huang Y, Guo N, Seok M, Tsividis Y, Mandli K and Sethumadhavan S Hybrid analog-digital solution of nonlinear partial differential equations Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, (665-678)
  210. ACM
    Milic U, Villa O, Bolotin E, Arunkumar A, Ebrahimi E, Jaleel A, Ramirez A and Nellans D Beyond the socket Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, (123-135)
  211. ACM
    Fu X, Rol M, Bultink C, van Someren J, Khammassi N, Ashraf I, Vermeulen R, de Sterke J, Vlothuizen W, Schouten R, Almudever C, DiCarlo L and Bertels K An experimental microarchitecture for a superconducting quantum processor Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, (813-825)
  212. ACM
    Tsai P, Beckmann N and Sanchez D (2017). Jenga, ACM SIGARCH Computer Architecture News, 45:2, (652-665), Online publication date: 14-Sep-2017.
  213. ACM
    Wang K and Lin C (2017). Decoupled Affine Computation for SIMT GPUs, ACM SIGARCH Computer Architecture News, 45:2, (295-306), Online publication date: 14-Sep-2017.
  214. Aghaei Khouzani H, Hosseini F and Yang C (2017). Segment and Conflict Aware Page Allocation and Migration in DRAM-PCM Hybrid Main Memory, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 36:9, (1458-1470), Online publication date: 1-Sep-2017.
  215. ACM
    Stanic M, Palomar O, Hayes T, Ratkovic I, Cristal A, Unsal O and Valero M (2017). An Integrated Vector-Scalar Design on an In-Order ARM Core, ACM Transactions on Architecture and Code Optimization, 14:2, (1-26), Online publication date: 21-Jul-2017.
  216. Blohoubek J, Fier P and Schmidt J (2017). Error masking method based on the short-duration offline test, Microprocessors & Microsystems, 52:C, (236-250), Online publication date: 1-Jul-2017.
  217. Mai V and Khalil I (2017). Design and implementation of a secure cloud-based billing model for smart meters as an Internet of things using homomorphic cryptography, Future Generation Computer Systems, 72:C, (327-338), Online publication date: 1-Jul-2017.
  218. ACM
    Tsai P, Beckmann N and Sanchez D Jenga Proceedings of the 44th Annual International Symposium on Computer Architecture, (652-665)
  219. ACM
    Wang K and Lin C Decoupled Affine Computation for SIMT GPUs Proceedings of the 44th Annual International Symposium on Computer Architecture, (295-306)
  220. Gutierrez-Alcoba A, Ortega G, Hendrix E and Garca I (2017). Accelerating an algorithm for perishable inventory control on heterogeneous platforms, Journal of Parallel and Distributed Computing, 104:C, (12-18), Online publication date: 1-Jun-2017.
  221. Khan A, Al-Mouhamed M, Al-Mulhem M and Ahmed A (2017). RT-CUDA, International Journal of Parallel Programming, 45:3, (551-594), Online publication date: 1-Jun-2017.
  222. ACM
    Gupta S and Wilsey P Quantitative Driven Optimization of a Time Warp Kernel Proceedings of the 2017 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation, (27-38)
  223. ACM
    Paredes M, Riley G and Luján M Vectorization of Hybrid Breadth First Search on the Intel Xeon Phi Proceedings of the Computing Frontiers Conference, (127-135)
  224. ACM
    Wickerson J, Batty M, Sorensen T and Constantinides G (2017). Automatically comparing memory consistency models, ACM SIGPLAN Notices, 52:1, (190-204), Online publication date: 11-May-2017.
  225. ACM
    Liu Y and Sun X (2017). Evaluating the Combined Effect of Memory Capacity and Concurrency for Many-Core Chip Design, ACM Transactions on Modeling and Performance Evaluation of Computing Systems, 2:2, (1-25), Online publication date: 5-May-2017.
  226. Deng S and Suresh K (2017). Topology optimization under thermo-elastic buckling, Structural and Multidisciplinary Optimization, 55:5, (1759-1772), Online publication date: 1-May-2017.
  227. ACM
    Chow K and Zhu W Software Performance Analytics in the Cloud Proceedings of the 8th ACM/SPEC on International Conference on Performance Engineering, (419-421)
  228. ACM
    Palangappa P and Mohanram K (2017). CompEx++, ACM Transactions on Architecture and Code Optimization, 14:1, (1-30), Online publication date: 14-Apr-2017.
  229. ACM
    Zhang Y, Anwer B, Gopalakrishnan V, Han B, Reich J, Shaikh A and Zhang Z ParaBox Proceedings of the Symposium on SDN Research, (143-149)
  230. Melani A, Bertogna M, Davis R, Bonifaci V, Marchetti-Spaccamela A and Buttazzo G (2017). Exact Response Time Analysis for Fixed Priority Memory-Processor Co-Scheduling, IEEE Transactions on Computers, 66:4, (631-646), Online publication date: 1-Apr-2017.
  231. Qin H, Liu Z, Liu Y and Zhong H (2017). An object-oriented MATLAB toolbox for automotive body conceptual design using distributed parallel optimization, Advances in Engineering Software, 106:C, (19-32), Online publication date: 1-Apr-2017.
  232. Brandalero M and Beck A A mechanism for energy-efficient reuse of decoding and scheduling of x86 instruction streams Proceedings of the Conference on Design, Automation & Test in Europe, (1472-1477)
  233. Alioto M Energy-quality scalable adaptive VLSI circuits and systems beyond approximate computing Proceedings of the Conference on Design, Automation & Test in Europe, (127-132)
  234. Tang Q, Basten T, Geilen M, Stuijk S and Wei J (2017). Mapping of synchronous dataflow graphs on MPSoCs based on parallelism enhancement, Journal of Parallel and Distributed Computing, 101:C, (79-91), Online publication date: 1-Mar-2017.
  235. Tran K, Carlson T, Koukos K, Själander M, Spiliopoulos V, Kaxiras S and Jimborean A Clairvoyance: look-ahead compile-time scheduling Proceedings of the 2017 International Symposium on Code Generation and Optimization, (171-184)
  236. Chen Q, Wang X, Wan H and Yang R (2017). A Logic Circuit Design for Perfecting Memristor-Based Material Implication, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 36:2, (279-284), Online publication date: 1-Feb-2017.
  237. ACM
    Wickerson J, Batty M, Sorensen T and Constantinides G Automatically comparing memory consistency models Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages, (190-204)
  238. Ortega G, Puertas A and Garzón E (2017). Accelerating the problem of microrheology in colloidal systems on a GPU, The Journal of Supercomputing, 73:1, (370-383), Online publication date: 1-Jan-2017.
  239. ACM
    Fernandes F, Weigel L, Jung C, Navaux P, Carro L and Rech P (2016). Evaluation of Histogram of Oriented Gradients Soft Errors Criticality for Automotive Applications, ACM Transactions on Architecture and Code Optimization, 13:4, (1-25), Online publication date: 28-Dec-2016.
  240. Brock J and Bruce R (2016). Power labs, Journal of Computing Sciences in Colleges, 32:2, (104-110), Online publication date: 1-Dec-2016.
  241. Sewall J, Pennycook S, Duran A, Tian X and Narayanaswamy R A modern memory management system for OpenMP Proceedings of the Third International Workshop on Accelerator Programming Using Directives, (25-35)
  242. Bederián C and Wolovick N A project-based HPC course for single-box computers Proceedings of the Workshop on Education for High Performance Computing, (1-6)
  243. Qu P, Yan J and Gao G Toward a Parallel Turing Machine Model Network and Parallel Computing, (191-204)
  244. ACM
    Hahn S, Jacobs M and Reineke J Enabling Compositionality for Multicore Timing Analysis Proceedings of the 24th International Conference on Real-Time Networks and Systems, (299-308)
  245. ACM
    Siegl P, Buchty R and Berekovic M Data-Centric Computing Frontiers Proceedings of the Second International Symposium on Memory Systems, (295-308)
  246. ACM
    Tran K Student Research Poster Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, (458-458)
  247. Catalán S, Malossi A, Bekas C and Quintana-Ortí E The Impact of Voltage-Frequency Scaling for the Matrix-Vector Product on the IBM POWER8 Proceedings of the 22nd International Conference on Euro-Par 2016: Parallel Processing - Volume 9833, (103-116)
  248. Masliah I, Abdelfattah A, Haidar A, Tomov S, Baboulin M, Falcou J and Dongarra J High-Performance Matrix-Matrix Multiplications of Very Small Matrices Proceedings of the 22nd International Conference on Euro-Par 2016: Parallel Processing - Volume 9833, (659-671)
  249. ACM
    Joshi A, Vollala S, Begum B and Ramasubramanian N Performance Analysis of Cache Coherence Protocols for Multi-core Architectures Proceedings of the International Conference on Advances in Information Communication Technology & Computing, (1-7)
  250. ACM
    Darav N, Kennings A, Tabrizi A, Westwick D and Behjat L (2016). Eh?Placer, ACM Transactions on Design Automation of Electronic Systems, 21:3, (1-27), Online publication date: 26-Jul-2016.
  251. ACM
    Shahvarani A and Jacobsen H A Hybrid B+-tree as Solution for In-Memory Indexing on CPU-GPU Heterogeneous Computing Platforms Proceedings of the 2016 International Conference on Management of Data, (1523-1538)
  252. ACM
    Banerjee K, Banerjee S and Sarkar S Data-race detection: the missing piece for an end-to-end semantic equivalence checker for parallelizing transformations of array-intensive programs Proceedings of the 3rd ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming, (1-8)
  253. Hong J and Kim S (2016). Flexible ECC Management for Low-Cost Transient Error Protection of Last-Level Caches, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 24:6, (2152-2164), Online publication date: 1-Jun-2016.
  254. Ünal E and Savaş E (2016). On Acceleration and Scalability of Number Theoretic Private Information Retrieval, IEEE Transactions on Parallel and Distributed Systems, 27:6, (1727-1741), Online publication date: 1-Jun-2016.
  255. Dai Y, Fang Y, Yang L and Jeon G (2016). Graphics processing unit-accelerated joint-bitplane belief propagation algorithm in DSC, The Journal of Supercomputing, 72:6, (2351-2375), Online publication date: 1-Jun-2016.
  256. ACM
    Luppold A, Kittsteiner C and Falk H Cache-Aware Instruction SPM Allocation for Hard Real-Time Systems Proceedings of the 19th International Workshop on Software and Compilers for Embedded Systems, (77-85)
  257. ACM
    Wilsey P Some Properties of Events Executed in Discrete-Event Simulation Models Proceedings of the 2016 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation, (165-176)
  258. ACM
    Bijo S, Johnsen E, Pun K and Tarifa S An operational semantics of cache coherent multicore architectures Proceedings of the 31st Annual ACM Symposium on Applied Computing, (1219-1224)
  259. Elkhouly R, El-Mahdy A and Elmasry A Optimality analysis of if-conversion transformation Proceedings of the 24th High Performance Computing Symposium, (1-8)
  260. Savidis I, Ciftcioglu B, Xu J, Hu J, Jain M, Berman R, Xue J, Liu P, Moore D, Wicks G, Huang M, Wu H and Friedman E (2016). Heterogeneous 3-D circuits, Microelectronics Journal, 50:C, (66-75), Online publication date: 1-Apr-2016.
  261. Souza J, Carro L, Rutzig M and Beck A A reconfigurable heterogeneous multicore with a homogeneous ISA Proceedings of the 2016 Conference on Design, Automation & Test in Europe, (1598-1603)
  262. Yao Y and Lu Z Memory-access aware DVFS for network-on-chip in CMPs Proceedings of the 2016 Conference on Design, Automation & Test in Europe, (1433-1436)
  263. ACM
    Goossens B, Parello D, Porada K and Rahmoune D Parallel Locality and Parallelization Quality Proceedings of the 7th International Workshop on Programming Models and Applications for Multicores and Manycores, (59-68)
  264. ACM
    Fadolalkarim D, Sallam A and Bertino E PANDDE Proceedings of the Sixth ACM Conference on Data and Application Security and Privacy, (267-276)
  265. Johnson P and Ekstedt M (2016). The Tarpit - A general theory of software engineering, Information and Software Technology, 70:C, (181-203), Online publication date: 1-Feb-2016.
  266. ACM
    Madarbux M, Van Laer A, Watts P and Jones T Energy Efficient And Low Latency Interconnection Network For Multicast Invalidates In Shared Memory Systems Proceedings of the 1st International Workshop on Advanced Interconnect Solutions and Technologies for Emerging Computing Systems, (1-6)
  267. ACM
    Kanev S, Darago J, Hazelwood K, Ranganathan P, Moseley T, Wei G and Brooks D (2015). Profiling a warehouse-scale computer, ACM SIGARCH Computer Architecture News, 43:3S, (158-169), Online publication date: 4-Jan-2016.
  268. ACM
    Lee Y, Kim J, Jang H, Yang H, Kim J, Jeong J and Lee J (2015). A fully associative, tagless DRAM cache, ACM SIGARCH Computer Architecture News, 43:3S, (211-222), Online publication date: 4-Jan-2016.
  269. ACM
    Quéva C, Couroussé D and Charles H Self-optimisation using runtime code generation for wireless sensor networks Proceedings of the 17th International Conference on Distributed Computing and Networking, (1-6)
  270. Kleanthous M, Sazeides Y, Ozer E, Nicopoulos C, Nikolaou P and Hadjilambrou Z (2016). Toward Multi-Layer Holistic Evaluation of System Designs, IEEE Computer Architecture Letters, 15:1, (58-61), Online publication date: 1-Jan-2016.
  271. ACM
    Fang Y, Hoang T, Becchi M and Chien A Fast support for unstructured data processing Proceedings of the 48th International Symposium on Microarchitecture, (533-545)
  272. Beyer J, Hadwiger M and Pfister H (2015). State-of-the-Art in GPU-Based Large-Scale Volume Visualization, Computer Graphics Forum, 34:8, (13-37), Online publication date: 1-Dec-2015.
  273. Ben Youssef B (2015). A parallel cellular automata algorithm for the deterministic simulation of 3-D multicellular tissue growth, Cluster Computing, 18:4, (1561-1579), Online publication date: 1-Dec-2015.
  274. ACM
    Eslami H, Kougkas A, Kotsifakou M, Kasampalis T, Feng K, Lu Y, Gropp W, Sun X, Chen Y and Thakur R Efficient disk-to-disk sorting Proceedings of the 2015 International Workshop on Data-Intensive Scalable Computing Systems, (1-8)
  275. ACM
    Liu Y and Sun X C2-bound Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, (1-11)
  276. ACM
    Jacobs M, Hahn S and Hack S WCET analysis for multi-core processors with shared buses and event-driven bus arbitration Proceedings of the 23rd International Conference on Real Time and Networks Systems, (193-202)
  277. ACM
    Zhang J, You S and Gruenwald L Efficient Parallel Zonal Statistics on Large-Scale Global Biodiversity Data on GPUs Proceedings of the 4th International ACM SIGSPATIAL Workshop on Analytics for Big Geospatial Data, (35-44)
  278. ACM
    Zhang J, You S and Xia Y Prototyping A Web-based High-Performance Visual Analytics Platform for Origin-Destination Data Proceedings of the 1st International ACM SIGSPATIAL Workshop on Smart Cities and Urban Analytics, (16-23)
  279. ACM
    Altamimi M and Naik K A Computing Profiling Procedure for Mobile Developers to Estimate Energy Cost Proceedings of the 18th ACM International Conference on Modeling, Analysis and Simulation of Wireless and Mobile Systems, (301-305)
  280. Lai B, Kuan-Ting Chen and Ping-Ru Wu (2015). A High-Performance Double-Layer Counting Bloom Filter for Multicore Systems, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 23:11, (2473-2486), Online publication date: 1-Nov-2015.
  281. Diaz I, Zhang C, Hollevoet L, Svensson J, Rodrigues J, Wilhelmsson L, Olsson T, Van der Perre L and Öwall V (2015). A new digital front-end for flexible reception in software defined radio, Microprocessors & Microsystems, 39:8, (889-900), Online publication date: 1-Nov-2015.
  282. ACM
    Gottscho M, BanaiyanMofrad A, Dutt N, Nicolau A and Gupta P (2015). DPCS, ACM Transactions on Architecture and Code Optimization, 12:3, (1-26), Online publication date: 6-Oct-2015.
  283. Oxley M, Pasricha S, Maciejewski A, Siegel H, Apodaca J, Young D, Briceno L, Smith J, Bahirat S, Khemka B, Ramirez A and Zou Y (2015). Makespan and Energy Robust Stochastic Static Resource Allocation of a Bag-of-Tasks to a Heterogeneous Computing System, IEEE Transactions on Parallel and Distributed Systems, 26:10, (2791-2805), Online publication date: 1-Oct-2015.
  284. Zhu F, Yao Y, Tang W and Chen D (2015). A high performance framework for modeling and simulation of large-scale complex systems, Future Generation Computer Systems, 51:C, (132-141), Online publication date: 1-Oct-2015.
  285. ACM
    Abadal S, Nemirovsky M, Alarcón E and Cabellos-Aparicio A Networking Challenges and Prospective Impact of Broadcast-Oriented Wireless Networks-on-Chip Proceedings of the 9th International Symposium on Networks-on-Chip, (1-8)
  286. Sanchez E and Reorda M (2015). On the Functional Test of Branch Prediction Units, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 23:9, (1675-1688), Online publication date: 1-Sep-2015.
  287. Hao Zhang , Gang Chen , Beng Chin Ooi , Kian-Lee Tan and Meihui Zhang (2015). In-Memory Big Data Management and Processing: A Survey, IEEE Transactions on Knowledge and Data Engineering, 27:7, (1920-1948), Online publication date: 1-Jul-2015.
  288. ACM
    Kandemir M, Zhao H, Tang X and Karakoy M (2015). Memory Row Reuse Distance and its Role in Optimizing Application Performance, ACM SIGMETRICS Performance Evaluation Review, 43:1, (137-149), Online publication date: 24-Jun-2015.
  289. ACM
    Li A, Tay Y, Kumar A and Corporaal H Transit Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing, (101-106)
  290. ACM
    Kandemir M, Zhao H, Tang X and Karakoy M Memory Row Reuse Distance and its Role in Optimizing Application Performance Proceedings of the 2015 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, (137-149)
  291. ACM
    Ul-Abdin Z and Svensson B Towards teaching embedded parallel computing Proceedings of the Workshop on Computer Architecture Education, (1-6)
  292. ACM
    Kanev S, Darago J, Hazelwood K, Ranganathan P, Moseley T, Wei G and Brooks D Profiling a warehouse-scale computer Proceedings of the 42nd Annual International Symposium on Computer Architecture, (158-169)
  293. ACM
    Tan Z, Qian Z, Chen X, Asanovic K and Patterson D (2015). DIABLO, ACM SIGARCH Computer Architecture News, 43:1, (207-221), Online publication date: 29-May-2015.
  294. ACM
    Cilku B and Puschner P (2015). Designing a time predictable memory hierarchy for single-path code, ACM SIGBED Review, 12:2, (16-21), Online publication date: 20-May-2015.
  295. ACM
    Mozafari S, Meyer B and Skadron K Yield-aware Performance-Cost Characterization for Multi-Core SIMT Proceedings of the 25th edition on Great Lakes Symposium on VLSI, (237-240)
  296. ACM
    Tan Z, Qian Z, Chen X, Asanovic K and Patterson D (2015). DIABLO, ACM SIGPLAN Notices, 50:4, (207-221), Online publication date: 12-May-2015.
  297. Gallenmüller S, Emmerich P, Wohlfart F, Raumer D and Carle G Comparison of Frameworks for High-Performance Packet IO Proceedings of the Eleventh ACM/IEEE Symposium on Architectures for networking and communications systems, (29-38)
  298. Li W, Jin G, Cui X and See S An evaluation of unified memory technology on NVIDIA GPUs Proceedings of the 15th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, (1092-1098)
  299. Subedi T, Nguyen K and Cheriet M (2015). OpenFlow-based in-network Layer-2 adaptive multipath aggregation in data centers, Computer Communications, 61:C, (58-69), Online publication date: 1-May-2015.
  300. ACM
    Zhang J, You S and Gruenwald L (2015). Large-scale spatial data processing on GPUs and GPU-accelerated clusters, SIGSPATIAL Special, 6:3, (27-34), Online publication date: 22-Apr-2015.
  301. Damodaran P, Zaib A, Wallentowitz S, Wild T and Herkersdorf A Sharer status-based caching in tiled multiprocessor systems-on-chip Proceedings of the Symposium on High Performance Computing, (67-74)
  302. Carretero J, Distefano S, Petcu D, Pop D, Rauber T, Runger G and Singh D (2015). Energy-efficient Algorithms for Ultrascale Systems, Supercomputing Frontiers and Innovations: an International Journal, 2:2, (77-104), Online publication date: 6-Apr-2015.
  303. Li Wang , Minqi Zhou , Zhenjie Zhang , Ming-Chien Shan and Aoying Zhou (2015). NUMA-Aware Scalable and Efficient In-Memory Aggregation on Large Domains, IEEE Transactions on Knowledge and Data Engineering, 27:4, (1071-1084), Online publication date: 1-Apr-2015.
  304. ACM
    Cilku B, Kammerer R and Puschner P (2015). Aligning single path loops to reduce the number of capacity cache misses, ACM SIGBED Review, 12:1, (13-18), Online publication date: 27-Mar-2015.
  305. ACM
    Tan Z, Qian Z, Chen X, Asanovic K and Patterson D DIABLO Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, (207-221)
  306. ACM
    Chaker H, Cudennec L, Dahmani S, Gogniat G and Sepúlveda M Cycle-based Model to Evaluate Consistency Protocols within a Multi-protocol Compilation Tool-chain Proceedings of the 2015 International Workshop on Code Optimisation for Multi and Many Cores, (1-10)
  307. ACM
    Fox A and Patterson D (2015). Do-it-yourself textbook publishing, Communications of the ACM, 58:2, (40-43), Online publication date: 28-Jan-2015.
  308. ACM
    Lazarescu M and Lavagno L (2015). Interactive Trace-Based Analysis Toolset for Manual Parallelization of C Programs, ACM Transactions on Embedded Computing Systems, 14:1, (1-20), Online publication date: 21-Jan-2015.
  309. Gadouleau M and Riis S (2015). Memoryless computation, Theoretical Computer Science, 562:C, (129-145), Online publication date: 11-Jan-2015.
  310. Kiran D, Gurunarayanan S, Misra J and Nawal A (2015). Global scheduling heuristics for multicore architecture, Scientific Programming, 2015, (18-18), Online publication date: 1-Jan-2015.
  311. ACM
    Riemens D, Gaydadjiev G, Zeeuw C and Strydis C (2014). Towards scalable arithmetic units with graceful degradation, ACM Transactions on Embedded Computing Systems, 13:4, (1-26), Online publication date: 5-Dec-2014.
  312. Son Y, Seongil O, Yang H, Jung D, Ahn J, Kim J, Kim J and Lee J Microbank Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, (1059-1070)
  313. ACM
    Kaligirwa N, Leal E, Gruenwald L, Zhang J and You S Parallel QuadTree encoding of large-scale raster geospatial data on multicore CPUs and GPGPUs Proceedings of the 3rd ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, (30-39)
  314. ACM
    Yalcin G, Ergin O, Islek E, Unsal O and Cristal A (2014). Exploiting Existing Comparators for Fine-Grained Low-Cost Error Detection, ACM Transactions on Architecture and Code Optimization, 11:3, (1-24), Online publication date: 27-Oct-2014.
  315. ACM
    Aziz A, Cireno M, Barros E and Prado B Balanced Prefetching Aggressiveness Controller for NoC-based Multiprocessor Proceedings of the 27th Symposium on Integrated Circuits and Systems Design, (1-7)
  316. ACM
    Segulja C and Abdelrahman T What is the cost of weak determinism? Proceedings of the 23rd international conference on Parallel architectures and compilation, (99-112)
  317. ACM
    Hrbacek R and Sekanina L Towards highly optimized cartesian genetic programming Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation, (1015-1022)
  318. Shoukourian H, Wilde T, Auweter A and Bode A (2014). Predicting the Energy and Power Consumption of Strong and Weak Scaling HPC Applications, Supercomputing Frontiers and Innovations: an International Journal, 1:2, (20-41), Online publication date: 9-Jul-2014.
  319. ACM
    Pirk H, Petraki E, Idreos S, Manegold S and Kersten M Database cracking Proceedings of the Tenth International Workshop on Data Management on New Hardware, (1-8)
  320. ACM
    Mühlbauer T, Rödiger W, Seilbeck R, Kemper A and Neumann T Heterogeneity-conscious parallel query execution Proceedings of the Tenth International Workshop on Data Management on New Hardware, (1-10)
  321. ACM
    Lazarescu M, Cohen A, Guatto A, Lê N, Lavagno L, Pop A, Prieto M, Terechko A and Sutii A Energy-aware parallelization flow and toolset for C code Proceedings of the 17th International Workshop on Software and Compilers for Embedded Systems, (79-88)
  322. ACM
    Raghavendra K, Warrier T and Mutyam M SAMO Proceedings of the 11th ACM Conference on Computing Frontiers, (1-10)
  323. ACM
    Piro G, Abadal S, Mestres A, Alarcón E, Solé-Pareta J, Grieco L and Boggia G Initial MAC Exploration for Graphene-enabled Wireless Networks-on-Chip Proceedings of ACM The First Annual International Conference on Nanoscale Computing and Communication, (1-9)
  324. Valero M, Moreto M, Casas M, Ayguade E and Labarta J (2014). Runtime-Aware Architectures, Supercomputing Frontiers and Innovations: an International Journal, 1:1, (29-44), Online publication date: 6-Apr-2014.
  325. Titmus M, Gurtowski J and Schatz M (2014). Answering the demands of digital genomics, Concurrency and Computation: Practice & Experience, 26:4, (917-928), Online publication date: 25-Mar-2014.
  326. Liu J, Bouganis C and Cheung P Image progressive acquisition for hardware systems Proceedings of the conference on Design, Automation & Test in Europe, (1-6)
  327. Tsoutsos N and Maniatakos M HEROIC Proceedings of the conference on Design, Automation & Test in Europe, (1-6)
  328. ACM
    Sahu A and Ramakrishna S Creating heterogeneity at run time by dynamic cache and bandwidth partitioning schemes Proceedings of the 29th Annual ACM Symposium on Applied Computing, (872-879)
  329. ACM
    Fang J, Sips H, Zhang L, Xu C, Che Y and Varbanescu A Test-driving Intel Xeon Phi Proceedings of the 5th ACM/SPEC international conference on Performance engineering, (137-148)
  330. ACM
    Patterson D (2014). How to build a bad research center, Communications of the ACM, 57:3, (33-36), Online publication date: 1-Mar-2014.
  331. Bhattacharya A, Banerjee A and Sur-Kolay S Energy-Aware H.264 Decoding Proceedings of the 10th International Conference on Distributed Computing and Internet Technology - Volume 8337, (200-211)
  332. Benner P, Ezzatti P, Quintana-Ortí E and Remón A On the Impact of Optimization on the Time-Power-Energy Balance of Dense Linear Algebra Factorizations Algorithms and Architectures for Parallel Processing, (3-10)
  333. ACM
    Bardizbanyan A, Själander M, Whalley D and Larsson-Edefors P (2013). Designing a practical data filter cache to improve both energy efficiency and performance, ACM Transactions on Architecture and Code Optimization, 10:4, (1-25), Online publication date: 1-Dec-2013.
  334. ACM
    Fauzia N, Elango V, Ravishankar M, Ramanujam J, Rastello F, Rountev A, Pouchet L and Sadayappan P (2013). Beyond reuse distance analysis, ACM Transactions on Architecture and Code Optimization, 10:4, (1-29), Online publication date: 1-Dec-2013.
  335. ACM
    Cicotti P, Carrington L and Chien A Toward application-specific memory reconfiguration for energy efficiency Proceedings of the 1st International Workshop on Energy Efficient Supercomputing, (1-8)
  336. Seo S, Lee J, Jo G and Lee J Automatic OpenCL work-group size selection for multicore CPUs Proceedings of the 22nd international conference on Parallel architectures and compilation techniques, (387-398)
  337. ACM
    Choi J, Kwak J, Jhang S and Jhon C Data filter cache with word selection cache for low power embedded processor Proceedings of the 2013 Research in Adaptive and Convergent Systems, (422-427)
  338. ACM
    Martínez H, Tárraga J, Medina I, Barrachina S, Castillo M, Dopazo J and Quintana-Ortí E A dynamic pipeline for RNA sequencing on multicore processors Proceedings of the 20th European MPI Users' Group Meeting, (235-240)
  339. Hossain S and Steihaug T (2013). Sparse matrix computations with application to solve system of nonlinear equations, WIREs Computational Statistics, 5:5, (372-386), Online publication date: 1-Sep-2013.
  340. Schindewolf M, Rocker B, Karl W and Heuveline V Evaluation of two formulations of the conjugate gradients method with transactional memory Proceedings of the 19th international conference on Parallel Processing, (508-520)
  341. ACM
    Bhatia M, Kiran D, Misra J and Gurunarayanan S Fine grain thread scheduling on multicore processors Proceedings of the 6th ACM India Computing Convention, (1-6)
  342. ACM
    Altinigneli M, Plant C and Böhm C Massively parallel expectation maximization using graphics processing units Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, (838-846)
  343. ACM
    Song X, Shi J, Chen H and Zang B Schedule processes, not VCPUs Proceedings of the 4th Asia-Pacific Workshop on Systems, (1-7)
  344. ACM
    Xu T, Liljeberg P, Plosila J and Tenhunen H MMSoC Proceedings of the 14th International Conference on Computer Systems and Technologies, (67-74)
  345. ACM
    Son Y, Seongil O, Ro Y, Lee J and Ahn J (2013). Reducing memory access latency with asymmetric DRAM bank organizations, ACM SIGARCH Computer Architecture News, 41:3, (380-391), Online publication date: 26-Jun-2013.
  346. ACM
    Cook H, Moreto M, Bird S, Dao K, Patterson D and Asanovic K (2013). A hardware evaluation of cache partitioning to improve utilization and energy-efficiency while preserving responsiveness, ACM SIGARCH Computer Architecture News, 41:3, (308-319), Online publication date: 26-Jun-2013.
  347. ACM
    Son Y, Seongil O, Ro Y, Lee J and Ahn J Reducing memory access latency with asymmetric DRAM bank organizations Proceedings of the 40th Annual International Symposium on Computer Architecture, (380-391)
  348. ACM
    Cook H, Moreto M, Bird S, Dao K, Patterson D and Asanovic K A hardware evaluation of cache partitioning to improve utilization and energy-efficiency while preserving responsiveness Proceedings of the 40th Annual International Symposium on Computer Architecture, (308-319)
  349. ACM
    Szymanski T Low latency energy efficient communications in global-scale cloud computing systems Proceedings of the 2013 workshop on Energy efficient high performance parallel and distributed computing, (13-22)
  350. Soliman M (2013). Design, implementation, and evaluation of a low-complexity vector-core for executing scalar/vector instructions, Journal of Parallel and Distributed Computing, 73:6, (836-850), Online publication date: 1-Jun-2013.
  351. ACM
    Neela G and Draper J An asymmetric adaptive-precision energy-efficient 3DIC multiplier Proceedings of the 23rd ACM international conference on Great lakes symposium on VLSI, (269-274)
  352. ACM
    Nanavati M, Spear M, Taylor N, Rajagopalan S, Meyer D, Aiello W and Warfield A Whose cache line is it anyway? Proceedings of the 8th ACM European Conference on Computer Systems, (141-154)
  353. ACM
    Ltaief H, Luszczek P and Dongarra J (2013). High-performance bidiagonal reduction using tile algorithms on homogeneous multicore architectures, ACM Transactions on Mathematical Software, 39:3, (1-22), Online publication date: 1-Apr-2013.
  354. ACM
    Li S, Ahn J, Strong R, Brockman J, Tullsen D and Jouppi N (2013). The McPAT Framework for Multicore and Manycore Architectures, ACM Transactions on Architecture and Code Optimization, 10:1, (1-29), Online publication date: 1-Apr-2013.
  355. Hong S and Kim S AVICA Proceedings of the Conference on Design, Automation and Test in Europe, (65-70)
  356. ACM
    Huang Y, Ienne P, Temam O, Chen Y and Wu C Elastic CGRAs Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays, (171-180)
  357. ACM
    Park H and Choi K Position-based weighted round-robin arbitration for equality of service in many-core network-on-chips Proceedings of the Fifth International Workshop on Network on Chip Architectures, (51-56)
  358. ACM
    Zhang J and You S CudaGIS Proceedings of the 3rd ACM SIGSPATIAL International Workshop on GeoStreaming, (101-108)
  359. ACM
    Zhang J, You S and Gruenwald L High-performance online spatial and temporal aggregations on multi-core CPUs and many-core GPUs Proceedings of the fifteenth international workshop on Data warehousing and OLAP, (89-96)
  360. ACM
    Zhang J, You S and Gruenwald L U2STRA Proceedings of the 2012 ACM workshop on City data management workshop, (5-12)
  361. ACM
    Haque M, Ragel R, Ambrose A, Radhakrishnan S and Parameswaran S DIMSim Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis, (151-160)
  362. ACM
    Bournoutian G and Orailoglu A Dynamic transient fault detection and recovery for embedded processor datapaths Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis, (43-52)
  363. ACM
    Tu C, Hung S and Tsai T (2012). MCEmu, ACM Transactions on Design Automation of Electronic Systems, 17:4, (1-25), Online publication date: 1-Oct-2012.
  364. ACM
    Menon J, De Kruijf M and Sankaralingam K (2012). iGPU, ACM SIGARCH Computer Architecture News, 40:3, (72-83), Online publication date: 5-Sep-2012.
  365. ACM
    Zhang J, Kamga C, Gong H and Gruenwald L U2SOD-DB Proceedings of the ACM SIGKDD International Workshop on Urban Computing, (163-171)
  366. ACM
    Wang Y, Zhang C, Yu H and Zhang W Design of low power 3D hybrid memory by non-volatile CBRAM-crossbar with block-level data-retention Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design, (197-202)
  367. ACM
    Edwards J and Vishkin U Brief announcement Proceedings of the twenty-fourth annual ACM symposium on Parallelism in algorithms and architectures, (190-192)
  368. Menon J, De Kruijf M and Sankaralingam K iGPU Proceedings of the 39th Annual International Symposium on Computer Architecture, (72-83)
  369. Hart S, Frachtenberg E and Berezecki M Predicting memcached throughput using simulation and modeling Proceedings of the 2012 Symposium on Theory of Modeling and Simulation - DEVS Integrative M&S Symposium, (1-8)
  370. Habermaier A and Knapp A On the correctness of the SIMT execution model of GPUs Proceedings of the 21st European conference on Programming Languages and Systems, (316-335)
  371. ACM
    Ahn J, Jouppi N, Kozyrakis C, Leverich J and Schreiber R (2012). Improving System Energy Efficiency with Memory Rank Subsetting, ACM Transactions on Architecture and Code Optimization, 9:1, (1-28), Online publication date: 1-Mar-2012.
  372. Nie P and Duan Z (2012). Efficient and scalable scheduling for performance heterogeneous multicore systems, Journal of Parallel and Distributed Computing, 72:3, (353-361), Online publication date: 1-Mar-2012.
Contributors
  • Stanford University
  • Google LLC

Reviews

Ruay-Shiung Chang

Moore's law states that the number of transistors that can be placed on an integrated circuit (IC) doubles approximately every two years. This exponential growth in IC technology has led to advancements in everything digital, from central processing units (CPUs) and memory to digital cameras. Since computers are made up of CPUs, memory, and input/output (I/O) devices, it is a logical consequence that computers have also experienced tremendous improvements. This drastic change in computers makes it difficult-if not impossible-for a textbook on computer architecture to include every new technology. Often, when a computer architecture textbook hits the counter, it is already out of date. Therefore, it is no wonder that this book is in its fifth edition. The main part of the book contains six chapters. The focus is on parallelism. Besides a chapter on the fundamentals of quantitative methods and a chapter on memory hierarchy, the other four chapters deal with parallelism at various levels. It is explained as it relates to cloud computing in chapter 6, "Warehouse-Scale Computers to Exploit Request-Level and Data-Level Parallelism." However, not everyone will agree with the authors' decisions regarding which topics to include or exclude. For example, traditional computer architecture textbooks would include designs of CPU, memory, and I/O. In this book, I/O systems are rarely touched on at all. Moore's law tells us that computer industries and technologies are still quickly evolving. To chase the newest technology in a textbook is unrealistic. Going back to the basics may be the solution. We have to teach computer science students the basic principles that have applied since the computer was invented. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Recommendations