The computing world today is in the middle of a revolution: mobile clients and cloud computing have emerged as the dominant paradigms driving programming and hardware innovation today. The Fifth Edition of Computer Architecture focuses on this dramatic shift, exploring the ways in which software and technology in the "cloud" are accessed by cell phones, tablets, laptops, and other mobile computing devices. Each chapter includes two real-world examples, one mobile and one datacenter, to illustrate this revolutionary change. Updated to cover the mobile computing revolutionEmphasizes the two most important topics in architecture today: memory hierarchy and parallelism in all its forms.Develops common themes throughout each chapter: power, performance, cost, dependability, protection, programming models, and emerging trends ("What's Next")Includes three review appendices in the printed text. Additional reference appendices are available online.Includes updated Case Studies and completely new exercises.
- Adve, S. V., and K. Gharachorloo [1996]. "Shared memory consistency models: A tutorial," IEEE Computer 29:12 (December), 66-76. Google Scholar
Digital Library
- Adve, S. V., and M. D. Hill [1990]. "Weak ordering--a new definition," Proc. 17th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 28-31, 1990, Seattle, Wash., 2-14. Google Scholar
- Agarwal, A. [1987]. "Analysis of Cache Performance for Operating Systems and Multiprogramming," Ph. D. thesis, Tech. Rep. No. CSL-TR-87-332, Stanford University, Palo Alto, Calif. Google Scholar
- Agarwal, A. [1991]. "Limits on interconnection network performance," IEEE Trans. on Parallel and Distributed Systems 2:4 (April), 398-412. Google Scholar
Digital Library
- Agarwal, A., and S. D. Pudar [1993]. "Column-associative caches: A technique for reducing the miss rate of direct-mapped caches," 20th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 16-19, 1993, San Diego, Calif. Also appears in Computer Architecture News 21:2 (May), 179-190, 1993. Google Scholar
- Agarwal, A., R. Bianchini, D. Chaiken, K. Johnson, and D. Kranz [1995]. "The MIT Alewife machine: Architecture and performance," Int'l. Symposium on Computer Architecture (Denver, Colo.), June, 2-13. Google Scholar
- Agarwal, A., J. L. Hennessy, R. Simoni, and M. A. Horowitz [1988]. "An evaluation of directory schemes for cache coherence," Proc. 15th Int'l. Symposium on Computer Architecture (June), 280-289. Google Scholar
- Agarwal, A., J. Kubiatowicz, D. Kranz, B.-H. Lim, D. Yeung, G. D'Souza, and M. Parkin [1993]. "Sparcle: An evolutionary processor design for large-scale multiprocessors," IEEE Micro 13 (June), 48-61. Google Scholar
Digital Library
- Agerwala, T., and J. Cocke [1987]. High Performance Reduced Instruction Set Processors , IBM Tech. Rep. RC12434, IBM, Armonk, N.Y.Google Scholar
- Akeley, K. and T. Jermoluk [1988]. "High-Performance Polygon Rendering," Proc. 15th Annual Conf. on Computer Graphics and Interactive Techniques (SIGGRAPH 1988) , August 1-5, 1988, Atlanta, Ga., 239-246. Google Scholar
- Alexander, W. G., and D. B. Wortman [1975]. "Static and dynamic characteristics of XPL programs," IEEE Computer 8:11 (November), 41-46. Google Scholar
Digital Library
- Alles, A. [1995]. "ATM Internetworking," White Paper (May), Cisco Systems, Inc., San Jose, Calif. (www.cisco.com/warp/public/614/12.html)Google Scholar
- Alliant. [1987]. Alliant FX/Series: Product Summary , Alliant Computer Systems Corp., Acton, Mass.Google Scholar
- Almasi, G. S., and A. Gottlieb [1989]. Highly Parallel Computing , Benjamin/Cummings, Redwood City, Calif. Google Scholar
- Alverson, G., R. Alverson, D. Callahan, B. Koblenz, A. Porterfield, and B. Smith [1992]. "Exploiting heterogeneous parallelism on a multithreaded multiprocessor," Proc. ACM/IEEE Conf. on Supercomputing , November 16-20, 1992, Minneapolis, Minn., 188-197. Google Scholar
- Amdahl, G. M. [1967]. "Validity of the single processor approach to achieving large scale computing capabilities," Proc. AFIPS Spring Joint Computer Conf. , April 18-20, 1967, Atlantic City, N. J., 483-485. Google Scholar
- Amdahl, G. M., G. A. Blaauw, and F. P. Brooks, Jr. [1964]. "Architecture of the IBM System 360," IBM J. Research and Development 8:2 (April), 87-101. Google Scholar
Digital Library
- Amza, C., A. L. Cox, S. Dwarkadas, P. Keleher, H. Lu, R. Rajamony, W. Yu, and W. Zwaenepoel [1996]. "Treadmarks: Shared memory computing on networks of workstations," IEEE Computer 29:2 (February), 18-28. Google Scholar
Digital Library
- Anderson, D. [2003]. "You don't know jack about disks," Queue , 1:4 (June), 20-30. Google Scholar
Digital Library
- Anderson, D., J. Dykes, and E. Riedel [2003]. "SCSI vs. ATA--More than an interface," Proc. 2nd USENIX Conf. on File and Storage Technology (FAST '03) , March 31- April 2, 2003, San Francisco. Google Scholar
- Anderson, D. W., F. J. Sparacio, and R. M. Tomasulo [1967]. "The IBM 360 Model 91: Processor philosophy and instruction handling," IBM J. Research and Development 11:1 (January), 8-24. Google Scholar
Digital Library
- Anderson, M. H. [1990]. "Strength (and safety) in numbers (RAID, disk storage technology)," Byte 15:13 (December), 337-339.Google Scholar
- Anderson, T. E., D. E. Culler, and D. Patterson [1995]. "A case for NOW (networks of workstations)," IEEE Micro 15:1 (February), 54-64. Google Scholar
Digital Library
- Ang, B., D. Chiou, D. Rosenband, M. Ehrlich, L. Rudolph, and Arvind [1998]. "StarTVoyager: A flexible platform for exploring scalable SMP issues," Proc. ACM/IEEE Conf. on Supercomputing , November 7-13, 1998, Orlando, FL. Google Scholar
- Anjan, K. V., and T. M. Pinkston [1995]. "An efficient, fully-adaptive deadlock recovery scheme: Disha," Proc. 22nd Annual Int'l. Symposium on Computer Architecture (ISCA) , June 22-24, 1995, Santa Margherita, Italy. Google Scholar
- Anon. et al. [1985]. A Measure of Transaction Processing Power , Tandem Tech. Rep. TR85.2. Also appears in Datamation 31:7 (April), 112-118, 1985. Google Scholar
- Apache Hadoop. [2011]. http://hadoop.apache.org.Google Scholar
- Archibald, J., and J.-L. Baer [1986]. "Cache coherence protocols: Evaluation using a multiprocessor simulation model," ACM Trans. on Computer Systems 4:4 (November), 273-298. Google Scholar
Digital Library
- Armbrust, M., A. Fox, R. Griffith, A. D. Joseph, R. Katz, A. Konwinski, G. Lee, D. Patterson, A. Rabkin, I. Stoica, and M. Zaharia [2009]. Above the Clouds: A Berkeley View of Cloud Computing , Tech. Rep. UCB/EECS-2009-28, University of California, Berkeley (http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-28.html).Google Scholar
- Arpaci, R. H., D. E. Culler, A. Krishnamurthy, S. G. Steinberg, and K. Yelick [1995]. "Empirical evaluation of the CRAY-T3D: A compiler perspective," 22nd Annual Int'l. Symposium on Computer Architecture (ISCA) , June 22-24, 1995, Santa Margherita, Italy. Google Scholar
- Asanovic, K. [1998]. "Vector Microprocessors," Ph. D. thesis, Computer Science Division, University of California, Berkeley. Google Scholar
- Associated Press. [2005]. "Gap Inc. shuts down two Internet stores for major overhaul," USATODAY.com , August 8, 2005.Google Scholar
- Atanasoff, J. V. [1940]. Computing Machine for the Solution of Large Systems of Linear Equations , Internal Report, Iowa State University, Ames.Google Scholar
- Atkins, M. [1991]. Performance and the i860 Microprocessor, IEEE Micro , 11:5 (September), 24-27, 72-78. Google Scholar
Digital Library
- Austin, T. M., and G. Sohi [1992]. "Dynamic dependency analysis of ordinary programs," Proc. 19th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 19-21, 1992, Gold Coast, Australia, 342-351. Google Scholar
- Babbay, F., and A. Mendelson [1998]. "Using value prediction to increase the power of speculative execution hardware," ACM Trans. on Computer Systems 16:3 (August), 234-270. Google Scholar
- Baer, J.-L., and W.-H. Wang [1988]. "On the inclusion property for multi-level cache hierarchies," Proc. 15th Annual Int'l. Symposium on Computer Architecture , May 30-June 2, 1988, Honolulu, Hawaii, 73-80. Google Scholar
- Bailey, D. H., E. Barszcz, J. T. Barton, D. S. Browning, R. L. Carter, L. Dagum, R. A. Fatoohi, P. O. Frederickson, T. A. Lasinski, R. S. Schreiber, H. D. Simon, V. Venkatakrishnan, and S. K. Weeratunga [1991]. "The NAS parallel benchmarks," Int'l. J. Supercomputing Applications 5, 63-73. Google Scholar
Digital Library
- Bakoglu, H. B., G. F. Grohoski, L. E. Thatcher, J. A. Kaeli, C. R. Moore, D. P. Tattle, W. E. Male, W. R. Hardell, D. A. Hicks, M. Nguyen Phu, R. K. Montoye, W. T. Glover, and S. Dhawan [1989]. "IBM second-generation RISC processor organization," Proc. IEEE Int'l. Conf. on Computer Design , September 30-October 4, 1989, Rye, N.Y., 138-142.Google Scholar
- Balakrishnan, H., V. N. Padmanabhan, S. Seshan, and R. H. Katz [1997]. "A comparison of mechanisms for improving TCP performance over wireless links," IEEE/ACM Trans. on Networking 5:6 (December), 756-769. Google Scholar
Digital Library
- Ball, T., and J. Larus [1993]. "Branch prediction for free," Proc. ACM SIGPLAN'93 Conference on Programming Language Design and Implementation (PLDI) , June 23-25, 1993, Albuquerque, N. M., 300-313. Google Scholar
- Banerjee, U. [1979]. "Speedup of Ordinary Programs," Ph. D. thesis, Dept. of Computer Science, University of Illinois at Urbana-Champaign. Google Scholar
- Barham, P., B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, and R. Neugebauer [2003]. "Xen and the art of virtualization," Proc. of the 19th ACM Symposium on Operating Systems Principles , October 19-22, 2003, Bolton Landing, N.Y. Google Scholar
- Barroso, L. A. [2010]. "Warehouse Scale Computing [keynote address]," Proc. ACM SIGMOD , June 8-10, 2010, Indianapolis, Ind. Google Scholar
- Barroso, L. A., and U. Holzle [2007], "The case for energy-proportional computing," IEEE Computer , 40:12 (December), 33-37. Google Scholar
Digital Library
- Barroso, L. A., and U. Holzle [2009]. The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines , Morgan & Claypool, San Rafael, Calif. Google Scholar
- Barroso, L. A., K. Gharachorloo, and E. Bugnion [1998]. "Memory system characterization of commercial workloads," Proc. 25th Annual Int'l. Symposium on Computer Architecture (ISCA) , July 3-14, 1998, Barcelona, Spain, 3-14. Google Scholar
- Barton, R. S. [1961]. "A new approach to the functional design of a computer," Proc. Western Joint Computer Conf. , May 9-11, 1961, Los Angeles, Calif., 393-396. Google Scholar
- Bashe, C. J., W. Buchholz, G. V. Hawkins, J. L. Ingram, and N. Rochester [1981]. "The architecture of IBM's early computers," IBM J. Research and Development 25:5 (September), 363-375. Google Scholar
Digital Library
- Bashe, C. J., L. R. Johnson, J. H. Palmer, and E. W. Pugh [1986]. IBM's Early Computers , MIT Press, Cambridge, Mass. Google Scholar
- Baskett, F., and T. W. Keller [1977]. "An evaluation of the Cray-1 processor," in High Speed Computer and Algorithm Organization , D. J. Kuck, D. H. Lawrie, and A. H. Sameh, eds., Academic Press, San Diego, 71-84.Google Scholar
- Baskett, F., T. Jermoluk, and D. Solomon [1988]. "The 4D-MP graphics superworkstation: Computing + graphics = 40 MIPS + 40 MFLOPS and 10,000 lighted polygons per second," Proc. IEEE COMPCON , February 29-March 4, 1988, San Francisco, 468-471.Google Scholar
- BBN Laboratories. [1986]. Butterfly Parallel Processor Overview , Tech. Rep. 6148, BBN Laboratories, Cambridge, Mass.Google Scholar
- Bell, C. G. [1984]. "The mini and micro industries," IEEE Computer 17:10 (October), 14-30. Google Scholar
Digital Library
- Bell, C. G. [1985]. "Multis: A new class of multiprocessor computers," Science 228 (April 26), 462-467.Google Scholar
Cross Ref
- Bell, C. G. [1989]. "The future of high performance computers in science and engineering," Communications of the ACM 32:9 (September), 1091-1101. Google Scholar
Digital Library
- Bell, G., and J. Gray [2001]. Crays, Clusters and Centers , Tech. Rep. MSR-TR-2001-76, Microsoft Research, Redmond, Wash.Google Scholar
- Bell, C. G., and J. Gray [2002]. "What's next in high performance computing?" CACM 45:2 (February), 91-95. Google Scholar
Digital Library
- Bell, C. G., and A. Newell [1971]. Computer Structures: Readings and Examples , McGraw-Hill, New York. Google Scholar
- Bell, C. G., and W. D. Strecker [1976]. "Computer structures: What have we learned from the PDP-11?," Third Annual Int'l. Symposium on Computer Architecture (ISCA) , January 19-21, 1976, Tampa, Fla., 1-14. Google Scholar
- Bell, C. G., and W. D. Strecker [1998]. "Computer structures: What have we learned from the PDP-11?" 25 Years of the International Symposia on Computer Architecture (Selected Papers) , ACM, New York, 138-151. Google Scholar
Digital Library
- Bell, C. G., J. C. Mudge, and J. E. McNamara [1978]. A DEC View of Computer Engineering , Digital Press, Bedford, Mass.Google Scholar
- Bell, C. G., R. Cady, H. McFarland, B. DeLagi, J. O'Laughlin, R. Noonan, and W. Wulf [1970]. "A new architecture for mini-computers: The DEC PDP-11," Proc. AFIPS Spring Joint Computer Conf. , May 5-May 7, 1970, Atlantic City, N. J., 657-675. Google Scholar
Digital Library
- Benes, V. E. [1962]. "Rearrangeable three stage connecting networks," Bell System Technical Journal 41, 1481-1492.Google Scholar
Cross Ref
- Bertozzi, D., A. Jalabert, S. Murali, R. Tamhankar, S. Stergiou, L. Benini, and G. De Micheli [2005]. "NoC synthesis flow for customized domain specific multiprocessor systems-on-chip," IEEE Trans. on Parallel and Distributed Systems 16:2 (February), 113-130. Google Scholar
Digital Library
- Bhandarkar, D. P. [1995]. Alpha Architecture and Implementations , Digital Press, Newton, Mass.Google Scholar
- Bhandarkar, D. P., and D. W. Clark [1991]. "Performance from architecture: Comparing a RISC and a CISC with similar hardware organizations," Proc. Fourth Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , April 8-11, 1991, Palo Alto, Calif., 310-319. Google Scholar
- Bhandarkar, D. P., and J. Ding [1997]. "Performance characterization of the Pentium Pro processor," Proc. Third Int'l. Symposium on High-Performance Computer Architecture , February 1-February 5, 1997, San Antonio, Tex., 288-297. Google Scholar
- Bhuyan, L. N., and D. P. Agrawal [1984]. "Generalized hypercube and hyperbus structures for a computer network," IEEE Trans. on Computers 32:4 (April), 322-333. Google Scholar
- Bienia, C., S. Kumar, P. S. Jaswinder, and K. Li [2008]. The Parsec Benchmark Suite: Characterization and Architectural Implications , Tech. Rep. TR-811-08, Princeton University, Princeton, N. J.Google Scholar
- Bier, J. [1997]. "The Evolution of DSP Processors," presentation at Univesity of California, Berkeley, November 14.Google Scholar
- Bird, S., A. Phansalkar, L. K. John, A. Mericas, and R. Indukuru [2007]. "Characterization of performance of SPEC CPU benchmarks on Intel's Core Microarchitecture based processor," Proc. 2007 SPEC Benchmark Workshop , January 21, 2007, Austin, Tex.Google Scholar
- Birman, M., A. Samuels, G. Chu, T. Chuk, L. Hu, J. McLeod, and J. Barnes [1990]. "Developing the WRL3170/3171 SPARC floating-point coprocessors," IEEE Micro 10:1, 55-64. Google Scholar
Digital Library
- Blackburn, M., R. Garner, C. Hoffman, A. M. Khan, K. S. McKinley, R. Bentzur, A. Diwan, D. Feinberg, D. Frampton, S. Z. Guyer, M. Hirzel, A. Hosking, M. Jump, H. Lee, J. E. B. Moss, A. Phansalkar, D. Stefanovic, T. VanDrunen, D. von Dincklage, and B. Wiedermann [2006]. "The DaCapo benchmarks: Java benchmarking development and analysis," ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA) , October 22-26, 2006, 169-190. Google Scholar
- Blaum, M., J. Bruck, and A. Vardy [1996]. "MDS array codes with independent parity symbols," IEEE Trans. on Information Theory , IT-42 (March), 529-42. Google Scholar
Digital Library
- Blaum, M., J. Brady, J. Bruck, and J. Menon [1994]. "EVENODD: An optimal scheme for tolerating double disk failures in RAID architectures," Proc. 21st Annual Int'l. Symposium on Computer Architecture (ISCA) , April 18-21, 1994, Chicago, 245-254. Google Scholar
- Blaum, M., J. Brady, J. Bruck, and J. Menon [1995]. "EVENODD: An optimal scheme for tolerating double disk failures in RAID architectures," IEEE Trans. on Computers 44:2 (February), 192-202. Google Scholar
Digital Library
- Blaum, M., J. Brady, J., Bruck, J. Menon, and A. Vardy [2001]. "The EVENODD code and its generalization," in H. Jin, T. Cortes, and R. Buyya, eds., High Performance Mass Storage and Parallel I/O: Technologies and Applications , Wiley-IEEE, New York, 187-208.Google Scholar
- Bloch, E. [1959]. "The engineering design of the Stretch computer," 1959 Proceedings of the Eastern Joint Computer Conf. , December 1-3, 1959, Boston, Mass., 48-59. Google Scholar
- Boddie, J. R. [2000]. "History of DSPs," www.lucent.com/micro/dsp/dsphist.html.Google Scholar
- Bolt, K. M. [2005]. "Amazon sees sales rise, profit fall," Seattle Post-Intelligencer , October 25 (http://seattlepi.nwsource.com/business/245943_techearns26.html).Google Scholar
- Bordawekar, R., U. Bondhugula, R. Rao [2010]. "Believe It or Not!: Multi-core CPUs can Match GPU Performance for a FLOP-Intensive Application!" 19th International Conference on Parallel Architecture and Compilation Techniques (PACT 2010) . Vienna, Austria, September 11-15, 2010, 537-538. Google Scholar
- Borg, A., R. E. Kessler, and D. W. Wall [1990]. "Generation and analysis of very long address traces," 19th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 19-21, 1992, Gold Coast, Australia, 270-279. Google Scholar
- Bouknight, W. J., S. A. Deneberg, D. E. McIntyre, J. M. Randall, A. H. Sameh, and D. L. Slotnick [1972]. "The Illiac IV system," Proc. IEEE 60:4, 369-379. Also appears in D. P. Siewiorek, C. G. Bell, and A. Newell, Computer Structures: Principles and Examples , McGraw-Hill, New York, 1982, 306-316. Google Scholar
Cross Ref
- Brady, J. T. [1986]. "A theory of productivity in the creative process," IEEE CG&A (May), 25-34. Google Scholar
- Brain, M. [2000]. "Inside a Digital Cell Phone," www.howstuffworks.com/insidecellphone. htm.Google Scholar
- Brandt, M., J. Brooks, M. Cahir, T. Hewitt, E. Lopez-Pineda, and D. Sandness [2000]. The Benchmarker's Guide for Cray SV1 Systems. Cray Inc., Seattle, Wash.Google Scholar
- Brent, R. P., and H. T. Kung [1982]. "A regular layout for parallel adders," IEEE Trans. on Computers C-31, 260-264. Google Scholar
Digital Library
- Brewer, E. A., and B. C. Kuszmaul [1994]. "How to get good performance from the CM-5 data network," Proc. Eighth Int'l. Parallel Processing Symposium , April 26-27, 1994, Cancun, Mexico. Google Scholar
Cross Ref
- Brin, S., and L. Page [1998]. "The anatomy of a large-scale hypertextual Web search engine," Proc. 7th Int'l. World Wide Web Conf. , April 14-18, 1998, Brisbane, Queensland, Australia, 107-117. Google Scholar
- Brown, A., and D. A. Patterson [2000]. "Towards maintainability, availability, and growth benchmarks: A case study of software RAID systems." Proc. 2000 USENIX Annual Technical Conf. , June 18-23, 2000, San Diego, Calif. Google Scholar
- Bucher, I. V., and A. H. Hayes [1980]. "I/O performance measurement on Cray-1 and CDC 7000 computers," Proc. Computer Performance Evaluation Users Group , 16th Meeting , NBS 500-65, 245-254.Google Scholar
- Bucher, I. Y. [1983]. "The computational speed of supercomputers," Proc. Int'l. Conf. on Measuring and Modeling of Computer Systems (SIGMETRICS 1983) , August 29-31, 1983, Minneapolis, Minn., 151-165. Google Scholar
- Bucholtz, W. [1962]. Planning a Computer System: Project Stretch , McGraw-Hill, New York. Google Scholar
- Burgess, N., and T. Williams [1995]. "Choices of operand truncation in the SRT division algorithm," IEEE Trans. on Computers 44:7, 933-938. Google Scholar
Digital Library
- Burkhardt III, H., S. Frank, B. Knobe, and J. Rothnie [1992]. Overview of the KSR1 Computer System , Tech. Rep. KSR-TR-9202001, Kendall Square Research, Boston, Mass.Google Scholar
- Burks, A. W., H. H. Goldstine, and J. von Neumann [1946]. "Preliminary discussion of the logical design of an electronic computing instrument," Report to the U. S. Army Ordnance Department, p. 1; also appears in Papers of John von Neumann , W. Aspray and A. Burks, eds., MIT Press, Cambridge, Mass., and Tomash Publishers, Los Angeles, Calif., 1987, 97-146.Google Scholar
- Calder, B., G. Reinman, and D. M. Tullsen [1999]. "Selective value prediction," Proc. 26th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 2-4, 1999, Atlanta, Ga. Google Scholar
- Calder, B., D. Grunwald, M. Jones, D. Lindsay, J. Martin, M. Mozer, and B. Zorn [1997]. "Evidence-based static branch prediction using machine learning," ACM Trans. Program. Lang. Syst. 19:1, 188-222. Google Scholar
Digital Library
- Callahan, D., J. Dongarra, and D. Levine [1988]. "Vectorizing compilers: A test suite and results," Proc. ACM/IEEE Conf. on Supercomputing , November 12-17, 1988, Orland, Fla., 98-105. Google Scholar
- Cantin, J. F., and M. D. Hill [2001]. "Cache Performance for Selected SPEC CPU2000 Benchmarks," www.jfred.org/cache-data.html (June).Google Scholar
- Cantin, J. F., and M. D. Hill [2003]. "Cache Performance for SPEC CPU2000 Benchmarks, Version 3.0," www.cs.wisc.edu/multifacet/misc/spec2000cache-data/index.html.Google Scholar
- Carles, S. [2005]. "Amazon reports record Xmas season, top game picks," Gamasutra , December 27 (http://www.gamasutra.com/php-bin/news_index.php?story=7630.)Google Scholar
- Carter, J., and K. Rajamani [2010]. "Designing energy-efficient servers and data centers," IEEE Computer 43:7 (July), 76-78. Google Scholar
Digital Library
- Case, R. P., and A. Padegs [1978]. "The architecture of the IBM System/370," Communications of the ACM 21:1, 73-96. Also appears in D. P. Siewiorek, C. G. Bell, and A. Newell, Computer Structures: Principles and Examples , McGraw-Hill, New York, 1982, 830-855. Google Scholar
Digital Library
- Censier, L., and P. Feautrier [1978]. "A new solution to coherence problems in multicache systems," IEEE Trans. on Computers C-27:12 (December), 1112-1118. Google Scholar
Digital Library
- Chandra, R., S. Devine, B. Verghese, A. Gupta, and M. Rosenblum [1994]. "Scheduling and page migration for multiprocessor compute servers," Sixth Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 4-7, 1994, San Jose, Calif., 12-24. Google Scholar
- Chang, F., J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber [2006]. "Bigtable: A distributed storage system for structured data," Proc. 7th USENIX Symposium on Operating Systems Design and Implementation (OSDI '06) , November 6-8, 2006, Seattle, Wash. Google Scholar
- Chang, J., J. Meza, P. Ranganathan, C. Bash, and A. Shah [2010]. "Green server design: Beyond operational energy to sustainability," Proc. Workshop on Power Aware Computing and Systems (HotPower '10) , October 3, 2010, Vancouver, British Columbia. Google Scholar
- Chang, P. P., S. A. Mahlke, W. Y. Chen, N. J. Warter, and W. W. Hwu [1991]. "IMPACT: An architectural framework for multiple-instruction-issue processors," 18th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 27-30, 1991, Toronto, Canada, 266-275. Google Scholar
- Charlesworth, A. E. [1981]. "An approach to scientific array processing: The architecture design of the AP-120B/FPS-164 family," Computer 14:9 (September), 18-27. Google Scholar
Digital Library
- Charlesworth, A. [1998]. "Starfire: Extending the SMP envelope," IEEE Micro 18:1 (January/February), 39-49. Google Scholar
Digital Library
- Chen, P. M., and E. K. Lee [1995]. "Striping in a RAID level 5 disk array," Proc. ACM SIGMETRICS Conf. on Measurement and Modeling of Computer Systems , May 15-19, 1995, Ottawa, Canada, 136-145. Google Scholar
- Chen, P. M., G. A. Gibson, R. H. Katz, and D. A. Patterson [1990]. "An evaluation of redundant arrays of inexpensive disks using an Amdahl 5890," Proc. ACM SIGMETRICS Conf. on Measurement and Modeling of Computer Systems , May 22-25, 1990, Boulder, Colo. Google Scholar
- Chen, P. M., E. K. Lee, G. A. Gibson, R. H. Katz, and D. A. Patterson [1994]. "RAID: High-performance, reliable secondary storage," ACM Computing Surveys 26:2 (June), 145-188. Google Scholar
Digital Library
- Chen, S. [1983]. "Large-scale and high-speed multiprocessor system for scientific applications," Proc. NATO Advanced Research Workshop on High-Speed Computing , June 20-22, 1983, Julich, West Germany. Also appears in K. Hwang, ed., "Superprocessors: Design and applications," IEEE (August), 602-609, 1984.Google Scholar
- Chen, T. C. [1980]. "Overlap and parallel processing," in H. Stone, ed., Introduction to Computer Architecture , Science Research Associates, Chicago, 427-486.Google Scholar
- Chow, F. C. [1983]. "A Portable Machine-Independent Global Optimizer--Design and Measurements," Ph. D. thesis, Stanford University, Palo Alto, Calif. Google Scholar
- Chrysos, G. Z., and J. S. Emer [1998]. "Memory dependence prediction using store sets," Proc. 25th Annual Int'l. Symposium on Computer Architecture (ISCA) , July 3-14, 1998, Barcelona, Spain, 142-153. Google Scholar
- Clark, B., T. Deshane, E. Dow, S. Evanchik, M. Finlayson, J. Herne, and J. Neefe Matthews [2004]. "Xen and the art of repeated research," Proc. USENIX Annual Technical Conf. , June 27-July 2, 2004, 135-144. Google Scholar
- Clark, D. W. [1983]. "Cache performance of the VAX-11/780," ACM Trans. on Computer Systems 1:1, 24-37. Google Scholar
Digital Library
- Clark, D. W. [1987]. "Pipelining and performance in the VAX 8800 processor," Proc. Second Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 5-8, 1987, Palo Alto, Calif., 173-177. Google Scholar
- Clark, D. W., and J. S. Emer [1985]. "Performance of the VAX-11/780 translation buffer: Simulation and measurement," ACM Trans. on Computer Systems 3:1 (February), 31-62. Google Scholar
Digital Library
- Clark, D., and H. Levy [1982]. "Measurement and analysis of instruction set use in the VAX-11/780," Proc. Ninth Annual Int'l. Symposium on Computer Architecture (ISCA) , April 26-29, 1982, Austin, Tex., 9-17. Google Scholar
- Clark, D., and W. D. Strecker [1980]. "Comments on 'the case for the reduced instruction set computer,'" Computer Architecture News 8:6 (October), 34-38. Google Scholar
Digital Library
- Clark, W. A. [1957]. "The Lincoln TX-2 computer development," Proc. Western Joint Computer Conference , February 26-28, 1957, Los Angeles, 143-145. Google Scholar
- Clidaras, J., C. Johnson, and B. Felderman [2010]. Private communication. Climate Savers Computing Initiative. [2007]. "Efficiency Specs," http://www. climatesaverscomputing.org/.Google Scholar
- Clos, C. [1953]. "A study of non-blocking switching networks," Bell Systems Technical Journal 32 (March), 406-424.Google Scholar
Cross Ref
- Cody, W. J., J. T. Coonen, D. M. Gay, K. Hanson, D. Hough, W. Kahan, R. Karpinski, J. Palmer, F. N. Ris, and D. Stevenson [1984]. "A proposed radix- and word-lengthindependent standard for floating-point arithmetic," IEEE Micro 4:4, 86-100. Google Scholar
Digital Library
- Colwell, R. P., and R. Steck [1995]. "A 0.6 µm BiCMOS processor with dynamic execution." Proc. of IEEE Int'l. Symposium on Solid State Circuits (ISSCC) , February 15-17, 1995, San Francisco, 176-177.Google Scholar
- Colwell, R. P., R. P. Nix, J. J. O'Donnell, D. B. Papworth, and P. K. Rodman [1987]. "A VLIW architecture for a trace scheduling compiler," Proc. Second Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 5-8, 1987, Palo Alto, Calif., 180-192. Google Scholar
- Comer, D. [1993]. Internetworking with TCP/IP , 2nd ed., Prentice Hall, Englewood Cliffs, N. J. Google Scholar
- Compaq Computer Corporation. [1999]. Compiler Writer's Guide for the Alpha 21264 , Order Number EC-RJ66A-TE, June, www1.support.compaq.com/alpha-tools/documentation/current/21264_EV67/ec-rj66a-te_comp_writ_gde_for_alpha21264.pdf.Google Scholar
- Conti, C., D. H. Gibson, and S. H. Pitkowsky [1968]. "Structural aspects of the System/ 360 Model 85. Part I. General organization," IBM Systems J. 7:1, 2-14. Google Scholar
Digital Library
- Coonen, J. [1984]. "Contributions to a Proposed Standard for Binary Floating-Point Arithmetic," Ph. D. thesis, University of California, Berkeley. Google Scholar
- Corbett, P., B. English, A. Goel, T. Grcanac, S. Kleiman, J. Leong, and S. Sankar [2004]. "Row-diagonal parity for double disk failure correction," Proc. 3rd USENIX Conf. on File and Storage Technology (FAST '04) , March 31-April 2, 2004, San Francisco. Google Scholar
- Crawford, J., and P. Gelsinger [1988]. Programming the 80386 , Sybex Books, Alameda, Calif.Google Scholar
- Culler, D. E., J. P. Singh, and A. Gupta [1999]. Parallel Computer Architecture: A Hardware/Software Approach , Morgan Kaufmann, San Francisco. Google Scholar
- Curnow, H. J., and B. A. Wichmann [1976]. "A synthetic benchmark," The Computer J. 19:1, 43-49.Google Scholar
Cross Ref
- Cvetanovic, Z., and R. E. Kessler [2000]. "Performance analysis of the Alpha 21264- based Compaq ES40 system," Proc. 27th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 10-14, 2000, Vancouver, Canada, 192-202. Google Scholar
- Dally, W. J. [1990]. "Performance analysis of k -ary n -cube interconnection networks," IEEE Trans. on Computers 39:6 (June), 775-785. Google Scholar
Digital Library
- Dally, W. J. [1992]. "Virtual channel flow control," IEEE Trans. on Parallel and Distributed Systems 3:2 (March), 194-205. Google Scholar
Digital Library
- Dally, W. J. [1999]. "Interconnect limited VLSI architecture," Proc. of the International Interconnect Technology Conference , May 24-26, 1999, San Francisco.Google Scholar
- Dally, W. J., and C. I. Seitz [1986]. "The torus routing chip," Distributed Computing 1:4, 187-196.Google Scholar
Cross Ref
- Dally, W. J., and B. Towles [2001]. "Route packets, not wires: On-chip interconnection networks," Proc. 38th Design Automation Conference , June 18-22, 2001, Las Vegas. Google Scholar
- Dally, W. J., and B. Towles [2003]. Principles and Practices of Interconnection Networks , Morgan Kaufmann, San Francisco. Google Scholar
- Darcy, J. D., and D. Gay [1996]. "FLECKmarks: Measuring floating point performance using a full IEEE compliant arithmetic benchmark," CS 252 class project, University of California, Berkeley (see HTTP.CS.Berkeley.EDU/~darcy/Projects/cs252/).Google Scholar
- Darley, H. M. et al. [1989]. "Floating Point/Integer Processor with Divide and Square Root Functions," U. S. Patent 4,878,190, October 31.Google Scholar
- Davidson, E. S. [1971]. "The design and control of pipelined function generators," Proc. IEEE Conf. on Systems , Networks , and Computers , January 19-21, 1971, Oaxtepec, Mexico, 19-21.Google Scholar
- Davidson, E. S., A. T. Thomas, L. E. Shar, and J. H. Patel [1975]. "Effective control for pipelined processors," Proc. IEEE COMPCON , February 25-27, 1975, San Francisco, 181-184.Google Scholar
- Davie, B. S., L. L. Peterson, and D. Clark [1999]. Computer Networks: A Systems Approach , 2nd ed., Morgan Kaufmann, San Francisco. Google Scholar
- Dean, J. [2009]. "Designs, lessons and advice from building large distributed systems [keynote address]," Proc. 3rd ACM SIGOPS Int'l. Workshop on Large-Scale Distributed Systems and Middleware , Co-located with the 22nd ACM Symposium on Operating Systems Principles , October 11-14, 2009, Big Sky, Mont.Google Scholar
- Dean, J., and S. Ghemawat [2004]. "MapReduce: Simplified data processing on large clusters." In Proc. Operating Systems Design and Implementation (OSDI) , December 6-8, 2004, San Francisco, Calif., 137-150. Google Scholar
- Dean, J., and S. Ghemawat [2008]. "MapReduce: Simplified data processing on large clusters," Communications of the ACM , 51:1, 107-113. Google Scholar
Digital Library
- DeCandia, G., D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels [2007]. "Dynamo: Amazon's highly available key-value store," Proc. 21st ACM Symposium on Operating Systems Principles , October 14-17, 2007, Stevenson, Wash. Google Scholar
- Dehnert, J. C., P. Y.-T. Hsu, and J. P. Bratt [1989]. "Overlapped loop support on the Cydra 5," Proc. Third Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , April 3-6, 1989, Boston, Mass., 26-39. Google Scholar
- Demmel, J. W., and X. Li [1994]. "Faster numerical algorithms via exception handling," IEEE Trans. on Computers 43:8, 983-992. Google Scholar
Digital Library
- Denehy, T. E., J. Bent, F. I. Popovici, A. C. Arpaci-Dusseau, and R. H. Arpaci-Dusseau [2004]. "Deconstructing storage arrays," Proc. 11th Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 7-13, 2004, Boston, Mass., 59-71. Google Scholar
- Desurvire, E. [1992]. "Lightwave communications: The fifth generation," Scientific American (International Edition) 266:1 (January), 96-103.Google Scholar
- Diep, T. A., C. Nelson, and J. P. Shen [1995]. "Performance evaluation of the PowerPC 620 microarchitecture," Proc. 22nd Annual Int'l. Symposium on Computer Architecture (ISCA) , June 22-24, 1995, Santa Margherita, Italy. Google Scholar
- Digital Semiconductor. [1996]. Alpha Architecture Handbook , Version 3 , Digital Press, Maynard, Mass.Google Scholar
- Ditzel, D. R., and H. R. McLellan [1987]. "Branch folding in the CRISP microprocessor: Reducing the branch delay to zero," Proc. 14th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 2-5, 1987, Pittsburgh, Penn., 2-7. Google Scholar
- Ditzel, D. R., and D. A. Patterson [1980]. "Retrospective on high-level language computer architecture," Proc. Seventh Annual Int'l. Symposium on Computer Architecture (ISCA) , May 6-8, 1980, La Baule, France, 97-104. Google Scholar
- Doherty, W. J., and R. P. Kelisky [1979]. "Managing VM/CMS systems for user effectiveness," IBM Systems J. 18:1, 143-166. Google Scholar
Digital Library
- Dongarra, J. J. [1986]. "A survey of high performance processors," Proc. IEEE COMPCON , March 3-6, 1986, San Francisco, 8-11.Google Scholar
- Dongarra, J., T. Sterling, H. Simon, and E. Strohmaier [2005]. "High-performance computing: Clusters, constellations, MPPs, and future directions," Computing in Science & Engineering , 7:2 (March/April), 51-59. Google Scholar
- Douceur, J. R., and W. J. Bolosky [1999]. "A large scale study of file-system contents," Proc. ACM SIGMETRICS Conf. on Measurement and Modeling of Computer Systems , May 1-9, 1999, Atlanta, Ga., 59-69. Google Scholar
- Douglas, J. [2005]. "Intel 8xx series and Paxville Xeon-MP microprocessors," paper presented at Hot Chips 17, August 14-16, 2005, Stanford University, Palo Alto, Calif.Google Scholar
- Duato, J. [1993]. "A new theory of deadlock-free adaptive routing in wormhole networks," IEEE Trans. on Parallel and Distributed Systems 4:12 (December) 1320-1331. Google Scholar
Digital Library
- Duato, J., and T. M. Pinkston [2001]. "A general theory for deadlock-free adaptive routing using a mixed set of resources," IEEE Trans. on Parallel and Distributed Systems 12:12 (December), 1219-1235. Google Scholar
Digital Library
- Duato, J., S. Yalamanchili, and L. Ni [2003]. Interconnection Networks: An Engineering Approach , 2nd printing, Morgan Kaufmann, San Francisco.Google Scholar
- Duato, J., I. Johnson, J. Flich, F. Naven, P. Garcia, and T. Nachiondo [2005a]. "A new scalable and cost-effective congestion management strategy for lossless multistage interconnection networks," Proc. 11th Int'l. Symposium on High-Performance Computer Architecture , February 12-16, 2005, San Francisco. Google Scholar
Digital Library
- Duato, J., O. Lysne, R. Pang, and T. M. Pinkston [2005b]. "Part I: A theory for deadlockfree dynamic reconfiguration of interconnection networks," IEEE Trans. on Parallel and Distributed Systems 16:5 (May), 412-427. Google Scholar
Digital Library
- Dubois, M., C. Scheurich, and F. Briggs [1988]. "Synchronization, coherence, and event ordering," IEEE Computer 21:2 (February), 9-21. Google Scholar
Digital Library
- Dunigan, W., K. Vetter, K. White, and P. Worley [2005]. "Performance evaluation of the Cray X1 distributed shared memory architecture," IEEE Micro January/February, 30-40. Google Scholar
Digital Library
- Eden, A., and T. Mudge [1998]. "The YAGS branch prediction scheme," Proc. of the 31st Annual ACM/IEEE Int'l. Symposium on Microarchitecture , November 30-December 2, 1998, Dallas, Tex., 69-80. Google Scholar
- Edmondson, J. H., P. I. Rubinfield, R. Preston, and V. Rajagopalan [1995]. "Superscalar instruction execution in the 21164 Alpha microprocessor," IEEE Micro 15:2, 33-43. Google Scholar
Digital Library
- Eggers, S. [1989]. "Simulation Analysis of Data Sharing in Shared Memory Multiprocessors," Ph. D. thesis, University of California, Berkeley. Google Scholar
- Elder, J., A. Gottlieb, C. K. Kruskal, K. P. McAuliffe, L. Randolph, M. Snir, P. Teller, and J. Wilson [1985]. "Issues related to MIMD shared-memory computers: The NYU Ultracomputer approach," Proc. 12th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 17-19, 1985, Boston, Mass., 126-135. Google Scholar
- Ellis, J. R. [1986]. Bulldog: A Compiler for VLIW Architectures , MIT Press, Cambridge, Mass. Google Scholar
- Emer, J. S., and D. W. Clark [1984]. "A characterization of processor performance in the VAX-11/780," Proc. 11th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 5-7, 1984, Ann Arbor, Mich., 301-310. Google Scholar
- Enriquez, P. [2001]. "What happened to my dial tone? A study of FCC service disruption reports," poster, Richard Tapia Symposium on the Celebration of Diversity in Computing , October 18-20, Houston, Tex.Google Scholar
- Erlichson, A., N. Nuckolls, G. Chesson, and J. L. Hennessy [1996]. "SoftFLASH: Analyzing the performance of clustered distributed virtual shared memory," Proc. Seventh Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 1-5, 1996, Cambridge, Mass., 210-220. Google Scholar
- Esmaeilzadeh, H., T. Cao, Y. Xi, S. M. Blackburn, and K. S. McKinley [2011]. "Looking Back on the Language and Hardware Revolution: Measured Power, Performance, and Scaling," Proc. 16th Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , March 5-11, 2011, Newport Beach, Calif. Google Scholar
- Evers, M., S. J. Patel, R. S. Chappell, and Y. N. Patt [1998]. "An analysis of correlation and predictability: What makes two-level branch predictors work," Proc. 25th Annual Int'l. Symposium on Computer Architecture (ISCA) , July 3-14, 1998, Barcelona, Spain, 52-61. Google Scholar
- Fabry, R. S. [1974]. "Capability based addressing," Communications of the ACM 17:7 (July), 403-412. Google Scholar
Digital Library
- Falsafi, B., and D. A. Wood [1997]. "Reactive NUMA: A design for unifying S-COMA and CC-NUMA," Proc. 24th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 2-4, 1997, Denver, Colo., 229-240. Google Scholar
- Fan, X., W. Weber, and L. A. Barroso [2007]. "Power provisioning for a warehouse-sized computer," Proc. 34th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 9-13, 2007, San Diego, Calif. Google Scholar
- Farkas, K. I., and N. P. Jouppi [1994]. "Complexity/performance trade-offs with nonblocking loads," Proc. 21st Annual Int'l. Symposium on Computer Architecture (ISCA) , April 18-21, 1994, Chicago. Google Scholar
- Farkas, K. I., N. P. Jouppi, and P. Chow [1995]. "How useful are non-blocking loads, stream buffers and speculative execution in multiple issue processors?," Proc. First IEEE Symposium on High-Performance Computer Architecture , January 22-25, 1995, Raleigh, N.C., 78-89. Google Scholar
Cross Ref
- Farkas, K. I., P. Chow, N. P. Jouppi, and Z. Vranesic [1997]. "Memory-system design considerations for dynamically-scheduled processors," Proc. 24th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 2-4, 1997, Denver, Colo., 133-143. Google Scholar
- Fazio, D. [1987]. "It's really much more fun building a supercomputer than it is simply inventing one," Proc. IEEE COMPCON , February 23-27, 1987, San Francisco, 102-105.Google Scholar
- Fisher, J. A. [1981]. "Trace scheduling: A technique for global microcode compaction," IEEE Trans. on Computers 30:7 (July), 478-490. Google Scholar
- Fisher, J. A. [1983]. "Very long instruction word architectures and ELI-512," 10th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 5-7, 1982, Stockholm, Sweden, 140-150. Google Scholar
- Fisher, J. A., and S. M. Freudenberger [1992]. "Predicting conditional branches from previous runs of a program," Proc. Fifth Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 12-15, 1992, Boston, Mass., 85-95. Google Scholar
- Fisher, J. A., and B. R. Rau [1993]. Journal of Supercomputing , January (special issue).Google Scholar
- Fisher, J. A., J. R. Ellis, J. C. Ruttenberg, and A. Nicolau [1984]. "Parallel processing: A smart compiler and a dumb processor," Proc. SIGPLAN Conf. on Compiler Construction , June 17-22, 1984, Montreal, Canada, 11-16. Google Scholar
- Flemming, P. J., and J. J. Wallace [1986]. "How not to lie with statistics: The correct way to summarize benchmarks results," Communications of the ACM 29:3 (March), 218-221. Google Scholar
Digital Library
- Flynn, M. J. [1966]. "Very high-speed computing systems," Proc. IEEE 54:12 (December), 1901-1909.Google Scholar
Cross Ref
- Forgie, J. W. [1957]. "The Lincoln TX-2 input-output system," Proc. Western Joint Computer Conference (February), Institute of Radio Engineers, Los Angeles, 156-160. Google Scholar
- Foster, C. C., and E. M. Riseman [1972]. "Percolation of code to enhance parallel dispatching and execution," IEEE Trans. on Computers C-21:12 (December), 1411- 1415. Google Scholar
Digital Library
- Frank, S. J. [1984]. "Tightly coupled multiprocessor systems speed memory access time," Electronics 57:1 (January), 164-169.Google Scholar
Cross Ref
- Freiman, C. V. [1961]. "Statistical analysis of certain binary division algorithms," Proc. IRE 49:1, 91-103.Google Scholar
Cross Ref
- Friesenborg, S. E., and R. J. Wicks [1985]. DASD Expectations: The 3380, 3380-23, and MVS/XA , Tech. Bulletin GG22-9363-02, IBM Washington Systems Center, Gaithersburg, Md.Google Scholar
- Fuller, S. H., and W. E. Burr [1977]. "Measurement and evaluation of alternative computer architectures," Computer 10:10 (October), 24-35. Google Scholar
Digital Library
- Furber, S. B. [1996]. ARM System Architecture , Addison-Wesley, Harlow, England (see www.cs.man.ac.uk/amulet/publications/books/ARMsysArch). Google Scholar
- Gagliardi, U. O. [1973]. "Report of workshop 4--software-related advances in computer hardware," Proc. Symposium on the High Cost of Software , September 17-19, 1973, Monterey, Calif., 99-120.Google Scholar
- Gajski, D., D. Kuck, D. Lawrie, and A. Sameh [1983]. "CEDAR--a large scale multiprocessor," Proc. Int'l. Conf. on Parallel Processing (ICPP) , August, Columbus, Ohio, 524-529.Google Scholar
- Gallagher, D. M., W. Y. Chen, S. A. Mahlke, J. C. Gyllenhaal, and W. W. Hwu [1994]. "Dynamic memory disambiguation using the memory conflict buffer," Proc. Sixth Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 4-7, Santa Jose, Calif., 183-193. Google Scholar
- Galles, M. [1996]. "Scalable pipelined interconnect for distributed endpoint routing: The SGI SPIDER chip," Proc. IEEE HOT Interconnects '96 , August 15-17, 1996, Stanford University, Palo Alto, Calif.Google Scholar
- Game, M., and A. Booker [1999]. "CodePack code compression for PowerPC processors," MicroNews , 5:1, www.chips.ibm.com/micronews/vol5_no1/codepack.html.Google Scholar
- Gao, Q. S. [1993]. "The Chinese remainder theorem and the prime memory system," 20th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 16-19, 1993, San Diego, Calif. ( Computer Architecture News 21:2 (May), 337-340). Google Scholar
- Gap. [2005]. "Gap Inc. Reports Third Quarter Earnings," http://gapinc.com/public/documents/PR_Q405EarningsFeb2306.pdf.Google Scholar
- Gap. [2006]. "Gap Inc. Reports Fourth Quarter and Full Year Earnings," http://gapinc.com/public/documents/Q32005PressRelease_Final22.pdff.Google Scholar
- Garner, R., A. Agarwal, F. Briggs, E. Brown, D. Hough, B. Joy, S. Kleiman, S. Muchnick, M. Namjoo, D. Patterson, J. Pendleton, and R. Tuck [1988]. "Scalable processor architecture (SPARC)," Proc. IEEE COMPCON , February 29-March 4, 1988, San Francisco, 278-283.Google Scholar
- Gebis, J., and D. Patterson [2007]. "Embracing and extending 20th-century instruction set architectures," IEEE Computer 40:4 (April), 68-75. Google Scholar
Digital Library
- Gee, J. D., M. D. Hill, D. N. Pnevmatikatos, and A. J. Smith [1993]. "Cache performance of the SPEC92 benchmark suite," IEEE Micro 13:4 (August), 17-27. Google Scholar
Digital Library
- Gehringer, E. F., D. P. Siewiorek, and Z. Segall [1987]. Parallel Processing: The Cm* Experience , Digital Press, Bedford, Mass. Google Scholar
- Gharachorloo, K., A. Gupta, and J. L. Hennessy [1992]. "Hiding memory latency using dynamic scheduling in shared-memory multiprocessors," Proc. 19th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 19-21, 1992, Gold Coast, Australia. Google Scholar
- Gharachorloo, K., D. Lenoski, J. Laudon, P. Gibbons, A. Gupta, and J. L. Hennessy [1990]. "Memory consistency and event ordering in scalable shared-memory multiprocessors," Proc. 17th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 28-31, 1990, Seattle, Wash., 15-26. Google Scholar
- Ghemawat, S., H. Gobioff, and S.-T. Leung [2003]. "The Google file system," Proc. 19th ACM Symposium on Operating Systems Principles , October 19-22, 2003, Bolton Landing, N.Y. Google Scholar
Digital Library
- Gibson, D. H. [1967]. "Considerations in block-oriented systems design," AFIPS Conf. Proc. 30, 75-80. Google Scholar
- Gibson, G. A. [1992]. Redundant Disk Arrays: Reliable , Parallel Secondary Storage , ACM Distinguished Dissertation Series, MIT Press, Cambridge, Mass. Google Scholar
- Gibson, J. C. [1970]. "The Gibson mix," Rep. TR. 00.2043, IBM Systems Development Division, Poughkeepsie, N.Y. (research done in 1959).Google Scholar
- Gibson, J., R. Kunz, D. Ofelt, M. Horowitz, J. Hennessy, and M. Heinrich [2000]. "FLASH vs. (simulated) FLASH: Closing the simulation loop," Proc. Ninth Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , November 12-15, Cambridge, Mass., 49-58. Google Scholar
- Glass, C. J., and L. M. Ni [1992]. "The Turn Model for adaptive routing," 19th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 19-21, 1992, Gold Coast, Australia. Google Scholar
- Goldberg, D. [1991]. "What every computer scientist should know about floating-point arithmetic," Computing Surveys 23:1, 5-48. Google Scholar
Digital Library
- Goldberg, I. B. [1967]. "27 bits are not enough for 8-digit accuracy," Communications of the ACM 10:2, 105-106. Google Scholar
Digital Library
- Goldstein, S. [1987]. Storage Performance--An Eight Year Outlook , Tech. Rep. TR 03.308-1, Santa Teresa Laboratory, IBM Santa Teresa Laboratory, San Jose, Calif.Google Scholar
- Goldstine, H. H. [1972]. The Computer: From Pascal to von Neumann , Princeton University Press, Princeton, N. J. Google Scholar
- Gonzalez, J., and A. González [1998]. "Limits of instruction level parallelism with data speculation," Proc. Vector and Parallel Processing (VECPAR) Conf. , June 21-23, 1998, Porto, Portugal, 585-598. Google Scholar
- Goodman, J. R. [1983]. "Using cache memory to reduce processor memory traffic," Proc. 10th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 5-7, 1982, Stockholm, Sweden, 124-131. Google Scholar
- Goralski, W. [1997]. SONET: A Guide to Synchronous Optical Network , McGraw-Hill, New York. Google Scholar
- Gosling, J. B. [1980]. Design of Arithmetic Units for Digital Computers , Springer-Verlag, New York.Google Scholar
- Gray, J. [1990]. "A census of Tandem system availability between 1985 and 1990," IEEE Trans. on Reliability , 39:4 (October), 409-418.Google Scholar
Cross Ref
- Gray, J. (ed.) [1993]. The Benchmark Handbook for Database and Transaction Processing Systems , 2nd ed., Morgan Kaufmann, San Francisco.Google Scholar
- Gray, J. [2006]. Sort benchmark home page, http://sortbenchmark.org/.Google Scholar
- Gray, J., and A. Reuter [1993]. Transaction Processing: Concepts and Techniques , Morgan Kaufmann, San Francisco. Google Scholar
- Gray, J., and D. P. Siewiorek [1991]. "High-availability computer systems," Computer 24:9 (September), 39-48. Google Scholar
Digital Library
- Gray, J., and C. van Ingen [2005]. Empirical Measurements of Disk Failure Rates and Error Rates , MSR-TR-2005-166, Microsoft Research, Redmond, Wash.Google Scholar
- Greenberg, A., N. Jain, S. Kandula, C. Kim, P. Lahiri, D. Maltz, P. Patel, and S. Sengupta [2009]. "VL2: A Scalable and Flexible Data Center Network," in Proc. ACM SIGCOMM , August 17-21, 2009, Barcelona, Spain. Google Scholar
Digital Library
- Grice, C., and M. Kanellos [2000]. "Cell phone industry at crossroads: Go high or low?," CNET News , August 31, technews.netscape.com/news/0-1004-201-2518386- 0.html?tag=st.ne.1002.tgif.sf.Google Scholar
- Groe, J. B., and L. E. Larson [2000]. CDMA Mobile Radio Design , Artech House, Boston. Google Scholar
- Gunther, K. D. [1981]. "Prevention of deadlocks in packet-switched data transport systems," IEEE Trans. on Communications COM-29:4 (April), 512-524.Google Scholar
Cross Ref
- Hagersten, E., and M. Koster [1998]. "WildFire: A scalable path for SMPs," Proc. Fifth Int'l. Symposium on High-Performance Computer Architecture , January 9-12, 1999, Orlando, Fla. Google Scholar
- Hagersten, E., A. Landin, and S. Haridi [1992]. "DDM--a cache-only memory architecture," IEEE Computer 25:9 (September), 44-54. Google Scholar
Digital Library
- Hamacher, V. C., Z. G. Vranesic, and S. G. Zaky [1984]. Computer Organization , 2nd ed., McGraw-Hill, New York. Google Scholar
- Hamilton, J. [2009]. "Data center networks are in my way," paper presented at the Stanford Clean Slate CTO Summit, October 23, 2009 (http://mvdirona.com/jrh/TalksAndPapers/JamesHamilton_CleanSlateCTO2009.pdf).Google Scholar
- Hamilton, J. [2010]. "Cloud computing economies of scale," paper presented at the AWS Workshop on Genomics and Cloud Computing , June 8, 2010, Seattle, Wash. (http://mvdirona.com/jrh/TalksAndPapers/JamesHamilton_GenomicsCloud20100608.pdf).Google Scholar
- Handy, J. [1993]. The Cache Memory Book , Academic Press, Boston. Google Scholar
- Hauck, E. A., and B. A. Dent [1968]. "Burroughs' B6500/B7500 stack mechanism," Proc. AFIPS Spring Joint Computer Conf. , April 30-May 2, 1968, Atlantic City, N. J., 245-251. Google Scholar
- Heald, R., K. Aingaran, C. Amir, M. Ang, M. Boland, A. Das, P. Dixit, G. Gouldsberry, J. Hart, T. Horel, W.-J. Hsu, J. Kaku, C. Kim, S. Kim, F. Klass, H. Kwan, R. Lo, H. McIntyre, A. Mehta, D. Murata, S. Nguyen, Y.-P. Pai, S. Patel, K. Shin, K. Tam, S. Vishwanthaiah, J. Wu, G. Yee, and H. You [2000]. "Implementation of thirdgeneration SPARC V9 64-b microprocessor," ISSCC Digest of Technical Papers , 412-413 and slide supplement.Google Scholar
- Heinrich, J. [1993]. MIPS R4000 User's Manual , Prentice Hall, Englewood Cliffs, N. J. Henly, M., and B. McNutt [1989]. DASD I/O Characteristics: A Comparison of MVS to VM ," Tech. Rep. TR 02.1550 (May), IBM General Products Division, San Jose, Calif. Google Scholar
- Hennessy, J. [1984]. "VLSI processor architecture," IEEE Trans. on Computers C-33:11 (December), 1221-1246. Google Scholar
Digital Library
- Hennessy, J. [1985]. "VLSI RISC processors," VLSI Systems Design 6:10 (October), 22-32.Google Scholar
- Hennessy, J., N. Jouppi, F. Baskett, and J. Gill [1981]. "MIPS: A VLSI processor architecture," in CMU Conference on VLSI Systems and Computations , Computer Science Press, Rockville, Md.Google Scholar
- Hewlett-Packard. [1994]. PA-RISC 2.0 Architecture Reference Manual , 3rd ed., Hewlett-Packard, Palo Alto, Calif.Google Scholar
- Hewlett-Packard. [1998]. "HP's '5NINES:5MINUTES' Vision Extends Leadership and Redefines High Availability in Mission-Critical Environments," February 10, www.future.enterprisecomputing.hp.com/ia64/news/5nines_vision_pr.html.Google Scholar
- Hill, M. D. [1987]. "Aspects of Cache Memory and Instruction Buffer Performance," Ph. D. thesis, Tech. Rep. UCB/CSD 87/381, Computer Science Division, University of California, Berkeley. Google Scholar
- Hill, M. D. [1988]. "A case for direct mapped caches," Computer 21:12 (December), 25-40. Google Scholar
Digital Library
- Hill, M. D. [1998]. "Multiprocessors should support simple memory consistency models," IEEE Computer 31:8 (August), 28-34. Google Scholar
Digital Library
- Hillis, W. D. [1985]. The Connection Multiprocessor , MIT Press, Cambridge, Mass.Google Scholar
- Hillis, W. D. and G. L. Steele [1986]. "Data parallel algorithms," Communications of the ACM 29:12 (December), 1170-1183. (http://doi.acm.org/10.1145/7902.7903). Google Scholar
Digital Library
- Hinton, G., D. Sager, M. Upton, D. Boggs, D. Carmean, A. Kyker, and P. Roussel [2001]. "The microarchitecture of the Pentium 4 processor," Intel Technology Journal , February.Google Scholar
- Hintz, R. G., and D. P. Tate [1972]. "Control data STAR-100 processor design," Proc. IEEE COMPCON , September 12-14, 1972, San Francisco, 1-4.Google Scholar
- Hirata, H., K. Kimura, S. Nagamine, Y. Mochizuki, A. Nishimura, Y. Nakase, and T. Nishizawa [1992]. "An elementary processor architecture with simultaneous instruction issuing from multiple threads," Proc. 19th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 19-21, 1992, Gold Coast, Australia, 136-145. Google Scholar
- Hitachi. [1997]. SuperH RISC Engine SH7700 Series Programming Manual , Hitachi, Santa Clara, Calif. (see www.halsp.hitachi.com/tech_prod/and search for title).Google Scholar
- Ho, R., K. W. Mai, and M. A. Horowitz [2001]. "The future of wires," Proc. of the IEEE 89:4 (April), 490-504.Google Scholar
Cross Ref
- Hoagland, A. S. [1963]. Digital Magnetic Recording , Wiley, New York.Google Scholar
- Hockney, R. W., and C. R. Jesshope [1988]. Parallel Computers 2: Architectures , Programming and Algorithms , Adam Hilger, Ltd., Bristol, England. Google Scholar
- Holland, J. H. [1959]. "A universal computer capable of executing an arbitrary number of subprograms simultaneously," Proc. East Joint Computer Conf. 16, 108-113. Google Scholar
- Holt, R. C. [1972]. "Some deadlock properties of computer systems," ACM Computer Surveys 4:3 (September), 179-196. Google Scholar
Digital Library
- Hopkins, M. [2000]. "A critical look at IA-64: Massive resources, massive ILP, but can it deliver?" Microprocessor Report , February.Google Scholar
- Hord, R. M. [1982]. The Illiac-IV , The First Supercomputer , Computer Science Press, Rockville, Md.Google Scholar
- Horel, T., and G. Lauterbach [1999]. "UltraSPARC-III: Designing third-generation 64-bit performance," IEEE Micro 19:3 (May-June), 73-85. Google Scholar
Digital Library
- Hospodor, A. D., and A. S. Hoagland [1993]. "The changing nature of disk controllers." Proc. IEEE 81:4 (April), 586-594.Google Scholar
Cross Ref
- Holzle, U. [2010]. "Brawny cores still beat wimpy cores, most of the time," IEEE Micro 30:4 (July/August).Google Scholar
- Hristea, C., D. Lenoski, and J. Keen [1997]. "Measuring memory hierarchy performance of cache-coherent multiprocessors using micro benchmarks," Proc. ACM/IEEE Conf. on Supercomputing , November 16-21, 1997, San Jose, Calif. Google Scholar
- Hsu, P. [1994]. "Designing the TFP microprocessor," IEEE Micro 18:2 (April), 2333. Google Scholar
- Huck, J. et al. [2000]. "Introducing the IA-64 Architecture" IEEE Micro , 20:5 (September-October), 12-23. Google Scholar
Digital Library
- Hughes, C. J., P. Kaul, S. V. Adve, R. Jain, C. Park, and J. Srinivasan [2001]. "Variability in the execution of multimedia applications and implications for architecture," Proc. 28th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 30-July 4, 2001, Goteborg, Sweden, 254-265. Google Scholar
- Hwang, K. [1979]. Computer Arithmetic: Principles , Architecture , and Design , Wiley, New York. Google Scholar
- Hwang, K. [1993]. Advanced Computer Architecture and Parallel Programming , McGraw-Hill, New York.Google Scholar
- Hwu, W.-M., and Y. Patt [1986]. "HPSm, a high performance restricted data flow architecture having minimum functionality," Proc. 13th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 2-5, 1986, Tokyo, 297-307. Google Scholar
- Hwu, W. W., S. A. Mahlke, W. Y. Chen, P. P. Chang, N. J. Warter, R. A. Bringmann, R. O. Ouellette, R. E. Hank, T. Kiyohara, G. E. Haab, J. G. Holm, and D. M. Lavery [1993]. "The superblock: An effective technique for VLIW and superscalar compilation," J. Supercomputing 7:1, 2 (March), 229-248. Google Scholar
- IBM. [1982]. The Economic Value of Rapid Response Time , GE20-0752-0, IBM, White Plains, N.Y., 11-82.Google Scholar
- IBM. [1990]. "The IBM RISC System/6000 processor" (collection of papers), IBM J. Research and Development 34:1 (January).Google Scholar
- IBM. [1994]. The PowerPC Architecture , Morgan Kaufmann, San Francisco.Google Scholar
- IBM. [2005]. "Blue Gene," IBM J. Research and Development , 49:2/3 (special issue).Google Scholar
Digital Library
- IEEE. [1985]. "IEEE standard for binary floating-point arithmetic," SIGPLAN Notices 22:2, 9-25.Google Scholar
- IEEE. [2005]. "Intel virtualization technology, computer," IEEE Computer Society 38:5 (May), 48-56. Google Scholar
- IEEE. 754-2008 Working Group. [2006]. "DRAFT Standard for Floating-Point Arithmetic 754-2008," http://dx.doi.org/10.1109/IEEESTD.2008.4610935.Google Scholar
- Imprimis Product Specification , 97209 Sabre Disk Drive IPI-2 Interface 1.2 GB , Document No. 64402302, Imprimis, Dallas, Tex.Google Scholar
- InfiniBand Trade Association. [2001]. InfiniBand Architecture Specifications Release 1.0.a , www.infinibandta.org.Google Scholar
- Intel. [2001]. "Using MMX Instructions to Convert RGB to YUV Color Conversion," cedar.intel.com/cgi-bin/ids.dll/content/content.jsp?cntKey=Legacy::irtm_AP548_9996& cntType=IDS_ EDITORIAL.Google Scholar
- Internet Retailer. [2005]. "The Gap launches a new site--after two weeks of downtime," Internet® Retailer , September 28, http://www.internetretailer.com/2005/09/28/thegap-launches-a-new-site-after-two-weeks-of-downtime.Google Scholar
- Jain, R. [1991]. The Art of Computer Systems Performance Analysis: Techniques for Experimental Design , Measurement , Simulation , and Modeling , Wiley, New York.Google Scholar
- Jantsch, A., and H. Tenhunen (eds.) [2003]. Networks on Chips , Kluwer Academic Publishers, The Netherlands. Google Scholar
- Jimenez, D. A., and C. Lin [2002]. "Neural methods for dynamic branch prediction," ACM Trans. on Computer Systems 20:4 (November), 369-397. Google Scholar
Digital Library
- Johnson, M. [1990]. Superscalar Microprocessor Design , Prentice Hall, Englewood Cliffs, N. J.Google Scholar
- Jordan, H. F. [1983]. "Performance measurements on HEP--a pipelined MIMD computer," Proc. 10th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 5-7, 1982, Stockholm, Sweden, 207-212. Google Scholar
- Jordan, K. E. [1987]. "Performance comparison of large-scale scientific processors: Scalar mainframes, mainframes with vector facilities, and supercomputers," Computer 20:3 (March), 10-23. Google Scholar
Digital Library
- Jouppi, N. P. [1990]. "Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers," Proc. 17th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 28-31, 1990, Seattle, Wash., 364-373. Google Scholar
- Jouppi, N. P. [1998]. "Retrospective: Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers," 25 Years of the International Symposia on Computer Architecture (Selected Papers) , ACM, New York, 71-73. Google Scholar
Digital Library
- Jouppi, N. P., and D. W. Wall [1989]. "Available instruction-level parallelism for superscalar and superpipelined processors," Proc. Third Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , April 3-6, 1989, Boston, 272-282. Google Scholar
- Jouppi, N. P., and S. J. E. Wilton [1994]. "Trade-offs in two-level on-chip caching," Proc. 21st Annual Int'l. Symposium on Computer Architecture (ISCA) , April 18-21, 1994, Chicago, 34-45. Google Scholar
- Kaeli, D. R., and P. G. Emma [1991]. "Branch history table prediction of moving target branches due to subroutine returns," Proc. 18th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 27-30, 1991, Toronto, Canada, 34-42. Google Scholar
- Kahan, J. [1990]. "On the advantage of the 8087's stack," unpublished course notes, Computer Science Division, University of California, Berkeley.Google Scholar
- Kahan, W. [1968]. "7094-II system support for numerical analysis," SHARE Secretarial Distribution SSD-159, Department of Computer Science, University of Toronto.Google Scholar
- Kahaner, D. K. [1988]. "Benchmarks for 'real' programs," SIAM News , November.Google Scholar
- Kahn, R. E. [1972]. "Resource-sharing computer communication networks," Proc. IEEE 60:11 (November), 1397-1407.Google Scholar
Cross Ref
- Kane, G. [1986]. MIPS R2000 RISC Architecture , Prentice Hall, Englewood Cliffs, N. J.Google Scholar
- Kane, G. [1996]. PA-RISC 2.0 Architecture , Prentice Hall, Upper Saddle River, N. J. Google Scholar
- Kane, G., and J. Heinrich [1992]. MIPS RISC Architecture , Prentice Hall, Englewood Cliffs, N. J. Google Scholar
- Katz, R. H., D. A. Patterson, and G. A. Gibson [1989]. "Disk system architectures for high performance computing," Proc. IEEE 77:12 (December), 1842-1858.Google Scholar
Cross Ref
- Keckler, S. W., and W. J. Dally [1992]. "Processor coupling: Integrating compile time and runtime scheduling for parallelism," Proc. 19th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 19-21, 1992, Gold Coast, Australia, 202-213. Google Scholar
- Keller, R. M. [1975]. "Look-ahead processors," ACM Computing Surveys 7:4 (December), 177-195. Google Scholar
Digital Library
- Keltcher, C. N., K. J. McGrath, A. Ahmed, and P. Conway [2003]. "The AMD Opteron processor for multiprocessor servers," IEEE Micro 23:2 (March-April), 66-76 (dx.doi.org/10.1109. MM.2003.119116). Google Scholar
Digital Library
- Kembel, R. [2000]. "Fibre Channel: A comprehensive introduction," Internet Week , April. Google Scholar
- Kermani, P., and L. Kleinrock [1979]. "Virtual Cut-Through: A New Computer Communication Switching Technique," Computer Networks 3 (January), 267-286.Google Scholar
- Kessler, R. [1999]. "The Alpha 21264 microprocessor," IEEE Micro 19:2 (March/April) 24-36. Google Scholar
Digital Library
- Kilburn, T., D. B. G. Edwards, M. J. Lanigan, and F. H. Sumner [1962]. "One-level storage system," IRE Trans. on Electronic Computers EC-11 (April) 223-235. AlsoGoogle Scholar
Cross Ref
- appears in D. P. Siewiorek, C. G. Bell, and A. Newell, Computer Structures: Principles and Examples , McGraw-Hill, New York, 1982, 135-148. Google Scholar
- Killian, E. [1991]. "MIPS R4000 technical overview-64 bits/100 MHz or bust," Hot Chips III Symposium Record , August 26-27, 1991, Stanford University, Palo Alto, Calif., 1.6-1.19.Google Scholar
- Kim, M. Y. [1986]. "Synchronized disk interleaving," IEEE Trans. on Computers C-35:11 (November), 978-988. Google Scholar
Digital Library
- Kissell, K. D. [1997]. "MIPS16: High-density for the embedded market," Proc. Real Time Systems '97 , June 15, 1997, Las Vegas, Nev. (see www.sgi.com/MIPS/arch/MIPS16/MIPS16.whitepaper.pdf).Google Scholar
- Kitagawa, K., S. Tagaya, Y. Hagihara, and Y. Kanoh [2003]. "A hardware overview of SX- 6 and SX-7 supercomputer," NEC Research & Development J. 44:1 (January), 2-7.Google Scholar
- Knuth, D. [1981]. The Art of Computer Programming , Vol. II, 2nd ed., Addison-Wesley, Reading, Mass.Google Scholar
- Kogge, P. M. [1981]. The Architecture of Pipelined Computers , McGraw-Hill, New York.Google Scholar
- Kohn, L., and S.-W. Fu [1989]. "A 1,000,000 transistor microprocessor," Proc. of IEEE Int'l. Symposium on Solid State Circuits (ISSCC) , February 15-17, 1989, New York, 54-55.Google Scholar
- Kohn, L., and N. Margulis [1989]. "Introducing the Intel i860 64-Bit Microprocessor," IEEE Micro , 9:4 (July), 15-30. Google Scholar
Digital Library
- Kontothanassis, L., G. Hunt, R. Stets, N. Hardavellas, M. Cierniak, S. Parthasarathy, W. Meira, S. Dwarkadas, and M. Scott [1997]. "VM-based shared memory on lowlatency, remote-memory-access networks," Proc. 24th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 2-4, 1997, Denver, Colo. Google Scholar
- Koren, I. [1989]. Computer Arithmetic Algorithms , Prentice Hall, Englewood Cliffs, N. J. Kozyrakis, C. [2000]. "Vector IRAM: A media-oriented vector processor with embedded DRAM," paper presented at Hot Chips 12, August 13-15, 2000, Palo Alto, Calif, 13-15. Google Scholar
- Kozyrakis, C., and D. Patterson, [2002]. "Vector vs. superscalar and VLIW architectures for embedded multimedia benchmarks," Proc. 35th Annual Int'l. Symposium on Microarchitecture (MICRO-35) , November 18-22, 2002, Istanbul, Turkey. Google Scholar
- Kroft, D. [1981]. "Lockup-free instruction fetch/prefetch cache organization," Proc. Eighth Annual Int'l. Symposium on Computer Architecture (ISCA) , May 12-14, 1981, Minneapolis, Minn., 81-87. Google Scholar
- Kroft, D. [1998]. "Retrospective: Lockup-free instruction fetch/prefetch cache organization," 25 Years of the International Symposia on Computer Architecture (Selected Papers) , ACM, New York, 20-21. Google Scholar
Digital Library
- Kuck, D., P. P. Budnik, S.-C. Chen, D. H. Lawrie, R. A. Towle, R. E. Strebendt, E. W. Davis, Jr., J. Han, P. W. Kraska, and Y. Muraoka [1974]. "Measurements of parallelism in ordinary FORTRAN programs," Computer 7:1 (January), 37-46.Google Scholar
Cross Ref
- Kuhn, D. R. [1997]. "Sources of failure in the public switched telephone network," IEEE Computer 30:4 (April), 31-36. Google Scholar
Digital Library
- Kumar, A. [1997]. "The HP PA-8000 RISC CPU," IEEE Micro 17:2 (March/April), 27-32. Google Scholar
Digital Library
- Kunimatsu, A., N. Ide, T. Sato, Y. Endo, H. Murakami, T. Kamei, M. Hirano, F. Ishihara, H. Tago, M. Oka, A. Ohba, T. Yutaka, T. Okada, and M. Suzuoki [2000]. "Vector unit architecture for emotion synthesis," IEEE Micro 20:2 (March-April), 40-47. Google Scholar
Digital Library
- Kunkel, S. R., and J. E. Smith [1986]. "Optimal pipelining in supercomputers," Proc. 13th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 2-5, 1986, Tokyo, 404-414. Google Scholar
- Kurose, J. F., and K. W. Ross [2001]. Computer Networking: A Top-Down Approach Featuring the Internet , Addison-Wesley, Boston. Google Scholar
- Kuskin, J., D. Ofelt, M. Heinrich, J. Heinlein, R. Simoni, K. Gharachorloo, J. Chapin, D. Nakahira, J. Baxter, M. Horowitz, A. Gupta, M. Rosenblum, and J. L. Hennessy [1994]. "The Stanford FLASH multiprocessor," Proc. 21st Annual Int'l. Symposium on Computer Architecture (ISCA) , April 18-21, 1994, Chicago. Google Scholar
Digital Library
- Lam, M. [1988]. "Software pipelining: An effective scheduling technique for VLIW processors," SIGPLAN Conf. on Programming Language Design and Implementation , June 22-24, 1988, Atlanta, Ga., 318-328. Google Scholar
- Lam, M. S., and R. P. Wilson [1992]. "Limits of control flow on parallelism," Proc. 19th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 19-21, 1992, Gold Coast, Australia, 46-57. Google Scholar
- Lam, M. S., E. E. Rothberg, and M. E. Wolf [1991]. "The cache performance and optimizations of blocked algorithms," Proc. Fourth Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , April 8-11, 1991, Santa Clara, Calif. ( SIGPLAN Notices 26:4 (April), 63-74). Google Scholar
- Lambright, D. [2000]. "Experiences in measuring the reliability of a cache-based storage system," Proc. of First Workshop on Industrial Experiences with Systems Software (WIESS 2000), Co-Located with the 4th Symposium on Operating Systems Design and Implementation (OSDI) , October 22, 2000, San Diego, Calif. Google Scholar
- Lamport, L. [1979]. "How to make a multiprocessor computer that correctly executes multiprocess programs," IEEE Trans. on Computers C-28:9 (September), 241-248. Google Scholar
- Lang, W., J. M. Patel, and S. Shankar [2010]. "Wimpy node clusters: What about non-wimpy workloads?" Proc. Sixth International Workshop on Data Management on New Hardware (DaMoN) , June 7, Indianapolis, Ind. Google Scholar
- Laprie, J.-C. [1985]. "Dependable computing and fault tolerance: Concepts and terminology," Proc. 15th Annual Int'l. Symposium on Fault-Tolerant Computing , June 19-21, 1985, Ann Arbor, Mich., 2-11.Google Scholar
- Larson, E. R. [1973]. "Findings of fact, conclusions of law, and order for judgment," File No. 4-67, Civ. 138, Honeywell v. Sperry-Rand and Illinois Scientific Development , U. S. District Court for the State of Minnesota, Fourth Division (October 19).Google Scholar
- Laudon, J., and D. Lenoski [1997]. "The SGI Origin: A ccNUMA highly scalable server," Proc. 24th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 2-4, 1997, Denver, Colo., 241-251. Google Scholar
- Laudon, J., A. Gupta, and M. Horowitz [1994]. "Interleaving: A multithreading technique targeting multiprocessors and workstations," Proc. Sixth Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 4-7, San Jose, Calif., 308-318. Google Scholar
- Lauterbach, G., and T. Horel [1999]. "UltraSPARC-III: Designing third generation 64-bit performance," IEEE Micro 19:3 (May/June). Google Scholar
- Lazowska, E. D., J. Zahorjan, G. S. Graham, and K. C. Sevcik [1984]. Quantitative System Performance: Computer System Analysis Using Queueing Network Models , Prentice Hall, Englewood Cliffs, N. J. (Although out of print, it is available online at www.cs.washington.edu/homes/lazowska/qsp/.) Google Scholar
- Lebeck, A. R., and D. A. Wood [1994]. "Cache profiling and the SPEC benchmarks: A case study," Computer 27:10 (October), 15-26. Google Scholar
Digital Library
- Lee, R. [1989]. "Precision architecture," Computer 22:1 (January), 78-91. Google Scholar
Digital Library
- Lee, W. V. et al. [2010]. "Debunking the 100X GPU vs. CPU myth: An evaluation of throughput computing on CPU and GPU," Proc. 37th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 19-23, 2010, Saint-Malo, France. Google Scholar
- Leighton, F. T. [1992]. Introduction to Parallel Algorithms and Architectures: Arrays , Trees , Hypercubes , Morgan Kaufmann, San Francisco. Google Scholar
- Leiner, A. L. [1954]. "System specifications for the DYSEAC," J. ACM 1:2 (April), 57-81. Google Scholar
Digital Library
- Leiner, A. L., and S. N. Alexander [1954]. "System organization of the DYSEAC," IRE Trans. of Electronic Computers EC-3:1 (March), 1-10.Google Scholar
- Leiserson, C. E. [1985]. "Fat trees: Universal networks for hardware-efficient supercomputing," IEEE Trans. on Computers C-34:10 (October), 892-901. Google Scholar
Cross Ref
- Lenoski, D., J. Laudon, K. Gharachorloo, A. Gupta, and J. L. Hennessy [1990]. "The Stanford DASH multiprocessor," Proc. 17th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 28-31, 1990, Seattle, Wash., 148-159.Google Scholar
Cross Ref
- Lenoski, D., J. Laudon, K. Gharachorloo, W.-D. Weber, A. Gupta, J. L. Hennessy, M. A. Horowitz, and M. Lam [1992]. "The Stanford DASH multiprocessor," IEEE Computer 25:3 (March), 63-79. Google Scholar
Digital Library
- Levy, H., and R. Eckhouse [1989]. Computer Programming and Architecture: The VAX , Digital Press, Boston. Google Scholar
- Li, K. [1988]. "IVY: A shared virtual memory system for parallel computing," Proc. 1988 Int'l. Conf. on Parallel Processing , Pennsylvania State University Press, University Park, Penn.Google Scholar
- Li, S., K. Chen, J. B. Brockman, and N. Jouppi [2011]. "Performance Impacts of Nonblocking Caches in Out-of-order Processors," HP Labs Tech Report HPL-2011-65 (full text available at http://Library.hp.com/techpubs/2011/Hpl-2011-65.html).Google Scholar
- Lim, K., P. Ranganathan, J. Chang, C. Patel, T. Mudge, and S. Reinhardt [2008]. "Understanding and designing new system architectures for emerging warehouse-computing environments," Proc. 35th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 21-25, 2008, Beijing, China. Google Scholar
- Lincoln, N. R. [1982]. "Technology and design trade offs in the creation of a modern supercomputer," IEEE Trans. on Computers C-31:5 (May), 363-376. Google Scholar
- Lindholm, T., and F. Yellin [1999]. The Java Virtual Machine Specification , 2nd ed., Addison-Wesley, Reading, Mass. (also available online at java.sun.com/docs/books/vmspec/). Google Scholar
- Lipasti, M. H., and J. P. Shen [1996]. "Exceeding the dataflow limit via value prediction," Proc. 29th Int'l. Symposium on Microarchitecture , December 2-4, 1996, Paris, France. Google Scholar
- Lipasti, M. H., C. B. Wilkerson, and J. P. Shen [1996]. "Value locality and load value prediction," Proc. Seventh Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 1-5, 1996, Cambridge, Mass., 138-147. Google Scholar
Digital Library
- Liptay, J. S. [1968]. "Structural aspects of the System/360 Model 85, Part II: The cache," IBM Systems J. 7:1, 15-21. Google Scholar
Digital Library
- Lo, J., L. Barroso, S. Eggers, K. Gharachorloo, H. Levy, and S. Parekh [1998]. "An analysis of database workload performance on simultaneous multithreaded processors," Proc. 25th Annual Int'l. Symposium on Computer Architecture (ISCA) , July 3-14, 1998, Barcelona, Spain, 39-50. Google Scholar
- Lo, J., S. Eggers, J. Emer, H. Levy, R. Stamm, and D. Tullsen [1997]. "Converting threadlevel parallelism into instruction-level parallelism via simultaneous multithreading," ACM Trans. on Computer Systems 15:2 (August), 322-354. Google Scholar
Digital Library
- Lovett, T., and S. Thakkar [1988]. "The Symmetry multiprocessor system," Proc. 1988 Int'l. Conf. of Parallel Processing , University Park, Penn., 303-310.Google Scholar
- Lubeck, O., J. Moore, and R. Mendez [1985]. "A benchmark comparison of three supercomputers: Fujitsu VP-200, Hitachi S810/20, and Cray X-MP/2," Computer 18:12 (December), 10-24. Google Scholar
Digital Library
- Luk, C.-K., and T. C Mowry [1999]. "Automatic compiler-inserted prefetching for pointer-based applications," IEEE Trans. on Computers 48:2 (February), 134-141. Google Scholar
- Lunde, A. [1977]. "Empirical evaluation of some features of instruction set processor architecture," Communications of the ACM 20:3 (March), 143-152. Google Scholar
Digital Library
- Luszczek, P., J. J. Dongarra, D. Koester, R. Rabenseifner, B. Lucas, J. Kepner, J. McCalpin, D. Bailey, and D. Takahashi [2005]. "Introduction to the HPC challenge benchmark suite," Lawrence Berkeley National Laboratory, Paper LBNL-57493 (April 25), repositories.cdlib.org/lbnl/LBNL-57493.Google Scholar
- Maberly, N. C. [1966]. Mastering Speed Reading , New American Library, New York.Google Scholar
- Magenheimer, D. J., L. Peters, K. W. Pettis, and D. Zuras [1988]. "Integer multiplication and division on the HP precision architecture," IEEE Trans. on Computers 37:8, 980-990. Google Scholar
Digital Library
- Mahlke, S. A., W. Y. Chen, W.-M. Hwu, B. R. Rau, and M. S. Schlansker [1992]. "Sentinel scheduling for VLIW and superscalar processors," Proc. Fifth Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 12-15, 1992, Boston, 238-247. Google Scholar
- Mahlke, S. A., R. E. Hank, J. E. McCormick, D. I. August, and W. W. Hwu [1995]. "A comparison of full and partial predicated execution support for ILP processors," Proc. 22nd Annual Int'l. Symposium on Computer Architecture (ISCA) , June 22-24, 1995, Santa Margherita, Italy, 138-149. Google Scholar
- Major, J. B. [1989]. "Are queuing models within the grasp of the unwashed?," Proc. Int'l. Conf. on Management and Performance Evaluation of Computer Systems , December 11-15, 1989, Reno, Nev., 831-839.Google Scholar
- Markstein, P. W. [1990]. "Computation of elementary functions on the IBM RISC System/6000 processor," IBM J. Research and Development 34:1, 111-119. Google Scholar
Digital Library
- Mathis, H. M., A. E. Mercias, J. D. McCalpin, R. J. Eickemeyer, and S. R. Kunkel [2005]. "Characterization of the multithreading (SMT) efficiency in Power5," IBM J. Research and Development , 49:4/5 (July/September), 555-564. Google Scholar
Cross Ref
- McCalpin, J. [2005]. "STREAM: Sustainable Memory Bandwidth in High Performance Computers," www.cs.virginia.edu/stream/.Google Scholar
- McCalpin, J., D. Bailey, and D. Takahashi [2005]. Introduction to the HPC Challenge Benchmark Suite , Paper LBNL-57493 Lawrence Berkeley National Laboratory, University of California, Berkeley, repositories.cdlib.org/lbnl/LBNL-57493.Google Scholar
- McCormick, J., and A. Knies [2002]. "A brief analysis of the SPEC CPU2000 benchmarks on the Intel Itanium 2 processor," paper presented at Hot Chips 14, August 18-20, 2002, Stanford University, Palo Alto, Calif.Google Scholar
- McFarling, S. [1989]. "Program optimization for instruction caches," Proc. Third Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , April 3-6, 1989, Boston, 183-191. Google Scholar
- McFarling, S. [1993]. Combining Branch Predictors , WRL Technical Note TN-36, Digital Western Research Laboratory, Palo Alto, Calif.Google Scholar
- McFarling, S., and J. Hennessy [1986]. "Reducing the cost of branches," Proc. 13th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 2-5, 1986, Tokyo, 396-403. Google Scholar
- McGhan, H., and M. O'Connor [1998]. "PicoJava: A direct execution engine for Java bytecode," Computer 31:10 (October), 22-30. Google Scholar
Digital Library
- McKeeman, W. M. [1967]. "Language directed computer design," Proc. AFIPS Fall Joint Computer Conf. , November 14-16, 1967, Washington, D.C., 413-417. Google Scholar
- McMahon, F. M. [1986]. " The Livermore FORTRAN Kernels: A Computer Test of Numerical Performance Range ," Tech. Rep. UCRL-55745, Lawrence Livermore National Laboratory, University of California, Livermore.Google Scholar
- McNairy, C., and D. Soltis [2003]. "Itanium 2 processor microarchitecture," IEEE Micro 23:2 (March-April), 44-55. Google Scholar
Digital Library
- Mead, C., and L. Conway [1980]. Introduction to VLSI Systems , Addison-Wesley, Reading, Mass. Google Scholar
- Mellor-Crummey, J. M., and M. L. Scott [1991]. "Algorithms for scalable synchronization on shared-memory multiprocessors," ACM Trans. on Computer Systems 9:1 (February), 21-65. Google Scholar
Digital Library
- Menabrea, L. F. [1842]. "Sketch of the analytical engine invented by Charles Babbage," Bibliothèque Universelle de Genève , 82 (October).Google Scholar
- Menon, A., J. Renato Santos, Y. Turner, G. Janakiraman, and W. Zwaenepoel [2005]. "Diagnosing performance overheads in the xen virtual machine environment," Proc. First ACM/USENIX Int'l. Conf. on Virtual Execution Environments , June 11-12, 2005, Chicago, 13-23. Google Scholar
- Merlin, P. M., and P. J. Schweitzer [1980]. "Deadlock avoidance in store-and-forward networks. Part I. Store-and-forward deadlock," IEEE Trans. on Communications COM-28:3 (March), 345-354.Google Scholar
Cross Ref
- Metcalfe, R. M. [1993]. "Computer/network interface design: Lessons from Arpanet and Ethernet," IEEE J. on Selected Areas in Communications 11:2 (February), 173-180. Google Scholar
Digital Library
- Metcalfe, R. M., and D. R. Boggs [1976]. "Ethernet: Distributed packet switching for local computer networks," Communications of the ACM 19:7 (July), 395-404. Google Scholar
Digital Library
- Metropolis, N., J. Howlett, and G. C. Rota (eds.) [1980]. A History of Computing in the Twentieth Century , Academic Press, New York. Google Scholar
- Meyer, R. A., and L. H. Seawright [1970]. A virtual machine time sharing system, IBM Systems J. 9:3, 199-218. Google Scholar
Digital Library
- Meyers, G. J. [1978]. "The evaluation of expressions in a storage-to-storage architecture," Computer Architecture News 7:3 (October), 20-23. Google Scholar
- Meyers, G. J. [1982]. Advances in Computer Architecture , 2nd ed., Wiley, New York. Micron. [2004]. "Calculating Memory System Power for DDR2," http://download. micron.com/pdf/pubs/designline/dl1Q04.pdf. Google Scholar
- Micron. [2006]. "The Micron® System-Power Calculator," http://www.micron.com/systemcalc.Google Scholar
- MIPS. [1997]. "MIPS16 Application Specific Extension Product Description," www.sgi.com/MIPS/arch/MIPS16/mips16.pdf.Google Scholar
- Miranker, G. S., J. Rubenstein, and J. Sanguinetti [1988]. "Squeezing a Cray-class supercomputer into a single-user package," Proc. IEEE COMPCON , February 29-March 4, 1988, San Francisco, 452-456.Google Scholar
- Mitchell, D. [1989]. "The Transputer: The time is now," Computer Design (RISC suppl.), 40-41.Google Scholar
- Mitsubishi. [1996]. Mitsubishi 32-Bit Single Chip Microcomputer M32R Family Software Manual , Mitsubishi, Cypress, Calif.Google Scholar
- Miura, K., and K. Uchida [1983]. "FACOM vector processing system: VP100/200," Proc. NATO Advanced Research Workshop on High-Speed Computing , June 20-22, 1983, Jülich, West Germany. Also appears in K. Hwang, ed., "Superprocessors: Design and applications," IEEE (August 1984), 59-73.Google Scholar
- Miya, E. N. [1985]. "Multiprocessor/distributed processing bibliography," Computer Architecture News 13:1, 27-29. Google Scholar
Digital Library
- Montoye, R. K., E. Hokenek, and S. L. Runyon [1990]. "Design of the IBM RISC System/6000 floating-point execution," IBM J. Research and Development 34:1, 59-70. Google Scholar
Digital Library
- Moore, B., A. Padegs, R. Smith, and W. Bucholz [1987]. "Concepts of the System/370 vector architecture," 14th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 2-5, 1987, Pittsburgh, Penn., 282-292. Google Scholar
- Moore, G. E. [1965]. "Cramming more components onto integrated circuits," Electronics , 38:8 (April 19), 114-117.Google Scholar
- Morse, S., B. Ravenal, S. Mazor, and W. Pohlman [1980]. "Intel microprocessors--8080 to 8086," Computer 13:10 (October). Google Scholar
- Moshovos, A., and G. S. Sohi [1997]. "Streamlining inter-operation memory communication via data dependence prediction," Proc. 30th Annual Int'l. Symposium on Microarchitecture , December 1-3, Research Triangle Park, N.C., 235-245. Google Scholar
- Moshovos, A., S. Breach, T. N. Vijaykumar, and G. S. Sohi [1997]. "Dynamic speculation and synchronization of data dependences," 24th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 2-4, 1997, Denver, Colo. Google Scholar
- Moussouris, J., L. Crudele, D. Freitas, C. Hansen, E. Hudson, S. Przybylski, T. Riordan, and C. Rowen [1986]. "A CMOS RISC processor with integrated system functions," Proc. IEEE COMPCON , March 3-6, 1986, San Francisco, 191.Google Scholar
- Mowry, T. C., S. Lam, and A. Gupta [1992]. "Design and evaluation of a compiler algorithm for prefetching," Proc. Fifth Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 12-15, 1992, Boston ( SIGPLAN Notices 27:9 (September), 62-73). Google Scholar
- MSN Money. [2005]. "Amazon Shares Tumble after Rally Fizzles," http://moneycentral .msn.com/content/CNBCTV/Articles/Dispatches/P133695.asp.Google Scholar
- Muchnick, S. S. [1988]. "Optimizing compilers for SPARC," Sun Technology 1:3 (Summer), 64-77.Google Scholar
- Mueller, M., L. C. Alves, W. Fischer, M. L. Fair, and I. Modi [1999]. "RAS strategy for IBM S/390 G5 and G6," IBM J. Research and Development 43:5-6 (September-November), 875-888. Google Scholar
Digital Library
- Mukherjee, S. S., C. Weaver, J. S. Emer, S. K. Reinhardt, and T. M. Austin [2003]. "Measuring architectural vulnerability factors," IEEE Micro 23:6, 70-75. Google Scholar
Digital Library
- Murphy, B., and T. Gent [1995]. "Measuring system and software reliability using an automated data collection process," Quality and Reliability Engineering International 11:5 (September-October), 341-353.Google Scholar
Cross Ref
- Myer, T. H., and I. E. Sutherland [1968]. "On the design of display processors," Communications of the ACM 11:6 (June), 410-414. Google Scholar
Digital Library
- Narayanan, D., E. Thereska, A. Donnelly, S. Elnikety, and A. Rowstron [2009]. "Migrating server storage to SSDs: Analysis of trade-offs," Proc. 4th ACM European Conf. on Computer Systems , April 1-3, 2009, Nuremberg, Germany. Google Scholar
- National Research Council. [1997]. The Evolution of Untethered Communications , Computer Science and Telecommunications Board, National Academy Press, Washington, D.C. Google Scholar
- National Storage Industry Consortium. [1998]. "Tape Roadmap," www.nsic.org.Google Scholar
- Nelson, V. P. [1990]. "Fault-tolerant computing: Fundamental concepts," Computer 23:7 (July), 19-25. Google Scholar
Digital Library
- Ngai, T.-F., and M. J. Irwin [1985]. "Regular, area-time efficient carry-lookahead adders," Proc. Seventh IEEE Symposium on Computer Arithmetic , June 4-6, 1985, University of Illinois, Urbana, 9-15.Google Scholar
- Nicolau, A., and J. A. Fisher [1984]. "Measuring the parallelism available for very long instruction word architectures," IEEE Trans. on Computers C-33:11 (November), 968-976. Google Scholar
Digital Library
- Nikhil, R. S., G. M. Papadopoulos, and Arvind [1992]. "*T: A multithreaded massively parallel architecture," Proc. 19th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 19-21, 1992, Gold Coast, Australia, 156-167. Google Scholar
- Noordergraaf, L., and R. van der Pas [1999]. "Performance experiences on Sun's WildFire prototype," Proc. ACM/IEEE Conf. on Supercomputing , November 13-19, 1999, Portland, Ore. Google Scholar
- Nyberg, C. R., T. Barclay, Z. Cvetanovic, J. Gray, and D. Lomet [1994]. "AlphaSort: A RISC machine sort," Proc. ACM SIGMOD , May 24-27, 1994, Minneapolis, Minn. Google Scholar
- Oka, M., and M. Suzuoki [1999]. "Designing and programming the emotion engine," IEEE Micro 19:6 (November-December), 20-28. Google Scholar
Digital Library
- Okada, S., S. Okada, Y. Matsuda, T. Yamada, and A. Kobayashi [1999]. "System on a chip for digital still camera," IEEE Trans. on Consumer Electronics 45:3 (August), 584-590. Google Scholar
Digital Library
- Oliker, L., A. Canning, J. Carter, J. Shalf, and S. Ethier [2004]. "Scientific computations on modern parallel vector systems," Proc. ACM/IEEE Conf. on Supercomputing , November 6-12, 2004, Pittsburgh, Penn., 10. Google Scholar
- Pabst, T. [2000]. "Performance Showdown at 133 MHz FSB--The Best Platform for Coppermine," www6.tomshardware.com/mainboard/00q1/000302/.Google Scholar
- Padua, D., and M. Wolfe [1986]. "Advanced compiler optimizations for supercomputers," Communications of the ACM 29:12 (December), 1184-1201. Google Scholar
Digital Library
- Palacharla, S., and R. E. Kessler [1994]. "Evaluating stream buffers as a secondary cache replacement," Proc. 21st Annual Int'l. Symposium on Computer Architecture (ISCA) , April 18-21, 1994, Chicago, 24-33. Google Scholar
- Palmer, J., and S. Morse [1984]. The 8087 Primer , John Wiley & Sons, New York, 93.Google Scholar
- Pan, S.-T., K. So, and J. T. Rameh [1992]. "Improving the accuracy of dynamic branch prediction using branch correlation," Proc. Fifth Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 12-15, 1992, Boston, 76-84. Google Scholar
- Partridge, C. [1994]. Gigabit Networking , Addison-Wesley, Reading, Mass. Google Scholar
- Patterson, D. [1985]. "Reduced instruction set computers," Communications of the ACM 28:1 (January), 8-21. Google Scholar
Digital Library
- Patterson, D. [2004]. "Latency lags bandwidth," Communications of the ACM 47:10 (October), 71-75. Google Scholar
Digital Library
- Patterson, D. A., and D. R. Ditzel [1980]. "The case for the reduced instruction set computer," Computer Architecture News 8:6 (October), 25-33. Google Scholar
Digital Library
- Patterson, D. A., and J. L. Hennessy [2004]. Computer Organization and Design: The Hardware/Software Interface , 3rd ed., Morgan Kaufmann, San Francisco. Google Scholar
- Patterson, D. A., G. A. Gibson, and R. H. Katz [1987]. A Case for Redundant Arrays of Inexpensive Disks (RAID) , Tech. Rep. UCB/CSD 87/391, University of California, Berkeley. Also appeared in Proc. ACM SIGMOD , June 1-3, 1988, Chicago, 109-116. Google Scholar
Digital Library
- Patterson, D. A., P. Garrison, M. Hill, D. Lioupis, C. Nyberg, T. Sippel, and K. Van Dyke [1983]. "Architecture of a VLSI instruction cache for a RISC," 10th Annual Int'l. Conf. on Computer Architecture Conf. Proc. , June 13-16, 1983, Stockholm, Sweden, 108-116. Google Scholar
- Pavan, P., R. Bez, P. Olivo, and E. Zanoni [1997]. "Flash memory cells--an overview." Proc. IEEE 85:8 (August), 1248-1271.Google Scholar
Cross Ref
- Peh, L. S., and W. J. Dally [2001]. "A delay model and speculative architecture for pipelined routers," Proc. 7th Int'l. Symposium on High-Performance Computer Architecture , January 22-24, 2001, Monterrey, Mexico. Google Scholar
- Peng, V., S. Samudrala, and M. Gavrielov [1987]. "On the implementation of shifters, multipliers, and dividers in VLSI floating point units," Proc. 8th IEEE Symposium on Computer Arithmetic , May 19-21, 1987, Como, Italy, 95-102.Google Scholar
- Pfister, G. F. [1998]. In Search of Clusters , 2nd ed., Prentice Hall, Upper Saddle River, N. J. Google Scholar
Digital Library
- Pfister, G. F., W. C. Brantley, D. A. George, S. L. Harvey, W. J. Kleinfekder, K. P. McAuliffe, E. A. Melton, V. A. Norton, and J. Weiss [1985]. "The IBM research parallel processor prototype (RP3): Introduction and architecture," Proc. 12th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 17-19, 1985, Boston, Mass., 764-771.Google Scholar
- Pinheiro, E., W. D. Weber, and L. A. Barroso [2007]. "Failure trends in a large disk drive population," Proc. 5th USENIX Conference on File and Storage Technologies (FAST '07) , February 13-16, 2007, San Jose, Calif. Google Scholar
Digital Library
- Pinkston, T. M. [2004]. "Deadlock characterization and resolution in interconnection networks," in M. C. Zhu and M. P. Fanti, eds., Deadlock Resolution in Computer-Integrated Systems , CRC Press, Boca Raton, FL, 445-492.Google Scholar
- Pinkston, T. M., and J. Shin [2005]. "Trends toward on-chip networked microsystems," Int'l. J. of High Performance Computing and Networking 3:1, 3-18. Google Scholar
Digital Library
- Pinkston, T. M., and S. Warnakulasuriya [1997]. "On deadlocks in interconnection networks," 24th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 2-4, 1997, Denver, Colo. Google Scholar
- Pinkston, T. M., A. Benner, M. Krause, I. Robinson, and T. Sterling [2003]. "InfiniBand: The 'de facto' future standard for system and local area networks or just a scalable replacement for PCI buses?" Cluster Computing (special issue on communication architecture for clusters) 6:2 (April), 95-104. Google Scholar
- Postiff, M. A., D. A. Greene, G. S. Tyson, and T. N. Mudge [1999]. "The limits of instruction level parallelism in SPEC95 applications," Computer Architecture News 27:1 (March), 31-40. Google Scholar
Digital Library
- Przybylski, S. A. [1990]. Cache Design: A Performance-Directed Approach , Morgan Kaufmann, San Francisco. Google Scholar
- Przybylski, S. A., M. Horowitz, and J. L. Hennessy [1988]. "Performance trade-offs in cache design," 15th Annual Int'l. Symposium on Computer Architecture , May 30-June 2, 1988, Honolulu, Hawaii, 290-298. Google Scholar
- Puente, V., R. Beivide, J. A. Gregorio, J. M. Prellezo, J. Duato, and C. Izu [1999]. "Adaptive bubble router: A design to improve performance in torus networks," Proc. 28th Int'l. Conference on Parallel Processing , September 21-24, 1999, Aizu-Wakamatsu, Fukushima, Japan. Google Scholar
- Radin, G. [1982]. "The 801 minicomputer," Proc. Symposium Architectural Support for Programming Languages and Operating Systems (ASPLOS) , March 1-3, 1982, Palo Alto, Calif., 39-47. Google Scholar
- Rajesh Bordawekar, Uday Bondhugula, Ravi Rao: Believe it or not!: mult-core CPUs can match GPU performance for a FLOP-intensive application! 19th International Conference on Parallel Architecture and Compilation Techniques (PACT 2010), Vienna, Austria, September 11-15, 2010: 537-538. Google Scholar
- Ramamoorthy, C. V., and H. F. Li [1977]. "Pipeline architecture," ACM Computing Surveys 9:1 (March), 61-102. Google Scholar
Digital Library
- Ranganathan, P., P. Leech, D. Irwin, and J. Chase [2006]. "Ensemble-Level Power Management for Dense Blade Servers," Proc. 33rd Annual Int'l. Symposium on Computer Architecture (ISCA) , June 17-21, 2006, Boston, Mass., 66-77. Google Scholar
- Rau, B. R. [1994]. "Iterative modulo scheduling: An algorithm for software pipelining loops," Proc. 27th Annual Int'l. Symposium on Microarchitecture , November 30-December 2, 1994, San Jose, Calif., 63-74. Google Scholar
- Rau, B. R., C. D. Glaeser, and R. L. Picard [1982]. "Efficient code generation for horizontal architectures: Compiler techniques and architectural support," Proc. Ninth Annual Int'l. Symposium on Computer Architecture (ISCA) , April 26-29, 1982, Austin, Tex., 131-139. Google Scholar
- Rau, B. R., D. W. L. Yen, W. Yen, and R. A. Towle [1989]. "The Cydra 5 departmental supercomputer: Design philosophies, decisions, and trade-offs," IEEE Computers 22:1 (January), 12-34. Google Scholar
Digital Library
- Reddi, V. J., B. C. Lee, T. Chilimbi, and K. Vaid [2010]. "Web search using mobile cores: Quantifying and mitigating the price of efficiency," Proc. 37th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 19-23, 2010, Saint-Malo, France. Google Scholar
- Redmond, K. C., and T. M. Smith [1980]. Project Whirlwind--The History of a Pioneer Computer , Digital Press, Boston. Google Scholar
- Reinhardt, S. K., J. R. Larus, and D. A. Wood [1994]. "Tempest and Typhoon: User-level shared memory," 21st Annual Int'l. Symposium on Computer Architecture (ISCA) , April 18-21, 1994, Chicago, 325-336. Google Scholar
- Reinman, G., and N. P. Jouppi. [1999]. "Extensions to CACTI," research.compaq.com/wrl/people/jouppi/CACTI.html.Google Scholar
- Rettberg, R. D., W. R. Crowther, P. P. Carvey, and R. S. Towlinson [1990]. "The Monarch parallel processor hardware design," IEEE Computer 23:4 (April), 18-30. Google Scholar
Digital Library
- Riemens, A., K. A. Vissers, R. J. Schutten, F. W. Sijstermans, G. J. Hekstra, and G. D. La Hei [1999]. "Trimedia CPU64 application domain and benchmark suite," Proc. IEEE Int'l. Conf. on Computer Design: VLSI in Computers and Processors (ICCD'99) , October 10-13, 1999, Austin, Tex., 580-585. Google Scholar
- Riseman, E. M., and C. C. Foster [1972]. "Percolation of code to enhance paralled dispatching and execution," IEEE Trans. on Computers C-21:12 (December), 1411-1415. Google Scholar
- Robin, J., and C. Irvine [2000]. "Analysis of the Intel Pentium's ability to support a secure virtual machine monitor." Proc. USENIX Security Symposium , August 14-17, 2000, Denver, Colo. Google Scholar
Cross Ref
- Robinson, B., and L. Blount [1986]. The VM/HPO 3880-23 Performance Results , IBM Tech. Bulletin GG66-0247-00, IBM Washington Systems Center, Gaithersburg, Md.Google Scholar
- Ropers, A., H. W. Lollman, and J. Wellhausen [1999]. DSPstone: Texas Instruments TMS320C54x , Tech. Rep. IB 315 1999/9-ISS-Version 0.9, Aachen University of Technology, Aaachen, Germany (www.ert.rwth-aachen.de/Projekte/Tools/coal/dspstone_c54x/index.html).Google Scholar
- Rosenblum, M., S. A. Herrod, E. Witchel, and A. Gupta [1995]. "Complete computer simulation: The SimOS approach," in IEEE Parallel and Distributed Technology (now called Concurrency ) 4:3, 34-43. Google Scholar
- Rowen, C., M. Johnson, and P. Ries [1988]. "The MIPS R3010 floating-point coprocessor," IEEE Micro 8:3 (June), 53-62. Google Scholar
Digital Library
- Russell, R. M. [1978]. "The Cray-1 processor system," Communications of the ACM 21:1 (January), 63-72. Google Scholar
Digital Library
- Rymarczyk, J. [1982]. "Coding guidelines for pipelined processors," Proc. Symposium Architectural Support for Programming Languages and Operating Systems (ASPLOS) , March 1-3, 1982, Palo Alto, Calif., 12-19. Google Scholar
- Saavedra-Barrera, R. H. [1992]. "CPU Performance Evaluation and Execution Time Prediction Using Narrow Spectrum Benchmarking," Ph. D. dissertation, University of California, Berkeley. Google Scholar
- Salem, K., and H. Garcia-Molina [1986]. "Disk striping," Proc. 2nd Int'l. IEEE Conf. on Data Engineering , February 5-7, 1986, Washington, D.C., 249-259. Google Scholar
- Saltzer, J. H., D. P. Reed, and D. D. Clark [1984]. "End-to-end arguments in system design," ACM Trans. on Computer Systems 2:4 (November), 277-288. Google Scholar
Digital Library
- Samples, A. D., and P. N. Hilfinger [1988]. Code Reorganization for Instruction Caches , Tech. Rep. UCB/CSD 88/447, University of California, Berkeley. Google Scholar
- Santoro, M. R., G. Bewick, and M. A. Horowitz [1989]. "Rounding algorithms for IEEE multipliers," Proc. Ninth IEEE Symposium on Computer Arithmetic , September 6-8, Santa Monica, Calif., 176-183.Google Scholar
- Satran, J., D. Smith, K. Meth, C. Sapuntzakis, M. Wakeley, P. Von Stamwitz, R. Haagens, E. Zeidner, L. Dalle Ore, and Y. Klein [2001]. "iSCSI," IPS Working Group of IETF, Internet draft www.ietf.org/internet-drafts/draft-ietf-ips-iscsi-07.txt.Google Scholar
- Saulsbury, A., T. Wilkinson, J. Carter, and A. Landin [1995]. "An argument for Simple COMA," Proc. First IEEE Symposium on High-Performance Computer Architectures , January 22-25, 1995, Raleigh, N.C., 276-285. Google Scholar
- Schneck, P. B. [1987]. Superprocessor Architecture , Kluwer Academic Publishers, Norwell, Mass.Google Scholar
- Schroeder, B., and G. A. Gibson [2007]. "Understanding failures in petascale computers," J. of Physics Conf. Series 78(1), 188-198.Google Scholar
- Schroeder, B., E. Pinheiro, and W.-D. Weber [2009]. "DRAM errors in the wild: a largescale field study," Proc. Eleventh Int'l. Joint Conf. on Measurement and Modeling of Computer Systems (SIGMETRICS) , June 15-19, 2009, Seattle, Wash. Google Scholar
- Schurman, E., and J. Brutlag [2009]. "The user and business impact of server delays," Proc. Velocity: Web Performance and Operations Conf. , June 22-24, 2009, San Jose, Calif.Google Scholar
- Schwartz, J. T. [1980]. "Ultracomputers," ACM Trans. on Programming Languages and Systems 4:2, 484-521. Google Scholar
Digital Library
- Scott, N. R. [1985]. Computer Number Systems and Arithmetic , Prentice Hall, Englewood Cliffs, N. J. Google Scholar
- Scott, S. L. [1996]. "Synchronization and communication in the T3E multiprocessor," Seventh Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 1-5, 1996, Cambridge, Mass. Google Scholar
- Scott, S. L., and J. Goodman [1994]. "The impact of pipelined channels on k -ary n -cube networks," IEEE Trans. on Parallel and Distributed Systems 5:1 (January), 1-16. Google Scholar
Digital Library
- Scott, S. L., and G. M. Thorson [1996]. "The Cray T3E network: Adaptive routing in a high performance 3D torus," Proc. IEEE HOT Interconnects '96 , August 15-17, 1996, Stanford University, Palo Alto, Calif., 14-156.Google Scholar
- Scranton, R. A., D. A. Thompson, and D. W. Hunter [1983]. The Access Time Myth ," Tech. Rep. RC 10197 (45223), IBM, Yorktown Heights, N.Y.Google Scholar
- Seagate. [2000]. Seagate Cheetah 73 Family: ST173404LW/LWV/LC/LCV Product Manual , Vol. 1, Seagate, Scotts Valley, Calif. (www.seagate.com/support/disc/manuals/scsi/29478b.pdf).Google Scholar
- Seitz, C. L. [1985]. "The Cosmic Cube (concurrent computing)," Communications of the ACM 28:1 (January), 22-33. Google Scholar
Digital Library
- Senior, J. M. [1993]. Optical Fiber Commmunications: Principles and Practice , 2nd ed., Prentice Hall, Hertfordshire, U. K. Google Scholar
- Sharangpani, H., and K. Arora [2000]. "Itanium Processor Microarchitecture," IEEE Micro 20:5 (September-October), 24-43. Google Scholar
Digital Library
- Shurkin, J. [1984]. Engines of the Mind: A History of the Computer , W. W. Norton, New York. Google Scholar
- Shustek, L. J. [1978]. "Analysis and Performance of Computer Instruction Sets," Ph. D. dissertation, Stanford University, Palo Alto, Calif. Google Scholar
- Silicon Graphics. [1996]. MIPS V Instruction Set (see http://www.sgi.com/MIPS/arch/ISA5/#MIPSV_indx).Google Scholar
- Singh, J. P., J. L. Hennessy, and A. Gupta [1993]. "Scaling parallel programs for multiprocessors: Methodology and examples," Computer 26:7 (July), 22-33. Google Scholar
Digital Library
- Sinharoy, B., R. N. Koala, J. M. Tendler, R. J. Eickemeyer, and J. B. Joyner [2005]. "POWER5 system microarchitecture," IBM J. Research and Development , 49:4-5, 505-521. Google Scholar
Cross Ref
- Sites, R. [1979]. Instruction Ordering for the CRAY-1 Computer , Tech. Rep. 78-CS-023, Dept. of Computer Science, University of California, San Diego.Google Scholar
- Sites, R. L. (ed.) [1992]. Alpha Architecture Reference Manual , Digital Press, Burlington, Mass. Google Scholar
- Sites, R. L., and R. Witek, (eds.) [1995]. Alpha Architecture Reference Manual , 2nd ed., Digital Press, Newton, Mass. Google Scholar
- Skadron, K., and D. W. Clark [1997]. "Design issues and tradeoffs for write buffers," Proc. Third Int'l. Symposium on High-Performance Computer Architecture , February 1-5, 1997, San Antonio, Tex., 144-155. Google Scholar
- Skadron, K., P. S. Ahuja, M. Martonosi, and D. W. Clark [1999]. "Branch prediction, instruction-window size, and cache size: Performance tradeoffs and simulation techniques," IEEE Trans. on Computers 48:11 (November). Google Scholar
Digital Library
- Slater, R. [1987]. Portraits in Silicon , MIT Press, Cambridge, Mass. Google Scholar
- Slotnick, D. L., W. C. Borck, and R. C. McReynolds [1962]. "The Solomon computer," Proc. AFIPS Fall Joint Computer Conf. , December 4-6, 1962, Philadelphia, Penn., 97-107. Google Scholar
- Smith, A. J. [1982]. "Cache memories," Computing Surveys 14:3 (September), 473-530. Google Scholar
Digital Library
- Smith, A., and J. Lee [1984]. "Branch prediction strategies and branch-target buffer design," Computer 17:1 (January), 6-22. Google Scholar
- Smith, B. J. [1978]. "A pipelined, shared resource MIMD computer," Proc. Int'l. Conf. on Parallel Processing (ICPP) , August, Bellaire, Mich., 6-8.Google Scholar
- Smith, B. J. [1981]. "Architecture and applications of the HEP multiprocessor system," Real-Time Signal Processing IV 298 (August), 241-248.Google Scholar
- Smith, J. E. [1981]. "A study of branch prediction strategies," Proc. Eighth Annual Int'l. Symposium on Computer Architecture (ISCA) , May 12-14, 1981, Minneapolis, Minn., 135-148. Google Scholar
- Smith, J. E. [1984]. "Decoupled access/execute computer architectures," ACM Trans. on Computer Systems 2:4 (November), 289-308. Google Scholar
Digital Library
- Smith, J. E. [1988]. "Characterizing computer performance with a single number," Communications of the ACM 31:10 (October), 1202-1206. Google Scholar
Digital Library
- Smith, J. E. [1989]. "Dynamic instruction scheduling and the Astronautics ZS-1," Computer 22:7 (July), 21-35. Google Scholar
Digital Library
- Smith, J. E., and J. R. Goodman [1983]. "A study of instruction cache organizations and replacement policies," Proc. 10th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 5-7, 1982, Stockholm, Sweden, 132-137. Google Scholar
- Smith, J. E., and A. R. Pleszkun [1988]. "Implementing precise interrupts in pipelined processors," IEEE Trans. on Computers 37:5 (May), 562-573. (This paper is based on an earlier paper that appeared in Proc. 12th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 17-19, 1985, Boston, Mass.) Google Scholar
- Smith, J. E., G. E. Dermer, B. D. Vanderwarn, S. D. Klinger, C. M. Rozewski, D. L. Fowler, K. R. Scidmore, and J. P. Laudon [1987]. "The ZS-1 central processor," Proc. Second Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 5-8, 1987, Palo Alto, Calif., 199-204. Google Scholar
- Smith, M. D., M. Horowitz, and M. S. Lam [1992]. "Efficient superscalar performance through boosting," Proc. Fifth Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 12-15, 1992, Boston, 248-259. Google Scholar
- Smith, M. D., M. Johnson, and M. A. Horowitz [1989]. "Limits on multiple instruction issue," Proc. Third Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , April 3-6, 1989, Boston, 290-302. Google Scholar
- Smotherman, M. [1989]. "A sequencing-based taxonomy of I/O systems and review of historical machines," Computer Architecture News 17:5 (September), 5-15. Reprinted in Computer Architecture Readings , M. D. Hill, N. P. Jouppi, and G. S. Sohi, eds., Morgan Kaufmann, San Francisco, 1999, 451-461. Google Scholar
Digital Library
- Sodani, A., and G. Sohi [1997]. "Dynamic instruction reuse," Proc. 24th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 2-4, 1997, Denver, Colo. Google Scholar
- Sohi, G. S. [1990]. "Instruction issue logic for high-performance, interruptible, multiple functional unit, pipelined computers," IEEE Trans. on Computers 39:3 (March), 349-359. Google Scholar
Digital Library
- Sohi, G. S., and S. Vajapeyam [1989]. "Tradeoffs in instruction format design for horizontal architectures," Proc. Third Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , April 3-6, 1989, Boston, 15-25. Google Scholar
- Soundararajan, V., M. Heinrich, B. Verghese, K. Gharachorloo, A. Gupta, and J. L. Hennessy [1998]. "Flexible use of memory for replication/migration in cachecoherent DSM multiprocessors," Proc. 25th Annual Int'l. Symposium on Computer Architecture (ISCA) , July 3-14, 1998, Barcelona, Spain, 342-355. Google Scholar
- SPEC. [1989]. SPEC Benchmark Suite Release 1.0 (October 2).Google Scholar
- SPEC. [1994]. SPEC Newsletter (June).Google Scholar
- Sporer, M., F. H. Moss, and C. J. Mathais [1988]. "An introduction to the architecture of the Stellar Graphics supercomputer," Proc. IEEE COMPCON , February 29-March 4, 1988, San Francisco, 464.Google Scholar
- Spurgeon, C. [2001]. "Charles Spurgeon's Ethernet Web Site," wwwhost.ots.utexas.edu/ethernet/ethernet-home.html.Google Scholar
- Spurgeon, C. [2006]. "Charles Spurgeon's Ethernet Web SITE," www.ethermanage.com/ethernet/ethernet.html.Google Scholar
- Stenstrom, P., T. Joe, and A. Gupta [1992]. "Comparative performance evaluation of cache-coherent NUMA and COMA architectures," Proc. 19th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 19-21, 1992, Gold Coast, Australia, 80-91. Google Scholar
- Sterling, T. [2001]. Beowulf PC Cluster Computing with Windows and Beowulf PC Cluster Computing with Linux , MIT Press, Cambridge, Mass. Google Scholar
- Stern, N. [1980]. "Who invented the first electronic digital computer?" Annals of the History of Computing 2:4 (October), 375-376.Google Scholar
- Stevens, W. R. [1994-1996]. TCP/IP Illustrated (three volumes), Addison-Wesley, Reading, Mass.Google Scholar
- Stokes, J. [2000]. "Sound and Vision: A Technical Overview of the Emotion Engine," arstechnica.com/reviews/1q00/playstation2/ee-1.html.Google Scholar
- Stone, H. [1991]. High Performance Computers , Addison-Wesley, New York.Google Scholar
- Strauss, W. [1998]. "DSP Strategies 2002," www.usadata.com/market_research/spr_05/spr_r127-005.htm.Google Scholar
- Strecker, W. D. [1976]. "Cache memories for the PDP-11?," Proc. Third Annual Int'l. Symposium on Computer Architecture (ISCA) , January 19-21, 1976, Tampa, Fla., 155-158. Google Scholar
- Strecker, W. D. [1978]. "VAX-11/780: A virtual address extension of the PDP-11 family," Proc. AFIPS National Computer Conf. , June 5-8, 1978, Anaheim, Calif., 47, 967-980.Google Scholar
- Sugumar, R. A., and S. G. Abraham [1993]. "Efficient simulation of caches under optimal replacement with applications to miss characterization," Proc. ACM SIGMETRICS Conf. on Measurement and Modeling of Computer Systems , May 17-21, 1993, Santa Clara, Calif., 24-35. Google Scholar
- Sun Microsystems. [1989]. The SPARC Architectural Manual , Version 8, Part No. 8001399-09, Sun Microsystems, Santa Clara, Calif.Google Scholar
- Sussenguth, E. [1999]. "IBM's ACS-1 Machine," IEEE Computer 22:11 (November).Google Scholar
- Swan, R. J., S. H. Fuller, and D. P. Siewiorek [1977]. "Cm*--a modular, multimicroprocessor," Proc. AFIPS National Computing Conf. , June 13-16, 1977, Dallas, Tex., 637-644. Google Scholar
- Swan, R. J., A. Bechtolsheim, K. W. Lai, and J. K. Ousterhout [1977]. "The implementation of the Cm* multi-microprocessor," Proc. AFIPS National Computing Conf. , June 13-16, 1977, Dallas, Tex., 645-654. Google Scholar
- Swartzlander, E. (ed.) [1990]. Computer Arithmetic , IEEE Computer Society Press, Los Alamitos, Calif. Google Scholar
- Takagi, N., H. Yasuura, and S. Yajima [1985]."High-speed VLSI multiplication algorithm with a redundant binary addition tree," IEEE Trans. on Computers C-34:9, 789-796. Google Scholar
- Talagala, N. [2000]. "Characterizing Large Storage Systems: Error Behavior and Performance Benchmarks," Ph. D. dissertation, Computer Science Division, University of California, Berkeley. Google Scholar
- Talagala, N., and D. Patterson [1999]. An Analysis of Error Behavior in a Large Storage System , Tech. Report UCB//CSD-99-1042, Computer Science Division, University of California, Berkeley. Google Scholar
- Talagala, N., R. Arpaci-Dusseau, and D. Patterson [2000]. Micro-Benchmark Based Extraction of Local and Global Disk Characteristics , CSD-99-1063, Computer Science Division, University of California, Berkeley. Google Scholar
- Talagala, N., S. Asami, D. Patterson, R. Futernick, and D. Hart [2000]. "The art of massive storage: A case study of a Web image archive," Computer (November). Google Scholar
- Tamir, Y., and G. Frazier [1992]. "Dynamically-allocated multi-queue buffers for VLSI communication switches," IEEE Trans. on Computers 41:6 (June), 725-734. Google Scholar
Digital Library
- Tanenbaum, A. S. [1978]. "Implications of structured programming for machine architecture," Communications of the ACM 21:3 (March), 237-246. Google Scholar
Digital Library
- Tanenbaum, A. S. [1988]. Computer Networks , 2nd ed., Prentice Hall, Englewood Cliffs, N. J. Google Scholar
- Tang, C. K. [1976]. "Cache design in the tightly coupled multiprocessor system," Proc. AFIPS National Computer Conf. , June 7-10, 1976, New York, 749-753. Google Scholar
- Tanqueray, D. [2002]. "The Cray X1 and supercomputer road map," Proc. 13th Daresbury Machine Evaluation Workshop , December 11-12, 2002, Daresbury Laboratories, Daresbury, Cheshire, U. K.Google Scholar
- Tarjan, D., S. Thoziyoor, and N. Jouppi [2005]. "HPL Technical Report on CACTI 4.0," www.hpl.hp.com/techeports/2006/HPL=2006+86.html.Google Scholar
- Taylor, G. S. [1981]. "Compatible hardware for division and square root," Proc. 5th IEEE Symposium on Computer Arithmetic , May 18-19, 1981, University of Michigan, Ann Arbor, Mich., 127-134.Google Scholar
- Taylor, G. S. [1985]. "Radix 16 SRT dividers with overlapped quotient selection stages," Proc. Seventh IEEE Symposium on Computer Arithmetic , June 4-6, 1985, University of Illinois, Urbana, Ill., 64-71.Google Scholar
- Taylor, G., P. Hilfinger, J. Larus, D. Patterson, and B. Zorn [1986]. "Evaluation of the SPUR LISP architecture," Proc. 13th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 2-5, 1986, Tokyo. Google Scholar
- Taylor, M. B., W. Lee, S. P. Amarasinghe, and A. Agarwal [2005]. "Scalar operand networks," IEEE Trans. on Parallel and Distributed Systems 16:2 (February), 145-162. Google Scholar
Digital Library
- Tendler, J. M., J. S. Dodson, J. S. Fields, Jr., H. Le, and B. Sinharoy [2002]. "Power4 system microarchitecture," IBM J. Research and Development 46:1, 5-26. Google Scholar
Digital Library
- Texas Instruments. [2000]. "History of Innovation: 1980s," www.ti.com/corp/docs/company/history/1980s.shtml.Google Scholar
- Tezzaron Semiconductor. [2004]. Soft Errors in Electronic Memory , White Paper,Google Scholar
- Tezzaron Semiconductor, Naperville, Ill. (http://www.tezzaron.com/about/papers/soft_errors_1_1_secure.pdf).Google Scholar
- Thacker, C. P., E. M. McCreight, B. W. Lampson, R. F. Sproull, and D. R. Boggs [1982]. "Alto: A personal computer," in D. P. Siewiorek, C. G. Bell, and A. Newell, eds., Computer Structures: Principles and Examples , McGraw-Hill, New York, 549-572.Google Scholar
- Thadhani, A. J. [1981]. "Interactive user productivity," IBM Systems J. 20:4, 407-423. Google Scholar
Digital Library
- Thekkath, R., A. P. Singh, J. P. Singh, S. John, and J. L. Hennessy [1997]. "An evaluation of a commercial CC-NUMA architecture--the CONVEX Exemplar SPP1200," Proc. 11th Int'l. Parallel Processing Symposium (IPPS) , April 1-7, 1997, Geneva, Switzerland. Google Scholar
Cross Ref
- Thorlin, J. F. [1967]. "Code generation for PIE (parallel instruction execution) computers," Proc. Spring Joint Computer Conf. , April 18-20, 1967, Atlantic City, N. J., 27. Google Scholar
- Thornton, J. E. [1964]. "Parallel operation in the Control Data 6600," Proc. AFIPS Fall Joint Computer Conf. , Part II , October 27-29, 1964, San Francisco, 26, 33-40. Google Scholar
- Thornton, J. E. [1970]. Design of a Computer, the Control Data 6600 , Scott, Foresman, Glenview, Ill. Google Scholar
- Tjaden, G. S., and M. J. Flynn [1970]. "Detection and parallel execution of independent instructions," IEEE Trans. on Computers C-19:10 (October), 889-895. Google Scholar
Digital Library
- Tomasulo, R. M. [1967]. "An efficient algorithm for exploiting multiple arithmetic units," IBM J. Research and Development 11:1 (January), 25-33. Google Scholar
Digital Library
- Torrellas, J., A. Gupta, and J. Hennessy [1992]. "Characterizing the caching and synchronization performance of a multiprocessor operating system," Proc. Fifth Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 12-15, 1992, Boston ( SIGPLAN Notices 27:9 (September), 162-174). Google Scholar
- Touma, W. R. [1993]. The Dynamics of the Computer Industry: Modeling the Supply of Workstations and Their Components , Kluwer Academic, Boston. Google Scholar
- Tuck, N., and D. Tullsen [2003]. "Initial observations of the simultaneous multithreading Pentium 4 processor," Proc. 12th Int. Conf. on Parallel Architectures and Compilation Techniques (PACT'03 ), September 27-October 1, 2003, New Orleans, La., 26-34. Google Scholar
- Tullsen, D. M., S. J. Eggers, and H. M. Levy [1995]. "Simultaneous multithreading: Maximizing on-chip parallelism," Proc. 22nd Annual Int'l. Symposium on Computer Architecture (ISCA) , June 22-24, 1995, Santa Margherita, Italy, 392-403. Google Scholar
- Tullsen, D. M., S. J. Eggers, J. S. Emer, H. M. Levy, J. L. Lo, and R. L. Stamm [1996]. "Exploiting choice: Instruction fetch and issue on an implementable simultaneous multithreading processor," Proc. 23rd Annual Int'l. Symposium on Computer Architecture (ISCA) , May 22-24, 1996, Philadelphia, Penn., 191-202. Google Scholar
- Ungar, D., R. Blau, P. Foley, D. Samples, and D. Patterson [1984]. "Architecture of SOAR: Smalltalk on a RISC," Proc. 11th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 5-7, 1984, Ann Arbor, Mich., 188-197. Google Scholar
- Unger, S. H. [1958]. "A computer oriented towards spatial problems," Proc. Institute of Radio Engineers 46:10 (October), 1744-1750. Google Scholar
- Vahdat, A., M. Al-Fares, N. Farrington, R. Niranjan Mysore, G. Porter, and S. Radhakrishnan [2010]. "Scale-Out Networking in the Data Center," IEEE Micro 30:4 (July/August), 29-41. Google Scholar
Digital Library
- Vaidya, A. S., A Sivasubramaniam, and C. R. Das [1997]. "Performance benefits of virtual channels and adaptive routing: An application-driven study," Proc. ACM/IEEE Conf. on Supercomputing , November 16-21, 1997, San Jose, Calif. Google Scholar
Digital Library
- Vajapeyam, S. [1991]. "Instruction-Level Characterization of the Cray Y-MP Processor," Ph. D. thesis, Computer Sciences Department, University of Wisconsin-Madison. Google Scholar
- van Eijndhoven, J. T. J., F. W. Sijstermans, K. A. Vissers, E. J. D. Pol, M. I. A. Tromp, P. Struik, R. H. J. Bloks, P. van der Wolf, A. D. Pimentel, and H. P. E. Vranken [1999]. "Trimedia CPU64 architecture," Proc. IEEE Int'l. Conf. on Computer Design: VLSI in Computers and Processors (ICCD'99) , October 10-13, 1999, Austin, Tex., 586-592. Google Scholar
- Van Vleck, T. [2005]. "The IBM 360/67 and CP/CMS," http://www.multicians.org/thvv/360-67.html.Google Scholar
- von Eicken, T., D. E. Culler, S. C. Goldstein, and K. E. Schauser [1992]. "Active Messages: A mechanism for integrated communication and computation," Proc. 19th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 19-21, 1992, Gold Coast, Australia. Google Scholar
- Waingold, E., M. Taylor, D. Srikrishna, V. Sarkar, W. Lee, V. Lee, J. Kim, M. Frank, P. Finch, R. Barua, J. Babb, S. Amarasinghe, and A. Agarwal [1997]. "Baring it all to software: Raw Machines," IEEE Computer 30 (September), 86-93. Google Scholar
Digital Library
- Wakerly, J. [1989]. Microcomputer Architecture and Programming , Wiley, New York. Google Scholar
- Wall, D. W. [1991]. "Limits of instruction-level parallelism," Proc. Fourth Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , April 8-11, 1991, Palo Alto, Calif., 248-259. Google Scholar
- Wall, D. W. [1993]. Limits of Instruction-Level Parallelism , Research Rep. 93/6, Western Research Laboratory, Digital Equipment Corp., Palo Alto, Calif.Google Scholar
- Walrand, J. [1991]. Communication Networks: A First Course , Aksen Associates/Irwin, Homewood, Ill. Google Scholar
- Wang, W.-H., J.-L. Baer, and H. M. Levy [1989]. "Organization and performance of a two-level virtual-real cache hierarchy," Proc. 16th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 28-June 1, 1989, Jerusalem, 140-148. Google Scholar
Cross Ref
- Watanabe, T. [1987]. "Architecture and performance of the NEC supercomputer SX system," Parallel Computing 5, 247-255.Google Scholar
Cross Ref
- Waters, F. (ed.) [1986]. IBM RT Personal Computer Technology , SA 23-1057, IBM, Austin, Tex.Google Scholar
- Watson, W. J. [1972]. "The TI ASC--a highly modular and flexible super processor architecture," Proc. AFIPS Fall Joint Computer Conf. , December 5-7, 1972, Anaheim, Calif., 221-228. Google Scholar
- Weaver, D. L., and T. Germond [1994]. The SPARC Architectural Manual , Version 9, Prentice Hall, Englewood Cliffs, N. J. Google Scholar
- Weicker, R. P. [1984]. "Dhrystone: A synthetic systems programming benchmark," Communications of the ACM 27:10 (October), 1013-1030. Google Scholar
Digital Library
- Weiss, S., and J. E. Smith [1984]. "Instruction issue logic for pipelined supercomputers," Proc. 11th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 5-7, 1984, Ann Arbor, Mich., 110-118. Google Scholar
- Weiss, S., and J. E. Smith [1987]. "A study of scalar compilation techniques for pipelined supercomputers," Proc. Second Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , October 5-8, 1987, Palo Alto, Calif., 105-109. Google Scholar
- Weiss, S., and J. E. Smith [1994]. Power and PowerPC , Morgan Kaufmann, San Francisco. Google Scholar
- Wendel, D., R. Kalla, J. Friedrich, J. Kahle, J. Leenstra, C. Lichtenau, B. Sinharoy, W. Starke, and V. Zyuban [2010]. "The Power7 processor SoC," Proc. Int'l. Conf. on IC Design and Technology , June 2-4, 2010, Grenoble, France, 71-73.Google Scholar
- Weste, N., and K. Eshraghian [1993]. Principles of CMOS VLSI Design: A Systems Perspective , 2nd ed., Addison-Wesley, Reading, Mass.Google Scholar
- Wiecek, C. [1982]. "A case study of the VAX 11 instruction set usage for compiler execution," Proc. Symposium on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , March 1-3, 1982, Palo Alto, Calif., 177-184. Google Scholar
Digital Library
- Wilkes, M. [1965]. "Slave memories and dynamic storage allocation," IEEE Trans. Electronic Computers EC-14:2 (April), 270-271.Google Scholar
Cross Ref
- Wilkes, M. V. [1982]. "Hardware support for memory protection: Capability implementations," Proc. Symposium on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , March 1-3, 1982, Palo Alto, Calif., 107-116. Google Scholar
- Wilkes, M. V. [1985]. Memoirs of a Computer Pioneer , MIT Press, Cambridge, Mass. Google Scholar
- Wilkes, M. V. [1995]. Computing Perspectives , Morgan Kaufmann, San Francisco. Google Scholar
- Wilkes, M. V., D. J. Wheeler, and S. Gill [1951]. The Preparation of Programs for an Electronic Digital Computer , Addison-Wesley, Cambridge, Mass.Google Scholar
- Williams, S., A. Waterman, and D. Patterson [2009]. "Roofline: An insightful visual performance model for multicore architectures," Communications of the ACM , 52:4 (April), 65-76. Google Scholar
Digital Library
- Williams, T. E., M. Horowitz, R. L. Alverson, and T. S. Yang [1987]. "A self-timed chip for division," in P. Losleben, ed., 1987 Stanford Conference on Advanced Research in VLSI , MIT Press, Cambridge, Mass.Google Scholar
- Wilson, A. W., Jr. [1987]. "Hierarchical cache/bus architecture for shared-memory multiprocessors," Proc. 14th Annual Int'l. Symposium on Computer Architecture (ISCA) , June 2-5, 1987, Pittsburgh, Penn., 244-252. Google Scholar
- Wilson, R. P., and M. S. Lam [1995]. "Efficient context-sensitive pointer analysis for C programs," Proc. ACM SIGPLAN'95 Conf. on Programming Language Design and Implementation , June 18-21, 1995, La Jolla, Calif., 1-12. Google Scholar
- Wolfe, A., and J. P. Shen [1991]. "A variable instruction stream extension to the VLIW architecture," Proc. Fourth Int'l. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS) , April 8-11, 1991, Palo Alto, Calif., 2-14. Google Scholar
- Wood, D. A., and M. D. Hill [1995]. "Cost-effective parallel computing," IEEE Computer 28:2 (February), 69-72. Google Scholar
Digital Library
- Wulf, W. [1981]. "Compilers and computer architecture," Computer 14:7 (July), 41-47. Google Scholar
Digital Library
- Wulf, W., and C. G. Bell [1972]. "C.mmp--A multi-mini-processor," Proc. AFIPS Fall Joint Computer Conf. , December 5-7, 1972, Anaheim, Calif., 765-777. Google Scholar
- Wulf, W., and S. P. Harbison [1978]. "Reflections in a pool of processors--an experience report on C.mmp/Hydra," Proc. AFIPS National Computing Conf. June 5-8, 1978, Anaheim, Calif., 939-951.Google Scholar
- Wulf, W. A., and S. A. McKee [1995]. "Hitting the memory wall: Implications of the obvious," ACM SIGARCH Computer Architecture News , 23:1 (March), 20-24. Google Scholar
Digital Library
- Wulf, W. A., R. Levin, and S. P. Harbison [1981]. Hydra/C.mmp: An Experimental Computer System , McGraw-Hill, New York.Google Scholar
- Yamamoto, W., M. J. Serrano, A. R. Talcott, R. C. Wood, and M. Nemirosky [1994]. "Performance estimation of multistreamed, superscalar processors," Proc. 27th Annual Hawaii Int'l. Conf. on System Sciences , January 4-7, 1994, Maui, 195-204.Google Scholar
- Yang, Y., and G. Mason [1991]. "Nonblocking broadcast switching networks," IEEE Trans. on Computers 40:9 (September), 1005-1015. Google Scholar
Digital Library
- Yeager, K. [1996]. "The MIPS R10000 superscalar microprocessor," IEEE Micro 16:2 (April), 28-40. Google Scholar
Digital Library
- Yeh, T., and Y. N. Patt [1993a]. "Alternative implementations of two-level adaptive branch prediction," Proc. 19th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 19-21, 1992, Gold Coast, Australia, 124-134. Google Scholar
- Yeh, T., and Y. N. Patt [1993b]. "A comparison of dynamic branch predictors that use two levels of branch history," Proc. 20th Annual Int'l. Symposium on Computer Architecture (ISCA) , May 16-19, 1993, San Diego, Calif., 257-266. Google Scholar
Cited By
- Venkatesha S and Parthasarathi R (2024). Survey on Redundancy Based-Fault tolerance methods for Processors and Hardware accelerators - Trends in Quantum Computing, Heterogeneous Systems and Reliability, ACM Computing Surveys, 56:11, (1-76), Online publication date: 30-Nov-2024.
- Elderhalli Y, Hasan O and Tahar S (2024). Dynamic dependability analysis of shuffle-exchange networks, Formal Methods in System Design, 62:1-3, (285-325), Online publication date: 1-Jun-2024.
- Fu X, Yang W, Dong D and Su X Optimizing Attention by Exploiting Data Reuse on ARM Multi-core CPUs Proceedings of the 38th ACM International Conference on Supercomputing, (137-149)
- Mosquera F, Ekanayake A, Hua W, Kavi K, Mehta G and John L (2024). SecurityCloak, Journal of Systems Architecture: the EUROMICRO Journal, 150:C, Online publication date: 1-May-2024.
- Miao X, Oliaro G, Zhang Z, Cheng X, Wang Z, Zhang Z, Wong R, Zhu A, Yang L, Shi X, Shi C, Chen Z, Arfeen D, Abhyankar R and Jia Z SpecInfer: Accelerating Large Language Model Serving with Tree-based Speculative Inference and Verification Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3, (932-949)
- Ottaviano A, Balas R, Bambini G, Del Vecchio A, Ciani M, Rossi D, Benini L and Bartolini A (2024). ControlPULP: A RISC-V On-Chip Parallel Power Controller for Many-Core HPC Processors with FPGA-Based Hardware-In-The-Loop Power and Thermal Emulation, International Journal of Parallel Programming, 52:1-2, (93-123), Online publication date: 1-Apr-2024.
- Wang Z, Liu L and Xiao L (2024). iSwap: A New Memory Page Swap Mechanism for Reducing Ineffective I/O Operations in Cloud Environments, ACM Transactions on Architecture and Code Optimization, 0:0
- Zhou C, Hassman Z, Shah D, Richard V and Li Y YFlows: Systematic Dataflow Exploration and Code Generation for Efficient Neural Network Inference using SIMD Architectures on CPUs Proceedings of the 33rd ACM SIGPLAN International Conference on Compiler Construction, (212-226)
- Chang L, Zhao X, Yue T, Yang X, Li C, Lin S and Zhou J (2024). IPOCIM: Artificial Intelligent Architecture Design Space Exploration With Scalable Ping-Pong Computing-in-Memory Macro, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 32:2, (256-268), Online publication date: 1-Feb-2024.
- Nicolás-Conesa V, Titos-Gil R, Fernández-Pascual R, Ros A and Acacio M (2024). On the interactions between ILP and TLP with hardware transactional memory, Microprocessors & Microsystems, 104:C, Online publication date: 1-Feb-2024.
- Yang S, Dong C, Xiao Y, Cheng Y, Shi Z, Li Z and Sun L (2023). Asteria-Pro: Enhancing Deep Learning-based Binary Code Similarity Detection by Incorporating Domain Knowledge, ACM Transactions on Software Engineering and Methodology, 33:1, (1-40), Online publication date: 31-Jan-2024.
- Jiang Z, Yang K, Fisher N, Guan N, Audsley N and Dong Z (2024). Hopscotch: A Hardware-Software Co-Design for Efficient Cache Resizing on Multi-Core SoCs, IEEE Transactions on Parallel and Distributed Systems, 35:1, (89-104), Online publication date: 1-Jan-2024.
- Yang J, Yang Z, Casas J and Ray S (2024). Correct-by-Construction Design of Custom Accelerator Microarchitectures, IEEE Transactions on Computers, 73:1, (278-291), Online publication date: 1-Jan-2024.
- Dong P, Kong Z, Meng X, Yu P, Gong Y, Yuan G, Tang H and Wang Y HotBEV Proceedings of the 37th International Conference on Neural Information Processing Systems, (2824-2836)
- Li Y, Li N, Zhang Y, Guo J, Huang B, Xing M and Huang W Hmem: A Holistic Memory Performance Metric for Cloud Computing Benchmarking, Measuring, and Optimizing, (171-187)
- Jiang Z, Dai X, Wei R, Gray I, Gu Z, Zhao Q and Zhao S (2023). NPRC-I/O: An NoC-Based Real-Time I/O System With Reduced Contention and Enhanced Predictability, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 42:12, (4629-4642), Online publication date: 1-Dec-2023.
- Lee J, Min D, Byun I, Jang H and Kim J Fast, Light-weight, and Accurate Performance Evaluation using Representative Datacenter Behaviors Proceedings of the 24th International Middleware Conference, (220-233)
- Cheshmi K, Strout M and Mehri Dehnavi M Runtime Composition of Iterations for Fusing Loop-carried Sparse Dependence Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, (1-15)
- Peng J, Fang J, Liu J, Xie M, Dai Y, Yang B, Li S and Wang Z Optimizing MPI Collectives on Shared Memory Multi-Cores Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, (1-15)
- Lin Z, Liang T, Zhao J, Sinha S and Zhang W (2023). HL-Pow: Learning-Assisted Pre-RTL Power Modeling and Optimization for FPGA HLS, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 42:11, (3925-3938), Online publication date: 1-Nov-2023.
- Moghimi A, Hattori J, Li A, Ben Chikha M and Shahrad M Parrotfish Proceedings of the 2023 ACM Symposium on Cloud Computing, (177-192)
- Zeng J, Jeong J and Jung C Persistent Processor Architecture Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, (1075-1091)
- Villon L, Susskind Z, Bacellar A, Miranda I, de Araújo L, Lima P, Breternitz M, John L, França F and Dutra D (2023). A conditional branch predictor based on weightless neural networks, Neurocomputing, 555:C, Online publication date: 28-Oct-2023.
- Kong X, Zheng X, Zhu Y, Duan G and Chen Z (2023). I/O-efficient GPU-based acceleration of coherent dedispersion for pulsar observation, Journal of Systems Architecture: the EUROMICRO Journal, 142:C, Online publication date: 1-Sep-2023.
- Kong L, Tan J, Huang J, Chen G, Wang S, Jin X, Zeng P, Khan M and Das S (2022). Edge-computing-driven Internet of Things: A Survey, ACM Computing Surveys, 55:8, (1-41), Online publication date: 31-Aug-2023.
- Chen C, Kande R, Nguyen N, Andersen F, Tyagi A, Sadeghi A and Rajendran J HyPFuzz Proceedings of the 32nd USENIX Conference on Security Symposium, (1361-1378)
- Min D, Kim K, Moon C, Khan A, Lee S, Yun C, Chung W and Kim Y (2023). A Multi-tenant Key-value SSD with Secondary Index for Search Query Processing and Analysis, ACM Transactions on Embedded Computing Systems, 22:4, (1-27), Online publication date: 31-Jul-2023.
- Naghibijouybari H, Koruyeh E and Abu-Ghazaleh N (2022). Microarchitectural Attacks in Heterogeneous Systems: A Survey, ACM Computing Surveys, 55:7, (1-40), Online publication date: 31-Jul-2023.
- Orts F, Ortega G, Combarro E, Rúa I, Puertas A and Garzón E (2023). Efficient design of a quantum absolute-value circuit using Clifford+T gates, The Journal of Supercomputing, 79:11, (12656-12670), Online publication date: 1-Jul-2023.
- Khanna G, Chaturvedi S and Othman M (2023). On design and performance analysis of improved shuffle exchange gamma interconnection network layouts, The Journal of Supercomputing, 79:11, (11611-11640), Online publication date: 1-Jul-2023.
- Li X, Parazeres M, Oberman A, Ghaffari A, Asgharian M and Nia V (2023). EuclidNets: An Alternative Operation for Efficient Inference of Deep Learning Models, SN Computer Science, 4:5, Online publication date: 30-Jun-2023.
- Resch S, Cilasun H, Chowdhury Z, Zabihi M, Zhao Z, Wang J, Sapatnekar S and Karpuzcu U On Endurance of Processing in (Nonvolatile) Memory Proceedings of the 50th Annual International Symposium on Computer Architecture, (1-13)
- Friedman R, Goaz O and Hovav D PKache: A Generic Framework for Data Plane Caching Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing, (1268-1276)
- Mhatre S and Chandran P On the Measurement of Performance Metrics for Virtualization-Enhanced Architectures Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing, (49-56)
- Araújo De Medeiros D, Markidis S and Bo Peng I LibCOS: Enabling Converged HPC and Cloud Data Stores with MPI Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region, (106-116)
- Hessien S and Hassan M (2022). PISCOT: A Pipelined Split-Transaction COTS-Coherent Bus for Multi-Core Real-Time Systems, ACM Transactions on Embedded Computing Systems, 22:1, (1-27), Online publication date: 31-Jan-2023.
- Sahabandu D, Mertoguno J and Poovendran R (2023). A Natural Language Processing Approach for Instruction Set Architecture Identification, IEEE Transactions on Information Forensics and Security, 18, (4086-4099), Online publication date: 1-Jan-2023.
- Xu M, Ng W, Lim W, Kang J, Xiong Z, Niyato D, Yang Q, Shen X and Miao C (2023). A Full Dive Into Realizing the Edge-Enabled Metaverse: Visions, Enabling Technologies, and Challenges, IEEE Communications Surveys & Tutorials, 25:1, (656-700), Online publication date: 1-Jan-2023.
- Neto A, Neto J and Moreno E (2022). The development of a low-cost big data cluster using Apache Hadoop and Raspberry Pi. A complete guide, Computers and Electrical Engineering, 104:PA, Online publication date: 1-Dec-2022.
- Kopper P, Copplestone S, Pfeiffer M, Koch C, Fasoulas S and Beck A (2022). Hybrid parallelization of Euler–Lagrange simulations based on MPI-3 shared memory, Advances in Engineering Software, 174:C, Online publication date: 1-Dec-2022.
- Han Y, Yuan Z, Pu Y, Xue C, Song S, Sun G and Huang G Latency-aware spatial-wise dynamic networks Proceedings of the 36th International Conference on Neural Information Processing Systems, (36845-36857)
- Song C, Wright S, Lin C and Diakonikolas J Coordinate linear variance reduction for generalized linear programming Proceedings of the 36th International Conference on Neural Information Processing Systems, (22049-22063)
- Du X, Chen A, He B, Chen H, Zhang F and Chen Y (2022). AflIot, Computers and Security, 122:C, Online publication date: 1-Nov-2022.
- Bang T, May N, Petrov I and Binnig C (2022). The full story of 1000 cores, The VLDB Journal — The International Journal on Very Large Data Bases, 31:6, (1185-1213), Online publication date: 1-Nov-2022.
- Gebregiorgis A, Du Nguyen H, Yu J, Bishnoi R, Taouil M, Catthoor F and Hamdioui S (2022). A Survey on Memory-centric Computer Architectures, ACM Journal on Emerging Technologies in Computing Systems, 18:4, (1-50), Online publication date: 31-Oct-2022.
- Zhang J, Cheng Y, Deng X, Wang B, Xie J, Yang Y and Zhang M (2022). A Reputation-Based Mechanism for Transaction Processing in Blockchain Systems, IEEE Transactions on Computers, 71:10, (2423-2434), Online publication date: 1-Oct-2022.
- Jeong I, Lee J, Yoon M and Ro W Reconstructing Out-of-Order Issue Queue Proceedings of the 55th Annual IEEE/ACM International Symposium on Microarchitecture, (144-161)
- Resch S, Khatamifard S, Chowdhury Z, Zabihi M, Zhao Z, Cilasun H, Wang J, Sapatnekar S and Karpuzcu U (2022). Energy-efficient and Reliable Inference in Nonvolatile Memory under Extreme Operating Conditions, ACM Transactions on Embedded Computing Systems, 21:5, (1-36), Online publication date: 30-Sep-2022.
- Baldassin A, Barreto J, Castro D and Romano P (2021). Persistent Memory, ACM Computing Surveys, 54:7, (1-37), Online publication date: 30-Sep-2022.
- Resch S and Karpuzcu U (2021). Benchmarking Quantum Computers and the Impact of Quantum Noise, ACM Computing Surveys, 54:7, (1-35), Online publication date: 30-Sep-2022.
- Jiang Z, Yang K, Fisher N, Audsley N and Dong Z (2022). Towards an energy-efficient quarter-clairvoyant mixed-criticality system, Journal of Systems Architecture: the EUROMICRO Journal, 130:C, Online publication date: 1-Sep-2022.
- Rosenbloom P Thoughts on Architecture Artificial General Intelligence, (364-373)
- Mahafzah B, Al-Adwan A and Zaghloul R (2022). Topological properties assessment of optoelectronic architectures, Telecommunications Systems, 80:4, (599-627), Online publication date: 1-Aug-2022.
- Beckmann N, Gibbons P and McGuffey C Brief Announcement: Spatial Locality and Granularity Change in Caching Proceedings of the 34th ACM Symposium on Parallelism in Algorithms and Architectures, (173-175)
- Wu N, Yang H, Xie Y, Li P and Hao C High-level synthesis performance prediction using GNNs Proceedings of the 59th ACM/IEEE Design Automation Conference, (49-54)
- Orts F, Ortega G, Filatovas E and M. Garzón E (2022). Implementation of three efficient 4-digit fault-tolerant quantum carry lookahead adders, The Journal of Supercomputing, 78:11, (13323-13341), Online publication date: 1-Jul-2022.
- Mbongue J, Kwadjo D, Shuping A and Bobda C (2022). Deploying Multi-tenant FPGAs within Linux-based Cloud Infrastructure, ACM Transactions on Reconfigurable Technology and Systems, 15:2, (1-31), Online publication date: 30-Jun-2022.
- Paul A, Choi J, Karimi A and Wang F Machine Learning Assisted HPC Workload Trace Generation for Leadership Scale Storage Systems Proceedings of the 31st International Symposium on High-Performance Parallel and Distributed Computing, (199-212)
- Shukla S, Bandishte S, Gaur J and Subramoney S Register file prefetching Proceedings of the 49th Annual International Symposium on Computer Architecture, (410-423)
- Gudaparthi S and Shrestha R (2022). Selective register-file cache: an energy saving technique for embedded processor architecture, Design Automation for Embedded Systems, 26:2, (105-124), Online publication date: 1-Jun-2022.
- Xiong W and Szefer J (2021). Survey of Transient Execution Attacks and Their Mitigations, ACM Computing Surveys, 54:3, (1-36), Online publication date: 30-Apr-2022.
- Li Y, Yu X, Yang Y, Zhou Y, Yang T, Ma Z and Chen S (2021). Pyramid Family: Generic Frameworks for Accurate and Fast Flow Size Measurement, IEEE/ACM Transactions on Networking, 30:2, (586-600), Online publication date: 1-Apr-2022.
- Arras P, Andronidis A, Pina L, Mituzas K, Shu Q, Grumberg D and Cadar C (2022). SaBRe: load-time selective binary rewriting, International Journal on Software Tools for Technology Transfer (STTT), 24:2, (205-223), Online publication date: 1-Apr-2022.
- Guerrero-Balaguera J, Condia J and Reorda M A compaction method for STLs for GPU in-field test Proceedings of the 2022 Conference & Exhibition on Design, Automation & Test in Europe, (454-459)
- Bura A, Rengarajan D, Kalathil D, Shakkottai S and Chamberland J (2021). Learning to Cache and Caching to Learn: Regret Analysis of Caching Algorithms, IEEE/ACM Transactions on Networking, 30:1, (18-31), Online publication date: 1-Feb-2022.
- Jiang Z, Dong P, Wei R, Zhao Q, Wang Y, Zhu D, Zhuang Y and Audsley N (2022). PSpSys, Journal of Systems Architecture: the EUROMICRO Journal, 123:C, Online publication date: 1-Feb-2022.
- Berg B, Whitehouse J, Moseley B, Wang W and Harchol-Balter M (2021). The case for phase-aware scheduling of parallelizable jobs, Performance Evaluation, 153:C, Online publication date: 1-Feb-2022.
- Wang M, Wen C and Chao H (2021). Roadrunner+: An Autonomous Intersection Management Cooperating with Connected Autonomous Vehicles and Pedestrians with Spillback Considered, ACM Transactions on Cyber-Physical Systems, 6:1, (1-29), Online publication date: 31-Jan-2022.
- Gade S and Deb S (2021). A Novel Hybrid Cache Coherence with Global Snooping for Many-core Architectures, ACM Transactions on Design Automation of Electronic Systems, 27:1, (1-31), Online publication date: 31-Jan-2022.
- Moti N, Schimmelpfennig F, Salkhordeh R, Klopp D, Cortes T, Rückert U and Brinkmann A Simurgh Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, (1-14)
- Chowdhury S, Yang K and Nuzzo P ReIGNN: State Register Identification Using Graph Neural Networks for Circuit Reverse Engineering 2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD), (1-9)
- Zeitak A and Morrison A Cuckoo Trie Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles, (147-162)
- Moreira A, Ottoni G and Quintão Pereira F (2021). VESPA: static profiling for binary optimization, Proceedings of the ACM on Programming Languages, 5:OOPSLA, (1-28), Online publication date: 20-Oct-2021.
- Zhang M, Xie L, Zhang Z, Yu Q, Xi G, Zhang H, Liu F, Zheng Y, Zheng Y and Zhang S Exploiting Different Levels of Parallelism in the Quantum Control Microarchitecture for Superconducting Qubits MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture, (898-911)
- LeMay M, Rakshit J, Deutsch S, Durham D, Ghosh S, Nori A, Gaur J, Weiler A, Sultana S, Grewal K and Subramoney S Cryptographic Capability Computing MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture, (253-267)
- Carvalho D and Seznec A (2021). Understanding Cache Compression, ACM Transactions on Architecture and Code Optimization, 18:3, (1-27), Online publication date: 30-Sep-2021.
- Wu Y, Li J, Dai H, Yi X, Wang Y and Yang X micROS.BT: An Event-Driven Behavior Tree Framework for Swarm Robots 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (9146-9153)
- Nair A, Pai A, Raveendran B and Patil G MOESI Proceedings of the 2021 IEEE/ACM 25th International Symposium on Distributed Simulation and Real Time Applications, (1-8)
- Das A, Jose J and Mishra P (2021). Data Criticality in Multithreaded Applications: An Insight for Many-Core Systems, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 29:9, (1675-1679), Online publication date: 1-Sep-2021.
- Zhang J, Zhou X, Ge T, Wang X and Hwang T (2021). Joint Task Scheduling and Containerizing for Efficient Edge Computing, IEEE Transactions on Parallel and Distributed Systems, 32:8, (2086-2100), Online publication date: 1-Aug-2021.
- Kim H, Amarnath A, Bagherzadeh J, Talati N and Dreslinski R (2021). A Survey Describing Beyond Si Transistors and Exploring Their Implications for Future Processors, ACM Journal on Emerging Technologies in Computing Systems, 17:3, (1-44), Online publication date: 31-Jul-2021.
- Min D and Kim Y Isolating namespace and performance in key-value SSDs for multi-tenant environments Proceedings of the 13th ACM Workshop on Hot Topics in Storage and File Systems, (8-13)
- Chen J, Lu C, Ni J, Guo X, Girard P and Cheng Y (2021). DOVA PRO: A Dynamic Overwriting Voltage Adjustment Technique for STT-MRAM L1 Cache Considering Dielectric Breakdown Effect, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 29:7, (1325-1334), Online publication date: 1-Jul-2021.
- Mustard C, Goswami S, Gharavi N, Nider J, Beschastnikh I and Fedorova A Jumpgate Proceedings of the 14th ACM International Conference on Systems and Storage, (1-12)
- Skarlatos D, Zhao Z, Paccagnella R, Fletcher C and Torrellas J Jamais vu: thwarting microarchitectural replay attacks Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, (1061-1076)
- C. A, Lee W and Lin W Branchboozle Proceedings of the 36th Annual ACM Symposium on Applied Computing, (1617-1625)
- Bazzaz M, Hoseinghorban A and Ejlali A (2021). Fast and Predictable Non-Volatile Data Memory for Real-Time Embedded Systems, IEEE Transactions on Computers, 70:3, (359-371), Online publication date: 1-Mar-2021.
- Zhou C, Wu W, He H, Yang P, Lyu F, Cheng N and Shen X (2021). Deep Reinforcement Learning for Delay-Oriented IoT Task Scheduling in SAGIN, IEEE Transactions on Wireless Communications, 20:2, (911-925), Online publication date: 1-Feb-2021.
- Schuiki F, Zaruba F, Hoefler T and Benini L (2021). Stream Semantic Registers: A Lightweight RISC-V ISA Extension Achieving Full Compute Utilization in Single-Issue Cores, IEEE Transactions on Computers, 70:2, (212-227), Online publication date: 1-Feb-2021.
- Parra P, Guzmán D, Polo Ó, da Silva A, Martínez A, Sánchez S and Prieto M (2021). Improving performance and determinism of multitasking systems on the LEON architecture, Microprocessors & Microsystems, 80:C, Online publication date: 1-Feb-2021.
- Salehnamadi N, Alshayban A, Ahmed I and Malek S ER catcher Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, (324-335)
- Park H, Ahn H and Jung S (2020). A Novel Matchline Scheduling Method for Low-Power and Reliable Search Operation in Cross-Point-Array Nonvolatile Ternary CAM, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 28:12, (2650-2657), Online publication date: 1-Dec-2020.
- Mozafari S and Meyer B (2020). Hot sparing for lifetime-chip-performance and cost improvement in application specific SIMT processors, Design Automation for Embedded Systems, 24:4, (249-266), Online publication date: 1-Dec-2020.
- Coffin E, Young S, Kaur H, Brown J, Pirvu M and Kent K MicroJIT Proceedings of the 30th Annual International Conference on Computer Science and Software Engineering, (179-188)
- Jeon Y, Park B, Kwon S, Kim B, Yun J and Lee D BiQGEMM Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, (1-16)
- Berg B, Berger D, McAllister S, Grosof I, Gunasekar S, Lu J, Uhlar M, Carrig J, Beckmann N, Harchol-Balter M and Ganger G The CacheLib caching engine Proceedings of the 14th USENIX Conference on Operating Systems Design and Implementation, (769-786)
- Orts F, Ortega G, Puertas A, García I and Garzón E (2020). On solving the unrelated parallel machine scheduling problem: active microrheology as a case study, The Journal of Supercomputing, 76:11, (8494-8509), Online publication date: 1-Nov-2020.
- Salazar C and Bobby Birrer M Instrumentation and Extension of reduced, simulated Single Cycle MIPS architecture to improve Student Comprehension 2020 IEEE Frontiers in Education Conference (FIE), (1-5)
- Wang M, Wang J, Wen C and Chao H Roadrunner: Autonomous Intersection Management with Dynamic Lane Assignment 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC), (1-7)
- Fichte J, Hecher M and Szeider S A Time Leap Challenge for SAT-Solving Principles and Practice of Constraint Programming, (267-285)
- Alam M, Nahiyan A, Sadi M, Forte D and Tehranipoor M (2020). Soft-HaT, ACM Transactions on Design Automation of Electronic Systems, 25:4, (1-22), Online publication date: 2-Sep-2020.
- Zhang Z, Henderson T, Karaman S and Sze V (2020). FSMI, International Journal of Robotics Research, 39:9, (1155-1177), Online publication date: 1-Aug-2020.
- Sheikh S and Pasha M (2020). Energy-efficient Real-time Scheduling on Multicores, ACM Transactions on Embedded Computing Systems, 19:4, (1-25), Online publication date: 31-Jul-2020.
- Atre N, Sherry J, Wang W and Berger D Caching with Delayed Hits Proceedings of the Annual conference of the ACM Special Interest Group on Data Communication on the applications, technologies, architectures, and protocols for computer communication, (495-513)
- Manjith B.C. and Ramasubramanian N. (2020). Securing AES Accelerator from Key-Leaking Trojans on FPGA, International Journal of Embedded and Real-Time Communication Systems, 11:3, (84-105), Online publication date: 1-Jul-2020.
- Ritter F and Hack S PMEvo: portable inference of port mappings for out-of-order processors by evolutionary optimization Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation, (608-622)
- Lozano R and Schulte C (2019). Survey on Combinatorial Register Allocation and Instruction Scheduling, ACM Computing Surveys, 52:3, (1-50), Online publication date: 31-May-2020.
- Lipp M, Schwarz M, Gruss D, Prescher T, Haas W, Horn J, Mangard S, Kocher P, Genkin D, Yarom Y, Hamburg M and Strackx R (2020). Meltdown, Communications of the ACM, 63:6, (46-56), Online publication date: 21-May-2020.
- Lanuza J, Trabes G and Wainer G Parallel execution of DEVS in shared-memory multicore architectures Proceedings of the 2020 Spring Simulation Conference, (1-11)
- El-Moursy A, Sibai F, El-Moursy M and Mohamed A (2020). PMSMC, Journal of Parallel and Distributed Computing, 139:C, (135-147), Online publication date: 1-May-2020.
- Nguyen H, Yu J, Lebdeh M, Taouil M, Hamdioui S and Catthoor F (2020). A Classification of Memory-Centric Computing, ACM Journal on Emerging Technologies in Computing Systems, 16:2, (1-26), Online publication date: 30-Apr-2020.
- Jošilo S and Dán G (2020). Computation Offloading Scheduling for Periodic Tasks in Mobile Edge Computing, IEEE/ACM Transactions on Networking, 28:2, (667-680), Online publication date: 1-Apr-2020.
- Hahn S and Reineke J (2019). Design and analysis of SIC: a provably timing-predictable pipelined processor core, Real-Time Systems, 56:2, (207-245), Online publication date: 1-Apr-2020.
- Vineyard C, Plagge M and Green S Comparing Neural Accelerators & Neuromorphic Architectures The False Idol of Operations Proceedings of the 2020 Annual Neuro-Inspired Computational Elements Workshop, (1-6)
- Zhang R, Biswas S, Balaji V, Bond M and Lucia B Peacenik Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems, (317-333)
- Szymczyk M and Szymczyk P (2020). Automatic processing of Z-transform artificial neural networks using parallel programming, Neurocomputing, 379:C, (74-88), Online publication date: 28-Feb-2020.
- Liu B, Cheshmi K, Soori S, Strout M and Dehnavi M MatRox Proceedings of the 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, (389-402)
- Damaj I, Elshafei M, El-Abd M and Aydin M (2022). An analytical framework for high-speed hardware particle swarm optimization, Microprocessors & Microsystems, 72:C, Online publication date: 1-Feb-2020.
- Edelkamp S and Weiß A (2019). BlockQuicksort, ACM Journal of Experimental Algorithmics, 24, (1-22), Online publication date: 17-Dec-2019.
- García-Martín E, Rodrigues C, Riley G and Grahn H (2019). Estimation of energy consumption in machine learning, Journal of Parallel and Distributed Computing, 134:C, (75-88), Online publication date: 1-Dec-2019.
- Wang L, Gao W, Yang K and Jiang Z BOPS, A New Computation-Centric Metric for Datacenter Computing Benchmarking, Measuring, and Optimizing, (262-277)
- Coffin E, Young S, Kent K and Pirvu M A roadmap for extending MicroJIT Proceedings of the 29th Annual International Conference on Computer Science and Software Engineering, (293-298)
- Zaruba F and Benini L (2019). The Cost of Application-Class Processing: Energy and Performance Analysis of a Linux-Ready 1.7-GHz 64-Bit RISC-V Core in 22-nm FDSOI Technology, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 27:11, (2629-2640), Online publication date: 1-Nov-2019.
- Park S, Wu Y, Lee J, Aupov A and Mahlke S (2019). Multi-objective Exploration for Practical Optimization Decisions in Binary Translation, ACM Transactions on Embedded Computing Systems, 18:5s, (1-19), Online publication date: 31-Oct-2019.
- Castro-Godínez J, Shafique M and Henkel J (2019). ECAx, ACM Transactions on Embedded Computing Systems, 18:5s, (1-20), Online publication date: 31-Oct-2019.
- Nongpoh B, Ray R and Banerjee A Approximate computing for multithreaded programs in shared memory architectures Proceedings of the 17th ACM-IEEE International Conference on Formal Methods and Models for System Design, (1-9)
- Nair A, Colaco L, Patil G, Raveendran B and Punnekkatt S MEDIATOR Proceedings of the 23rd IEEE/ACM International Symposium on Distributed Simulation and Real Time Applications, (146-153)
- Hou Y, He H, Shamsi K, Jin Y, Wu D and Wu H (2019). On-Chip Analog Trojan Detection Framework for Microprocessor Trustworthiness, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 38:10, (1820-1830), Online publication date: 1-Oct-2019.
- Lozano R, Carlsson M, Blindell G and Schulte C (2019). Combinatorial Register Allocation and Instruction Scheduling, ACM Transactions on Programming Languages and Systems, 41:3, (1-53), Online publication date: 30-Sep-2019.
- Sperl P and Böttinger K Side-Channel Aware Fuzzing Computer Security – ESORICS 2019, (259-278)
- Nadeem M, Li Z, Malik A, Biglari-Abhari M and Salcic Z (2019). Allocation and scheduling of SystemJ programs on chip multiprocessors with weighted TDMA scheduling, Journal of Systems Architecture: the EUROMICRO Journal, 98:C, (63-78), Online publication date: 1-Sep-2019.
- Nadeau D, Ezzati-Jivan N and Dagenais M (2019). Efficient large-scale heterogeneous debugging using dynamic tracing, Journal of Systems Architecture: the EUROMICRO Journal, 98:C, (346-360), Online publication date: 1-Sep-2019.
- Ponugoti M and Milenkovic A (2019). Enabling On-the-Fly Hardware Tracing of Data Reads in Multicores, ACM Transactions on Embedded Computing Systems, 18:4, (1-27), Online publication date: 31-Jul-2019.
- Liu Z, Nath A, Ding X, Fu H, Muhib Khan M and Yu W (2022). Multivariate modeling and two-level scheduling of analytic queries, Parallel Computing, 85:C, (66-78), Online publication date: 1-Jul-2019.
- Reichenbach M, Holzinger P, Häublein K, Lieske T, Blinzer P and Fey D (2019). Heterogeneous Computing Utilizing FPGAs, Journal of Signal Processing Systems, 91:7, (745-757), Online publication date: 1-Jul-2019.
- Geng T, Wang T, Wu C, Yang C, Wu W, Li A and Herbordt M O3BNN Proceedings of the ACM International Conference on Supercomputing, (461-472)
- Chen Y and Louri A An online quality management framework for approximate communication in network-on-chips Proceedings of the ACM International Conference on Supercomputing, (217-226)
- Real P, Molina-Abril H, Díaz-del-Río F, Blanco-Trejo S and Onchis D Enhanced Parallel Generation of Tree Structures for the Recognition of 3D Images Pattern Recognition, (292-301)
- Van Sandt P, Chronis Y and Patel J Efficiently Searching In-Memory Sorted Arrays Proceedings of the 2019 International Conference on Management of Data, (36-53)
- Ayers G, Nagendra N, August D, Cho H, Kanev S, Kozyrakis C, Krishnamurthy T, Litz H, Moseley T and Ranganathan P AsmDB Proceedings of the 46th International Symposium on Computer Architecture, (462-473)
- Pittino F, Bonfà P, Bartolini A, Affinito F, Benini L and Cavazzoni C Prediction of Time-to-Solution in Material Science Simulations Using Deep Learning Proceedings of the Platform for Advanced Scientific Computing Conference, (1-9)
- Hanford N, Ahuja V, Farrens M, Tierney B and Ghosal D (2018). A Survey of End-System Optimizations for High-Speed Networks, ACM Computing Surveys, 51:3, (1-36), Online publication date: 31-May-2019.
- Calciu I, Puddu I, Kolli A, Nowatzyk A, Gandhi J, Mutlu O and Subrahmanyam P Project PBerry Proceedings of the Workshop on Hot Topics in Operating Systems, (127-135)
- Moreira F, Oliveira D and Navaux P SPADA Proceedings of the 16th ACM International Conference on Computing Frontiers, (50-58)
- Li G, Yang Y, Le F, Lim Y and Wang J Update Algebra: Toward Continuous, Non-Blocking Composition of Network Updates in SDN IEEE INFOCOM 2019 - IEEE Conference on Computer Communications, (1081-1089)
- Gurung A and Ray R Simultaneous Solving of Batched Linear Programs on a GPU Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering, (59-66)
- Gebai M and Dagenais M (2018). Survey and Analysis of Kernel and Userspace Tracers on Linux, ACM Computing Surveys, 51:2, (1-33), Online publication date: 31-Mar-2019.
- Nongpoh B, Ray R, Das M and Banerjee A (2019). Enhancing Speculative Execution With Selective Approximate Computing, ACM Transactions on Design Automation of Electronic Systems, 24:2, (1-29), Online publication date: 21-Mar-2019.
- Li F, Xu L, Duan S, Wu W, Zhao H and Ling Q (2019). Improving hierarchical mobile video caching through distributed cross-layer coordination, Multimedia Tools and Applications, 78:5, (6049-6071), Online publication date: 1-Mar-2019.
- Jordan H, Subotić P, Zhao D and Scholz B A specialized B-tree for concurrent datalog evaluation Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming, (327-339)
- Ying B, Yuan K and Sayed A (2019). Supervised Learning Under Distributed Features, IEEE Transactions on Signal Processing, 67:4, (977-992), Online publication date: 1-Feb-2019.
- Al-Adwan A, Sharieh A and Mahafzah B (2019). Parallel heuristic local search algorithm on OTIS hyper hexa-cell and OTIS mesh of trees optoelectronic architectures, Applied Intelligence, 49:2, (661-688), Online publication date: 1-Feb-2019.
- Rhisheekesan A, Jeyapaul R and Shrivastava A (2019). Control Flow Checking or Not? (for Soft Errors), ACM Transactions on Embedded Computing Systems, 18:1, (1-25), Online publication date: 31-Jan-2019.
- Guo X, Wang H, Zhang C, Tang H and Yuan Y Leakage-aware thermal management for multi-core systems using piecewise linear model based predictive control Proceedings of the 24th Asia and South Pacific Design Automation Conference, (64-69)
- Pontarelli S, Bonola M and Bianchi G (2018). Smashing OpenFlow's “atomic” actions, International Journal of Network Management, 29:1, Online publication date: 11-Jan-2019.
- Shelor C and Kavi K Reconfigurable dataflow graphs for processing-in-memory Proceedings of the 20th International Conference on Distributed Computing and Networking, (110-119)
- Jošilo S and Dán G (2018). Selfish Decentralized Computation Offloading for Mobile Cloud Computing in Dense Wireless Networks, IEEE Transactions on Mobile Computing, 18:1, (207-220), Online publication date: 1-Jan-2019.
- Chen Y (2019). Reshaping Future Computing Systems With Emerging Nonvolatile Memory Technologies, IEEE Micro, 39:1, (54-57), Online publication date: 1-Jan-2019.
- Jiang Z, Gao W, Wang L, Xiong X, Zhang Y, Wen X, Luo C, Ye H, Lu X, Zhang Y, Feng S, Li K, Xu W and Zhan J HPC AI500: A Benchmark Suite for HPC AI Systems Benchmarking, Measuring, and Optimizing, (10-22)
- Tran K, Jimborean A, Carlson T, Koukos K, Själander M and Kaxiras S (2018). SWOOP: software-hardware co-design for non-speculative, execute-ahead, in-order cores, ACM SIGPLAN Notices, 53:4, (328-343), Online publication date: 2-Dec-2018.
- Zhang J, Wu C, Yang D, Chen Y, Meng X, Xu L and Guo M (2018). HSCS, Frontiers of Computer Science: Selected Publications from Chinese Universities, 12:6, (1090-1104), Online publication date: 1-Dec-2018.
- Breβ S, Köcher B, Funke H, Zeuch S, Rabl T and Markl V (2018). Generating custom code for efficient query execution on heterogeneous processors, The VLDB Journal — The International Journal on Very Large Data Bases, 27:6, (797-822), Online publication date: 1-Dec-2018.
- Lee J, Kim C, Lin K, Cheng L, Govindaraju R and Kim J (2018). WSMeter, ACM SIGPLAN Notices, 53:2, (549-563), Online publication date: 30-Nov-2018.
- Einziger G, Eytan O, Friedman R and Manes B Adaptive Software Cache Management Proceedings of the 19th International Middleware Conference, (94-106)
- Asă?Voae I, Asă?Voae M and Riesco A (2018). Slicing from formal semantics, International Journal on Software Tools for Technology Transfer (STTT), 20:6, (739-769), Online publication date: 1-Nov-2018.
- Dey M, Nazari A, Zajic A and Prvulovic M TEMProf Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, (881-893)
- Yan M, Choi J, Skarlatos D, Morrison A, Fletcher C and Torrellas J InvisiSpec Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, (428-441)
- Khattab O, Hammoud M and Shekfeh O PolyHJ Proceedings of the 27th ACM International Conference on Information and Knowledge Management, (1323-1332)
- Rashid S, Nelissen G and Tovar E Trading Between Intra- and Inter-Task Cache Interference to Improve Schedulability Proceedings of the 26th International Conference on Real-Time Networks and Systems, (125-136)
- Zoni D, Barenghi A, Pelosi G and Fornaciari W (2018). A Comprehensive Side-Channel Information Leakage Analysis of an In-Order RISC CPU Microarchitecture, ACM Transactions on Design Automation of Electronic Systems, 23:5, (1-30), Online publication date: 30-Sep-2018.
- Jimenez L and Agyeman M A Study of Techniques to Increase Instruction Level Parallelisms Proceedings of the 2nd International Symposium on Computer Science and Intelligent Control, (1-5)
- García-Martín E, Lavesson N, Grahn H, Casalicchio E and Boeva V How to Measure Energy Consumption in Machine Learning Algorithms ECML PKDD 2018 Workshops, (243-255)
- Ognawala S, Amato R, Pretschner A and Kulkarni P Automatically assessing vulnerabilities discovered by compositional analysis Proceedings of the 1st International Workshop on Machine Learning and Software Engineering in Symbiosis, (16-25)
- Gu J, Yin S, liu L and Wei S (2018). Stress-Aware Loops Mapping on CGRAs with Dynamic Multi-Map Reconfiguration, IEEE Transactions on Parallel and Distributed Systems, 29:9, (2105-2120), Online publication date: 1-Sep-2018.
- Ji K, Ling M, Shi L and Pan J (2018). An Analytical Cache Performance Evaluation Framework for Embedded Out-of-Order Processors Using Software Characteristics, ACM Transactions on Embedded Computing Systems, 17:4, (1-25), Online publication date: 29-Aug-2018.
- Tan W, Chang S, Fong L, Li C, Wang Z and Cao L Matrix Factorization on GPUs with Memory Optimization and Approximate Computing Proceedings of the 47th International Conference on Parallel Processing, (1-10)
- Catalán S, Herrero J, Quintana-Ortí E and Rodríguez-Sánchez R (2018). Static scheduling of the LU factorization with look-ahead on asymmetric multicore processors, Parallel Computing, 76:C, (18-27), Online publication date: 1-Aug-2018.
- Jakovljević R, Berić A, Van Dalen E and Milićev D (2018). New access modes of parallel memory subsystem for sub-pixel motion estimation, Journal of Real-Time Image Processing, 15:2, (279-296), Online publication date: 1-Aug-2018.
- Psychou G, Rodopoulos D, Sabry M, Gemmeke T, Atienza D, Noll T and Catthoor F (2017). Classification of Resilience Techniques Against Functional Errors at Higher Abstraction Layers of Digital Systems, ACM Computing Surveys, 50:4, (1-38), Online publication date: 31-Jul-2018.
- Schulz L, Broneske D and Saake G (2018). An eight-dimensional systematic evaluation of optimized search algorithms on modern processors, Proceedings of the VLDB Endowment, 11:11, (1550-1562), Online publication date: 1-Jul-2018.
- Kwon K, Amid A, Gholami A, Wu B, Asanovic K and Keutzer K Co-design of deep neural nets and neural net accelerators for embedded vision applications Proceedings of the 55th Annual Design Automation Conference, (1-6)
- Kwon K, Amid A, Gholami A, Wu B, Asanovic K and Keutzer K Invited: Co-Design of Deep Neural Nets and Neural Net Accelerators for Embedded Vision Applications 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC), (1-6)
- Kougkas A, Devarajan H and Sun X IRIS Proceedings of the 2018 International Conference on Supercomputing, (33-42)
- Tran K, Jimborean A, Carlson T, Koukos K, Själander M and Kaxiras S SWOOP: software-hardware co-design for non-speculative, execute-ahead, in-order cores Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation, (328-343)
- Zhang J and Gruenwald L Regularizing irregularity Proceedings of the 1st ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA), (1-8)
- Bae D, Jo I, Choi Y, Hwang J, Cho S, Lee D and Jeong J 2B-SSD Proceedings of the 45th Annual International Symposium on Computer Architecture, (425-438)
- Parasar M, Bhattacharjee A and Krishna T SEESAW Proceedings of the 45th Annual International Symposium on Computer Architecture, (193-206)
- Morse J, Kerrison S and Eder K (2018). On the Limitations of Analyzing Worst-Case Dynamic Energy of Processing, ACM Transactions on Embedded Computing Systems, 17:3, (1-22), Online publication date: 31-May-2018.
- Crawford P, Barnes Jr. P, Eidenbenz S and Wilsey P Sampling Simulation Model Profile Data for Analysis Proceedings of the 2018 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation, (17-28)
- Kelefouras V and Djemame K A methodology for efficient code optimizations and memory management Proceedings of the 15th ACM International Conference on Computing Frontiers, (105-112)
- Malas T, Hager G, Ltaief H and Keyes D (2017). Multidimensional Intratile Parallelization for Memory-Starved Stencil Computations, ACM Transactions on Parallel Computing, 4:3, (1-32), Online publication date: 27-Apr-2018.
- Liao C, Lee S, Chiou Y, Lee C and Lee C (2018). Power consumption minimization by distributive particle swarm optimization for luminance control and its parallel implementations, Expert Systems with Applications: An International Journal, 96:C, (479-491), Online publication date: 15-Apr-2018.
- Chen K and Chen C (2018). Enabling SIMT Execution Model on Homogeneous Multi-Core System, ACM Transactions on Architecture and Code Optimization, 15:1, (1-26), Online publication date: 31-Mar-2018.
- Lee J, Kim C, Lin K, Cheng L, Govindaraju R and Kim J WSMeter Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, (549-563)
- Prakash A, Clarke C, Lam S and Srikanthan T (2018). Rapid Memory-Aware Selection of Hardware Accelerators in Programmable SoC Design, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 26:3, (445-456), Online publication date: 1-Mar-2018.
- Dolbeau R (2018). Theoretical peak FLOPS per instruction set: a tutorial, The Journal of Supercomputing, 74:3, (1341-1377), Online publication date: 1-Mar-2018.
- Baba T, Watanabe S, Jackin B, Ohkawa T, Ootsu K, Yokota T, Hayasaki Y and Yatagai T Overcoming the difficulty of large-scale CGH generation on multi-GPU cluster Proceedings of the 11th Workshop on General Purpose GPUs, (13-21)
- Josipović L, Ghosal R and Ienne P Dynamically Scheduled High-level Synthesis Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, (127-136)
- Siddique N, Grubel P, Badawy A and Cook J (2018). A performance study of the time-varying cache behavior, The Journal of Supercomputing, 74:2, (665-695), Online publication date: 1-Feb-2018.
- Al-Adwan A, Mahafzah B and Sharieh A (2018). Solving traveling salesman problem using parallel repetitive nearest neighbor algorithm on OTIS-Hypercube and OTIS-Mesh optoelectronic architectures, The Journal of Supercomputing, 74:1, (1-36), Online publication date: 1-Jan-2018.
- Chen X, Wardi Y and Yalamanchili S Power regulation in high performance multicore processors 2017 IEEE 56th Annual Conference on Decision and Control (CDC), (2674-2679)
- Sieber C, Durner R, Ehm M, Kellerer W and Sharma P Towards optimal adaptation of NFV packet processing to modern CPU memory architectures Proceedings of the 2nd Workshop on Cloud-Assisted Networking, (7-12)
- Crawford P, Eidenbenz S, Barnes P and Wilsey P Some properties of communication behaviors in discrete-event simulation models Proceedings of the 2017 Winter Simulation Conference, (1-12)
- Zhang Y, Hou J, Cao Y, Gu J and Huang C (2017). OpenMP parallelization of a gridded SWAT (SWATG), Computers & Geosciences, 109:C, (228-237), Online publication date: 1-Dec-2017.
- He H, Cui L, Zhou F and Wang D (2017). Distributed proxy cache technology based on autonomic computing in smart cities, Future Generation Computer Systems, 76:C, (370-383), Online publication date: 1-Nov-2017.
- Ortega G, Filatovas E, Garzón E and Casado L (2017). Non-dominated sorting procedure for Pareto dominance ranking on multicore CPU and/or GPU, Journal of Global Optimization, 69:3, (607-627), Online publication date: 1-Nov-2017.
- Wan H, Gao X, Long X and Jiang B Introducing parallel computing concepts in computer system related courses 2017 IEEE Frontiers in Education Conference (FIE), (1-7)
- Kulkarni C, Kesavan A, Zhang T, Ricci R and Stutsman R Rocksteady Proceedings of the 26th Symposium on Operating Systems Principles, (390-405)
- Huang Y, Guo N, Seok M, Tsividis Y, Mandli K and Sethumadhavan S Hybrid analog-digital solution of nonlinear partial differential equations Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, (665-678)
- Milic U, Villa O, Bolotin E, Arunkumar A, Ebrahimi E, Jaleel A, Ramirez A and Nellans D Beyond the socket Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, (123-135)
- Fu X, Rol M, Bultink C, van Someren J, Khammassi N, Ashraf I, Vermeulen R, de Sterke J, Vlothuizen W, Schouten R, Almudever C, DiCarlo L and Bertels K An experimental microarchitecture for a superconducting quantum processor Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, (813-825)
- Tsai P, Beckmann N and Sanchez D (2017). Jenga, ACM SIGARCH Computer Architecture News, 45:2, (652-665), Online publication date: 14-Sep-2017.
- Wang K and Lin C (2017). Decoupled Affine Computation for SIMT GPUs, ACM SIGARCH Computer Architecture News, 45:2, (295-306), Online publication date: 14-Sep-2017.
- Aghaei Khouzani H, Hosseini F and Yang C (2017). Segment and Conflict Aware Page Allocation and Migration in DRAM-PCM Hybrid Main Memory, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 36:9, (1458-1470), Online publication date: 1-Sep-2017.
- Stanic M, Palomar O, Hayes T, Ratkovic I, Cristal A, Unsal O and Valero M (2017). An Integrated Vector-Scalar Design on an In-Order ARM Core, ACM Transactions on Architecture and Code Optimization, 14:2, (1-26), Online publication date: 21-Jul-2017.
- Blohoubek J, Fier P and Schmidt J (2017). Error masking method based on the short-duration offline test, Microprocessors & Microsystems, 52:C, (236-250), Online publication date: 1-Jul-2017.
- Mai V and Khalil I (2017). Design and implementation of a secure cloud-based billing model for smart meters as an Internet of things using homomorphic cryptography, Future Generation Computer Systems, 72:C, (327-338), Online publication date: 1-Jul-2017.
- Tsai P, Beckmann N and Sanchez D Jenga Proceedings of the 44th Annual International Symposium on Computer Architecture, (652-665)
- Wang K and Lin C Decoupled Affine Computation for SIMT GPUs Proceedings of the 44th Annual International Symposium on Computer Architecture, (295-306)
- Gutierrez-Alcoba A, Ortega G, Hendrix E and Garca I (2017). Accelerating an algorithm for perishable inventory control on heterogeneous platforms, Journal of Parallel and Distributed Computing, 104:C, (12-18), Online publication date: 1-Jun-2017.
- Khan A, Al-Mouhamed M, Al-Mulhem M and Ahmed A (2017). RT-CUDA, International Journal of Parallel Programming, 45:3, (551-594), Online publication date: 1-Jun-2017.
- Gupta S and Wilsey P Quantitative Driven Optimization of a Time Warp Kernel Proceedings of the 2017 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation, (27-38)
- Paredes M, Riley G and Luján M Vectorization of Hybrid Breadth First Search on the Intel Xeon Phi Proceedings of the Computing Frontiers Conference, (127-135)
- Wickerson J, Batty M, Sorensen T and Constantinides G (2017). Automatically comparing memory consistency models, ACM SIGPLAN Notices, 52:1, (190-204), Online publication date: 11-May-2017.
- Liu Y and Sun X (2017). Evaluating the Combined Effect of Memory Capacity and Concurrency for Many-Core Chip Design, ACM Transactions on Modeling and Performance Evaluation of Computing Systems, 2:2, (1-25), Online publication date: 5-May-2017.
- Deng S and Suresh K (2017). Topology optimization under thermo-elastic buckling, Structural and Multidisciplinary Optimization, 55:5, (1759-1772), Online publication date: 1-May-2017.
- Chow K and Zhu W Software Performance Analytics in the Cloud Proceedings of the 8th ACM/SPEC on International Conference on Performance Engineering, (419-421)
- Palangappa P and Mohanram K (2017). CompEx++, ACM Transactions on Architecture and Code Optimization, 14:1, (1-30), Online publication date: 14-Apr-2017.
- Zhang Y, Anwer B, Gopalakrishnan V, Han B, Reich J, Shaikh A and Zhang Z ParaBox Proceedings of the Symposium on SDN Research, (143-149)
- Melani A, Bertogna M, Davis R, Bonifaci V, Marchetti-Spaccamela A and Buttazzo G (2017). Exact Response Time Analysis for Fixed Priority Memory-Processor Co-Scheduling, IEEE Transactions on Computers, 66:4, (631-646), Online publication date: 1-Apr-2017.
- Qin H, Liu Z, Liu Y and Zhong H (2017). An object-oriented MATLAB toolbox for automotive body conceptual design using distributed parallel optimization, Advances in Engineering Software, 106:C, (19-32), Online publication date: 1-Apr-2017.
- Brandalero M and Beck A A mechanism for energy-efficient reuse of decoding and scheduling of x86 instruction streams Proceedings of the Conference on Design, Automation & Test in Europe, (1472-1477)
- Alioto M Energy-quality scalable adaptive VLSI circuits and systems beyond approximate computing Proceedings of the Conference on Design, Automation & Test in Europe, (127-132)
- Tang Q, Basten T, Geilen M, Stuijk S and Wei J (2017). Mapping of synchronous dataflow graphs on MPSoCs based on parallelism enhancement, Journal of Parallel and Distributed Computing, 101:C, (79-91), Online publication date: 1-Mar-2017.
- Tran K, Carlson T, Koukos K, Själander M, Spiliopoulos V, Kaxiras S and Jimborean A Clairvoyance: look-ahead compile-time scheduling Proceedings of the 2017 International Symposium on Code Generation and Optimization, (171-184)
- Chen Q, Wang X, Wan H and Yang R (2017). A Logic Circuit Design for Perfecting Memristor-Based Material Implication, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 36:2, (279-284), Online publication date: 1-Feb-2017.
- Wickerson J, Batty M, Sorensen T and Constantinides G Automatically comparing memory consistency models Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages, (190-204)
- Ortega G, Puertas A and Garzón E (2017). Accelerating the problem of microrheology in colloidal systems on a GPU, The Journal of Supercomputing, 73:1, (370-383), Online publication date: 1-Jan-2017.
- Fernandes F, Weigel L, Jung C, Navaux P, Carro L and Rech P (2016). Evaluation of Histogram of Oriented Gradients Soft Errors Criticality for Automotive Applications, ACM Transactions on Architecture and Code Optimization, 13:4, (1-25), Online publication date: 28-Dec-2016.
- Brock J and Bruce R (2016). Power labs, Journal of Computing Sciences in Colleges, 32:2, (104-110), Online publication date: 1-Dec-2016.
- Sewall J, Pennycook S, Duran A, Tian X and Narayanaswamy R A modern memory management system for OpenMP Proceedings of the Third International Workshop on Accelerator Programming Using Directives, (25-35)
- Bederián C and Wolovick N A project-based HPC course for single-box computers Proceedings of the Workshop on Education for High Performance Computing, (1-6)
- Qu P, Yan J and Gao G Toward a Parallel Turing Machine Model Network and Parallel Computing, (191-204)
- Hahn S, Jacobs M and Reineke J Enabling Compositionality for Multicore Timing Analysis Proceedings of the 24th International Conference on Real-Time Networks and Systems, (299-308)
- Siegl P, Buchty R and Berekovic M Data-Centric Computing Frontiers Proceedings of the Second International Symposium on Memory Systems, (295-308)
- Tran K Student Research Poster Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, (458-458)
- Catalán S, Malossi A, Bekas C and Quintana-Ortí E The Impact of Voltage-Frequency Scaling for the Matrix-Vector Product on the IBM POWER8 Proceedings of the 22nd International Conference on Euro-Par 2016: Parallel Processing - Volume 9833, (103-116)
- Masliah I, Abdelfattah A, Haidar A, Tomov S, Baboulin M, Falcou J and Dongarra J High-Performance Matrix-Matrix Multiplications of Very Small Matrices Proceedings of the 22nd International Conference on Euro-Par 2016: Parallel Processing - Volume 9833, (659-671)
- Joshi A, Vollala S, Begum B and Ramasubramanian N Performance Analysis of Cache Coherence Protocols for Multi-core Architectures Proceedings of the International Conference on Advances in Information Communication Technology & Computing, (1-7)
- Darav N, Kennings A, Tabrizi A, Westwick D and Behjat L (2016). Eh?Placer, ACM Transactions on Design Automation of Electronic Systems, 21:3, (1-27), Online publication date: 26-Jul-2016.
- Shahvarani A and Jacobsen H A Hybrid B+-tree as Solution for In-Memory Indexing on CPU-GPU Heterogeneous Computing Platforms Proceedings of the 2016 International Conference on Management of Data, (1523-1538)
- Banerjee K, Banerjee S and Sarkar S Data-race detection: the missing piece for an end-to-end semantic equivalence checker for parallelizing transformations of array-intensive programs Proceedings of the 3rd ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming, (1-8)
- Hong J and Kim S (2016). Flexible ECC Management for Low-Cost Transient Error Protection of Last-Level Caches, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 24:6, (2152-2164), Online publication date: 1-Jun-2016.
- Ünal E and Savaş E (2016). On Acceleration and Scalability of Number Theoretic Private Information Retrieval, IEEE Transactions on Parallel and Distributed Systems, 27:6, (1727-1741), Online publication date: 1-Jun-2016.
- Dai Y, Fang Y, Yang L and Jeon G (2016). Graphics processing unit-accelerated joint-bitplane belief propagation algorithm in DSC, The Journal of Supercomputing, 72:6, (2351-2375), Online publication date: 1-Jun-2016.
- Luppold A, Kittsteiner C and Falk H Cache-Aware Instruction SPM Allocation for Hard Real-Time Systems Proceedings of the 19th International Workshop on Software and Compilers for Embedded Systems, (77-85)
- Wilsey P Some Properties of Events Executed in Discrete-Event Simulation Models Proceedings of the 2016 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation, (165-176)
- Bijo S, Johnsen E, Pun K and Tarifa S An operational semantics of cache coherent multicore architectures Proceedings of the 31st Annual ACM Symposium on Applied Computing, (1219-1224)
- Elkhouly R, El-Mahdy A and Elmasry A Optimality analysis of if-conversion transformation Proceedings of the 24th High Performance Computing Symposium, (1-8)
- Savidis I, Ciftcioglu B, Xu J, Hu J, Jain M, Berman R, Xue J, Liu P, Moore D, Wicks G, Huang M, Wu H and Friedman E (2016). Heterogeneous 3-D circuits, Microelectronics Journal, 50:C, (66-75), Online publication date: 1-Apr-2016.
- Souza J, Carro L, Rutzig M and Beck A A reconfigurable heterogeneous multicore with a homogeneous ISA Proceedings of the 2016 Conference on Design, Automation & Test in Europe, (1598-1603)
- Yao Y and Lu Z Memory-access aware DVFS for network-on-chip in CMPs Proceedings of the 2016 Conference on Design, Automation & Test in Europe, (1433-1436)
- Goossens B, Parello D, Porada K and Rahmoune D Parallel Locality and Parallelization Quality Proceedings of the 7th International Workshop on Programming Models and Applications for Multicores and Manycores, (59-68)
- Fadolalkarim D, Sallam A and Bertino E PANDDE Proceedings of the Sixth ACM Conference on Data and Application Security and Privacy, (267-276)
- Johnson P and Ekstedt M (2016). The Tarpit - A general theory of software engineering, Information and Software Technology, 70:C, (181-203), Online publication date: 1-Feb-2016.
- Madarbux M, Van Laer A, Watts P and Jones T Energy Efficient And Low Latency Interconnection Network For Multicast Invalidates In Shared Memory Systems Proceedings of the 1st International Workshop on Advanced Interconnect Solutions and Technologies for Emerging Computing Systems, (1-6)
- Kanev S, Darago J, Hazelwood K, Ranganathan P, Moseley T, Wei G and Brooks D (2015). Profiling a warehouse-scale computer, ACM SIGARCH Computer Architecture News, 43:3S, (158-169), Online publication date: 4-Jan-2016.
- Lee Y, Kim J, Jang H, Yang H, Kim J, Jeong J and Lee J (2015). A fully associative, tagless DRAM cache, ACM SIGARCH Computer Architecture News, 43:3S, (211-222), Online publication date: 4-Jan-2016.
- Quéva C, Couroussé D and Charles H Self-optimisation using runtime code generation for wireless sensor networks Proceedings of the 17th International Conference on Distributed Computing and Networking, (1-6)
- Kleanthous M, Sazeides Y, Ozer E, Nicopoulos C, Nikolaou P and Hadjilambrou Z (2016). Toward Multi-Layer Holistic Evaluation of System Designs, IEEE Computer Architecture Letters, 15:1, (58-61), Online publication date: 1-Jan-2016.
- Fang Y, Hoang T, Becchi M and Chien A Fast support for unstructured data processing Proceedings of the 48th International Symposium on Microarchitecture, (533-545)
- Beyer J, Hadwiger M and Pfister H (2015). State-of-the-Art in GPU-Based Large-Scale Volume Visualization, Computer Graphics Forum, 34:8, (13-37), Online publication date: 1-Dec-2015.
- Ben Youssef B (2015). A parallel cellular automata algorithm for the deterministic simulation of 3-D multicellular tissue growth, Cluster Computing, 18:4, (1561-1579), Online publication date: 1-Dec-2015.
- Eslami H, Kougkas A, Kotsifakou M, Kasampalis T, Feng K, Lu Y, Gropp W, Sun X, Chen Y and Thakur R Efficient disk-to-disk sorting Proceedings of the 2015 International Workshop on Data-Intensive Scalable Computing Systems, (1-8)
- Liu Y and Sun X C2-bound Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, (1-11)
- Jacobs M, Hahn S and Hack S WCET analysis for multi-core processors with shared buses and event-driven bus arbitration Proceedings of the 23rd International Conference on Real Time and Networks Systems, (193-202)
- Zhang J, You S and Gruenwald L Efficient Parallel Zonal Statistics on Large-Scale Global Biodiversity Data on GPUs Proceedings of the 4th International ACM SIGSPATIAL Workshop on Analytics for Big Geospatial Data, (35-44)
- Zhang J, You S and Xia Y Prototyping A Web-based High-Performance Visual Analytics Platform for Origin-Destination Data Proceedings of the 1st International ACM SIGSPATIAL Workshop on Smart Cities and Urban Analytics, (16-23)
- Altamimi M and Naik K A Computing Profiling Procedure for Mobile Developers to Estimate Energy Cost Proceedings of the 18th ACM International Conference on Modeling, Analysis and Simulation of Wireless and Mobile Systems, (301-305)
- Lai B, Kuan-Ting Chen and Ping-Ru Wu (2015). A High-Performance Double-Layer Counting Bloom Filter for Multicore Systems, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 23:11, (2473-2486), Online publication date: 1-Nov-2015.
- Diaz I, Zhang C, Hollevoet L, Svensson J, Rodrigues J, Wilhelmsson L, Olsson T, Van der Perre L and Öwall V (2015). A new digital front-end for flexible reception in software defined radio, Microprocessors & Microsystems, 39:8, (889-900), Online publication date: 1-Nov-2015.
- Gottscho M, BanaiyanMofrad A, Dutt N, Nicolau A and Gupta P (2015). DPCS, ACM Transactions on Architecture and Code Optimization, 12:3, (1-26), Online publication date: 6-Oct-2015.
- Oxley M, Pasricha S, Maciejewski A, Siegel H, Apodaca J, Young D, Briceno L, Smith J, Bahirat S, Khemka B, Ramirez A and Zou Y (2015). Makespan and Energy Robust Stochastic Static Resource Allocation of a Bag-of-Tasks to a Heterogeneous Computing System, IEEE Transactions on Parallel and Distributed Systems, 26:10, (2791-2805), Online publication date: 1-Oct-2015.
- Zhu F, Yao Y, Tang W and Chen D (2015). A high performance framework for modeling and simulation of large-scale complex systems, Future Generation Computer Systems, 51:C, (132-141), Online publication date: 1-Oct-2015.
- Abadal S, Nemirovsky M, Alarcón E and Cabellos-Aparicio A Networking Challenges and Prospective Impact of Broadcast-Oriented Wireless Networks-on-Chip Proceedings of the 9th International Symposium on Networks-on-Chip, (1-8)
- Sanchez E and Reorda M (2015). On the Functional Test of Branch Prediction Units, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 23:9, (1675-1688), Online publication date: 1-Sep-2015.
- Hao Zhang , Gang Chen , Beng Chin Ooi , Kian-Lee Tan and Meihui Zhang (2015). In-Memory Big Data Management and Processing: A Survey, IEEE Transactions on Knowledge and Data Engineering, 27:7, (1920-1948), Online publication date: 1-Jul-2015.
- Kandemir M, Zhao H, Tang X and Karakoy M (2015). Memory Row Reuse Distance and its Role in Optimizing Application Performance, ACM SIGMETRICS Performance Evaluation Review, 43:1, (137-149), Online publication date: 24-Jun-2015.
- Li A, Tay Y, Kumar A and Corporaal H Transit Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing, (101-106)
- Kandemir M, Zhao H, Tang X and Karakoy M Memory Row Reuse Distance and its Role in Optimizing Application Performance Proceedings of the 2015 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, (137-149)
- Ul-Abdin Z and Svensson B Towards teaching embedded parallel computing Proceedings of the Workshop on Computer Architecture Education, (1-6)
- Kanev S, Darago J, Hazelwood K, Ranganathan P, Moseley T, Wei G and Brooks D Profiling a warehouse-scale computer Proceedings of the 42nd Annual International Symposium on Computer Architecture, (158-169)
- Tan Z, Qian Z, Chen X, Asanovic K and Patterson D (2015). DIABLO, ACM SIGARCH Computer Architecture News, 43:1, (207-221), Online publication date: 29-May-2015.
- Cilku B and Puschner P (2015). Designing a time predictable memory hierarchy for single-path code, ACM SIGBED Review, 12:2, (16-21), Online publication date: 20-May-2015.
- Mozafari S, Meyer B and Skadron K Yield-aware Performance-Cost Characterization for Multi-Core SIMT Proceedings of the 25th edition on Great Lakes Symposium on VLSI, (237-240)
- Tan Z, Qian Z, Chen X, Asanovic K and Patterson D (2015). DIABLO, ACM SIGPLAN Notices, 50:4, (207-221), Online publication date: 12-May-2015.
- Gallenmüller S, Emmerich P, Wohlfart F, Raumer D and Carle G Comparison of Frameworks for High-Performance Packet IO Proceedings of the Eleventh ACM/IEEE Symposium on Architectures for networking and communications systems, (29-38)
- Li W, Jin G, Cui X and See S An evaluation of unified memory technology on NVIDIA GPUs Proceedings of the 15th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, (1092-1098)
- Subedi T, Nguyen K and Cheriet M (2015). OpenFlow-based in-network Layer-2 adaptive multipath aggregation in data centers, Computer Communications, 61:C, (58-69), Online publication date: 1-May-2015.
- Zhang J, You S and Gruenwald L (2015). Large-scale spatial data processing on GPUs and GPU-accelerated clusters, SIGSPATIAL Special, 6:3, (27-34), Online publication date: 22-Apr-2015.
- Damodaran P, Zaib A, Wallentowitz S, Wild T and Herkersdorf A Sharer status-based caching in tiled multiprocessor systems-on-chip Proceedings of the Symposium on High Performance Computing, (67-74)
- Carretero J, Distefano S, Petcu D, Pop D, Rauber T, Runger G and Singh D (2015). Energy-efficient Algorithms for Ultrascale Systems, Supercomputing Frontiers and Innovations: an International Journal, 2:2, (77-104), Online publication date: 6-Apr-2015.
- Li Wang , Minqi Zhou , Zhenjie Zhang , Ming-Chien Shan and Aoying Zhou (2015). NUMA-Aware Scalable and Efficient In-Memory Aggregation on Large Domains, IEEE Transactions on Knowledge and Data Engineering, 27:4, (1071-1084), Online publication date: 1-Apr-2015.
- Cilku B, Kammerer R and Puschner P (2015). Aligning single path loops to reduce the number of capacity cache misses, ACM SIGBED Review, 12:1, (13-18), Online publication date: 27-Mar-2015.
- Tan Z, Qian Z, Chen X, Asanovic K and Patterson D DIABLO Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, (207-221)
- Chaker H, Cudennec L, Dahmani S, Gogniat G and Sepúlveda M Cycle-based Model to Evaluate Consistency Protocols within a Multi-protocol Compilation Tool-chain Proceedings of the 2015 International Workshop on Code Optimisation for Multi and Many Cores, (1-10)
- Fox A and Patterson D (2015). Do-it-yourself textbook publishing, Communications of the ACM, 58:2, (40-43), Online publication date: 28-Jan-2015.
- Lazarescu M and Lavagno L (2015). Interactive Trace-Based Analysis Toolset for Manual Parallelization of C Programs, ACM Transactions on Embedded Computing Systems, 14:1, (1-20), Online publication date: 21-Jan-2015.
- Gadouleau M and Riis S (2015). Memoryless computation, Theoretical Computer Science, 562:C, (129-145), Online publication date: 11-Jan-2015.
- Kiran D, Gurunarayanan S, Misra J and Nawal A (2015). Global scheduling heuristics for multicore architecture, Scientific Programming, 2015, (18-18), Online publication date: 1-Jan-2015.
- Riemens D, Gaydadjiev G, Zeeuw C and Strydis C (2014). Towards scalable arithmetic units with graceful degradation, ACM Transactions on Embedded Computing Systems, 13:4, (1-26), Online publication date: 5-Dec-2014.
- Son Y, Seongil O, Yang H, Jung D, Ahn J, Kim J, Kim J and Lee J Microbank Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, (1059-1070)
- Kaligirwa N, Leal E, Gruenwald L, Zhang J and You S Parallel QuadTree encoding of large-scale raster geospatial data on multicore CPUs and GPGPUs Proceedings of the 3rd ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, (30-39)
- Yalcin G, Ergin O, Islek E, Unsal O and Cristal A (2014). Exploiting Existing Comparators for Fine-Grained Low-Cost Error Detection, ACM Transactions on Architecture and Code Optimization, 11:3, (1-24), Online publication date: 27-Oct-2014.
- Aziz A, Cireno M, Barros E and Prado B Balanced Prefetching Aggressiveness Controller for NoC-based Multiprocessor Proceedings of the 27th Symposium on Integrated Circuits and Systems Design, (1-7)
- Segulja C and Abdelrahman T What is the cost of weak determinism? Proceedings of the 23rd international conference on Parallel architectures and compilation, (99-112)
- Hrbacek R and Sekanina L Towards highly optimized cartesian genetic programming Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation, (1015-1022)
- Shoukourian H, Wilde T, Auweter A and Bode A (2014). Predicting the Energy and Power Consumption of Strong and Weak Scaling HPC Applications, Supercomputing Frontiers and Innovations: an International Journal, 1:2, (20-41), Online publication date: 9-Jul-2014.
- Pirk H, Petraki E, Idreos S, Manegold S and Kersten M Database cracking Proceedings of the Tenth International Workshop on Data Management on New Hardware, (1-8)
- Mühlbauer T, Rödiger W, Seilbeck R, Kemper A and Neumann T Heterogeneity-conscious parallel query execution Proceedings of the Tenth International Workshop on Data Management on New Hardware, (1-10)
- Lazarescu M, Cohen A, Guatto A, Lê N, Lavagno L, Pop A, Prieto M, Terechko A and Sutii A Energy-aware parallelization flow and toolset for C code Proceedings of the 17th International Workshop on Software and Compilers for Embedded Systems, (79-88)
- Raghavendra K, Warrier T and Mutyam M SAMO Proceedings of the 11th ACM Conference on Computing Frontiers, (1-10)
- Piro G, Abadal S, Mestres A, Alarcón E, Solé-Pareta J, Grieco L and Boggia G Initial MAC Exploration for Graphene-enabled Wireless Networks-on-Chip Proceedings of ACM The First Annual International Conference on Nanoscale Computing and Communication, (1-9)
- Valero M, Moreto M, Casas M, Ayguade E and Labarta J (2014). Runtime-Aware Architectures, Supercomputing Frontiers and Innovations: an International Journal, 1:1, (29-44), Online publication date: 6-Apr-2014.
- Titmus M, Gurtowski J and Schatz M (2014). Answering the demands of digital genomics, Concurrency and Computation: Practice & Experience, 26:4, (917-928), Online publication date: 25-Mar-2014.
- Liu J, Bouganis C and Cheung P Image progressive acquisition for hardware systems Proceedings of the conference on Design, Automation & Test in Europe, (1-6)
- Tsoutsos N and Maniatakos M HEROIC Proceedings of the conference on Design, Automation & Test in Europe, (1-6)
- Sahu A and Ramakrishna S Creating heterogeneity at run time by dynamic cache and bandwidth partitioning schemes Proceedings of the 29th Annual ACM Symposium on Applied Computing, (872-879)
- Fang J, Sips H, Zhang L, Xu C, Che Y and Varbanescu A Test-driving Intel Xeon Phi Proceedings of the 5th ACM/SPEC international conference on Performance engineering, (137-148)
- Patterson D (2014). How to build a bad research center, Communications of the ACM, 57:3, (33-36), Online publication date: 1-Mar-2014.
- Bhattacharya A, Banerjee A and Sur-Kolay S Energy-Aware H.264 Decoding Proceedings of the 10th International Conference on Distributed Computing and Internet Technology - Volume 8337, (200-211)
- Benner P, Ezzatti P, Quintana-Ortí E and Remón A On the Impact of Optimization on the Time-Power-Energy Balance of Dense Linear Algebra Factorizations Algorithms and Architectures for Parallel Processing, (3-10)
- Bardizbanyan A, Själander M, Whalley D and Larsson-Edefors P (2013). Designing a practical data filter cache to improve both energy efficiency and performance, ACM Transactions on Architecture and Code Optimization, 10:4, (1-25), Online publication date: 1-Dec-2013.
- Fauzia N, Elango V, Ravishankar M, Ramanujam J, Rastello F, Rountev A, Pouchet L and Sadayappan P (2013). Beyond reuse distance analysis, ACM Transactions on Architecture and Code Optimization, 10:4, (1-29), Online publication date: 1-Dec-2013.
- Cicotti P, Carrington L and Chien A Toward application-specific memory reconfiguration for energy efficiency Proceedings of the 1st International Workshop on Energy Efficient Supercomputing, (1-8)
- Seo S, Lee J, Jo G and Lee J Automatic OpenCL work-group size selection for multicore CPUs Proceedings of the 22nd international conference on Parallel architectures and compilation techniques, (387-398)
- Choi J, Kwak J, Jhang S and Jhon C Data filter cache with word selection cache for low power embedded processor Proceedings of the 2013 Research in Adaptive and Convergent Systems, (422-427)
- Martínez H, Tárraga J, Medina I, Barrachina S, Castillo M, Dopazo J and Quintana-Ortí E A dynamic pipeline for RNA sequencing on multicore processors Proceedings of the 20th European MPI Users' Group Meeting, (235-240)
- Hossain S and Steihaug T (2013). Sparse matrix computations with application to solve system of nonlinear equations, WIREs Computational Statistics, 5:5, (372-386), Online publication date: 1-Sep-2013.
- Schindewolf M, Rocker B, Karl W and Heuveline V Evaluation of two formulations of the conjugate gradients method with transactional memory Proceedings of the 19th international conference on Parallel Processing, (508-520)
- Bhatia M, Kiran D, Misra J and Gurunarayanan S Fine grain thread scheduling on multicore processors Proceedings of the 6th ACM India Computing Convention, (1-6)
- Altinigneli M, Plant C and Böhm C Massively parallel expectation maximization using graphics processing units Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, (838-846)
- Song X, Shi J, Chen H and Zang B Schedule processes, not VCPUs Proceedings of the 4th Asia-Pacific Workshop on Systems, (1-7)
- Xu T, Liljeberg P, Plosila J and Tenhunen H MMSoC Proceedings of the 14th International Conference on Computer Systems and Technologies, (67-74)
- Son Y, Seongil O, Ro Y, Lee J and Ahn J (2013). Reducing memory access latency with asymmetric DRAM bank organizations, ACM SIGARCH Computer Architecture News, 41:3, (380-391), Online publication date: 26-Jun-2013.
- Cook H, Moreto M, Bird S, Dao K, Patterson D and Asanovic K (2013). A hardware evaluation of cache partitioning to improve utilization and energy-efficiency while preserving responsiveness, ACM SIGARCH Computer Architecture News, 41:3, (308-319), Online publication date: 26-Jun-2013.
- Son Y, Seongil O, Ro Y, Lee J and Ahn J Reducing memory access latency with asymmetric DRAM bank organizations Proceedings of the 40th Annual International Symposium on Computer Architecture, (380-391)
- Cook H, Moreto M, Bird S, Dao K, Patterson D and Asanovic K A hardware evaluation of cache partitioning to improve utilization and energy-efficiency while preserving responsiveness Proceedings of the 40th Annual International Symposium on Computer Architecture, (308-319)
- Szymanski T Low latency energy efficient communications in global-scale cloud computing systems Proceedings of the 2013 workshop on Energy efficient high performance parallel and distributed computing, (13-22)
- Soliman M (2013). Design, implementation, and evaluation of a low-complexity vector-core for executing scalar/vector instructions, Journal of Parallel and Distributed Computing, 73:6, (836-850), Online publication date: 1-Jun-2013.
- Neela G and Draper J An asymmetric adaptive-precision energy-efficient 3DIC multiplier Proceedings of the 23rd ACM international conference on Great lakes symposium on VLSI, (269-274)
- Nanavati M, Spear M, Taylor N, Rajagopalan S, Meyer D, Aiello W and Warfield A Whose cache line is it anyway? Proceedings of the 8th ACM European Conference on Computer Systems, (141-154)
- Ltaief H, Luszczek P and Dongarra J (2013). High-performance bidiagonal reduction using tile algorithms on homogeneous multicore architectures, ACM Transactions on Mathematical Software, 39:3, (1-22), Online publication date: 1-Apr-2013.
- Li S, Ahn J, Strong R, Brockman J, Tullsen D and Jouppi N (2013). The McPAT Framework for Multicore and Manycore Architectures, ACM Transactions on Architecture and Code Optimization, 10:1, (1-29), Online publication date: 1-Apr-2013.
- Hong S and Kim S AVICA Proceedings of the Conference on Design, Automation and Test in Europe, (65-70)
- Huang Y, Ienne P, Temam O, Chen Y and Wu C Elastic CGRAs Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays, (171-180)
- Park H and Choi K Position-based weighted round-robin arbitration for equality of service in many-core network-on-chips Proceedings of the Fifth International Workshop on Network on Chip Architectures, (51-56)
- Zhang J and You S CudaGIS Proceedings of the 3rd ACM SIGSPATIAL International Workshop on GeoStreaming, (101-108)
- Zhang J, You S and Gruenwald L High-performance online spatial and temporal aggregations on multi-core CPUs and many-core GPUs Proceedings of the fifteenth international workshop on Data warehousing and OLAP, (89-96)
- Zhang J, You S and Gruenwald L U2STRA Proceedings of the 2012 ACM workshop on City data management workshop, (5-12)
- Haque M, Ragel R, Ambrose A, Radhakrishnan S and Parameswaran S DIMSim Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis, (151-160)
- Bournoutian G and Orailoglu A Dynamic transient fault detection and recovery for embedded processor datapaths Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis, (43-52)
- Tu C, Hung S and Tsai T (2012). MCEmu, ACM Transactions on Design Automation of Electronic Systems, 17:4, (1-25), Online publication date: 1-Oct-2012.
- Menon J, De Kruijf M and Sankaralingam K (2012). iGPU, ACM SIGARCH Computer Architecture News, 40:3, (72-83), Online publication date: 5-Sep-2012.
- Zhang J, Kamga C, Gong H and Gruenwald L U2SOD-DB Proceedings of the ACM SIGKDD International Workshop on Urban Computing, (163-171)
- Wang Y, Zhang C, Yu H and Zhang W Design of low power 3D hybrid memory by non-volatile CBRAM-crossbar with block-level data-retention Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design, (197-202)
- Edwards J and Vishkin U Brief announcement Proceedings of the twenty-fourth annual ACM symposium on Parallelism in algorithms and architectures, (190-192)
- Menon J, De Kruijf M and Sankaralingam K iGPU Proceedings of the 39th Annual International Symposium on Computer Architecture, (72-83)
- Hart S, Frachtenberg E and Berezecki M Predicting memcached throughput using simulation and modeling Proceedings of the 2012 Symposium on Theory of Modeling and Simulation - DEVS Integrative M&S Symposium, (1-8)
- Habermaier A and Knapp A On the correctness of the SIMT execution model of GPUs Proceedings of the 21st European conference on Programming Languages and Systems, (316-335)
- Ahn J, Jouppi N, Kozyrakis C, Leverich J and Schreiber R (2012). Improving System Energy Efficiency with Memory Rank Subsetting, ACM Transactions on Architecture and Code Optimization, 9:1, (1-28), Online publication date: 1-Mar-2012.
- Nie P and Duan Z (2012). Efficient and scalable scheduling for performance heterogeneous multicore systems, Journal of Parallel and Distributed Computing, 72:3, (353-361), Online publication date: 1-Mar-2012.
Index Terms
- Computer Architecture, Fifth Edition: A Quantitative Approach