Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleOctober 2018
Architecting a hardware-managed hybrid DIMM optimized for cost/performance
- Fred Ware,
- Javier Bueno,
- Liji Gopalakrishnan,
- Brent Haukness,
- Chris Haywood,
- Toni Juan,
- Eric Linstadt,
- Sally A. McKee,
- Steven C. Woo,
- Kenneth L. Wright,
- Craig Hampel,
- Gary Bronner
MEMSYS '18: Proceedings of the International Symposium on Memory SystemsPages 327–340https://doi.org/10.1145/3240302.3240303Rapidly evolving workloads and exploding data volumes place great pressure on data-center compute, IO, and memory performance, and especially on memory capacity. Increasing memory capacity requires a commensurate reduction in memory cost per bit. DRAM ...
- ArticleOctober 2009
Reusing cached schedules in an out-of-order processor with in-order issue logic
The complex and powerful out-of-order issue logic dismisses the repetitive nature of the code, unlike what caches or branch predictors do. We show that 90% of the cycles, the group of instructions selected by the issue logic belongs to just 13% of the ...
- research-articleJanuary 2009
Larrabee: A Many-Core x86 Architecture for Visual Computing
- Larry Seiler,
- Doug Carmean,
- Eric Sprangle,
- Tom Forsyth,
- Pradeep Dubey,
- Stephen Junkins,
- Adam Lake,
- Robert Cavin,
- Roger Espasa,
- Ed Grochowski,
- Toni Juan,
- Michael Abrash,
- Jeremy Sugerman,
- Pat Hanrahan
The Larrabee many-core visual computing architecture uses multiple in-order x86 cores augmented by wide vector processor units, together with some fixed-function logic. This increases the architecture's programmability as compared to standard GPUs. The ...
- research-articleAugust 2008
Larrabee: a many-core x86 architecture for visual computing
- Larry Seiler,
- Doug Carmean,
- Eric Sprangle,
- Tom Forsyth,
- Michael Abrash,
- Pradeep Dubey,
- Stephen Junkins,
- Adam Lake,
- Jeremy Sugerman,
- Robert Cavin,
- Roger Espasa,
- Ed Grochowski,
- Toni Juan,
- Pat Hanrahan
SIGGRAPH '08: ACM SIGGRAPH 2008 papersArticle No.: 18, Pages 1–15https://doi.org/10.1145/1399504.1360617This paper presents a many-core visual computing architecture code named Larrabee, a new software rendering pipeline, a manycore programming model, and performance analysis for several applications. Larrabee uses multiple in-order x86 CPU cores that are ...
- research-articleAugust 2008
Larrabee: a many-core x86 architecture for visual computing
- Larry Seiler,
- Doug Carmean,
- Eric Sprangle,
- Tom Forsyth,
- Michael Abrash,
- Pradeep Dubey,
- Stephen Junkins,
- Adam Lake,
- Jeremy Sugerman,
- Robert Cavin,
- Roger Espasa,
- Ed Grochowski,
- Toni Juan,
- Pat Hanrahan
ACM Transactions on Graphics (TOG), Volume 27, Issue 3Pages 1–15https://doi.org/10.1145/1360612.1360617This paper presents a many-core visual computing architecture code named Larrabee, a new software rendering pipeline, a manycore programming model, and performance analysis for several applications. Larrabee uses multiple in-order x86 CPU cores that are ...
- ArticleMay 2002
Tarantula: a vector extension to the alpha architecture
- Roger Espasa,
- Federico Ardanaz,
- Joel Emer,
- Stephen Felix,
- Julio Gago,
- Roger Gramunt,
- Isaac Hernandez,
- Toni Juan,
- Geoff Lowney,
- Matthew Mattina,
- André Seznec
ISCA '02: Proceedings of the 29th annual international symposium on Computer architecturePages 281–292Tarantula is an aggressive floating point machine targeted at technical, scientific and bioinformatics workloads, originally planned as a follow-on candidate to the EV8 processor [6, 5]. Tarantula adds to the EV8 core a vector unit capable of 32 double-...
Also Published in:
ACM SIGARCH Computer Architecture News: Volume 30 Issue 2 - research-articleFebruary 2002
Asim: A Performance Model Framework
- Joel Emer,
- Pritpal Ahuja,
- Eric Borch,
- Artur Klauser,
- Chi-Keung Luk,
- Srilatha Manne,
- Shubhendu S. Mukherjee,
- Harish Patil,
- Steven Wallace,
- Nathan Binkert,
- Roger Espasa,
- Toni Juan
The longevity and usefulness of a microprocessor performance modelhas historically depended on the model writer's skills and discipline. However,at Compaq the models became extremely complex and unmanageablebecause designers lacked a structured way to ...
- ArticleApril 1998
Dynamic history-length fitting: a third level of adaptivity for branch prediction
ISCA '98: Proceedings of the 25th annual international symposium on Computer architecturePages 155–166https://doi.org/10.1145/279358.279379Accurate branch prediction is essential for obtaining high performance in pipelined superscalar processors that execute instructions speculatively. Some of the best current predictors combine a part of the branch address with a fixed amount of global ...
Also Published in:
ACM SIGARCH Computer Architecture News: Volume 26 Issue 3 - ArticleMay 1996
The difference-bit cache
ISCA '96: Proceedings of the 23rd annual international symposium on Computer architecturePages 114–120https://doi.org/10.1145/232973.232986The difference-bit cache is a two-way set-associative cache with an access time that is smaller than that of a conventional one and close or equal to that of a direct-mapped cache. This is achieved by noticing that the two tags for a set have to differ ...
Also Published in:
ACM SIGARCH Computer Architecture News: Volume 24 Issue 2 - ArticleJuly 1994
MOB forms: a class of multilevel block algorithms for dense linear algebra operations
ICS '94: Proceedings of the 8th international conference on SupercomputingPages 354–363https://doi.org/10.1145/181181.181561Multilevel block algorithms exploit the data locality in linear algebra operations when executed in machines with several levels in the memory hierarchy. It is shown that the family we call Multilevel Orthogonal Block (MOB) algorithms is optimal and ...