2021 IEEE International Symposium on Workload Characterization (IISWC), 2021
An increasing number of edge systems have large computational demands, stringent resource constra... more An increasing number of edge systems have large computational demands, stringent resource constraints, and end-to-end quality-driven goodness metrics. Architects have embraced domain-specific accelerators to meet the demands of such systems. We make the case for research that shifts emphasis from domain-specific accelerators to domain-specific systems, with a consequent shift from evaluations using benchmarks that are collections of independent applications to those using testbeds that are full integrated systems. We describe extended reality (XR) as an exciting domain motivating such domain-specific systems research, but hampered by the lack of an end-to-end evaluation testbed. We present ILLIXR (Illinois Extended Reality testbed), the first fully open source XR system and research testbed. ILLIXR enables system innovations with end-to-end co-designed hardware, compiler, OS, and algorithm, and driven by end-user perceived quality-of-experience (QoE) metrics. Using ILLIXR, we perfor...
As the need for specialization increases and architectures become increasingly domain-specific, i... more As the need for specialization increases and architectures become increasingly domain-specific, it is important for architects to understand the requirements of emerging application domains. Augmented and virtual reality (AR/VR) or extended reality (XR) is one such important domain. This paper presents a generic XR workflow and the first benchmark suite, ILLIXR (Illinois Extended Reality Benchmark Suite), that represents key computations from this workflow. Our analysis shows a large number of interesting implications for architects, including demanding performance, thermal, and energy requirements and a large diversity of critical tasks such that an accelerator per task is likely to overshoot area constraints. ILLIXR and our analysis have the potential to propel new directions in architecture research in general, and impact XR in particular. ILLIXR is open-source and available at this https URL
Hardware specialization is becoming a key enabler of energyefficient performance. Future systems ... more Hardware specialization is becoming a key enabler of energyefficient performance. Future systems will be increasingly heterogeneous, integrating multiple specialized and programmable accelerators, each with different memory demands. Traditionally, communication between accelerators has been inefficient, typically orchestrated through explicit DMA transfers between different address spaces. More recently, industry has proposed unified coherent memory which enables implicit data movement and more data reuse, but often these interfaces limit the coherence flexibility available to heterogeneous systems. This paper demonstrates the benefits of fine-grained coherence specialization for heterogeneous systems. We propose an architecture that enables low-complexity independent specialization of each individual coherence request in heterogeneous workloads by building upon a simple and flexible baseline coherence interface, Spandex. We then describe how to optimize individual memory requests t...
Goal: to integrate specialization techniques from the OS community hybrid runtimes and DB communi... more Goal: to integrate specialization techniques from the OS community hybrid runtimes and DB community compiled queries for high-performance querying on big data. •Certain abstractions improve generality, but get in the way of performance at exascale computing. • In specific cases, give up flexibility and generality in exchange for performance. •We prototype NautDB: specialized using a hybrid runtime kernel (based on the Nautilus Aerokernel [1]) and executing pre-compiled building-blocks. •We demonstrate performance benefits in certain cases, while maintaining a simple interface for users. Specialized Hybrid Runtimes [1] •Kernel + Runtime run in ring 0 (unikernel-like) =⇒ fewer context switches •Partition physical resources between general purpose OS and hybrid runtime [2] =⇒ can call general purpose OS where needed. •Unikernel-inspired design gives programmer fine-grained control over . . . •Not time-shared =⇒ No context-switches or thread-migration •Avoid interrupts =⇒ Faster and mor...
2021 IEEE International Symposium on Workload Characterization (IISWC), 2021
An increasing number of edge systems have large computational demands, stringent resource constra... more An increasing number of edge systems have large computational demands, stringent resource constraints, and end-to-end quality-driven goodness metrics. Architects have embraced domain-specific accelerators to meet the demands of such systems. We make the case for research that shifts emphasis from domain-specific accelerators to domain-specific systems, with a consequent shift from evaluations using benchmarks that are collections of independent applications to those using testbeds that are full integrated systems. We describe extended reality (XR) as an exciting domain motivating such domain-specific systems research, but hampered by the lack of an end-to-end evaluation testbed. We present ILLIXR (Illinois Extended Reality testbed), the first fully open source XR system and research testbed. ILLIXR enables system innovations with end-to-end co-designed hardware, compiler, OS, and algorithm, and driven by end-user perceived quality-of-experience (QoE) metrics. Using ILLIXR, we perfor...
As the need for specialization increases and architectures become increasingly domain-specific, i... more As the need for specialization increases and architectures become increasingly domain-specific, it is important for architects to understand the requirements of emerging application domains. Augmented and virtual reality (AR/VR) or extended reality (XR) is one such important domain. This paper presents a generic XR workflow and the first benchmark suite, ILLIXR (Illinois Extended Reality Benchmark Suite), that represents key computations from this workflow. Our analysis shows a large number of interesting implications for architects, including demanding performance, thermal, and energy requirements and a large diversity of critical tasks such that an accelerator per task is likely to overshoot area constraints. ILLIXR and our analysis have the potential to propel new directions in architecture research in general, and impact XR in particular. ILLIXR is open-source and available at this https URL
Hardware specialization is becoming a key enabler of energyefficient performance. Future systems ... more Hardware specialization is becoming a key enabler of energyefficient performance. Future systems will be increasingly heterogeneous, integrating multiple specialized and programmable accelerators, each with different memory demands. Traditionally, communication between accelerators has been inefficient, typically orchestrated through explicit DMA transfers between different address spaces. More recently, industry has proposed unified coherent memory which enables implicit data movement and more data reuse, but often these interfaces limit the coherence flexibility available to heterogeneous systems. This paper demonstrates the benefits of fine-grained coherence specialization for heterogeneous systems. We propose an architecture that enables low-complexity independent specialization of each individual coherence request in heterogeneous workloads by building upon a simple and flexible baseline coherence interface, Spandex. We then describe how to optimize individual memory requests t...
Goal: to integrate specialization techniques from the OS community hybrid runtimes and DB communi... more Goal: to integrate specialization techniques from the OS community hybrid runtimes and DB community compiled queries for high-performance querying on big data. •Certain abstractions improve generality, but get in the way of performance at exascale computing. • In specific cases, give up flexibility and generality in exchange for performance. •We prototype NautDB: specialized using a hybrid runtime kernel (based on the Nautilus Aerokernel [1]) and executing pre-compiled building-blocks. •We demonstrate performance benefits in certain cases, while maintaining a simple interface for users. Specialized Hybrid Runtimes [1] •Kernel + Runtime run in ring 0 (unikernel-like) =⇒ fewer context switches •Partition physical resources between general purpose OS and hybrid runtime [2] =⇒ can call general purpose OS where needed. •Unikernel-inspired design gives programmer fine-grained control over . . . •Not time-shared =⇒ No context-switches or thread-migration •Avoid interrupts =⇒ Faster and mor...
Uploads
Papers by Samuel Grayson