Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleJune 2024
CuPBoP: Making CUDA a Portable Language
ACM Transactions on Design Automation of Electronic Systems (TODAES), Volume 29, Issue 4Article No.: 60, Pages 1–25https://doi.org/10.1145/3659949CUDA is designed specifically for NVIDIA GPUs and is not compatible with non-NVIDIA devices. Enabling CUDA execution on alternative backends could greatly benefit the hardware community by fostering a more diverse software ecosystem.
To address the need ...
- research-articleJuly 2023
Cache Programming for Scientific Loops Using Leases
- Benjamin Reber,
- Matthew Gould,
- Alexander H. Kneipp,
- Fangzhou Liu,
- Ian Prechtl,
- Chen Ding,
- Linlin Chen,
- Dorin Patru
ACM Transactions on Architecture and Code Optimization (TACO), Volume 20, Issue 3Article No.: 39, Pages 1–25https://doi.org/10.1145/3600090Cache management is important in exploiting locality and reducing data movement. This article studies a new type of programmable cache called the lease cache. By assigning leases, software exerts the primary control on when and how long data stays in the ...
- research-articleSeptember 2022
COX : Exposing CUDA Warp-level Functions to CPUs
ACM Transactions on Architecture and Code Optimization (TACO), Volume 19, Issue 4Article No.: 59, Pages 1–25https://doi.org/10.1145/3554736As CUDA becomes the de facto programming language among data parallel applications such as high-performance computing or machine learning applications, running CUDA on other platforms becomes a compelling option. Although several efforts have attempted to ...
- research-articleNovember 2017
A compiler transformation-based approach to scientific workflow enactment
WORKS '17: Proceedings of the 12th Workshop on Workflows in Support of Large-Scale ScienceArticle No.: 4, Pages 1–11https://doi.org/10.1145/3150994.3150999We investigate in this paper the application of compiler transformations to workflow applications using the Manycore Workflow Runtime Environment (MWRE), a compiler-based workflow environment for modern manycore computing architectures. MWRE translates ...
- research-articleOctober 2016
ROP Gadget Prevalence and Survival under Compiler-based Binary Diversification Schemes
SPRO '16: Proceedings of the 2016 ACM Workshop on Software PROtectionPages 15–26https://doi.org/10.1145/2995306.2995309Diversity has been suggested as an effective alternative to the current trend in rules-based approaches to cybersecurity. However, little work to date has focused on how various techniques generalize to new attacks. That is, there is no accepted ...
- research-articleSeptember 2016
Reduction Drawing: Language Constructs and Polyhedral Compilation for Reductions on GPU
PACT '16: Proceedings of the 2016 International Conference on Parallel Architectures and CompilationPages 87–97https://doi.org/10.1145/2967938.2967950Reductions are common in scientific and data-crunching codes, and a typical source of bottlenecks on massively parallel architectures such as GPUs. Reductions are memory-bound, and achieving peak performance involves sophisticated optimizations. There ...
- research-articleSeptember 2012
Integrating Memory Optimization with Mapping Algorithms for Multi-Processors System-on-Chip
- Bruno Girodias,
- Luiza Gheorghe Iugan,
- Youcef Bouchebaba,
- Gabriela Nicolescu,
- El Mostapha Abouhamid,
- Michel Langevin,
- Pierre Paulin
ACM Transactions on Embedded Computing Systems (TECS), Volume 11, Issue 3Article No.: 64, Pages 1–26https://doi.org/10.1145/2345770.2345776Due to their great ability to parallelize at a very high integration level, Multi-Processors Systems-on-Chip (MPSoCs) are good candidates for systems and applications such as multimedia. Memory is becoming a key player for significant improvements in ...
- ArticleSeptember 2011
Design and Implementation of OpenMP Tasks in the OMPi Compiler
PCI '11: Proceedings of the 2011 15th Panhellenic Conference on InformaticsPages 265–269https://doi.org/10.1109/PCI.2011.34In this paper we present the design and implementation of tasks in the context of the \ompi\ \openmp\ compiler. The modular architecture of \ompi's runtime system allows a wide range of choices for experimenting with \openmp\ structures. We present two ...
- articleSeptember 2007
MPSoC memory optimization using program transformation
- Youcef Bouchebaba,
- Bruno Girodias,
- Gabriela Nicolescu,
- El Mostapha Aboulhamid,
- Bruno Lavigueur,
- Pierre Paulin
ACM Transactions on Design Automation of Electronic Systems (TODAES), Volume 12, Issue 4Pages 43–eshttps://doi.org/10.1145/1278349.1278356Multiprocessor system-on-a-chip (MPSoC) architectures have received a lot of attention in the past years, but few advances in compilation techniques target these architectures. This is particularly true for the exploitation of data locality. Most of the ...
- articleJanuary 2006
Improving the energy behavior of block buffering using compiler optimizations
ACM Transactions on Design Automation of Electronic Systems (TODAES), Volume 11, Issue 1Pages 228–250https://doi.org/10.1145/1124713.1124727On-chip caches consume a significant fraction of the energy in current microprocessors. As a result, architectural/circuit-level techniques such as block buffering and sub-banking have been proposed and shown to be very effective in reducing the energy ...
- research-articleSeptember 1999
Statically Safe Speculative Execution for Real-Time Systems
IEEE Transactions on Software Engineering (ISOF), Volume 25, Issue 5Pages 701–721https://doi.org/10.1109/32.815328Deterministic worst-case execution for satisfying hard-real-time constraints, and speculative execution with rollback for improving average-case throughput, appear to lie on opposite ends of a spectrum of performance requirements and strategies. ...