Newsletter Downloads
AOT vs. JIT: impact of profile data on code quality
Just-in-time (JIT) compilation during program execution and ahead-of-time (AOT) compilation during software installation are alternate techniques used by managed language virtual machines (VM) to generate optimized native code while simultaneously ...
Adaptive optimization for OpenCL programs on embedded heterogeneous systems
Heterogeneous multi-core architectures consisting of CPUs and GPUs are commonplace in today’s embedded systems. These architectures offer potential for energy efficient computing if the application task is mapped to the right core. Realizing such ...
Auto-vectorization for image processing DSLs
The parallelization of programs and distributing their workloads to multiple threads can be a challenging task. In addition to multi-threading, harnessing vector units in CPUs proves highly desirable. However, employing vector units to speed up ...
Dynamic translation of structured Loads/Stores and register mapping for architectures with SIMD extensions
More and more modern processors have been supporting non-contiguous SIMD data accesses. However, translating such instructions has been overlooked in the Dynamic Binary Translation (DBT) area. For example, in the popular QEMU dynamic binary translator, ...
Optimal functional unit assignment and voltage selection for pipelined MPSoC with guaranteed probability on time performance
Pipelined heterogeneous multiprocessor system-on-chip (MPSoC) can provide high throughput for streaming applications. In the design of such systems, time performance and system cost are the most concerning issues. By analyzing runtime behaviors of ...
Integrated IoT programming with selective abstraction
The explosion of networked devices has driven a new computing environment called the Internet of Things (IoT), enabling various services such as home automation and health monitoring. Despite the promising applicability of the IoT, developing an IoT ...
Towards SMT-based LTL model checking of clock constraint specification language for real-time and embedded systems
The Clock Constraint Specification Language (CCSL) is a formal language companion to MARTE (shorthand for Modeling and Analysis of Real-Time and Embedded systems), a UML profile used to facilitate the design and analysis of real-time and embedded ...
Integrating task scheduling and cache locking for multicore real-time embedded systems
Modern embedded processors provide hardware support for cache locking, a mechanism used to facilitate the WCET (Worst-Case Execution Time) calculation of a task. We investigate the problem of integrating task scheduling and cache locking for a set of ...
Towards memory-efficient processing-in-memory architecture for convolutional neural networks
Convolutional neural networks (CNNs) are widely adopted in artificial intelligent systems. In contrast to conventional computing centric applications, the computational and memory resources of CNN applications are mixed together in the network weights. ...
Unified nvTCAM and sTCAM architecture for improving packet matching performance
Software-Defined Networking (SDN) allows controlling applications to install fine-grained forwarding policies in the underlying switches. Ternary Content Addressable Memory (TCAM) enables fast lookups in hardware switches with flexible wildcard rule ...
A lightweight progress maximization scheduler for non-volatile processor under unstable energy harvesting
Energy harvesting techniques become increasingly popular as power supplies for embedded systems. However, the harvested energy is intrinsically unstable. Thus, the program execution may be interrupted frequently. Although the development of non-...
OSEK-V: application-specific RTOS instantiation in hardware
The employment of a real-time operating system (RTOS) in an embedded control systems is often an all-or-nothing decision: While the RTOS-abstractions provide for easier software composition and development, the price in terms of event latencies and ...