Design and Analysis of Approximate Redundant Binary Multipliers
As technology scaling is reaching its limits, new approaches have been proposed for computional efficiency. Approximate computing is a promising technique for high performance and low power circuits as used in error-tolerant applications. Among ...
Handling Transients of Dynamic Real-Time Workload Under EDF Scheduling
Real-time dynamic workload consists of tasks that can arbitrarily join and leave the system at run-time. To avoid incurring deadline misses, tasks that request to join the system must pass an admission test, which has to cope with potential scheduling ...
Increasing the Reliability of Software Timing Analysis for Cache-Based Processors
Real-time systems are witnessing a significant increase in critical software's size, complexity, and performance needs, which can only be satisfied with high-performance hardware features. Cache memories, pervasively used to improve average performance, ...
Learning-Based Application-Agnostic 3D NoC Design for Heterogeneous Manycore Systems
- Biresh Kumar Joardar,
- Ryan Gary Kim,
- Janardhan Rao Doppa,
- Partha Pratim Pande,
- Diana Marculescu,
- Radu Marculescu
The rising use of deep learning and other big-data algorithms has led to an increasing demand for hardware platforms that are computationally powerful, yet energy-efficient. Due to the amount of data parallelism in these algorithms, high-performance ...
Promoting the Harmony between Sparsity and Regularity: A Relaxed Synchronous Architecture for Convolutional Neural Networks
There are two approaches to improve the performance of Convolutional Neural Networks (CNNs): 1) accelerating computation and 2) reducing the amount of computation. The acceleration approaches take the advantage of CNN computing regularity which enables ...
Resource-Oriented Partitioning for Multiprocessor Systems with Shared Resources
Predictable scheduling and resource sharing primitives are fundamental aspects of real-time systems. To prevent race conditions, access to shared resources must ensure mutual exclusion, e.g., using semaphores. Further, real-time locking protocols are ...
Self-Synchronized Encryption for Physical Layer in 10Gbps Optical Links
In this work a new self-synchronized encryption method for 10 Gigabit optical links is proposed and developed. Necessary modifications to introduce this kind of encryption in physical layers based on 64b/66b encoding, such as 10 GBase-R, have been ...
SparCE: Sparsity Aware General-Purpose Core Extensions to Accelerate Deep Neural Networks
Deep Neural Networks (DNNs) have emerged as the method of choice for solving a wide range of machine learning tasks. The enormous computational demand posed by DNNs is a key challenge for computing system designers and has most commonly been addressed ...
The Concept of Unschedulability Core for Optimizing Real-Time Systems with Fixed-Priority Scheduling
In the design optimization of real-time systems scheduled with fixed priority, schedulability analysis is used to define the feasibility region within which tasks meet their deadlines, so that optimization algorithms can find the best solution within ...
Value Iteration Architecture Based Deep Learning for Intelligent Routing Exploiting Heterogeneous Computing Platforms
Recently, the rapid advancement of high computing platforms has accelerated the development and applications of artificial intelligence techniques. Deep learning, which has been regarded as the next paradigm to revolutionize users? experiences, has ...