Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
ACM Transactions on Embedded Computing SystemsJust Accepted
acm
The below articles have been recently accepted to the journal and are currently in the production process. These Author’s Accepted Manuscripts (AAM) will be available for preview until the “Version of Record” is available and assigned to its proper issue. The AAM carries the article’s permanent DOI and can be cited immediately.
research-article
Free
August 2024
JUST ACCEPTED
Revealing CNN Architectures via Side-Channel Analysis in Dataflow-based Inference Accelerators

Convolutional Neural Networks (CNNs) are widely used in various domains, including image recognition, medical diagnosis and autonomous driving. Recent advances in dataflow-based CNN accelerators have enabled CNN inference in resource-constrained edge ...

research-article
Free
August 2024
JUST ACCEPTED
Optimizing Dilithium Implementation with AVX2/-512

Dilithium is a signature scheme that is currently being standardized to the Module-Lattice-Based Digital Signature Standard by NIST. It is believed to be secure even against attacks from large-scale quantum computers based on lattice problems. The ...

research-article
Free
August 2024
JUST ACCEPTED
Combining Weight Approximation, Sharing and Retraining for Neural Network Model Compression

Neural network model compression is very important to achieve model deployment based on the memory and storage available in different computing systems. Generally, the continuous drive for higher accuracy in these models increases their size and ...

research-article
Free
August 2024
JUST ACCEPTED
Transient Fault Detection in Tensor Cores for Modern GPUs

Deep Neural networks (DNNs) have emerged as an effective solution for many machine learning applications. However, the great success comes with the cost of excessive computation. The Volta graphics processing unit (GPU) from NVIDIA introduced a ...

research-article
Free
August 2024
JUST ACCEPTED
High Performance and Predictable Shared Last-level Cache for Safety-Critical Systems

We propose ZeroCost-LLC (ZCLLC), a novel shared inclusive last-level cache (LLC) design for timing predictable multi-core platforms that offers lower worst-case latency (WCL) when compared to a traditional shared inclusive LLC design. ZCLLC achieves low ...

research-article
Free
August 2024
JUST ACCEPTED
Tutorial: A Novel Runtime Environment for Accelerator-Rich Heterogeneous Architectures

As the landscape of computing advances, system designers are increasingly exploring methodologies that leverage higher levels of heterogeneity to enhance performance within constrained size, weight, power, and cost parameters. CEDR stands as an ecosystem ...

research-article
Free
August 2024
JUST ACCEPTED
Load-balanced Routing Heuristics for Bandwidth Allocation of AVB Flow in TSN

Time-Sensitive Networking (TSN) is a new technology developed from Ethernet that guarantees deterministic transmission of various types of flows, such as Time-triggered (TT) flows and Audio-video-bridging (AVB) flows, in the same network. Currently, Time-...

research-article
Free
August 2024
JUST ACCEPTED
Software Optimization and Design Methodology for Low Power Computer Vision Systems

This tutorial paper addresses a low power computer vision system as an example of a growing application domain of neural networks, exploring various technologies developed to enhance accuracy within the resource and performance constraints imposed by the ...

research-article
Free
August 2024
JUST ACCEPTED
NeuroTAP: Thermal and Memory Access Pattern-aware Data Mapping on 3D DRAM for Maximizing DNN Performance

Deep neural networks (DNNs) have been widely adopted, owing to break-through performance and high accuracy. DNNs exhibit varying memory behavior involving specific and recognizable memory access patterns and access intensity, depending on the selected ...

research-article
Free
August 2024
JUST ACCEPTED
A Comprehensive Study of Systems Challenges in Visual Simultaneous Localization and Mapping Systems

Visual SLAM systems are concurrent, performance-critical systems that respond to real-time environmental conditions and are frequently deployed on resource-constrained hardware. Previous work has identified three interconnected systems challenges to ...

research-article
Free
July 2024
JUST ACCEPTED
OffloaD: Detection Failure-based Scheduler for Offloading Object Detection

The current times ask for resource-constrained devices such as drones, light mobile robots, XR glasses, or mobile phones to perform object detection efficiently and in real time. However, when executed on the device, object detection fails to achieve the ...

research-article
Free
July 2024
JUST ACCEPTED
APB-tree: An Adaptive Pre-built Tree Indexing Scheme for NVM-based IoT Systems

With the proliferation of sensors and the emergence of novel applications, IoT data has grown exponentially in recent years. Given this trend, efficient data management is crucial for a system to easily access vast amounts of information. For decades, B+-...

research-article
Free
July 2024
JUST ACCEPTED
DOCTOR: A Multi-Disease Detection Continual Learning Framework Based on Wearable Medical Sensors

Modern advances in machine learning (ML) and wearable medical sensors (WMSs) in edge devices have enabled ML-driven disease detection for smart healthcare. Conventional ML-driven methods for disease detection rely on customizing individual models for each ...

research-article
Free
July 2024
JUST ACCEPTED
Co-Approximator: Enabling Performance Prediction in Colocated Applications.

Today’s Internet of Things (IoT) devices can colocate multiple applications on a platform with hardware resource sharing. Such colocations allow for increasing the throughput of contemporary IoT applications, similar to the use of multi-tenancy in clouds. ...

research-article
Free
July 2024
JUST ACCEPTED
Trust Based Active Game Data Collection Scheme in Smart Cities

The concept of a smart city is to equip sensors to various objects in urban life to monitor areas and collect sensing data, and make wise decisions based on the collected data. However, some malicious sensor devices may interrupt and interfere with data ...

research-article
Open Access
PredATW: Predicting the Asynchronous Time Warp Latency For VR Systems

With the advent of low-power ultra-fast hardware and GPUs, virtual reality (VR) has gained a lot of prominence in the last few years and is being used in various areas such as education, entertainment, scientific visualization, and computer-aided design. ...

research-article
Free
July 2024
JUST ACCEPTED
Lightweight Champions of the World: Side-Channel Resistant Open Hardware for Finalists in the NIST Lightweight Cryptography Standardization Process

Cryptographic competitions played a significant role in stimulating the development and release of open hardware for cryptography. The primary reason was the focus of standardization organizations and other contest organizers on transparency and fairness ...

research-article
Free
July 2024
JUST ACCEPTED
Performance and Communication Cost of Hardware Accelerators for Hashing in Post-Quantum Cryptography

SPHINCS+ is a signature scheme included in the first NIST post-quantum standard, that bases its security on the underlying hash primitive. As most of the runtime of SPHINCS+ is caused by the evaluation of several hash- and pseudo-random functions, ...

research-article
Free
July 2024
JUST ACCEPTED
LiteHash: Hash Functions for Resource-Constrained Hardware

The global paradigm shift towards edge computing has led to a growing demand for efficient integrity verification. Hash functions are one-way algorithms which act as a zero-knowledge proof of a datum’s contents. However, it is infeasible to compute hashes ...

research-article
Open Access
A Hybrid Sparse-dense Defensive DNN Accelerator Architecture against Adversarial Example Attacks

Understanding how to defend against adversarial attacks is crucial for ensuring the safety and reliability of these systems in real-world applications. Various adversarial defense methods are proposed, which aim to improve the robustness of neural ...

research-article
Free
June 2024
JUST ACCEPTED
NIR-sighted: A Programmable Streaming Architecture for Low-Energy Human-Centric Vision Applications

Human studies often rely on wearable lifelogging cameras that capture videos of individuals and their surroundings to aid in visual confirmation or recollection of daily activities like eating, drinking and smoking. However, this may include private or ...

research-article
Open Access
TEFLON: Thermally Efficient Dataflow-Aware 3D NoC for Accelerating CNN Inferencing on Manycore PIM Architectures

Resistive random-access memory (ReRAM) based processing-in-memory (PIM) architectures are used extensively to accelerate inferencing/training with convolutional neural networks (CNNs). Three-dimensional (3D) integration is an enabling technology to ...

research-article
Free
March 2024
JUST ACCEPTED
REC: REtime Convolutional layers to fully exploit harvested energy for ReRAM-based CNN accelerators

As the Internet of Things (IoTs) increasingly combines AI technology, it is a trend to deploy neural network algorithms at edges and make IoT devices more intelligent than ever. Moreover, energy-harvesting technology-based IoT devices have shown the ...

research-article
Free
March 2024
JUST ACCEPTED
Implementing Privacy Homomorphism with Random Encoding and Computation Controlled by a Remote Secure Server

Remote IoT devices face significant security risks due to their inherent physical vulnerability. An adversarial actor with sufficient capability can monitor the devices or exfiltrate data to access sensitive information. Remotely deployed devices such as ...

research-article
Free
February 2024
JUST ACCEPTED
SPIMulator: A Spintronic Processing-In-Memory Simulator for Racetracks

In-memory processing is becoming a popular method to alleviate the memory bottleneck of the von Neumann computing model. With the goal of improving both latency and energy cost associated with such in-memory processing, emerging non-volatile memory ...

research-article
Free
January 2024
JUST ACCEPTED
Customized FPGA Implementation of Authenticated Lightweight Cipher Fountain for IoT Systems

Authenticated Encryption with Associated-Data (AEAD) can ensure both confidentiality and integrity of information in encrypted communication. Distinctive variants are customized from AEAD to satisfy various requirements. In this paper, we take a 128-bit ...

research-article
Free
January 2024
JUST ACCEPTED
Securing Pacemakers using Runtime Monitors over Physiological Signals

Wearable and implantable medical devices (IMDs) are increasingly deployed to diagnose, monitor, and provide therapy for critical medical conditions. Such medical devices are safety-critical cyber-physical systems (CPSs). These systems support wireless ...

research-article
Free
December 2023
JUST ACCEPTED
A Design Flow for Scheduling Spiking Deep Convolutional Neural Networks on Heterogeneous Neuromorphic System-on-Chip

Neuromorphic systems-on-chip (NSoCs) integrate CPU cores and neuromorphic hardware accelerators on the same chip. These platforms can execute spiking deep convolutional neural networks (SDCNNs) with a low energy footprint. Modern NSoCs are heterogeneous ...

research-article
Free
November 2023
JUST ACCEPTED
IoV-Fog-Assisted Framework for Accident Detection and Classification

The evolution of vehicular research into an effectuating area like the Internet of Vehicles (IoV) was verified by technical developments in hardware. The integration of the Internet of Things (IoT) and Vehicular Ad-hoc Networks (VANET) has significantly ...

research-article
Free
October 2023
JUST ACCEPTED
Reachability Analysis of Sigmoidal Neural Networks

This paper extends the star set reachability approach to verify the robustness of feed-forward neural networks (FNNs) with sigmoidal activation functions such as Sigmoid and TanH. The main drawbacks of the star set approach in Sigmoid/TanH FNN ...

research-article
Open Access
Deterministic Coordination Across Multiple Timelines

We discuss a novel approach for constructing deterministic reactive systems that revolves around a temporal model that incorporates a multiplicity of timelines. This model is central to Lingua Franca (LF), a polyglot coordination language and compiler ...

research-article
Synchronised Shared Memory and Model Checking

In this paper, a formal generic framework for defining and reasoning about deterministic concurrency in synchronous systems is implemented in the Spin model checker. Concretely, the paper implements the clock-synchronised shared memory (csm) theory, which ...

research-article
Free
September 2023
JUST ACCEPTED
An Asynchronous Compaction Acceleration Scheme for Near-Data Processing-enabled LSM-Tree-based KV Stores

LSM-tree-based key-value stores (KV stores) convert random-write requests to sequence-write ones to achieve high I/O performance. Meanwhile, compaction operations in KV stores update SSTables in forms of reorganizing low-level data components to high-...

research-article
Free
September 2023
JUST ACCEPTED
Evolution Function Based Reach-Avoid Verification for Time-varying Systems with Disturbances

In this work, we investigate the reach-avoid problem of a class of time-varying analytic systems with disturbances described by uncertain parameters. Firstly, by proposing the concepts of maximal and minimal reachable sets, we connect the avoidability and ...

research-article
Free
September 2023
JUST ACCEPTED
AMULET: a Mutation Language Enabling Automatic Enrichment of SysML Models

SysML models are widely used for designing and analyzing complex systems. Model-based design methods often require successive modifications of the models, whether for incrementally refining the design (e.g. in agile development methods) or for testing ...

research-article
Free
September 2023
JUST ACCEPTED
An Analytical Model-based Capacity Planning Approach for Building CSD-based Storage Systems

The data movement in large-scale computing facilities (from compute nodes to data nodes) is categorized as one of the major contributors to high cost and energy utilization. To tackle it, in-storage processing (ISP) within storage devices, such as Solid-...

research-article
Free
September 2023
JUST ACCEPTED
A Robust and Energy Efficient Hyperdimensional Computing System for Voltage-scaled Circuits

Voltage scaling is one of the most promising approaches for energy efficiency improvement but also brings challenges to fully guaranteeing stable operation in modern VLSI. To tackle such issues, we further extend the DependableHD to the second version ...

research-article
Free
August 2023
JUST ACCEPTED
Real-Time Fixed Priority Scheduling Synthesis using Affine DataFlow Graphs: from Theory to Practice

The major drawback of using static schedules to execute dataflow applications is their high inflexibility. In real-time systems, periodic schedules make it easier to assert safety guarantees and to decrease the schedule size, but their characteristics ...

research-article
Free
August 2023
JUST ACCEPTED
Static Scheduling of Weight Programming for DNN Acceleration with Resource Constrained PIM

Most existing architectural studies on ReRAM-based processing-in-memory (PIM) DNN accelerators assume that all weights of the DNN can be mapped to the crossbar at once. However, these studies are over-idealized. ReRAM crossbar resources for calculation ...

research-article
Free
August 2023
JUST ACCEPTED
ReSG: A Data Structure for Verification of Majority based In-Memory Computing on ReRAM Crossbars

Recent advancements in the fabrication of Resistive Random Access Memory (ReRAM) devices have led to the development of large scale crossbar structures. In-memory computing architectures relying on ReRAM crossbars aim to mitigate the processor-memory ...

research-article
Open Access
Fast Loosely-Timed Deep Neural Network Models with Accurate Memory Contention

The emergence of data-intensive applications, such as Deep Neural Networks (DNN), exacerbates the well-known memory bottleneck in computer systems and demands early attention in the design flow. Electronic System-Level (ESL) design using SystemC ...

research-article
Free
July 2023
JUST ACCEPTED
Hercules: Enabling Atomic Durability for Persistent Memory with Transient Persistence Domain

Persistent memory (pmem) products bring the persistence domain up to the memory level. Intel recently introduced the eADR feature that guarantees to flush data buffered in CPU cache to pmem on a power outage, thereby making the CPU cache a transient ...

research-article
Free
June 2023
JUST ACCEPTED
Analog In-memory Circuit Design of Polynomial Multiplication for Lattice Cipher Acceleration Application

As the core operation of lattice cipher, large-scale polynomial multiplication is the biggest computational bottleneck in its realization process. How to quickly calculate polynomial multiplication under resource constraints has become an urgent problem ...

research-article
Free
June 2023
JUST ACCEPTED
AMP: Total Variation Reduction for Lossless Compression via Approximate Median-based Preconditioning

With the increasing scale of cloud computing applications of next-generation embedded systems, a major challenge that domain scientists are facing is how to efficiently store and analyze the vast volume of output data. Compression can reduce the amount of ...

research-article
Open Access
A Design Flow based on Docker and Kubernetes for ROS-based Robotic Software Applications

Human-centered robotic applications are becoming pervasive in the context of robotics and smart manufacturing and such a pervasiveness is even more expected with the shift to Industry 5.0. The always increasing level of autonomy of modern robotic ...

research-article
Open Access
HDLRuby: A Ruby Extension for Hardware Description and its Translation to Synthesizable Verilog HDL

HDLRuby is a new hardware description language defined as an extension of the Ruby programming language aiming to improve circuit design productivity. HDLRuby allows to model digital circuits at the register transfer level while supporting high-level ...

research-article
A proven translation from a UML state machine subset to timed automata

Although UML state machines constitute a convenient modeling formalism that is widely used in many applications, the lack of formal semantics impedes carrying out some automatic processing such as formal verification for instance. In this paper, we aim to ...

research-article
Open Access
Supervisory Control for Dynamic Feature Configuration in Product Lines

In this paper a framework for engineering supervisory controllers for product lines with dynamic feature configuration is proposed. The variability in valid configurations is described by a feature model. Behavior of system components is achieved using (...

research-article
The Sparse Synchronous Model on Real Hardware

We present the Sparse Synchronous model (SSM) of computation, which allows a programmer to specify software timing more precisely than the traditional “heartbeat” of mainstream operating systems or the synchronous languages. SSM is a mix of semantics ...

research-article
Synchronized Shared Memory and Black-box Procedural Abstraction: Towards a Formal Semantics of Blech

Traditional imperative synchronous programming languages heavily rely on a strict separation between data memory and communication signals. Signals can be shared between computational units but cannot be overwritten within a synchronous reaction cycle. ...

research-article
Code Generation For Neural Networks Based On Fixed-Point Arithmetic

Over the last few years, neural networks have started penetrating safety critical systems to make decisions as for example in robots, rockets and autonomous driving car. Neural networks based on floating-point arithmetic are very time and memory consuming,...

research-article
Open Access
From Lustre to Graphical Models and SCCharts

We introduce a systematic approach for automatically creating a visual diagram, akin to the graphical SCADE model, from a Lustre program. This not only saves tedious manual drawing effort but also enables modeling software to automatically provide the ...

research-article
Early SoCs Information Flow Policies Validation using SystemC-based Virtual Prototypes at the ESL

Virtual Prototypes (VPs) at the Electronic System Level (ESL) are being increasingly adopted by the semiconductor industry and play an important role in modernizing the System-on-Chips (SoCs) design flow to raise design productivity and reduce time-to-...