-
ONNX-to-Hardware Design Flow for Adaptive Neural-Network Inference on FPGAs
Authors:
Federico Manca,
Francesco Ratto,
Francesca Palumbo
Abstract:
The challenges involved in executing neural networks (NNs) at the edge include providing diversity, flexibility, and sustainability. That implies, for instance, supporting evolving applications and algorithms energy-efficiently. Using hardware or software accelerators can deliver fast and efficient computation of the NNs, while flexibility can be exploited to support long-term adaptivity. Nonethel…
▽ More
The challenges involved in executing neural networks (NNs) at the edge include providing diversity, flexibility, and sustainability. That implies, for instance, supporting evolving applications and algorithms energy-efficiently. Using hardware or software accelerators can deliver fast and efficient computation of the NNs, while flexibility can be exploited to support long-term adaptivity. Nonetheless, handcrafting an NN for a specific device, despite the possibility of leading to an optimal solution, takes time and experience, and that's why frameworks for hardware accelerators are being developed. This work, starting from a preliminary semi-integrated ONNX-to-hardware toolchain [21], focuses on enabling approximate computing leveraging the distinctive ability of the original toolchain to favor adaptivity. The goal is to allow lightweight adaptable NN inference on FPGAs at the edge.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
CPS Workshop 2023 Proceedings
Authors:
Christian Pilato,
Francesca Palumbo
Abstract:
These proceedings contain the contributions to the CPS workshop 2023 (http://www.cpsschool.eu/cps-workshop/). The CPS Workshop 2023 is an initiative of the CPS Summer School 2023 community to offer participants close contact with leading experts in the field and the opportunity to present and discuss their ideas in a dynamic and friendly setting.
These proceedings contain the contributions to the CPS workshop 2023 (http://www.cpsschool.eu/cps-workshop/). The CPS Workshop 2023 is an initiative of the CPS Summer School 2023 community to offer participants close contact with leading experts in the field and the opportunity to present and discuss their ideas in a dynamic and friendly setting.
△ Less
Submitted 22 December, 2023;
originally announced December 2023.
-
Harnessing Attention Mechanisms: Efficient Sequence Reduction using Attention-based Autoencoders
Authors:
Daniel Biermann,
Fabrizio Palumbo,
Morten Goodwin,
Ole-Christoffer Granmo
Abstract:
Many machine learning models use the manipulation of dimensions as a driving force to enable models to identify and learn important features in data. In the case of sequential data this manipulation usually happens on the token dimension level. Despite the fact that many tasks require a change in sequence length itself, the step of sequence length reduction usually happens out of necessity and in…
▽ More
Many machine learning models use the manipulation of dimensions as a driving force to enable models to identify and learn important features in data. In the case of sequential data this manipulation usually happens on the token dimension level. Despite the fact that many tasks require a change in sequence length itself, the step of sequence length reduction usually happens out of necessity and in a single step. As far as we are aware, no model uses the sequence length reduction step as an additional opportunity to tune the models performance. In fact, sequence length manipulation as a whole seems to be an overlooked direction. In this study we introduce a novel attention-based method that allows for the direct manipulation of sequence lengths. To explore the method's capabilities, we employ it in an autoencoder model. The autoencoder reduces the input sequence to a smaller sequence in latent space. It then aims to reproduce the original sequence from this reduced form. In this setting, we explore the methods reduction performance for different input and latent sequence lengths. We are able to show that the autoencoder retains all the significant information when reducing the original sequence to half its original size. When reducing down to as low as a quarter of its original size, the autoencoder is still able to reproduce the original sequence with an accuracy of around 90%.
△ Less
Submitted 23 October, 2023;
originally announced October 2023.
-
A multithread AES accelerator for Cyber-Physical Systems
Authors:
Francesco Ratto,
Luigi Raffo,
Francesca Palumbo
Abstract:
Computing elements of CPSs must be flexible to ensure interoperability; and adaptive to cope with the evolving internal and external state, such as battery level and critical tasks. Cryptography is a common task needed in CPSs to guarantee private communication among different devices. In this work, we propose a reconfigurable FPGA accelerator for AES workloads with different key lengths. The acce…
▽ More
Computing elements of CPSs must be flexible to ensure interoperability; and adaptive to cope with the evolving internal and external state, such as battery level and critical tasks. Cryptography is a common task needed in CPSs to guarantee private communication among different devices. In this work, we propose a reconfigurable FPGA accelerator for AES workloads with different key lengths. The accelerator architecture exploits tagged-dataflow models to support the concurrent execution of multiple threads on the same accelerator. This solution demonstrates to be more resource- and energy-efficient than a set of non-reconfigurable accelerators while keeping high performance and flexibility of execution.
△ Less
Submitted 19 June, 2023;
originally announced June 2023.
-
Reconfigurable and approximate computing for video coding
Authors:
Francesca Palumbo,
Carlo Sau
Abstract:
The Chapter begins with a discussion of the constraints and needs of video coding systems. The lack in flexibility of traditional monolithic codec specifications, not suitable to model commonalities among codecs and foster reusability among successive codec generations/updates, was the main trigger for the development of a new standard initiative within the ISO/IEC MPEG committee, called reconfigu…
▽ More
The Chapter begins with a discussion of the constraints and needs of video coding systems. The lack in flexibility of traditional monolithic codec specifications, not suitable to model commonalities among codecs and foster reusability among successive codec generations/updates, was the main trigger for the development of a new standard initiative within the ISO/IEC MPEG committee, called reconfigurable video coding (RVC). The MPEG-RVC framework exploits the dataflow nature behind video coding to foster flexible and reconfigurable codec design, as well as to support dynamic reconfiguration. The Chapter goes on to consider that the inherent resiliency of various functional blocks (like motion estimation in the high-efficiency video coding, HEVC) and the varying levels of user perception make video coding suitable to apply approximate computing techniques. Approximate computing, if properly supported at design time, allows achieving run-time trade-offs, representing a new direction in hardware-software codesign research. The main assumption behind approximate computing, exploited within video coding, is that the degree of accuracy (in this case during codec execution) is not required to be the same all the time. The final part of the Chapter attempts to put together the concepts addressed and remarks on which are, in the authors' opinion, some interesting research directions.
△ Less
Submitted 5 March, 2021;
originally announced March 2021.
-
The Multi-Dataflow Composer Tool: an open-source tool suite for Optimized Coarse-Grain Reconfigurable Hardware Accelerators and Platform Design
Authors:
Carlo Sau,
Tiziana Fanni,
Claudio Rubattu,
Luigi Raffo,
Francesca Palumbo
Abstract:
Modern embedded and cyber-physical systems require every day more performance, power efficiency and flexibility, to execute several profiles and functionalities targeting the ever growing adaptivity needs and preserving execution efficiency. Such requirements pushed designers towards the adoption of heterogeneous and reconfigurable substrates, which development and management is not that straightf…
▽ More
Modern embedded and cyber-physical systems require every day more performance, power efficiency and flexibility, to execute several profiles and functionalities targeting the ever growing adaptivity needs and preserving execution efficiency. Such requirements pushed designers towards the adoption of heterogeneous and reconfigurable substrates, which development and management is not that straightforward. Despite acceleration and flexibility are desirable in many domains, the barrier of hardware deployment and operation is still there since specific advanced expertise and skills are needed. Related challenges are effectively tackled by leveraging on automation strategies that in some cases, as in the proposed work, exploit model-based approaches. This paper is focused on the Multi-Dataflow Composer (MDC) tool, that intends to solve issues related to design, optimization and operation of coarse-grain reconfigurable hardware accelerators and their easy adoption in modern heterogeneous substrates. MDC latest features and improvements are introduced in detail and have been assessed on the so far unexplored robotics application field. A multi-profile trajectory generator for a robotic arm is implemented over a Xilinx FPGA board to show in which cases coarse-grain reconfiguration can be applied and which can be the parameters and trade-offs MDC will allow users to play with.
△ Less
Submitted 5 March, 2021;
originally announced March 2021.
-
Run-time Performance Monitoring of Heterogenous Hw/Sw Platforms Using PAPI
Authors:
Tiziana Fanni,
Daniel Madronal,
Claudio Rubattu,
Carlo Sau,
Francesca Palumbo,
Eduardo Juarez,
Maxime Pelcat,
Cesar Sanz,
Luigi Raffo
Abstract:
In the era of Cyber Physical Systems, designers need to offer support for run-time adaptivity considering different constraints, including the internal status of the system. This work presents a run-time monitoring approach, based on the Performance Application Programming Interface, that offers a unified interface to transparently access both the standard Performance Monitoring Counters (PMCs) in…
▽ More
In the era of Cyber Physical Systems, designers need to offer support for run-time adaptivity considering different constraints, including the internal status of the system. This work presents a run-time monitoring approach, based on the Performance Application Programming Interface, that offers a unified interface to transparently access both the standard Performance Monitoring Counters (PMCs) in the CPUs and the custom ones integrated into hardware accelerators. Automatic tools offer to Sw programmers the support to design and implement Coarse-Grain Virtual Reconfigurable Circuits, instrumented with custom PMCs. This approach has been validated on a heterogeneous application for image/video processing with an overhead of 6% of the execution time.
△ Less
Submitted 1 March, 2021;
originally announced March 2021.
-
Wireless communication, identification and sensing technologies enabling integrated logistics: a study in the harbor environment
Authors:
Mario G. C. A. Cimino,
Nedo Celandroni,
Erina Ferro,
Davide La Rosa,
Filippo Palumbo,
Gigliola Vaglini
Abstract:
In the last decade, integrated logistics has become an important challenge in the development of wireless communication, identification and sensing technology, due to the growing complexity of logistics processes and the increasing demand for adapting systems to new requirements. The advancement of wireless technology provides a wide range of options for the maritime container terminals. Electroni…
▽ More
In the last decade, integrated logistics has become an important challenge in the development of wireless communication, identification and sensing technology, due to the growing complexity of logistics processes and the increasing demand for adapting systems to new requirements. The advancement of wireless technology provides a wide range of options for the maritime container terminals. Electronic devices employed in container terminals reduce the manual effort, facilitating timely information flow and enhancing control and quality of service and decision made. In this paper, we examine the technology that can be used to support integration in harbor's logistics. In the literature, most systems have been developed to address specific needs of particular harbors, but a systematic study is missing. The purpose is to provide an overview to the reader about which technology of integrated logistics can be implemented and what remains to be addressed in the future.
△ Less
Submitted 21 October, 2015;
originally announced October 2015.