Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Dimitrios Soudris

Mitigating software vulnerabilities typically requires source code refactorings for implementing necessary security mechanisms. These mechanisms, although they enhance software security, they usually execute a large number of... more
Mitigating software vulnerabilities typically requires source code refactorings for implementing necessary security mechanisms. These mechanisms, although they enhance software security, they usually execute a large number of instructions, adding a performance/energy penalty to the overall application. Conversely, source code transformations are extensively performed by developers in order to improve the runtime quality of applications, in terms of performance and energy efficiency. These transformations may indirectly affect software security, since they may lead to the introduction of new security issues. In this work, we empirically examine the impact of source code-level energy/performance optimizations on software security and vice versa. The preliminary results of the empirical study suggest that the energy-related transformations may indirectly affect software security, whereas the incremental addition of security mechanisms may lead to an important increase in the energy con...
Deep Neural Networks (DNNs) are very popular because of their high performance in various cognitive tasks in Machine Learning (ML). Recent advancements in DNNs have brought levels beyond human accuracy in many tasks, but at the cost of... more
Deep Neural Networks (DNNs) are very popular because of their high performance in various cognitive tasks in Machine Learning (ML). Recent advancements in DNNs have brought levels beyond human accuracy in many tasks, but at the cost of high computational complexity. To enable efficient execution of DNN inference, more and more research works, therefore, are exploiting the inherent error resilience of DNNs and employing Approximate Computing (AC) principles to address the elevated energy demands of DNN accelerators. This article provides a comprehensive survey and analysis of hardware approximation techniques for DNN accelerators. First, we analyze the state of the art, and by identifying approximation families, we cluster the respective works with respect to the approximation type. Next, we analyze the complexity of the performed evaluations (with respect to the dataset and DNN size) to assess the efficiency, potential, and limitations of approximate DNN accelerators. Moreover, a br...
In future mobile networks, the evolution of optical transport architectures enabling the flexible, scalable interconnection of Baseband Units (BBUs) and Radio Units (RUs) with heterogeneous interfaces is a significant issue. In this... more
In future mobile networks, the evolution of optical transport architectures enabling the flexible, scalable interconnection of Baseband Units (BBUs) and Radio Units (RUs) with heterogeneous interfaces is a significant issue. In this paper, we propose a multi-technology hybrid transport architecture that comprises both analog and digital-Radio over Fiber (RoF) mobile network segments relying on a dynamically reconfigurable optical switching node. As a step forward, the integration of the discussed network layout into an existing mobile infrastructure is demonstrated, enabling the support of real-world services through both standard digital and Analog–Intermediate- Frequency over Fiber (A-IFoF)-based converged fiber–wireless paths. Emphasis has been placed on the implementation of a real-time A-IFoF transceiver that is employed through a single embedded fully programmable gateway array (FPGA)-based platform that serves as an Ethernet to Intermediate Frequency (IF) bridge for the trans...
... B. Exploration for the connection box connectivity The effect of the connection box connectivity value, Fc, on the FPGA characteristics are explored for the combination of the Disjoint SB with L1 and L1&L2 since they... more
... B. Exploration for the connection box connectivity The effect of the connection box connectivity value, Fc, on the FPGA characteristics are explored for the combination of the Disjoint SB with L1 and L1&L2 since they present the lowest energy consumption with small ...
A systematic methodology for synthesizing optimal VLSI residue number system architectures using full adders (FAs) as the basic building block is introduced. The design methodology derives array architectures starting from the algorithmic... more
A systematic methodology for synthesizing optimal VLSI residue number system architectures using full adders (FAs) as the basic building block is introduced. The design methodology derives array architectures starting from the algorithmic level. Taking into account the target architecture, the proposed synthesis procedure derives a dependence graph of the algorithm using uniform recurrent equations, specifies the architecture topology, allocates, and schedules the computations within FAs. The derived architectures, called inner product step processors, can be used as the processing element of a regular array architecture. The design methodology derives FA-based implementations that completely eliminate the need for ROM-table look-up. The resulting architectures exhibit less hardware complexity and much higher throughput rates than ROM-based ones
The workloads of Convolutional Neural Networks (CNNs) exhibit a streaming nature that makes them attractive for reconfigurable architectures such as the Field-Programmable Gate Arrays (FPGAs), while their increased need for low-power and... more
The workloads of Convolutional Neural Networks (CNNs) exhibit a streaming nature that makes them attractive for reconfigurable architectures such as the Field-Programmable Gate Arrays (FPGAs), while their increased need for low-power and speed has established Application-Specific Integrated Circuit (ASIC)-based accelerators as alternative efficient solutions. During the last five years, the development of Hardware Description Language (HDL)-based CNN accelerators, either for FPGA or ASIC, has seen huge academic interest due to their high-performance and room for optimizations. Towards this direction, we propose a library-based framework, which extends TensorFlow, the well-established machine learning framework, and automatically generates high-throughput CNN inference engines for FPGAs and ASICs. The framework allows software developers to exploit the benefits of FPGA/ASIC acceleration without requiring any expertise on HDL development and low-level design. Moreover, it provides a s...
Efficient system-level prototyping of power-aware dynamic
ABSTRACT
This chapter describes a complete system for the implementation of digital logic in a fine-grain reconfigurable platform (FPGA). The energy-efficient FPGA architecture is designed and simulated in STM 0.18μm CMOS technology. The detailed... more
This chapter describes a complete system for the implementation of digital logic in a fine-grain reconfigurable platform (FPGA). The energy-efficient FPGA architecture is designed and simulated in STM 0.18μm CMOS technology. The detailed design and circuit characteristics of the Configurable Logic Block and the interconnection network are determined and evaluated in terms of energy, delay and area. A number of circuit-level low-power techniques are employed because power consumption is the primary concern. Additionally, a complete tool framework for the implementation of digital logic circuits in FPGA platforms is introduced. The framework is composed of i) non-modified academic tools, ii) modified academic tools and iii) new tools. The de-veloped tool framework supports a variety of FPGA architectures. Qualitative and quantitative comparisons with existing academic and commercial architectures and tools are provided, yielding promising results. 3.1 Introduction FPGAs have recently ...
ABSTRACT
As discussed in Chap. 1, in nomadic embedded systems an increasing amount of applications (e.g., 3D games, video-players) coming from the general-purpose domain need to be mapped onto a cheap and compact device. However, embedded systems... more
As discussed in Chap. 1, in nomadic embedded systems an increasing amount of applications (e.g., 3D games, video-players) coming from the general-purpose domain need to be mapped onto a cheap and compact device. However, embedded systems struggle to execute these complex applications because they come from desktop systems, holding very different restrictions regarding memory use features, and more concretely not concerned with the efficient use of the dynamic memory. Today, a desktop computer typically includes at least 2–8 GB of RAM memory, as opposed to the 256–1024 MB range present in low-power respectively high-end nomadic embedded systems. Therefore, one of the main steps during the porting process of multimedia applications (that were initially developed on a PC) onto embedded multimedia systems, involves the optimization of the dynamic memory subsystem.
Our aim is the development of a novel probabilistic method to estimate the power consumption of a combinational circuit under real gate delay model handling temporal, structural and input pattern dependencies. The chosen gate delay model... more
Our aim is the development of a novel probabilistic method to estimate the power consumption of a combinational circuit under real gate delay model handling temporal, structural and input pattern dependencies. The chosen gate delay model allows handling both the functional and spurious transitions. It is proved that the switching activity evaluation problem assuming real gate delay model is reduced to the zero delay switching activity evaluation problem at specific time instances. A modified Boolean function, which describes the logic behavior of a signal at any time instance, including time parameter is introduced. Moreover, a mathematical model based on Markov stochastic processes, which describes the temporal and spatial correlation in terms of the associated zero delay based parameters is presented. Based on the mathematical model and considering the modified Boolean function, a new algorithm to evaluate the switching activity at specific time instances using Ordering Binary Dec...
Management in MPSoC Platforms Arindam Mallik, Stylianos Mamagkakis , Christos Baloukas , Lazaros Papadopoulos , Dimitrios Soudris, Sander Stuijk , Olivera Jovanovic , Florian Schmoll , Daniel Cordes , Robert Pyka , Peter Marwedel ,... more
Management in MPSoC Platforms Arindam Mallik, Stylianos Mamagkakis , Christos Baloukas , Lazaros Papadopoulos , Dimitrios Soudris, Sander Stuijk , Olivera Jovanovic , Florian Schmoll , Daniel Cordes , Robert Pyka , Peter Marwedel , François Capman , Séverin Collet , Nikolaos Mitas, Dimitrios Kritharidis IMEC vzw, Belgium (arindam, mamagka)@imec.be; Institute of Communication and Computer Systems, Gr eece (dsoudris@microlab.ntua.gr, cmpalouk@ee.duth.gr); Eindhoven University of Technology, Netherlands (s. tuijk@tue.nl); TU-Dortmund, Germany (florian.schmoll@tu-dortmund.d e, olivera.jovanovic@udo.edu); Informatik Centrum Dortmund, Germany (marwedel, pyka, cordes)@icd.de ; THALES Communications, France (francois.capman, severin.collet)@fr.thalesgroup.com; INTRACOM Telecom, Greece (nmitas, dkri)@intracom.gr
The rapid evolution in sub-micron process technol- ogy allows presently more complex systems to be implemented in embedded devices. In the near future, portable consumer devices must run multimedia and wireless network applications that... more
The rapid evolution in sub-micron process technol- ogy allows presently more complex systems to be implemented in embedded devices. In the near future, portable consumer devices must run multimedia and wireless network applications that require an enormous computational performance (1-40GOPS) at a low energy consumption (0.1-2W ). In these multimedia and wireless network applications, the dynamic memory subsystem is currently
Software-defined radio (SDR) terminals are critical to enable concrete and consecutive inter-working between fourth generation wireless access systems or communication modes. The next generation of SDR terminals is intended to have heavy... more
Software-defined radio (SDR) terminals are critical to enable concrete and consecutive inter-working between fourth generation wireless access systems or communication modes. The next generation of SDR terminals is intended to have heavy hardware resource requirements and switching between them will introduce dynamism in respect with timing and size of resource requests. This paper presents a system-level framework which combines a
Today, wireless networks are becoming increasingly ubiquitous. Usually several complex multi-threaded applications are mapped on a single embedded system and all of them are triggered by a single wireless stream (which corresponds to the... more
Today, wireless networks are becoming increasingly ubiquitous. Usually several complex multi-threaded applications are mapped on a single embedded system and all of them are triggered by a single wireless stream (which corresponds to the dynamic run-time behavior of the user). It is almost impossible to analyze these systems fully at design-time. Therefore, run-time information has also to be used in
ABSTRACT The EU FP7 SWAN-iCare project aims at developing an integrated autonomous device for the monitoring and the personalized management of chronic wounds, mainly diabetic foot ulcers and venous leg ulcers. Most foot and leg ulcers... more
ABSTRACT The EU FP7 SWAN-iCare project aims at developing an integrated autonomous device for the monitoring and the personalized management of chronic wounds, mainly diabetic foot ulcers and venous leg ulcers. Most foot and leg ulcers are caused by diabetes and vascular problems respectively but a remarkable number of them are also due to the co-morbidity influence of many other diseases (e.g. kidney disease, congestive heart failure, high blood pressure, inflammatory bowel disease). More than 10 million people in Europe suffer from chronic wounds, a number which is expected to grow due to the aging of the population. The core of the project is the fabrication of a conceptually new wearable negative pressure device equipped with Information and Communication Technologies. Such device will allow users to: (a) accurately monitor many wound parameters via non-invasive integrated micro-sensors, (b) early identify infections and (c) remotely provide an innovative personalized two-line therapy via non-invasive micro-actuators to supplement the negative pressure wound therapy. This paper describes the main components of the SWAN-iCare system and its potential impact in the area of wound management.
This chapter addresses the design of two-level pipelined processor arrays. The parallelism of algorithms is exploited both in word-level and in bit-level operations. Given an algorithm in the form of a Fortran-like nested loop program, a... more
This chapter addresses the design of two-level pipelined processor arrays. The parallelism of algorithms is exploited both in word-level and in bit-level operations. Given an algorithm in the form of a Fortran-like nested loop program, a two-step procedure is applied. First, any word-level parallelism is exploited by using loop transformation techniques, which include a uniformization method, if required, and a
Research Interests:
Research Interests:
Research Interests:

And 379 more