Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleAugust 2024
XCS with dynamic sized experience replay for memory constrained applications
GECCO '24 Companion: Proceedings of the Genetic and Evolutionary Computation Conference CompanionPages 1807–1814https://doi.org/10.1145/3638530.3664148The eXtended Classifier System (XCS) is the most widely studied classifier system in the community. It is a class of interpretable AI which has shown strong capability to master various classification and regression tasks. It has also shown strong ...
- posterJuly 2024
Hardware Assist for Linux IPC on an FPGA Platform
CF '24: Proceedings of the 21st ACM International Conference on Computing FrontiersPages 322–323https://doi.org/10.1145/3649153.3652998Specialized hardware units often accelerate compute-intensive or memory-heavy functions. In previous publications, we proposed concepts to assist Linux with a hardware unit for managing waiting threads to improve blocking inter-process communication (IPC)...
- research-articleJuly 2024
HASIIL: Hardware-Assisted Scheduling to Improve IPC Latency in Linux
CF '24: Proceedings of the 21st ACM International Conference on Computing FrontiersPages 80–87https://doi.org/10.1145/3649153.3649197Inter-processes communication (IPC) is essential for multi-threaded applications to achieve efficient execution. Synchronization through IPC can become a bottleneck for these applications. The effectiveness of IPC is determined by both its latency and ...
- research-articleOctober 2023
HW-FUTEX: Hardware-Assisted Futex Syscall
IEEE Transactions on Very Large Scale Integration (VLSI) Systems (ITVL), Volume 32, Issue 1Pages 16–29https://doi.org/10.1109/TVLSI.2023.3317926Efficient thread synchronization primitives are crucial in modern computer systems for the performant execution of interdependent code segments. In Linux, the futex() syscall is used to construct blocking synchronization primitives such as mutexes or ...
- posterJuly 2023
LCT-DER: Learning Classifier Table with Dynamic-Sized Experience Replay for Run-time SoC Performance-Power Optimization
GECCO '23 Companion: Proceedings of the Companion Conference on Genetic and Evolutionary ComputationPages 331–334https://doi.org/10.1145/3583133.3590573Learning classifier tables (LCTs) are lightweight, classifier based, hardware implemented reinforcement learning (RL) building blocks which enable self-adaptivity and self-optimization properties in multicore systems. LCTs are deployed per-core to ...
-
- ArticleAugust 2023
CoLeCTs: Cooperative Learning Classifier Tables for Resource Management in MPSoCs
AbstractThe increasing complexity and unpredictability of emerging applications makes it challenging for multi-processor system-on-chips to satisfy their performance requirements while keeping power consumption within bounds. In order to tackle this ...
- ArticleSeptember 2022
GAE-LCT: A Run-Time GA-Based Classifier Evolution Method for Hardware LCT Controlled SoC Performance-Power Optimization
AbstractLearning classifier tables (LCTs) are classifier based and lightweight hardware reinforcement learning building blocks which inherit the concepts of learning classifier systems. LCTs are used as a per-core low level controllers to learn and ...
- research-articleApril 2022
SmartNIC-based Load Management and Network Health Monitoring for Time Sensitive Applications
- Kilian Holzinger,
- Franz Biersack,
- Henning Stubbe,
- Angela Gonzalez Mariño,
- Abdoul Kane,
- Francesc Fons,
- Zhang Haigang,
- Thomas Wild,
- Andreas Herkersdorf,
- Georg Carle
NOMS 2022-2022 IEEE/IFIP Network Operations and Management SymposiumPages 1–6https://doi.org/10.1109/NOMS54207.2022.9789863Time sensitive network applications, for example in Intra-Vehicular Networks, aim to give predictable end-to-end latency guarantees. As a consequence, processing resources of involved host systems remain partially unused, because they are reserved for ...
- posterDecember 2021
Precise real-time monitoring of time-critical flows
- Kilian Holzinger,
- Henning Stubbe,
- Franz Biersack,
- Angela Gonzalez Mariño,
- Abdoul Kane,
- Francisco Fons Lluis,
- Zhang Haigang,
- Thomas Wild,
- Andreas Herkersdorf,
- Georg Carle
CoNEXT '21: Proceedings of the 17th International Conference on emerging Networking EXperiments and TechnologiesPages 489–490https://doi.org/10.1145/3485983.3493356Ethernet is increasingly used in areas where time-critical and safety-relevant data are transported over the network along with best-effort flows, for example in intra vehicle networks or industrial networks. The resulting complex network architectures, ...
- research-articleNovember 2021
Protection switching schemes and mapping strategies for fail-operational hard real-time NoCs
Microprocessors & Microsystems (MSYS), Volume 87, Issue Chttps://doi.org/10.1016/j.micpro.2021.104385AbstractCommunication infrastructures designed for mixed-critical MPSoCs must provide isolation of traffic, hard real-time guarantees, and fault-tolerance. In previous work, we proposed the combination of protection-switching with a hybrid ...
- research-articleNovember 2021
Exploring a Hybrid Voting-based Eviction Policy for Caches and Sparse Directories on Manycore Architectures
Microprocessors & Microsystems (MSYS), Volume 87, Issue Chttps://doi.org/10.1016/j.micpro.2021.104384AbstractIn manycore systems, eviction decisions related to caches and memory coherence greatly impact system performance, thereby emphasizing their importance. Extensive research has produced numerous standalone eviction policies such as LRU, ...
- short-paperMay 2022
PEPERONI: Pre-Estimating the Performance of Near-Memory Integration
MEMSYS '21: Proceedings of the International Symposium on Memory SystemsArticle No.: 9, Pages 1–6https://doi.org/10.1145/3488423.3519329Near-memory integration strives to tackle the challenge of low data locality and power consumption originating from cross- chip data transfers, meanwhile referred to as locality wall. In order to keep costly engineering efforts bounded when ...
- research-articleAugust 2021
DynaCo: Dynamic Coherence Management for Tiled Manycore Architectures
International Journal of Parallel Programming (IJPP), Volume 49, Issue 4Pages 570–599https://doi.org/10.1007/s10766-020-00688-6AbstractEmbedded system applications, with their inherently limited parallelism, rarely exploit all available processing resources in large DSM-based manycore architectures. From a cache coherence perspective, this provides an opportunity to move away ...
- research-articleAugust 2021
DySHARQ: Dynamic Software-Defined Hardware-Managed Queues for Tile-Based Architectures
- research-articleJuly 2021
SEAMS: Self-Optimizing Runtime Manager for Approximate Memory Hierarchies
ACM Transactions on Embedded Computing Systems (TECS), Volume 20, Issue 5Article No.: 48, Pages 1–26https://doi.org/10.1145/3466875Memory approximation techniques are commonly limited in scope, targeting individual levels of the memory hierarchy. Existing approximation techniques for a full memory hierarchy determine optimal configurations at design-time provided a goal and ...
- research-articleMarch 2021
X-Centric: A Survey on Compute-, Memory- and Application-Centric Computer Architectures
MEMSYS '20: Proceedings of the International Symposium on Memory SystemsPages 178–193https://doi.org/10.1145/3422575.3422792Big Data and machine learning constitute the multifaceted challenge of computer engineering in the past decade. The meaningful processing of vast amounts of unstructured data from a myriad of sensors and devices is a complicated endeavor already. ...
- research-articleAugust 2020
Machine Learning Approaches for Efficient Design Space Exploration of Application-Specific NoCs
ACM Transactions on Design Automation of Electronic Systems (TODAES), Volume 25, Issue 5Article No.: 44, Pages 1–27https://doi.org/10.1145/3403584In many Multi-Processor Systems-on-Chip (MPSoCs), traffic between cores is unbalanced. This motivates the use of an application-specific Network-on-Chip (NoC) that is customized and can provide a high performance at low cost in terms of power and area. ...
- research-articleJuly 2020
Combinatorial Auctions for Temperature-Constrained Resource Management in Manycores
IEEE Transactions on Parallel and Distributed Systems (TPDS), Volume 31, Issue 7Pages 1605–1620https://doi.org/10.1109/TPDS.2020.2965523Although manycore processors have plenty of cores, not all of them may run simultaneously at full speed and even some of them might need to be power-gated in order to keep the chip within safe temperature limits. Hence, a resource management technique, ...