Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleFebruary 2024
Extreme-scale Direct Numerical Simulation of Incompressible Turbulence on the Heterogeneous Many-core System
PPoPP '24: Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel ProgrammingMarch 2024, Pages 120–132https://doi.org/10.1145/3627535.3638479Direct numerical simulation (DNS) is a technique that directly solves the fluid Navier-Stokes equations with high spatial and temporal resolutions, which has driven much research regarding the nature of turbulence. For high-Reynolds number (Re) ...
- research-articleDecember 2022
Spatz: A Compact Vector Processing Unit for High-Performance and Energy-Efficient Shared-L1 Clusters
ICCAD '22: Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided DesignOctober 2022, Article No.: 22, Pages 1–9https://doi.org/10.1145/3508352.3549367While parallel architectures based on clusters of Processing Elements (PEs) sharing L1 memory are widespread, there is no consensus on how lean their PE should be. Architecting PEs as vector processors holds the promise to greatly reduce their ...
- research-articleApril 2022
Queuing Ports for Mesh Based Many-Core Processors
ACM SIGAda Ada Letters (SIGADA), Volume 41, Issue 2December 2021, Pages 66–70https://doi.org/10.1145/3530801.3530804This paper presents the implementation of Queuing Ports, a blocking communication protocol developed for manycore architectures that perform a synchronized communication between cores without the need of polling. This implementation has been performed ...
- research-articleMay 2022
Design of many-core big little μbrains for energy-efficient embedded neuromorphic computing
DATE '22: Proceedings of the 2022 Conference & Exhibition on Design, Automation & Test in EuropeMarch 2022, Pages 1011–1016As spiking-based deep learning inference applications are increasing in embedded systems, these systems tend to integrate neuromorphic accelerators such as μBrain to improve energy efficiency. We propose a μBrain-based scalable many-core neuromorphic ...
- research-articleMay 2022
MemPool-3D: boosting performance and efficiency of shared-l1 memory many-core clusters with 3D integration
- Matheus Cavalcante,
- Anthony Agnesina,
- Samuel Riedel,
- Moritz Brunion,
- Alberto García-Ortiz,
- Dragomir Milojevic,
- Francky Catthoor,
- Sung Kyu Lim,
- Luca Benini
DATE '22: Proceedings of the 2022 Conference & Exhibition on Design, Automation & Test in EuropeMarch 2022, Pages 394–399Three-dimensional integrated circuits promise power, performance, and footprint gains compared to their 2D counterparts, thanks to drastic reductions in the interconnects' length through their smaller form factor. We can leverage the potential of 3D ...
-
- research-articleJanuary 2022
Federated scheduling in clustered many-core processors
DS-RT '21: Proceedings of the 2021 IEEE/ACM 25th International Symposium on Distributed Simulation and Real Time ApplicationsSeptember 2021, Article No.: 12, Pages 1–8https://doi.org/10.1109/DS-RT52167.2021.9576150High-performance embedded systems, such as self-driving systems require platforms that reduce power consumption and perform high-performance processing. As satisfying both requirements, multi-/many-core processors are attracting attention. This paper ...
- research-articleFebruary 2022
Research on Full-Chip Programming for Sunway Heterogeneous Many-core Processor
WSSE '21: Proceedings of the 3rd World Symposium on Software EngineeringSeptember 2021, Pages 174–179https://doi.org/10.1145/3488838.3488868Programming on many-core processors is a challenging task. It's a difficult topic to program and compile on heterogeneous many-core architectures in high-performance computing area. The bottom-level programming support on Sunway many-core processors is ...
- research-articleJune 2020
Embedded social insect-inspired intelligence networks for system-level runtime management
DATE '20: Proceedings of the 23rd Conference on Design, Automation and Test in EuropeMarch 2020, Pages 1550–1555Large-scale distributed computing architectures such as, e.g. systems on chip or many-core devices, offer advantages over monolithic or centralised single-core systems in terms of speed, power/thermal performance and fault tolerance. However, these are ...
- research-articleJune 2020
Unified thread- and data-mapping for multi-threaded multi-phase applications on SPM many-cores
DATE '20: Proceedings of the 23rd Conference on Design, Automation and Test in EuropeMarch 2020, Pages 1496–1501Scratchpad Memories (SPMs) are more scalable than caches as they offer better performance with lower power and area overheads. This scalability advocates their suitability as on-chip memory in many-cores. However, SPM many-cores delegate the ...
- research-articleOctober 2019
Distributed SDN architecture for NoC-based many-core SoCs
NOCS '19: Proceedings of the 13th IEEE/ACM International Symposium on Networks-on-ChipOctober 2019, Article No.: 8, Pages 1–8https://doi.org/10.1145/3313231.3352361In the Software-Defined Networking (SDN) paradigm, routers are generic and programmable forwarding units that transmit packets according to a given policy defined by a software controller. Recent research has shown the potential of such a communication ...
- research-articleApril 2020
Multi-rate DAG scheduling considering communication contention for NoC-based embedded many-core processor
DS-RT '19: Proceedings of the 23rd IEEE/ACM International Symposium on Distributed Simulation and Real Time ApplicationsOctober 2019, Pages 283–292Computing platforms for embedded systems are increasingly being transformed into multi/many-core platforms because embedded systems have become extensive, complex, and automated. In the case of an autonomous driving system, various applications are ...
- research-articleAugust 2019
Fine-grain temperature monitoring for many-core systems
SBCCI '19: Proceedings of the 32nd Symposium on Integrated Circuits and Systems DesignAugust 2019, Article No.: 4, Pages 1–6https://doi.org/10.1145/3338852.3339841The power density may limit the amount of energy a many-core system can consume. A many-core at its maximum performance may lead to safe temperature violations and, consequently, result in reliability issues. Dynamic Thermal Management (DTM) techniques ...
- extended-abstractJune 2019
Scalable Reservoir Sampling on Many-Core CPUs
SIGMOD '19: Proceedings of the 2019 International Conference on Management of DataJune 2019, Pages 1817–1819https://doi.org/10.1145/3299869.3300096Database systems need to be able to convert queries to efficient execution plans. As recent research has shown, correctly estimating cardinalities of subqueries is an important factor in the efficiency of the resulting plans [7, 8]. Many algorithms have ...
- research-articleJune 2019
Self-Adaptive QoS Management of Computation and Communication Resources in Many-Core SoCs
ACM Transactions on Embedded Computing Systems (TECS), Volume 18, Issue 4Article No.: 37, Pages 1–21https://doi.org/10.1145/3328755Providing quality of service (QoS) for many-core systems with dynamic application admission is challenging due to the high amount of resources to manage and the unpredictability of computation and communication events. Related works propose a self-...
- research-articleMay 2019
Transitioning Spiking Neural Network Simulators to Heterogeneous Hardware
SIGSIM-PADS '19: Proceedings of the 2019 ACM SIGSIM Conference on Principles of Advanced Discrete SimulationMay 2019, Pages 115–126https://doi.org/10.1145/3316480.3322893Spiking neural networks (SNN) are among the most computationally intensive types of simulation models, with node counts on the order of up to 10^11. Currently, there is intensive research into hardware platforms suitable to support large-scale SNN ...
- research-articleApril 2019
GEM5-X: a GEM5-based system level simulation framework to optimize many-core platforms
HPC '19: Proceedings of the High Performance Computing SymposiumApril 2019, Article No.: 8, Pages 1–12The rapid expansion of online-based services requires novel energy and performance efficient architectures to meet power and latency constraints. Fast architectural exploration has become a key enabler in the proposal of architectural innovation. In ...
- articleMarch 2019
Exploiting multi–core and many–core parallelism for subspace clustering
International Journal of Applied Mathematics and Computer Science (IJAMCS), Volume 29, Issue 1Mar 2019, Pages 81–91https://doi.org/10.2478/amcs-2019-0006AbstractFinding clusters in high dimensional data is a challenging research problem. Subspace clustering algorithms aim to find clusters in all possible subspaces of the dataset, where a subspace is a subset of dimensions of the data. But the exponential ...
- research-articleFebruary 2019
Process Barrier for Predictable and Repeatable Concurrent Execution
PMAM'19: Proceedings of the 10th International Workshop on Programming Models and Applications for Multicores and ManycoresFebruary 2019, Pages 71–80https://doi.org/10.1145/3303084.3309494We study on how to design, debug and verify and validate (V&V) safety-critical control software running on shared-memory many-core platforms. Managing concurrency in a verifiable way is a certification requirement. The presented process barrier is a ...
- research-articleNovember 2018
A Design-Time/Run-Time Application Mapping Methodology for Predictable Execution Time in MPSoCs
ACM Transactions on Embedded Computing Systems (TECS), Volume 17, Issue 5Article No.: 89, Pages 1–25https://doi.org/10.1145/3274665Executing multiple applications on a single MPSoC brings the major challenge of satisfying multiple quality requirements regarding real-time, energy, and so on. Hybrid application mapping denotes the combination of design-time analysis with run-time ...
- research-articleAugust 2018
Vectorized Parallel Sparse Matrix-Vector Multiplication in PETSc Using AVX-512
ICPP '18: Proceedings of the 47th International Conference on Parallel ProcessingAugust 2018, Article No.: 55, Pages 1–10https://doi.org/10.1145/3225058.3225100Emerging many-core CPU architectures with high degrees of single-instruction, multiple data (SIMD) parallelism promise to enable increasingly ambitious simulations based on partial differential equations (PDEs) via extreme-scale computing. However, such ...