Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–4 of 4 results for author: Symons, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.09804  [pdf, other

    cs.AR

    Optimizing Layer-Fused Scheduling of Transformer Networks on Multi-accelerator Platforms

    Authors: Steven Colleman, Arne Symons, Victor J. B. Jung, Marian Verhelst

    Abstract: The impact of transformer networks is booming, yet, they come with significant computational complexity. It is therefore essential to understand how to optimally map and execute these networks on modern neural processor hardware. So far, literature on transformer scheduling optimization has been focusing on deployment on GPU and specific ASICs. This work enables extensive hardware/mapping explorat… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: Accepted to ISQED2024

  2. arXiv:2304.12931  [pdf, other

    cs.AR cs.AI

    SALSA: Simulated Annealing based Loop-Ordering Scheduler for DNN Accelerators

    Authors: Victor J. B. Jung, Arne Symons, Linyan Mei, Marian Verhelst, Luca Benini

    Abstract: To meet the growing need for computational power for DNNs, multiple specialized hardware architectures have been proposed. Each DNN layer should be mapped onto the hardware with the most efficient schedule, however, SotA schedulers struggle to consistently provide optimum schedules in a reasonable time across all DNN-HW combinations. This paper proposes SALSA, a fast dual-engine scheduler to gen… ▽ More

    Submitted 14 June, 2024; v1 submitted 20 April, 2023; originally announced April 2023.

    Comments: 5 pages, 6 figures, open-source at https://github.com/ZigZag-Project/zigzag

  3. arXiv:2212.10612  [pdf, other

    cs.AR

    Towards Heterogeneous Multi-core Accelerators Exploiting Fine-grained Scheduling of Layer-Fused Deep Neural Networks

    Authors: Arne Symons, Linyan Mei, Steven Colleman, Pouya Houshmand, Sebastian Karl, Marian Verhelst

    Abstract: To keep up with the ever-growing performance demand of neural networks, specialized hardware (HW) accelerators are shifting towards multi-core and chiplet architectures. So far, these multi-accelerator systems exploit the increased parallelism by pipelining different NN layers across input batches on different cores to increase throughput. Yet, when pursuing this with non-batched layer-by-layer sc… ▽ More

    Submitted 20 December, 2022; originally announced December 2022.

    Comments: 9 pages + references, 15 figures

  4. arXiv:2212.05344  [pdf, other

    cs.AR cs.DC

    DeFiNES: Enabling Fast Exploration of the Depth-first Scheduling Space for DNN Accelerators through Analytical Modeling

    Authors: Linyan Mei, Koen Goetschalckx, Arne Symons, Marian Verhelst

    Abstract: DNN workloads can be scheduled onto DNN accelerators in many different ways: from layer-by-layer scheduling to cross-layer depth-first scheduling (a.k.a. layer fusion, or cascaded execution). This results in a very broad scheduling space, with each schedule leading to varying hardware (HW) costs in terms of energy and latency. To rapidly explore this vast space for a wide variety of hardware archi… ▽ More

    Submitted 14 June, 2024; v1 submitted 10 December, 2022; originally announced December 2022.

    Comments: Accepted by HPCA 2023