Hardware Execution Time Prediction for Neural Network Layers

Osterwind, Adrian; Droste-Rehling, Julian; Vemparala, Manoj-Rohit; Helms, Domenik

doi:10.1007/978-3-031-23618-1_39

Adrian Osterwind ORCID: orcid.org/0000-0002-0752-8698⁴⁶,
Julian Droste-Rehling⁴⁷,
Manoj-Rohit Vemparala⁴⁸ &
…
Domenik Helms ORCID: orcid.org/0000-0001-8570-8363⁴⁶

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1752))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

1078 Accesses

Abstract

We present an estimation methodology, accurately predicting the execution time for a given embedded Artificial Intelligence (AI) accelerator and a neural network (NN) under analysis. The timing prediction is implemented as a python library called (MONNET) and is able to perform its predictions analyzing the Keras description of an NN under test within milliseconds. This enables several techniques to design NNs for embedded hardware. Designers can avoid training networks which could be functionally sufficient but will likely fail the timing requirements. The technique can also be included into automated network architecture search algorithms, enabling exact hardware execution times to become one contributor to the search’s target function.

In order to perform precise estimations for a target hardware, each new hardware needs to undergo an initial automatic characterization process, using tens of thousands of different small NNs. This process may need several days, depending on the hardware.

We tested our methodology for the Intel Neural Compute Stick 2, where we could achieve an (RMSPE) below 21% for a large range of industry relevant NNs from vision processing.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Efficient Neural Networks and Their Acceleration Techniques for Embedded Machine Learning

Hardware–Software Approximations for Deep Neural Networks

Efficient Hardware Acceleration of Emerging Neural Networks for Embedded Machine Learning: An Industry Perspective

References

Benmeziane, H., Maghraoui, K.E., Ouarnoughi, H., Niar, S., Wistuba, M., Wang, N.: A comprehensive survey on hardware-aware neural architecture search (2021). https://doi.org/10.48550/ARXIV.2101.09336, https://arxiv.org/abs/2101.09336
Intel®: Openvino™. https://docs.openvino.ai/latest/get_started.html
Mori, P., et al.: Accelerating and pruning CNNs for semantic segmentation on FPGA. In: Design Automation Conference (DAC) (2022)
Google Scholar
Parashar, A., et al.: Timeloop: a systematic approach to DNN accelerator evaluation, pp. 304–315, March 2019. https://doi.org/10.1109/ISPASS.2019.00042
Patterson, D.A., Hennessy, J.L.: Computer Organization and Design: The Hardware/Software Interface, 5th edn. (2013)
Google Scholar
Siu, K., Stuart, D.M., Mahmoud, M., Moshovos, A.: Memory requirements for convolutional neural network hardware accelerators. In: 2018 IEEE International Symposium on Workload Characterization (IISWC), pp. 111–121 (2018). https://doi.org/10.1109/IISWC.2018.8573527
Sotiriou-Xanthopoulos, E., Percy Delicia, G.S., Figuli, P., Siozios, K., Economakos, G., Becker, J.: A power estimation technique for cycle-accurate higher-abstraction SystemC-based CPU models. In: 2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS), pp. 70–77 (2015). https://doi.org/10.1109/SAMOS.2015.7363661
Wess, M., Ivanov, M., Unger, C., Nookala, A., Wendt, A., Jantsch, A.: Annette: accurate neural network execution time estimation with stacked models. IEEE Access 9, 3545–3556, December 2020. https://doi.org/10.1109/ACCESS.2020.3047259
Yao, S., et al.: FastDeepIoT: towards understanding and optimizing neural network execution time on mobile and embedded devices. In: Proceedings of the 16th ACM Conference on Embedded Networked Sensor Systems. ACM, November 2018. https://doi.org/10.1145/3274783.3274840

Download references

Acknowledgment

This publication was created as part of the research project KI Delta Learning (project number: 19A19013K) funded by the Federal Ministry for Economic Affairs and Energy (BMWi) on the basis of a decision by the German Bundestag.

Author information

Authors and Affiliations

German Aerospace Center (DLR), Institute of Systems Engineering for Future Mobility, Oldenburg, Germany
Adrian Osterwind & Domenik Helms
Siemens AG (Bremen), Bremen, Germany
Julian Droste-Rehling
BMW Autonomous Driving, Munich, Germany
Manoj-Rohit Vemparala

Authors

Adrian Osterwind
View author publications
You can also search for this author in PubMed Google Scholar
Julian Droste-Rehling
View author publications
You can also search for this author in PubMed Google Scholar
Manoj-Rohit Vemparala
View author publications
You can also search for this author in PubMed Google Scholar
Domenik Helms
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Adrian Osterwind .

Editor information

Editors and Affiliations

University of Sydney, Sydney, Australia
Irena Koprinska
University of Bari Aldo Moro, Bari, Italy
Paolo Mignone
University of Pisa, Pisa, Italy
Riccardo Guidotti
Warsaw University of Technology, Warsaw, Poland
Szymon Jaroszewicz
Heidelberg University, Heidelberg, Germany
Holger Fröning
UniCredit, Rome, Italy
Francesco Gullo
University of Lisbon, Lisbon, Portugal
Pedro M. Ferreira
Roche, Basel, Switzerland
Damian Roqueiro
Barcelona Supercomputing Center, Barcelona, Spain
Gaia Ceddia
Halmstad University, Halmstad, Sweden
Slawomir Nowaczyk
University of Porto, Porto, Portugal
João Gama
University of Porto, Porto, Portugal
Rita Ribeiro
UPC BarcelonaTech, Barcelona, Spain
Ricard Gavaldà
University of Naples Federico II, Naples, Italy
Elio Masciari
University of North Carolina, Charlotte, USA
Zbigniew Ras
ICAR-CNR, Rende, Italy
Ettore Ritacco
University of Pisa, Pisa, Italy
Francesca Naretto
Aalen University of Applied Sciences, Aalen, Germany
Andreas Theissler
Warsaw University of Technology, Warszaw, Poland
Przemyslaw Biecek
KU Leuven, Leuven, Belgium
Wouter Verbeke
University of Duisburg-Essen, Essen, Germany
Gregor Schiele
Graz University of Technology, Graz, Austria
Franz Pernkopf
AMD, Dublin, Ireland
Michaela Blott
UniCredit, Rome, Italy
Ilaria Bordino
UniCredit, Milan, Italy
Ivan Luciano Danesi
National Agency for New Technologies, Rome, Italy
Giovanni Ponti
Unicredit, Rome, Italy
Lorenzo Severini
University of Bari Aldo Moro, Bari, Italy
Annalisa Appice
University of Bari Aldo Moro, Bari, Italy
Giuseppina Andresini
University of Lisbon, Lisbon, Portugal
Ibéria Medeiros
University of Lisbon, Lisbon, Portugal
Guilherme Graça
Northwestern University, Chicago, USA
Lee Cooper
Roche, Basel, Switzerland
Naghmeh Ghazaleh
University of Lausanne, Lausanne, Switzerland
Jonas Richiardi
Novartis, Basel, Switzerland
Diego Saldana
Novartis, Basel, Switzerland
Konstantinos Sechidis
Fondazione IRCCS Ca’ Granda Ospedale Maggiore Policlinico, Milan, Italy
Arif Canakoglu
Politecnico di Milano, Milan, Italy
Sara Pido
Politecnico di Milano, Milan, Italy
Pietro Pinoli
University of Waikato, Hamilton, New Zealand
Albert Bifet
Halmstad University, Halmstad, Sweden
Sepideh Pashami

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Osterwind, A., Droste-Rehling, J., Vemparala, MR., Helms, D. (2023). Hardware Execution Time Prediction for Neural Network Layers. In: Koprinska, I., et al. Machine Learning and Principles and Practice of Knowledge Discovery in Databases. ECML PKDD 2022. Communications in Computer and Information Science, vol 1752. Springer, Cham. https://doi.org/10.1007/978-3-031-23618-1_39

Download citation

DOI: https://doi.org/10.1007/978-3-031-23618-1_39
Published: 31 January 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-23617-4
Online ISBN: 978-3-031-23618-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Hardware Execution Time Prediction for Neural Network Layers

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Efficient Neural Networks and Their Acceleration Techniques for Embedded Machine Learning

Hardware–Software Approximations for Deep Neural Networks

Efficient Hardware Acceleration of Emerging Neural Networks for Embedded Machine Learning: An Industry Perspective

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Hardware Execution Time Prediction for Neural Network Layers

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Efficient Neural Networks and Their Acceleration Techniques for Embedded Machine Learning

Hardware–Software Approximations for Deep Neural Networks

Efficient Hardware Acceleration of Emerging Neural Networks for Embedded Machine Learning: An Industry Perspective

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation