Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Abstract

We present Adrastea, an efficient FPGA design environment for developing scientific machine learning applications. FPGA development is challenging, from deployment, proper toolchain setup, programming methods, interfacing FPGA kernels, and more importantly, the need to explore design space choices to get the best performance and area usage from the FPGA kernel design. Adrastea provides an automated and scalable design flow to parameterize, implement, and optimize complex FPGA kernels and associated interfaces. We show how virtualization of the development environment via virtual machines is leveraged to simplify the setup of the FPGA toolchain while deploying the FPGA boards and while scaling up the automated design space exploration to leverage multiple machines concurrently. Adrastea provides an automated build and test environment of FPGA kernels. By exposing design space hyper-parameters, Adrastea can automatically search the design space in parallel to optimize the FPGA design for a given metric, usually performance or area. Adrastea simplifies the task of interfacing with the FPGA kernels with a simplified interface API. To demonstrate the capabilities of Adrastea, we implement a complex random forest machine learning kernel with 10,000 input features while achieving extremely low computing latency without loss of prediction accuracy, which is required by a scientific edge application at SNS. We also demonstrate Adrastea using an FFT kernel and show that for both applications Adrastea is able to systematically and efficiently evaluate different design options, which reduced the time and effort required to develop the kernel from months of manual work to days of automatic builds.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Cabrera, A.M., Young, A.R., Vetter, J.S.: Design and analysis of cxl performance models for tightly-coupled heterogeneous computing. In: Proceedings of the 1st International Workshop on Extreme Heterogeneity Solutions, ExHET 2022. Association for Computing Machinery, New York (2022). https://doi.org/10.1145/3529336.3530817

  2. Chacko, J., Sahin, C., Nguyen, D., Pfeil, D., Kandasamy, N., Dandekar, K.: FPGA-based latency-insensitive OFDM pipeline for wireless research. In: 2014 IEEE High Performance Extreme Computing Conference (HPEC), pp. 1–6. IEEE (2014)

    Google Scholar 

  3. Cock, D., et al.: Enzian: an open, general, CPU/FPGA platform for systems software research. In: Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2022, pp. 434–451. Association for Computing Machinery, New York (2022). https://doi.org/10.1145/3503222.3507742

  4. Dufour, C., Cense, S., Ould-Bachir, T., Grégoire, L.A., Bélanger, J.: General-purpose reconfigurable low-latency electric circuit and motor drive solver on FPGA. In: IECON 2012-38th Annual Conference on IEEE Industrial Electronics Society, pp. 3073–3081. IEEE (2012)

    Google Scholar 

  5. Farabet, C., Poulet, C., Han, J.Y., LeCun, Y.: CNP: an FPGA-based processor for convolutional networks. In: 2009 International Conference on Field Programmable Logic and Applications, pp. 32–37. IEEE (2009)

    Google Scholar 

  6. Giordano, R., Aloisio, A.: Protocol-independent, fixed-latency links with FPGA-embedded serdeses. J. Instrum. 7(05), P05004 (2012)

    Article  Google Scholar 

  7. Henderson, S., et al.: The spallation neutron source accelerator system design. Nucl. Instrum. Methods Phys. Res. Sect. A 763, 610–673 (2014)

    Article  Google Scholar 

  8. Huang, B., Huan, Y., Xu, L.D., Zheng, L., Zou, Z.: Automated trading systems statistical and machine learning methods and hardware implementation: a survey. Enterp. Inf. Syst. 13(1), 132–144 (2019)

    Article  Google Scholar 

  9. Islam, M.M., Hossain, M.S., Hasan, M.K., Shahjalal, M., Jang, Y.M.: FPGA implementation of high-speed area-efficient processor for elliptic curve point multiplication over prime field. IEEE Access 7, 178811–178826 (2019)

    Article  Google Scholar 

  10. Javeed, K., Wang, X.: Low latency flexible FPGA implementation of point multiplication on elliptic curves over GF (P). Int. J. Circuit Theory Appl. 45(2), 214–228 (2017)

    Article  Google Scholar 

  11. Kathail, V.: Xilinx vitis unified software platform. In: Proceedings of the 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, FPGA 2020, pp. 173–174. Association for Computing Machinery, New York (2020). https://doi.org/10.1145/3373087.3375887

  12. Kim, J., Lee, S., Johnston, B., Vetter, J.S.: IRIS: a portable runtime system exploiting multiple heterogeneous programming systems. In: Proceedings of the 25th IEEE High Performance Extreme Computing Conference, HPEC 2021, pp. 1–8 (2021). https://doi.org/10.1109/HPEC49654.2021.9622873

  13. Liu, F., Miniskar, N.R., Chakraborty, D., Vetter, J.S.: Deffe: a data-efficient framework for performance characterization in domain-specific computing. In: Proceedings of the 17th ACM International Conference on Computing Frontiers, pp. 182–191 (2020)

    Google Scholar 

  14. Lockwood, J.W., Gupte, A., Mehta, N., Blott, M., English, T., Vissers, K.: A low-latency library in FPGA hardware for high-frequency trading (HFT). In: 2012 IEEE 20th Annual Symposium on High-Performance Interconnects, pp. 9–16. IEEE (2012)

    Google Scholar 

  15. Miniskar, N., Young, A., Liu, F., Blokland, W., Cabrera, A., Vetter, J.: Ultra low latency machine learning for scientific edge applications. In: Proceedings of 32nd International Conference on Field Programmable Logic and Applications (FPL 2022). IEEE (2022)

    Google Scholar 

  16. Morris, G.W., Thomas, D.B., Luk, W.: FPGA accelerated low-latency market data feed processing. In: 2009 17th IEEE Symposium on High Performance Interconnects, pp. 83–89. IEEE (2009)

    Google Scholar 

  17. Puš, V., Kekely, L., Kořenek, J.: Low-latency modular packet header parser for FPGA. In: 2012 ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS), pp. 77–78. IEEE (2012)

    Google Scholar 

  18. Rodríguez-Andina, J.J., Valdes-Pena, M.D., Moure, M.J.: Advanced features and industrial applications of FPGAs-a review. IEEE Trans. Industr. Inf. 11(4), 853–864 (2015)

    Article  Google Scholar 

  19. Sarkar, T.: DOEPY design of experiments. https://doepy.readthedocs.io/en/latest/. Accessed 30 Sept 2020

  20. Sidler, D., Alonso, G., Blott, M., Karras, K., Vissers, K., Carley, R.: Scalable 10GBPS TCP/IP stack architecture for reconfigurable hardware. In: 2015 IEEE 23rd Annual International Symposium on Field-Programmable Custom Computing Machines, pp. 36–43. IEEE (2015)

    Google Scholar 

  21. Somnath, S., Belianinov, A., Kalinin, S.V., Jesse, S.: Rapid mapping of polarization switching through complete information acquisition. Nat. Commun. 7(1), 1–8 (2016). https://doi.org/10.1038/ncomms13290

    Article  Google Scholar 

  22. Wang, Z., Schafer, B.C.: Learning from the past: efficient high-level synthesis design space exploration for FPGAs. ACM Trans. Des. Autom. Electron. Syst. 27(4), 1–23 (2022). https://doi.org/10.1145/3495531

    Article  Google Scholar 

  23. Xilinx: Vitis high-level synthesis user guide (UG1399) (2022). https://docs.xilinx.com/r/en-US/ug1399-vitis-hls

Download references

Acknowledgments

This research used resources of the Experimental Computing Laboratory (ExCL) at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aaron R. Young .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Young, A.R., Miniskar, N.R., Liu, F., Blokland, W., Vetter, J.S. (2022). Adrastea: An Efficient FPGA Design Environment for Heterogeneous Scientific Computing and Machine Learning. In: Doug, K., Al, G., Pophale, S., Liu, H., Parete-Koon, S. (eds) Accelerating Science and Engineering Discoveries Through Integrated Research Infrastructure for Experiment, Big Data, Modeling and Simulation. SMC 2022. Communications in Computer and Information Science, vol 1690. Springer, Cham. https://doi.org/10.1007/978-3-031-23606-8_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-23606-8_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-23605-1

  • Online ISBN: 978-3-031-23606-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics