article

FPGA Implementation of a Pipelined On-Line Backpropagation

Authors:

Rafael Gadea Gironés,

Ricardo Colom Palero,

Joaquín Cerdá Boluda,

Angel Sebastia CortésAuthors Info & Claims

Journal of VLSI Signal Processing Systems, Volume 40, Issue 2

Pages 189 - 213

https://doi.org/10.1007/s11265-005-4961-3

Published: 01 June 2005 Publication History

Abstract

The paper describes the implementation of a systolic array for a multilayer perceptron with a hardware-friendly learning algorithm. A pipelined modification of the on-line backpropagation algorithm is shown and explained. It better exploits the parallelism because both the forward and backward phases can be performed simultaneously. The neural network performance for the proposed modification is discussed and compared with the standard so-called on-line backpropagation algorithm in typical databases and with the various precisions required. Although the preliminary results are positive, subsequent theoretical analysis and further experiments with different training sets will be necessary. For this reason our VLSI systolic architecture--together with the combination of FPGA reconfiguration properties and a design flow based on generic VHDL--can create a reusable, flexible, and fast method of designing a complete ANN on a single FPGA and can permit very fast hardware verifications for our trials of the Pipeline On-line Backpropagation algorithm and the standard algorithms.

References

[1]

1. S. Hauek, "The Roles of FPGAs in Reprogrammable Systems," Proceedings of the IEEE, vol. 86, no. 4, 1998, pp. 615-638.]]

Crossref

Google Scholar

[2]

2. C.E. Cox and W.E. Blanz, "GANGLION-A Fast Field-Programmable Gate Array Implementation of a Connectionist Classifier," Journal of Solid State Circuits, vol. 27, no. 3, 1992, pp. 288-299.]]

Crossref

Google Scholar

[3]

3. V. Jean, B. Patrice, R. Didler, S. Mark, T. Hervé, and B. Philippe, "Programmable Active Memories: Reconfigurable Systems Come of Age," IEEE Transactions on VLSI Systems, vol. 4, no. 1, 1996, pp. 56-69.]]

Digital Library

Google Scholar

[4]

4. P. Lysaghi, J. Stockwood, J. Law, and D. Girnia, "Artificial Neural Network Implementation on a Fine Grained FPGA," in Proc. of FPL 94, pp. 421-432.]]

Digital Library

Google Scholar

[5]

5. V. Salapura, M. Gschwind, and O. Maischberger, "A Fast FPGA Implementation of a General Purpose Neuron," In Proc. of the Fourth International Workshop on Field Programmable Logie and Applications, Sept. 1994.]]

Digital Library

Google Scholar

[6]

6. S.I., Bade and B.L. Hutchings, "FPGA-Based Stochastic Neural Network Implementation," IEEE Workshop on FPGAs for Custom Computing Machines, 1994, pp. 189-198.]]

Crossref

Google Scholar

[7]

7. K. Kollmann, K. Riemschneider, and H.C. Zeider, "On-Chip Backpropagation Training Using Parallel Stochastic Bit Streams," in Proceedings of the IEEE International Conference on Microelectronics for Neural Networks and Fuzzy Systems MicroNeuro'96, 1996, pp. 149-156.]]

Digital Library

Google Scholar

[8]

8. J.G. Elredge and B.L. Hutchings, "RRANN: A hardware Implementation of the Backpropagation Algorithm Using Recontigurable FPGAs." IEEE World Conference on Computational Intelligence, June 1994, pp. 77-80.]]

Google Scholar

[9]

9. J.-L. Beuchat, J.-O. Haenni, and E. Sanchez, "Hardware Re-configurable Neural Networks," Parallel and Distributed Processing, Lecture Notes in Computer Science, Springer-Verlag, vol. 1388, 1998, pp. 91-98.]]

Crossref

Google Scholar

[10]

10. N. Izeboudjen, A. Farah, S. Titri, and H. Boumeridja, "Digital Implementation of Artificial Neural Networks: From VHDL Description to FPGA Implementation," Lecture Notes in Computer Science, vol. 1607, June 1999, pp. 139-148.]]

Crossref

Google Scholar

[11]

11. S. Titri, H. Bourneridja, D. Lazih, and N. Izeboudjen, "A Reuse Oriented Design Methodology for Artificial Neural Network Implementation," ASIC/SOC Conference, 1999, Proceedings, 1999, pp. 409-413.]]

Google Scholar

[12]

12. R. Gadea and A. Mocholi, "Forward-Backward Parallelism in On-Line Backpropagation," Lecture Notes in Computer Science, vol. 1607, 1999, pp. 157-165.]]

Crossref

Google Scholar

[13]

13. C.R. Rosemberg and G. Belloch, "An Implementation of Network Learning on the Connection Machine," in Connectionist Models and their Implications, D. Waltz and J. Feldman, eds., Ablex, Norwood, NJ. 1988.]]

Digital Library

Google Scholar

[14]

14. A. Petrowski, G. Dreyfus, and C. Girault, "Performance Analysis of a Pipelined Backpropagation Parallel Algorithm," IEEE Transactions on Neural Networks, vol. 4, no. 6, 1993, pp. 970- 981.]]

Digital Library

Google Scholar

[15]

15. A. Singer, "Implementations of Artificial Neural Networks on the Connection Machine," Parallel Computing, vol. 14, 1990, pp. 305-315.]]

Crossref

Google Scholar

[16]

16. S. Shams and J.L. Gaudiot, "Implementing Regularly Structured Neural Networks on the Dream Machine," IEEE Transactions on Neural networks, vol. 6, no. 2, 1995, pp. 408-421.]]

Digital Library

Google Scholar

[17]

17. W.-M. Lin, V.K. Prasanna, and K.W. Przytula, "Algorithmic Mapping of Neural Network Models onto Parallel SIMD Machines," IEEE Transactions on Computers, vol. 40, no. 12, 1991, pp. 1390-1401.]]

Digital Library

Google Scholar

[18]

18. S.R. Jones, K.M. Sammut, and J. Hunter, "Learning in Linear Systolic Neural Network Engines: Analysis and Implementation." Transactions on Neural Networks, vol. 5, no. 4, 1994, pp. 584-593.]]

Digital Library

Google Scholar

[19]

19. D. Naylor, S. Jones, and D. Myers, "Backpropagation in Linear Arrays-A Performance Analysis and Optimization," IEEE Transactions on Neural Networks, vol. 6, no. 3, 1995, pp. 583- 595.]]

Digital Library

Google Scholar

[20]

20. P. Murtagh, A.C. Tsoi, and N. Bergmann, "Bit-Serial Array Implementation of a Multilayer Perceptron." IEEE Proceedings-E, vol. 140, no. 5, 1993, pp. 277-288.]]

Google Scholar

[21]

21. B. Burton, R.G. Harley, G. Diana, and J.R. Rodgerson, "Reducing the Computational Demands of Continually Online-Trained Artificial Neural Networks for System Identification and Control of Fast Processes." IEEE Transactions on Industry Applications, vol. 34, no. 3, 1998, pp. 589-596.]]

Crossref

Google Scholar

[22]

22. D.E. Rumelhart, G.E. Hinton, and R.J. Williams, Learning Internal Representations by Error Backpropagation, Parallel Distributed Processing, vol. 1, MIT Press, Cambridge, MA, 1986, pp. 318-362.]]

Digital Library

Google Scholar

[23]

23. S.E. Falhman, "Faster Learning Variations on Backpropagation: An Empirical Study," in Proc. 1988 Connectionist Models Summer School, 1988, pp. 38-50.]]

Google Scholar

[24]

24. M.R. Zargham, Computer Architecture: Single and Parallel Systems , Prentice Hall International Inc. 1996.]]

Digital Library

Google Scholar

[25]

25. F. Bayo, Y. Cheneval, A. Guérin-Dugué, R. Chentouf, C. Aviles-Cruz, J. Madrenas, M. Moreno, and J.L. Voz. ESPRIT Basic Research Project Number 6891: ELENA, Project Coordinator C. Jutten, Task B4: Benchmarks, 1995.]]

Google Scholar

[26]

26. J. Qualy and G. Saucier, "Fast Generation of Neuro-ASICs," in Proc. Int. Neural Networks Conf. vol. 2, 1990, pp. 563-567.]]

Google Scholar

[27]

27. D.J. Myers and R.A. Hutchinson, "Efficient Implementation of Piecewise Linear Activation Function for Digital VLSI Neural Networks," Elec. Lett., vol. 25, no. 24, 1989, pp. 1662-1663.]]

Crossref

Google Scholar

[28]

28. H. Hassler and N. Takagi, "Function Evaluation by Table Look-up and Addition," in Proceedings of the 12th Symposium on Computer Arithmetic. 1995, pp. 10-16.]]

Digital Library

Google Scholar

[29]

29. D.J. Myers and R.A. Hutchinson, "Efficient Implementation of Piecewise Linear Activation Function for Digital VLSI Neural Networks," Elec. Lett., vol. 25, no. 24, 1989, pp. 1662-1663.]]

Crossref

Google Scholar

[30]

30. C. Alippi and G. Storti-Gajani, "Simple Approximation of Sigmoidal Functions: Realistic Design of Digital Neural Networks Capable of Learning," in Proc. IEEE Int. Symp. Circuits and Syst., 1991, pp. 1505-1508.]]

Crossref

Google Scholar

[31]

31. P. Murtagh and A.C. Tsoi, "Implementation Issues of Sigmoid Function and its Derivative for VLSI Digital Neural Networks," IEE Proceedings-E, vol. 139, no. 3, 1992, pp. 201-214.]]

Google Scholar

[32]

32. H. Hikawa, "Improvement of the Learning Performance of Multiplierless Multiplayer Neural Network," in IEEE International Symposium on Circuits and Systems, 1997, pp. 641-644.]]

Google Scholar

[33]

33. B. Girau and A. Tisserand, "MLP Computing and Learning on FPGA Using On-line Arithmetic," Int. Journal on Sytem Research and Information Science, special issue on Parallel and Distributed Systems for Neural Computing, 2000.]]

Google Scholar

[34]

34. B. Girau, "Building a 2D-Compatible Multiplayer Neural Network," in Proc. IJCNN, IEEE, 2000, pp. 59-64.]]

Digital Library

Google Scholar

[35]

35. S. Marusic and G. Deng, "A Neural Network Based Adaptive Non-Linear Lossless Predictive Coding Technique," Signal Processing and Its Applications, ISSPA '99, 1999.]]

Google Scholar

Cited By

View all

Közkurt CKiliçarslan SBaş SElen A(2023)αSechSig and αTanhSig: two novel non-monotonic activation functionsSoft Computing - A Fusion of Foundations, Methodologies and Applications10.1007/s00500-023-09279-227:24(18451-18467)Online publication date: 1-Dec-2023
https://dl.acm.org/doi/10.1007/s00500-023-09279-2
Servais JAtoofian E(2021)Adaptive Computation Reuse for Energy-Efficient Training of Deep Neural NetworksACM Transactions on Embedded Computing Systems10.1145/348702520:6(1-24)Online publication date: 18-Oct-2021
https://dl.acm.org/doi/10.1145/3487025
Jingyang Zhu Zhiliang Qian Chi-Ying Tsui (2021)LRADNN: High-throughput and energy-efficient Deep Neural Network accelerator using Low Rank Approximation2016 21st Asia and South Pacific Design Automation Conference (ASP-DAC)10.1109/ASPDAC.2016.7428074(581-586)Online publication date: 10-Mar-2021
https://dl.acm.org/doi/10.1109/ASPDAC.2016.7428074
Show More Cited By

Index Terms

FPGA Implementation of a Pipelined On-Line Backpropagation
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Neural networks
2. Hardware
  1. Communication hardware, interfaces and storage
    1. Signal processing systems
  2. Very large scale integration design
    1. VLSI system specification and constraints

Recommendations

A low-area yet performant FPGA implementation of Shabal
SAC'10: Proceedings of the 17th international conference on Selected areas in cryptography

In this paper, we present an efficient FPGA implementation of the SHA-3 hash function candidate Shabal [7]. Targeted at the recent Xilinx Virtex-5 FPGA family, our design achieves a relatively high throughput of 2 Gbit/s at a cost of only 153 slices, ...
Artificial neural network implementation on a single FPGA of a pipelined on-line backpropagation
ISSS '00: Proceedings of the 13th international symposium on System synthesis

The paper describes the implementation of a systolic array for a multilayer perceptron on a Virtex XCV400 FPGA with a hardware-friendly learning algorithm. A pipelined adaptation of the on-line backpropagation algorithm is shown. Parallelism is better ...
On-line arithmetic-based reprogrammable hardware implementation of multilayer perceptron back-propagation
MICRONEURO '96: Proceedings of the 5th International Conference on Microelectronics for Neural Networks and Fuzzy Systems

A digital hardware implementation of a whole neural network learning is described. It uses on-line arithmetic on FPGAs. The modularity of our solution avoids the development problems that occur with more usual hardware circuits. A precise analysis of ...

Reviews

Reviewer: Vladimir Botchev

This paper is a technical report summarized in these lines: There exist powerful field programmable gate arrays (FPGAs), and there exist back propagation neural network algorithms, the authors put these together. Given the paper's length and the many figures it contains, one might think it announces some breakthrough or presents a novel way to teach the state-of-the-art techniques in the field. Unfortunately, neither of these is true. The paper begins with some theoretical re-derivations of back propagation, which achieve the feat more obscurely than many similar re-derivations. It continues by exploring two main threads: considerations of numerical precision and the architecture of the so-called systolic implementation. For the first, the results are simply borrowed, as the authors acknowledge, from previous research by others in the field. For the second, there is no clearly shown mapping process and thus one might think the architecture is ad hoc. This would have been a positive indication if the architecture had turned out to be a novel one. More than a decade ago, S.Y. Kung published a book [1] with a chapter completely devoted to neural network implementations, almost exclusively with systolic architectures. In fact, one of his processing elements (PEs) designs is a very close match to the PEs from this paper. However, Kung's book is not even cited. In conclusion, this paper has no merits other than the fact that it reports on a successful implementation of a somewhat complex algorithm on modern programmable hardware. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Information & Contributors

Information

Published In

cover image Journal of VLSI Signal Processing Systems

Journal of VLSI Signal Processing Systems Volume 40, Issue 2

June 2005

110 pages

ISSN:0922-5773

Issue’s Table of Contents

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 June 2005

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 15 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Közkurt CKiliçarslan SBaş SElen A(2023)αSechSig and αTanhSig: two novel non-monotonic activation functionsSoft Computing - A Fusion of Foundations, Methodologies and Applications10.1007/s00500-023-09279-227:24(18451-18467)Online publication date: 1-Dec-2023
https://dl.acm.org/doi/10.1007/s00500-023-09279-2
Servais JAtoofian E(2021)Adaptive Computation Reuse for Energy-Efficient Training of Deep Neural NetworksACM Transactions on Embedded Computing Systems10.1145/348702520:6(1-24)Online publication date: 18-Oct-2021
https://dl.acm.org/doi/10.1145/3487025
Jingyang Zhu Zhiliang Qian Chi-Ying Tsui (2021)LRADNN: High-throughput and energy-efficient Deep Neural Network accelerator using Low Rank Approximation2016 21st Asia and South Pacific Design Automation Conference (ASP-DAC)10.1109/ASPDAC.2016.7428074(581-586)Online publication date: 10-Mar-2021
https://dl.acm.org/doi/10.1109/ASPDAC.2016.7428074
Venkataramanaiah SSuh HYin SNurvitadhi EDasu ACao YSeo JXie Y(2020)FPGA-based low-batch training accelerator for modern CNNs featuring high bandwidth memoryProceedings of the 39th International Conference on Computer-Aided Design10.1145/3400302.3415643(1-8)Online publication date: 2-Nov-2020
https://dl.acm.org/doi/10.1145/3400302.3415643
You YHe YRajbhandari SWang WHsieh CKeutzer KDemmel J(2020)Fast LSTM by dynamic decomposition on cloud and distributed systemsKnowledge and Information Systems10.1007/s10115-020-01487-862:11(4169-4197)Online publication date: 1-Nov-2020
https://dl.acm.org/doi/10.1007/s10115-020-01487-8
Zhou XDu ZZhang SZhang LLan HLiu SLi LGuo QChen TChen Y(2019)Addressing Sparsity in Deep Neural NetworksIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2018.286428938:10(1858-1871)Online publication date: 17-Sep-2019
https://dl.acm.org/doi/10.1109/TCAD.2018.2864289
Zhang SDu ZZhang LLan HLiu SLi LGuo QChen TChen YHsu WYang CLipasti MLee H(2016)Cambricon-xThe 49th Annual IEEE/ACM International Symposium on Microarchitecture10.5555/3195638.3195662(1-12)Online publication date: 15-Oct-2016
https://dl.acm.org/doi/10.5555/3195638.3195662
Bahoura M(2014)FPGA implementation of high-speed neural network for power amplifier behavioral modelingAnalog Integrated Circuits and Signal Processing10.1007/s10470-014-0263-779:3(507-527)Online publication date: 1-Jun-2014
https://dl.acm.org/doi/10.1007/s10470-014-0263-7
Chakradhar SSankaradas MJakkula VCadambi S(2010)A dynamically configurable coprocessor for convolutional neural networksACM SIGARCH Computer Architecture News10.1145/1816038.181599338:3(247-257)Online publication date: 19-Jun-2010
https://dl.acm.org/doi/10.1145/1816038.1815993
Chakradhar SSankaradas MJakkula VCadambi SSeznec AWeiser URonen R(2010)A dynamically configurable coprocessor for convolutional neural networksProceedings of the 37th annual international symposium on Computer architecture10.1145/1815961.1815993(247-257)Online publication date: 19-Jun-2010
https://dl.acm.org/doi/10.1145/1815961.1815993

Abstract

References

Cited By

Index Terms

Recommendations

A low-area yet performant FPGA implementation of Shabal

Artificial neural network implementation on a single FPGA of a pipelined on-line backpropagation

On-line arithmetic-based reprogrammable hardware implementation of multilayer perceptron back-propagation

Reviews

Access critical reviews of Computing literature here

Comments

Information

Published In

Publisher

Publication History

Author Tags

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

Share

Share this Publication link

Share on social media

Affiliations