Improved parallel matrix multiplication using Strassen and Urdhvatiryagbhyam method

Bessant, Y. R. Annie; Jency, J. Grace; Sagayam, K. Martin; Jone, A. Amir Anton; Pandey, Digvijay; Pandey, Binay Kumar

doi:10.1007/s42514-023-00149-9

Improved parallel matrix multiplication using Strassen and Urdhvatiryagbhyam method

Regular Paper
Published: 24 May 2023

Volume 5, pages 102–115, (2023)
Cite this article

CCF Transactions on High Performance Computing Aims and scope Submit manuscript

Y. R. Annie Bessant¹,
J. Grace Jency²,
K. Martin Sagayam³,
A. Amir Anton Jone³,
Digvijay Pandey ORCID: orcid.org/0000-0003-0353-174X⁴ &
…
Binay Kumar Pandey⁵

320 Accesses
Explore all metrics

Abstract

The current milieu, encourages rapid growth of wireless communication, multimedia applications, robotics and graphics to have efficient utilization of resources with high throughput and low power digital signal processing (DSP) systems. In an aggregate DSP system ranging from audio/video signal processing to wireless sensor networks, floating point matrix multiplication is used in wide scale in most of the fundamental processing units. Hardware implementation of floating-point matrix multiplication demands a colossal number of arithmetic operations that alter speed and consuming more area and power. DSP systems essentially uses two techniques to reduce dynamic power consumption:—they are pipelining and parallel processing that needs high performance processing element with less area and low power in diverse scientific computing applications. However, number of adders and multipliers used in the design of floating-point unit also increases subsequently. The adders and multipliers are the most area, delay and power consuming data path elements in the processing unit. The arithmetic level reduction of delay, power and area in the processing element is performed by the selection of appropriate adders and multipliers. This article proposes a parallel multiplication architecture using Strassen and UrdhvaTiryagbhyam multiplier, which involves design of efficient parallel matrix multiplication with flexible implementation of FPGA (Field Programmable Gate Array) device to analyse the computation and area. The design incorporates scheduling of blocks, operations on processing elements, block size determination, parallelization and double buffering for storage of matrix elements.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Novel FPGA-Based Hardware Design of Canonical Signed Digit Matrix Multiplier and Its Comparative Analysis with Other Multipliers

Multi-precision Multi-mode Floating Point Multiplier

Single-Precision Floating Point Matrix Multiplier Using Low-Power Arithmetic Circuits

Data availability

The data supporting the findings of this study are available within the paper.

References

Amrutha, K., Ravi Kumar, M.N., Panduranga, H.T.: Implementation of dense matrix multiplication. In: Proceedings of 2nd ASAR International Conference, pp. 17–20 (2015)
Arish, S., Sharma, R.K.: Run time reconfigurable multi precision floating point matrix multiplier intellectual property core on FPGA. Circuits Syst. Signal Process. 36(3), 998–1026 (2016)
Article MATH Google Scholar
Cannon, L.E.: A cellular computer to implement the kalman filter algorithm. PhD dissertation. Montana State University (1969)
Chetan, S., Sourabh, K.S., Lekshmi, V., Sudhakar, S., Manikandan, J.: Design and evaluation of floating point matrix operations for FPGA based system design. Procedia Comput. Sci. 171, 959–968 (2020)
Article Google Scholar
Choi, J.: A new parallel matrix multiplication algorithm on distributed-memory concurrent computers. Concurr. Pract. Exp. 10(8), 224–229 (1997)
Google Scholar
Choi, J., Dangarra, J.J., Pozo, R., Walker, D.W.: PUMMA: parallel universal matrix multiplication algorithms on distributed memory concurrent computers. Concurr. Pract. Exp. 6(7), 543–570 (1994)
Article Google Scholar
Dou, Y., Vassiliadis, S., Kuzmanov, G.K., Gaydadjiev, G.N.: 64-bit floating point FPGA matrix multiplication. In: Proceeding of the ACM/SIGDA 13th International Symposium on Field Programmable Gate Arrays (FPGA). pp. 86–95 (2005)
Fox, G.C., Otto, S.W.: Matrix algorithms on a hypercube I: matrix multiplication. Parallel Comput. 4(1), 17–31 (1987)
Article MATH Google Scholar
Geijn, R.A.V., Watts, J.: SUMMA: scalable universal matrix multiplication algorithm. Concurr. Pract. Exp. 9(4), 255–274 (1998)
Article Google Scholar
Jagadguru Swami Sri BharatiKrsnaTirthaji Maharaja.: Vedic mathematics or sixteen simple mathematical formulae from the Vedas. MotilalBanarsidass, Delhi (1985)
Kalaiselvi, A.: Multimedia security for image encryption using transformation matrix. Maejo Int. J. Sci. Technol. 1(3), 79–88 (2010)
Google Scholar
Kang, B.-H.: A review on image and video processing. Int. J. Multimed. Ubiquitous Eng. 2(2), 49–64 (2007)
Google Scholar
Khayyat, A., Manjikian, N.: Analysis of blocking and scheduling for FPGA based floating point matrix multiplication. Can. J. Electr. Comput. Eng. 37(2), 65–75 (2014)
Article Google Scholar
Kumar, V.B.Y., Joshi, S., Patkar, S.B., Narayanan, H.: FPGA based high performance double precision matrix multiplication. Int. J. Parallel Prog. 38(3), 322–338 (2010)
Article MATH Google Scholar
Li, K.: Constant time boolean matrix multiplication on a linear array with a reconfigurable pipelined bus system. J. Supercomput. 11(4), 391–403 (1997)
Article Google Scholar
Li, K., Pan, V.Y.: Parallel matrix multiplication on a linear array with a reconfigurable pipelined bus system. IEEE Trans. Comput. 50(5), 519–525 (2001)
Article MathSciNet MATH Google Scholar
Matam, K.K., Prasanna, V.K.: Energy efficient large scale matrix multiplication on FPGAs. In: Proceedings of the International Conference on Reconfigurable Computing and FPGAs (ReConFig). pp. 1–8 (2013)
Matam, K.K., Le, H., Prasanna, V.K.: Evaluating energy efficiency of floating point matrix multiplication on FPGAs. In: Proceeding of the IEEE High Performance Extreme Computing Conference (HPEC). pp. 1–6 (2013)
Palacios, I., Medina, M., Moreno, J.: Matrix multiplication on digital signal processors and hierarchical memory systems. In: Baeza-Yates, R., Manber, U. (eds.) Computer Science, pp. 473–483. Springer, Boston, MA (1992)
Chapter Google Scholar
Pan, V.: Complexity of parallel matrix computation. Theoret. Comput. Sci. 54, 65–85 (1987)
Article MathSciNet MATH Google Scholar
Pan, Y., Li, K., Zheng, S.Q.: Fast nearest neighbor algorithms on a linear array with a reconfigurable pipelined bus system. Parallel Algorithms Appl. 13(1), 1–25 (2007)
Article MathSciNet MATH Google Scholar
Pedram, A., Geijin, R.A., Gerstlauer, A.: Co-design tradeoffs for high-performance low power linear algebra architectures. IEEE Trans. Comput. 61(12), 1724–1736 (2012)
Article MathSciNet MATH Google Scholar
Prabhune, O., Sabale, P., Sonawane, D.N., Prabhune C.L.: Image Processing and Matrices. In: International conference on Data Management Analytics and Innovation (ICDMAI). pp. 166–171 (2017)
Qasim, S.M., Abbasi, S.A., Almashary, B.: FPGA-based design and realization of fixed and floating point matrix multipliers: a review. J. Active Passiv. Electron. Devices 5, 181–189 (2010)
Google Scholar
Sajish, C., Abhyankar, Y., Ghotgalkar, S., Venkates, K.A.: Floating point matrix multiplication on a reconfigurable computing system. In: Current Trends in High Performance Computing and its Applications, pp. 113–122. Springer, Berlin (2005)
Chapter Google Scholar
Shen, H., Chen, J.: Efficient matrix multiplication on wireless sensor networks. In: Proc of 7th International Conference on Grid and Cooperative Computing: 331–341 (2008)
Silva, H.D., Gustafson, J.L., Wong, W.F.: Making Strassen matrix multiplication safe. In: Proceedings of the 25th International Conference on High Performance Computing, pp. 173–182 (2018)
Singh, K.N., Tarunkumar, H.: A review on various multipliers designs in VLSI. In: Annual IEEE India Conference (INDICON), pp. 1–4 (2015)
Sonawane, D.N., Sutaone, M.S., InayatMalek: Resource efficient 64-bit floating point matrix multiplication algorithm using FPGA. In: IEEE Region 10 Conference TENCON, pp. 1–5 (2009)
Stojcev, M.K., Milovanovic, I.Z., Radonjic, Z.C.: Some shifting methods for matrix multiplication. IEE Proc. E-Comput. Digital Tech. 132(1), 33–44 (1985)
Article Google Scholar
Strassen, V.: Gaussian elimination is not optimal. Numerischemathematik 13(4), 354–356 (1969)
MathSciNet MATH Google Scholar
Thabet, K., Al-Ghuribi, S.: Matrix multiplication algorithms. Int. J. Comput. Sci. Netw. Secur. 12(2), 74–79 (2012)
Google Scholar
Tiwari, S., Singh, S., Meena, N.: FPGA design and implementation of matrix multiplication architecture by PPI-MO techniques. Int. J. Comput. Appl. 80(1), 19–22 (2013)
Google Scholar
Van De Geijn, R.A., Watts, J.: SUMMA: scalable universal matrix multiplication algorithm. Concurr.: Pract. Exp. 9(4), 255–274 (1998)
Article Google Scholar
Zhang, T., Li, C.T., Qin, Y., Nie, M.: An optimized floating point matrix multiplication on FPGA. Inf. Technol. J. 12(9), 1832–1838 (2013)
Article Google Scholar
Zhou, L., Prassana, V.K.: High performance designs for linear algebra operations on reconfigurable hardware. IEEE Trans. Comput. 57(8), 1057–1071 (2008)
Article MathSciNet Google Scholar

Download references

Acknowledgements

This research has been funded by the research general direction at Universidad Santiago de Cali, Colombia under call no 01-2022. This research is collaborated with the authors in these institutions such as St. Xavier’s Catholic College of Engineering, Tamilnadu, India, Gems Educational Institutions, Sbte, Karunya Institute of Technology and Sciences, Coimbatore, India, and Al-nahrain university, al-nahrain nonrenewable energy research center Baghdad, Iraq.

Author information

Authors and Affiliations

Department of Electronics and Communication Engineering, St. Xavier’s Catholic College of Engineering, Nagercoil, Tamilnadu, India
Y. R. Annie Bessant
GEMS Educational Institutions, SBTE, Patna, India
J. Grace Jency
Department of ECE, Karunya Institute of Technology and Sciences, Coimbatore, India
K. Martin Sagayam & A. Amir Anton Jone
Department of Technical Education, IET, Dr. A.P.J. Abdul Kalam Technical University, Govt. of U.P, Lucknow, Uttar Pradesh, India
Digvijay Pandey
Department of Information Technology, College of Technology, Govind Ballabh Pant University of Agriculture and Technology, Pantnagar, Uttarakhand, India
Binay Kumar Pandey

Authors

Y. R. Annie Bessant
View author publications
You can also search for this author in PubMed Google Scholar
J. Grace Jency
View author publications
You can also search for this author in PubMed Google Scholar
K. Martin Sagayam
View author publications
You can also search for this author in PubMed Google Scholar
A. Amir Anton Jone
View author publications
You can also search for this author in PubMed Google Scholar
Digvijay Pandey
View author publications
You can also search for this author in PubMed Google Scholar
Binay Kumar Pandey
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Digvijay Pandey.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Bessant, Y.R.A., Jency, J.G., Sagayam, K.M. et al. Improved parallel matrix multiplication using Strassen and Urdhvatiryagbhyam method. CCF Trans. HPC 5, 102–115 (2023). https://doi.org/10.1007/s42514-023-00149-9

Download citation

Received: 07 June 2022
Accepted: 24 April 2023
Published: 24 May 2023
Issue Date: June 2023
DOI: https://doi.org/10.1007/s42514-023-00149-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improved parallel matrix multiplication using Strassen and Urdhvatiryagbhyam method

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Novel FPGA-Based Hardware Design of Canonical Signed Digit Matrix Multiplier and Its Comparative Analysis with Other Multipliers

Multi-precision Multi-mode Floating Point Multiplier

Single-Precision Floating Point Matrix Multiplier Using Low-Power Arithmetic Circuits

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Improved parallel matrix multiplication using Strassen and Urdhvatiryagbhyam method

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Novel FPGA-Based Hardware Design of Canonical Signed Digit Matrix Multiplier and Its Comparative Analysis with Other Multipliers

Multi-precision Multi-mode Floating Point Multiplier

Single-Precision Floating Point Matrix Multiplier Using Low-Power Arithmetic Circuits

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation