Montgomery’s Multiplication Technique: How to Make It Smaller and Faster

Walter, Colin D.

doi:10.1007/3-540-48059-5_9

Colin D. Walter³

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1717))

Included in the following conference series:

International Workshop on Cryptographic Hardware and Embedded Systems

6343 Accesses
39 Citations

Abstract

Montgomery’s modular multiplication algorithm has enabled considerable progress to be made in the speeding up of RSA cryptosystems. Perhaps the systolic array implementation stands out most in the history of its success. This article gives a brief history of its implementation in hardware, taking a broad view of the many aspects which need to be considered in chip design. Among these are trade-offs between area and time, higher radix methods, communications both within the circuitry and with the rest of the world, and, as the technology shrinks, testing, fault tolerance, checker functions and error correction. We conclude that a linear, pipelined implementation of the algorithm may be part of best policy in thwarting differential power attacks against RSA.

Download to read the full chapter text

Chapter PDF

Fast Modular Squaring with AVX512IFMA

Parallel modular multiplication using 512-bit advanced vector instructions

Article Open access 13 February 2021

Modular Hardware Architecture for Somewhat Homomorphic Function Evaluation

KeyWords

References

T. Blum & C. Paar,“Montgomery Modular Exponentiation on Reconfigurable Hard-Ware”, Proc. 14th IEEE Symp. on Computer Arithmetic,Adelaide, 14-16 April 1999,IEEE Press (1999) 70–77
Google Scholar
D. Boneh, R. DeMillo & R. Lipton, “On the Importance of Checking Cryptographic Protocols for Faults”, Eurocrypt x2019;97, Lecture Notes in Computer Science, vol. 1233, Springer-Verlag (1997) 37–51
Google Scholar
R. P. Brent & H. T. Kung,“The Area-Time Complexity of Binary Multiplication”, J.ACM 28 (1981) 521–534
Article MathSciNet Google Scholar
R. P. Brent & H. T. Kung, “A Regular Layout for Parallel Adders”, IEEE Trans. Comp. C-31 no. 3 (March 1982) 260–264
Google Scholar
E. F. Brickell,“A Fast Modular Multiplication Algorithm with Application to Two Key Cryptography”, Advances in Cryptology-CRYPTO’ 82, Chaum et al. (eds.),New York, Plenum (1983) 51–60
Google Scholar
S. E. Eldridge, “A Faster Modular Multiplication Algorithm”, Intern. J. Computer Math. 40 (1991) 63–68
Article Google Scholar
S. E. Eldridge & C. D. Walter, “Hardware Implementation of Montgomery’s Mo-dular Multiplication Algorithm”, IEEE Trans. Comp. 42 (1993) 693–699
Article Google Scholar
G. Gerwig & M. Kroener, “Floating Point Unit in Standard Cell Design with 116 bit Wide Dataflow”, Proc. 14th IEEE Symp. on Computer Arithmetic, Adelaide, 14-16 April 1999, IEEE Press (1999) 266–273
Google Scholar
D. E. Knuth, The Art of Computer Programming, vol. 2, Seminumerical Algorithms, 2nd Edition, Addison-Wesley (1981) 441–466
Google Scholar
N. Koblitz, A Course in Number Theory and Cryptography, Graduate Texts in Mathematics 114, Springer-Verlag (1987)
Google Scholar
Ç. K. Koç, T. Acar & B. S. Kaliski, “Analyzing and Comparing Montgomery Multiplication Algorithms”, IEEE Micro 16 no.3 (June 1996) 26–33
Article Google Scholar
P. Kocher, “Timing Attacks on Implementations of Diffie-Hellman, RSA, DSS, and Other Systems”, Advances in Cryptology,Proc Crypto 96, Lecture Notes in Computer Science 1109, N. Koblitz editor, Springer-Verlag (1996) 104–113
Google Scholar
P. Kocher, J. Jaffe & B. Jun, Introduction to Differential Power Analysis and Related Attacks at http://www.cryptography.com/dpa
P. Kornerup,“A Systolic, Linear-Array Multiplier for a Class of Right-Shift Algo-rithms”, IEEE Trans. Comp. 43 no. 8 (1994) 892–898
Article Google Scholar
W. K. Luk & J. E. Vuillemin, “Recursive Implementation of Optimal Time VLSI Integer Multipliers”, VLSI’ 83, F. Anceau & E.J. Aas (eds.), Elsevier Science (1983) 155–168
Google Scholar
K. Mehlhorn & F. P. Preparata, “Area-Time Optimal VLSI Integer Multiplier with Minimum Computation Time”, Information & Control 58 (1983) 137–156
Article MathSciNet Google Scholar
P. L. Montgomery, “Modular Multiplication without Trial Division”, Math. Computation 44 (1985) 519–521
Article MathSciNet Google Scholar
S. F. Obermann, H. Al-Twaijry & M. J. Flynn, “The SNAP Project: Design of Floating Point Arithmetic Units”, Proc. 13th IEEE Symp. on Computer Arith., Asilomar, CA, USA, 6–9 July 1997, IEEE Press (1997) 156–165
Google Scholar
F. P. Preparata & J. Vuillemin, “Area-Time Optimal VLSI Networks for computing Integer Multiplication and Discrete Fourier Transform”, Proc. ICALP, Haifa, Israel, 1981, 29–40
Google Scholar
R. L. Rivest, A. Shamir & L. Adleman, “A Method for obtaining Digital Signatures and Public-Key Cryptosystems”, Comm. ACM 21 (1978) 120–126
Article MathSciNet Google Scholar
A. van Someren & C. Attack, The ARM RISC Chip: a programmer’s guide, Addison-Wesley (1993)
Google Scholar
J. Vuillemin, P. Bertin, D. Roncin, M. Shand, H. Touati & P. Boucard, “ Pro-grammable active memories: Reconfigurable systems come of age”,, IEEE Trans. on VLSI Systems 5 no. 2 (June 1997) 211–217
Article Google Scholar
C. S. Wallace, “A Suggestion for a Fast Multiplier”, IEEE Trans.Electronic Com-puters EC-13 no. 2 (Feb. 1964) 14–17
Google Scholar
C. D. Walter, “Fast Modular Multiplication using 2-Power Radix”, Intern. J.Computer Maths. 39 (1991) 21–28
Article Google Scholar
C. D. Walter, “Faster Modular Multiplication by Operand Scaling”, Advances in Cryptology-CRYPTO’ 91, J. Feigenbaum (ed.), Lecture Notes in Computer Science 576, Springer-Verlag (1992) 313–323
Chapter Google Scholar
C. D. Walter, “Systolic Modular Multiplication”, IEEE Trans. Comp. 42 (1993) 376–378
Article Google Scholar
C. D. Walter, “Space/Time Trade-offs for Higher Radix Modular Multiplication using Repeated Addition”, IEEE Trans. Comp. 46 (1997) 139–141
Article Google Scholar
C. D. Walter, “Exponentiation using Division Chains”, IEEE Trans. Comp. 47 no.7 (July 1998) 757–765
Article MathSciNet Google Scholar
C. D. Walter, “Moduli for Testing Implementations of the RSA Cryptosystem”, Proc. 14th IEEE Symp. on Computer Arithmetic, Adelaide, 14-16 April 1999, IEEE Press (1999) 78–85
Google Scholar

Download references

Author information

Authors and Affiliations

Computation Department,UMIST, PO Box 88, Sackville Street, Manchester, M60 1QD, UK
Colin D. Walter

Authors

Colin D. Walter
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Electrical and Computer Engineering, Oregon State University, Corvallis, OR, 97330, USA
Çetin K. Koç
Department of Electrical and Computer Engineering, Worcester Polytechnic Institute, Worcester, MA, 01609, USA
Christof Paar

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Walter, C.D. (1999). Montgomery’s Multiplication Technique: How to Make It Smaller and Faster. In: Koç, Ç.K., Paar, C. (eds) Cryptographic Hardware and Embedded Systems. CHES 1999. Lecture Notes in Computer Science, vol 1717. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48059-5_9

Download citation

DOI: https://doi.org/10.1007/3-540-48059-5_9
Published: 08 February 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66646-2
Online ISBN: 978-3-540-48059-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Montgomery’s Multiplication Technique: How to Make It Smaller and Faster

Abstract

Chapter PDF

Similar content being viewed by others

Fast Modular Squaring with AVX512IFMA

Parallel modular multiplication using 512-bit advanced vector instructions

Modular Hardware Architecture for Somewhat Homomorphic Function Evaluation

KeyWords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Montgomery’s Multiplication Technique: How to Make It Smaller and Faster

Abstract

Chapter PDF

Similar content being viewed by others

Fast Modular Squaring with AVX512IFMA

Parallel modular multiplication using 512-bit advanced vector instructions

Modular Hardware Architecture for Somewhat Homomorphic Function Evaluation

KeyWords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation