0% found this document useful (0 votes)

143 views

Fast Multiplication Algorithms

The document discusses several fast multiplication algorithms: 1. Gauss's complex multiplication algorithm reduces the number of multiplications from four to three for complex numbers. 2. Karatsuba multiplication improves on long multiplication by splitting numbers into parts and performing three multiplications rather than four. 3. Toom-Cook is a generalization of Karatsuba that splits numbers into more parts to further reduce the number of multiplications needed. 4. Fourier transform methods and related algorithms like Schonhage-Strassen can multiply very large numbers in O(n log n) time by using Fourier transforms.

Uploaded by

Mahesh Simpy

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

143 views

Fast Multiplication Algorithms

Uploaded by

Mahesh Simpy

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 9

R. V.

College of Engineering
An autonomous institution affiliated to VTU

Department of
Electronics &Communication

VII Semester
ARM Processor Assignments

Report on Fast Multiplication Algorithms

Submitted by,
Name of the student: Mahesh kumar
USN : 1rv07ec047

Date of Submission:

Marks Awarded :

Staff Incharge: MGR

Different types of Multiplication Algorithms

Gauss's complex multiplication algorithm

Complex multiplication normally involves four multiplications. By 1805 Gauss

had discovered a way of reducing the number of multiplications to three.

The product (a + bi) · (c + di) can be calculated in the following way.

k1 = c · (a + b)
k2 = a · (d − c)
k3 = b · (c + d)
Real part = k1 − k3
Imaginary part = k1 + k2.

This algorithm uses only three multiplications, rather than four, and five
additions or subtractions rather than two. If a multiply is more expensive than
three adds or subtracts, as when calculating by hand, then there is a gain in
speed. On modern computers a multiply and an add can take about the same
time so there may be no speed gain. There is a trade-off in that there may be
some loss of precision when using floating point.

For fast Fourier transforms the complex multiplies involve constant 'twiddle'
factors and two of the adds can be precomputed. Only three multiplies and three
adds are required, and modern hardware can often overlap multiplies and adds.
Karatsuba multiplication

For systems that need to multiply numbers in the range of several thousand
digits, such as computer algebra systems and bignum libraries, long
multiplication is too slow. These systems may employ Karatsuba
multiplication, which was discovered in 1960 (published in 1962). The heart of
Karatsuba's method lies in the observation that two-digit multiplication can be
done with only three rather than the four multiplications classically required.
Suppose we want to multiply two 2-digit numbers: x1x2· y1y2:

1. compute x1 · y1, call the result A

2. compute x2 · y2, call the result B
3. compute (x1 + x2) · (y1 + y2), call the result C
4. compute C − A − B,call the result "K"; this number is equal to x1 · y2 + x2
· y1.
5. compute A · 100 + K · 10 + B

Bigger numbers x1x2 can be split into two parts x1 and x2. Then the method
works analogously. To compute these three products of m-digit numbers, we
can employ the same trick again, effectively using recursion. Once the numbers
are computed, we need to add them together (step 5.), which takes about n
operations.

Karatsuba multiplication has a time complexity of O(nlog23). The number log23 is

approximately 1.585, so this method is significantly faster than long
multiplication. Because of the overhead of recursion, Karatsuba's multiplication
is slower than long multiplication for small values of n; typical implementations
therefore switch to long multiplication if n is below some threshold.

Later the Karatsuba method was called ‘divide and conquer’, the other names of
this method, used at the present, are ‘binary splitting’ and ‘dichotomy
principle’.

The appearance of the method ‘divide and conquer’ was the starting point of the
theory of fast multiplications. A number of authors (among them Toom, Cook
and Schönhage) continued to look for an algorithm of multiplication with the
complexity close to the optimal one, and 1971 saw the construction of the
Schönhage–Strassen algorithm, which maintained the best known (until 2007)
upper bound for M(n).

The Karatsuba ‘divide and conquer’ is the most fundamental and general fast
method. Hundreds of different algorithms are constructed on its basis. Among
these algorithms the most well known are the algorithms based on Fast Fourier
Transform (FFT) and Fast Matrix Multiplication.

Toom–Cook

Another method of multiplication is called Toom–Cook or Toom-3. The Toom–

Cook method splits each number to be multiplied into multiple parts. The
Toom–Cook method is one of the generalizations of the Karatsuba method. A
three-way Toom–Cook can do a size-N3 multiplication for the cost of five size-
N multiplications, improvement by a factor of 9/5 compared to the Karatsuba
method's improvement by a factor of 4/3.

Although using more and more parts can reduce the time spent on recursive
multiplications further, the overhead from additions and digit management also
grows. For this reason, the method of Fourier transforms is typically faster for
numbers with several thousand digits, and asymptotically faster for even larger
numbers.

Fourier transform methods

The idea, due to Strassen (1968), is the following: We choose the largest integer
w that will not cause overflow during the process outlined below. Then we split
the two numbers into m groups of w bits

We can then say that

by setting bj = 0 and ai = 0 for j, i > m, k = i + j and {ck} as the convolution of

{ai} and {bj}. Using the convolution theorem ab can be computed by
1. Computing the fast Fourier transforms of {ai} and {bj},
2. Multiplying the two results entry by entry,
3. Computing the inverse Fourier transform and
4. Adding the part of ck that is greater than 2w to ck+1

For many years, the fastest known method for truly massive numbers based on
this idea was described in 1971 by Schönhage and Strassen (Schönhage–
Strassen algorithm) and has a time complexity of Θ(n log(n) log(log(n))). In
2007 this was improved by Martin Fürer (Fürer's algorithm) to give a time
complexity of n log(n) 2Θ(log*(n)) using Fourier transforms over complex numbers.
Anindya De, Chandan Saha, Piyush Kurur and Ramprasad Saptharishi [6] gave a
similar algorithm using modular arithmetic in 2008 achieving the same running
time. It is important to note that these are purely theoretical results as the time
complexities are in the multitape Turing machine model which does not, for
example, allow random access to arbitrary memory locations in constant time.
As a result, they are not necessarily of great practical import.

Applications of the Schönhage–Strassen algorithm include GIMPS.

Using number-theoretic transforms instead of discrete Fourier transforms avoids

rounding error problems by using modular arithmetic instead of complex
numbers.

Linear time multiplication

Knuth[7] describes computational models in which two n-bit numbers can be

multiplied in linear time. The most realistic of these requires that any memory
location can be accessed in constant time (the so-called RAM model). The
approach is to use the FFT based method described above, packing log n bits
into each coefficient of the polynomials and doing all computations with 6 log n
bits of accuracy. The time complexity is now O ( nM ) where M is the time
needed to multiply two log n - bit numbers. By precomputing a linear size
multiplication lookup table of all pairs of numbers of (log n)/2 bits, M is simply
the time needed to perform a constant number of table lookups. If one assumes
this takes constant time per table lookup as is true in the unit-cost word RAM
model, then the overall algorithm is linear time.

Quarter square multiplier

This is any device that multiplies two quantities employing the identity,

Quarter square multipliers were first used to form an analog signal that was the
product of two analog input signals in analog computers. In this application, the
sum and difference of two input voltages are formed using operational
amplifiers. The square of each of these is approximated using piecewise linear
circuits. Finally the difference of the two squares is formed and scaled by a
factor of one fourth using yet another operational amplifier.

In 1980, Everett L. Johnson proposed a method of using the quarter square

method in a digital multiplier.[8] To form the product of two 8-bit integers, for
example, the digital device forms the sum and difference, looks both quantities
up in a table of squares, takes the difference of the results, and divides by four
by shifting two bits to the right. The difficulty with this, though, is that the sum
of two 8-bit integers can span as many as 9 bits. Hence the table of squares
would have to be twice nine, which is 18 bits wide. Computer memories are
typically available in widths of 8 or 16 bits. An 18 bit wide table of squares
does not fit conveniently into such memories. Johnson proposed that, rather
than providing squares, the table should provide for the lookup of n2/4 given n,
discarding the remainder when n is odd. In this way, entries in such a table for n
from 0 to 510 (the possible range of the sum of two 8-bit integers) would never
be wider than 16 bits. Using a table in this form also removes the need for
dividing by 4 at the end. A simple algebraic proof shows that the discarded
remainder would have canceled when the final difference is taken, so no
accuracy is lost by discarding the remainders.

Below is a lookup table for applying Johnson's method on the digits, 0 through
9.

1 1 1 1
n 0 2 4 6 7 8 9 10 12 14 16 18
1 3 5 1 3 5 7
1 2 3 4 5 7
n2/4 0 0 1 2 4 6 9 16 25 36 49 64 81
2 0 0 2 6 2

If, for example, you wanted to multiply 9 by 3, you observe that the sum and
difference are 12 and 6 respectively. Looking both those values up on the table
yields 36 and 9, the difference of which is 27, which is the product of 9 and 3.
Booth Multiplication Algorithm

Booth algorithm gives a procedure for multiplying binary integers in signed –

2’s complement representation.

The booth algorithm will be illustrated with the following example:

Example, 2 ten x (- 4) ten

0010 two * 1100 two

Step 1: Making the Booth table

I. From the two numbers, pick the number with the smallest difference
between a series of consecutive numbers, and make it a multiplier.
i.e., 0010 -- From 0 to 0 no change, 0 to 1 one change, 1 to 0 another
change ,so there are two changes on this one
II. Let X = 1100 (multiplier)
Let Y = 0010 (multiplicand)
Take the 2’s complement of Y and call it –Y
-Y=1110

III. Load the X value in the table.

IV. Load 0 for X-1 value it should be the previous first least significant
bit of X

V. Load 0 in U and V rows which will have the product of X and Y at

the end of operation.

VI. Make four rows for each cycle; this is because we are multiplying
four bits numbers.
Load the value U V X X-1
st
1 cycle 0000 0000 1100 0
nd
2 cycle
rd
3 Cycle
Step 2: Booth Algorithm

Booth algorithm requires examination of the multiplier bits, and

shifting of the partial product. Prior to the shifting, the multiplicand
may be added to partial product, subtracted from the partial product,
or left unchanged according to the following rules:
Look at the first least significant bits of the multiplier “X”, and the
previous least significant bits of the multiplier “X - 1”.

I. 00 Shift only
11 Shift only.
01 Add Y to U, and shift
10 Subtract Y from U, and shift or add (-Y) to U
II. Take U & V together and shift arithmetic right shift which
preserves the sign bit of 2’s number. Thus a positive number
remains positive, and a negative number remains negative.
III. Shift X circular right shifts because this will prevent us from
using two registers for the X value.

U V X X-1
0000 0000 1100 0
0000 0000 0110 0

Repeat the same steps until the four cycles are completed.

U V X X-1
0000 0000 1100 0
0000 0000 0110 0
0000 0000 0011 0
U V X X-1
0000 0000 1100 0
0000 0000 0110 0
0000 0000 0011 0
1110 0000 0011 0
1111 0000 1001 1

U V X X-1
0000 0000 1100 0
0000 0000 0110 0
0000 0000 0011 0
1110 0000 0011 0
1111 0000 1001 1

1111 1000 1100 1

We have finished four cycles, so the answer is shown, in the last rows of
U and V
which is: 11111000 two
Note: By the fourth cycle, the two algorithms have the same values in the
Product register.

(I.smith D.griffiths) Programming The Finite Element Method
100% (4)
(I.smith D.griffiths) Programming The Finite Element Method
478 pages
H 2
No ratings yet
H 2
16 pages
EXERCISE 2. Solve The Following Problems Using Regula-Falsi Method and Show The Graph
No ratings yet
EXERCISE 2. Solve The Following Problems Using Regula-Falsi Method and Show The Graph
5 pages
How To Multiply: 5.5 Integer Multiplication
No ratings yet
How To Multiply: 5.5 Integer Multiplication
16 pages
daa_02_R1_2_8ab4cb757311e510166decd2a9d9c328
No ratings yet
daa_02_R1_2_8ab4cb757311e510166decd2a9d9c328
63 pages
Karatsuba Algorithm
No ratings yet
Karatsuba Algorithm
3 pages
Course Notes 1
No ratings yet
Course Notes 1
7 pages
Divide and Conquer: Integer Multiplication
No ratings yet
Divide and Conquer: Integer Multiplication
4 pages
T (N) Is Linear
No ratings yet
T (N) Is Linear
14 pages
Even Faster Integer Multiplication: July 11, 2014
No ratings yet
Even Faster Integer Multiplication: July 11, 2014
28 pages
1407 3360 PDF
No ratings yet
1407 3360 PDF
28 pages
strassen matrix DAA
No ratings yet
strassen matrix DAA
14 pages
A GMP-based Implementation of Schönhage-Strassen's Large Integer Multiplication Algorithm
No ratings yet
A GMP-based Implementation of Schönhage-Strassen's Large Integer Multiplication Algorithm
8 pages
An Efficient Multiplication Algorithm Using Nikhilam Method: Shri Prakash Dwivedi
No ratings yet
An Efficient Multiplication Algorithm Using Nikhilam Method: Shri Prakash Dwivedi
6 pages
CS124 Spring 2011: (N) Is The Number of Comparisons, Then T (N) 2T (n/2) + 2. (The 2T (n/2) Term Comes From
No ratings yet
CS124 Spring 2011: (N) Is The Number of Comparisons, Then T (N) 2T (n/2) + 2. (The 2T (n/2) Term Comes From
4 pages
MA214 Lecture Slides 1
No ratings yet
MA214 Lecture Slides 1
39 pages
3 D and C - Karatsuba
No ratings yet
3 D and C - Karatsuba
9 pages
Lecture 33 Algebraic Computation and FFTs
No ratings yet
Lecture 33 Algebraic Computation and FFTs
16 pages
Lecture8_IO_BLG336E_2022
No ratings yet
Lecture8_IO_BLG336E_2022
87 pages
Chapter 2 Devide and Conquer
No ratings yet
Chapter 2 Devide and Conquer
33 pages
Course Notes 1
No ratings yet
Course Notes 1
11 pages
Number Theory
No ratings yet
Number Theory
85 pages
Fast Integer Multiplication Using Modular Arithmetic
No ratings yet
Fast Integer Multiplication Using Modular Arithmetic
12 pages
6515 Transcripts DC1
No ratings yet
6515 Transcripts DC1
23 pages
SSA Algo With Others
No ratings yet
SSA Algo With Others
51 pages
daa_02_R1_1_1c6a15d207a7c4b3a81313aa34710307
No ratings yet
daa_02_R1_1_1c6a15d207a7c4b3a81313aa34710307
4 pages
Week 5 Divide and Conquer
No ratings yet
Week 5 Divide and Conquer
19 pages
A Multiplication Formula and Its Application: Yongwen Zhu Yantai City 264005, P.R. China
No ratings yet
A Multiplication Formula and Its Application: Yongwen Zhu Yantai City 264005, P.R. China
15 pages
Chapter 5 Divide and Conquer Student
No ratings yet
Chapter 5 Divide and Conquer Student
16 pages
Divide and Conquer_9d0cc5dad5a8bf4fc50a767f024d0c41
No ratings yet
Divide and Conquer_9d0cc5dad5a8bf4fc50a767f024d0c41
18 pages
Hardware Algorithm For Variable Precision Multiplication On FPGA
No ratings yet
Hardware Algorithm For Variable Precision Multiplication On FPGA
4 pages
Compusoft, 3 (3), 599-603
No ratings yet
Compusoft, 3 (3), 599-603
5 pages
Algo VC Lecture24
No ratings yet
Algo VC Lecture24
32 pages
2012 03 30 Algorithm Paradigms
No ratings yet
2012 03 30 Algorithm Paradigms
68 pages
Computer Organisation and Architecture:Multiplier Design
No ratings yet
Computer Organisation and Architecture:Multiplier Design
6 pages
Mersenne Prime
No ratings yet
Mersenne Prime
28 pages
Lecture 9 _ 10(Divide and Conquer)
No ratings yet
Lecture 9 _ 10(Divide and Conquer)
16 pages
05 Multiply
No ratings yet
05 Multiply
18 pages
ASM Design Example Bin Mult
No ratings yet
ASM Design Example Bin Mult
11 pages
intMult08
No ratings yet
intMult08
13 pages
Karatsuba algorithm - Wikipedia
No ratings yet
Karatsuba algorithm - Wikipedia
5 pages
What Is This Course About?: Simple Answer
No ratings yet
What Is This Course About?: Simple Answer
3 pages
09 Andrew
No ratings yet
09 Andrew
60 pages
Winograd 1968 Algorithm for Inner Product
No ratings yet
Winograd 1968 Algorithm for Inner Product
2 pages
DAA IA-1 Case Study Material-CSE
No ratings yet
DAA IA-1 Case Study Material-CSE
9 pages
Integer Arithmetic: Dr. Arunachalam V Associate Professor, SENSE
No ratings yet
Integer Arithmetic: Dr. Arunachalam V Associate Professor, SENSE
19 pages
Lect10 Long Int Multiply
No ratings yet
Lect10 Long Int Multiply
4 pages
EC3021 Computer Organisation and Architecture: Latest Technologies in Multiplier Design
No ratings yet
EC3021 Computer Organisation and Architecture: Latest Technologies in Multiplier Design
6 pages
3-Divide and Conquer
No ratings yet
3-Divide and Conquer
37 pages
Journal
No ratings yet
Journal
14 pages
Algorithm 1
No ratings yet
Algorithm 1
6 pages
Math-Fastest Way To Multiply
No ratings yet
Math-Fastest Way To Multiply
46 pages
Cryptography & Netowrk Security Notes - (Unit-2)
No ratings yet
Cryptography & Netowrk Security Notes - (Unit-2)
13 pages
Nlogn
No ratings yet
Nlogn
46 pages
Bu 33436438
No ratings yet
Bu 33436438
3 pages
4/12/2006 Numerical Algorithms 1
No ratings yet
4/12/2006 Numerical Algorithms 1
14 pages
The Trachtenberg Speed System of Basic Mathematics
100% (1)
The Trachtenberg Speed System of Basic Mathematics
11 pages
Listening To Booths Algorithm Using Rust and Orca
No ratings yet
Listening To Booths Algorithm Using Rust and Orca
17 pages
Seminar 0. Ancient Methods of Calculation
No ratings yet
Seminar 0. Ancient Methods of Calculation
7 pages
Bresenham Line Algorithm: Efficient Pixel-Perfect Line Rendering for Computer Vision
From Everand
Bresenham Line Algorithm: Efficient Pixel-Perfect Line Rendering for Computer Vision
Fouad Sabry
No ratings yet
Matrices with MATLAB (Taken from "MATLAB for Beginners: A Gentle Approach")
From Everand
Matrices with MATLAB (Taken from "MATLAB for Beginners: A Gentle Approach")
Peter Kattan
3/5 (4)
MATLAB for Beginners: A Gentle Approach
From Everand
MATLAB for Beginners: A Gentle Approach
Peter I. Kattan
No ratings yet
Linear Programming-Wolfram Mathematica 9 ..
No ratings yet
Linear Programming-Wolfram Mathematica 9 ..
15 pages
Nonlinear and Dynamic Optimization
No ratings yet
Nonlinear and Dynamic Optimization
214 pages
Analysis of Structure Using Moment Distribution Method
No ratings yet
Analysis of Structure Using Moment Distribution Method
43 pages
Unit 5 (Tutorials) - Sequential Logic Circuits 1
No ratings yet
Unit 5 (Tutorials) - Sequential Logic Circuits 1
2 pages
Quiz 8 Chap 9
No ratings yet
Quiz 8 Chap 9
5 pages
5 Fuzzy Expert System
No ratings yet
5 Fuzzy Expert System
6 pages
Lec 3 Spline Interpolation
No ratings yet
Lec 3 Spline Interpolation
25 pages
na project
No ratings yet
na project
14 pages
2017 01 27 (I I) T
No ratings yet
2017 01 27 (I I) T
56 pages
Factoring Polynomials
100% (1)
Factoring Polynomials
18 pages
42357618-Numerical Methods
No ratings yet
42357618-Numerical Methods
3 pages
Dual Minimization - Simplex
No ratings yet
Dual Minimization - Simplex
10 pages
Chapter 3 OTSM
No ratings yet
Chapter 3 OTSM
74 pages
Lecture 52
No ratings yet
Lecture 52
11 pages
Logic Programming Tutorial 1: Prolog Warm-Up
No ratings yet
Logic Programming Tutorial 1: Prolog Warm-Up
2 pages
CE403 Structural Analysis
No ratings yet
CE403 Structural Analysis
2 pages
Question Bank NMO
No ratings yet
Question Bank NMO
4 pages
Transsipment
No ratings yet
Transsipment
26 pages
Program To Implement The Bisection Method: Anshul Siwach
No ratings yet
Program To Implement The Bisection Method: Anshul Siwach
49 pages
LECTURE 9 - Finite Element 2
No ratings yet
LECTURE 9 - Finite Element 2
36 pages
Numerical Analysis WEEK 1
No ratings yet
Numerical Analysis WEEK 1
26 pages
Makerere University: Chapter Four: Numerical Analysis
No ratings yet
Makerere University: Chapter Four: Numerical Analysis
36 pages
02 03 SampleQuiz
No ratings yet
02 03 SampleQuiz
2 pages
Construcción Del Método Simplex
No ratings yet
Construcción Del Método Simplex
20 pages
MTH603 Final Term Solved MCQs
70% (10)
MTH603 Final Term Solved MCQs
66 pages
Solving Ordinary Differential Equations I: Nonstiff Problems
No ratings yet
Solving Ordinary Differential Equations I: Nonstiff Problems
9 pages
Systems of Linear Equations Reviewer
No ratings yet
Systems of Linear Equations Reviewer
6 pages
(Higham, 1996) - Book - Accuracy and Stability of Numerical Algorithms PDF
No ratings yet
(Higham, 1996) - Book - Accuracy and Stability of Numerical Algorithms PDF
718 pages