CS365-Notes-01-Algorithmic-Thinking-Arithmetic-Problems
CS365-Notes-01-Algorithmic-Thinking-Arithmetic-Problems
Contents
1 Parity check of an integer 1
1.1 Parity Check: Formulation for large input . . . . . . . . . . . . . . . . . . . 1
1.2 Parity Check: A Different Formulation . . . . . . . . . . . . . . . . . . . . . 2
4 Analysis of Algorithms 5
4.1 Running Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4.2 Elementary operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4.3 Runtime as a function of input size . . . . . . . . . . . . . . . . . . . . . . . 6
4.4 Best/Worst/Average Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
4.5 Growth of runtime as size of input . . . . . . . . . . . . . . . . . . . . . . . 7
5 Arithmetic problems 7
5.1 Addition of two long integers . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Correctness of Algorithm for adding two integers given as arrays . . . 8
Runtimeof Algorithm for adding two integers given as arrays . . . . . 8
5.2 Multiplication of two long integers . . . . . . . . . . . . . . . . . . . . . . . . 9
Runtime of grade-school multiplication algorithm . . . . . . . . . . . 10
5.2.1 A reformulation of the multiplication problem . . . . . . . . . . . . . 11
5.3 Exponentiation of an integer to a power . . . . . . . . . . . . . . . . . . . . 11
5.3.1 Exponentiation by iterative multiplication . . . . . . . . . . . . . . . 11
5.3.2 Exponentiation by recursive multiplication . . . . . . . . . . . . . . . 12
Runtime of recursive exponentiation . . . . . . . . . . . . . . . . . . 12
5.3.3 Exponentiation by repeated squaring . . . . . . . . . . . . . . . . . . 13
Runtime of repeated squaring . . . . . . . . . . . . . . . . . . . . . . 13
5.4 Dot Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1
Runtime of dot-prod . . . . . . . . . . . . . . . . . . . . . . . . . . 15
5.5 Matrix-Vector Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
5.6 Matrix Multiplication via dot product . . . . . . . . . . . . . . . . . . . . . . 17
5.7 Matrix-Matrix Multiplication via Matrix-Vector Product . . . . . . . . . . . 18
Input: An integer A
Output: True if A is even, else False
Algorithm 1 Parity-Test-with-mod
if A mod 2 = 0 then
return true
The above solution is written in pseudocode (more on it later). This algorithm solves the
problem, however following are some issues with it.
It only works if A is given in an int
What if A doesn’t fit an int and A’s digits are given in an array?
What if A is given in binary/unary/. . . ?
Note that these issues are in addition to usual checks to see if it is a valid input, i.e. to check
that A is not a real number, a string, or an image, etc.
7 6 5 4 3 2 1 0
A=
5 4 6 9 2 7 5 8
2
By definition of even integers, we need to check if A[0] mod 2 = 0. If so then output A is
even.
Algorithm 2 Parity-Test-with-mod
Input: A - digits array
Output: true if A is even, otherwise false
if A[0] mod 2 = 0 then
return true
else
return false
Algorithm 3 Parity-Test-with-no-mod
Input: A - digits array
Output: true if A is even, otherwise false
if A[0] = 0 then return true
else if A[0] = 2 then return true
else if A[0] = 4 then return true
else if A[0] = 6 then return true
else if A[0] = 8 then return true
else
return false
3
Input: This is the data that is passed to the algorithm, and may be in different
forms. For instance, it could be a simple number, or a list of numbers. Other examples
include matrices, strings, graphs, images, videos, and many more. The format of input
is an important factor in designing algorithms, since processing it can involve extensive
computations.
Size of Input: The size of input is a major factor in determining the effectiveness of
an algorithm. Algorithms which may be efficient for small sizes of inputs may not do
the same for large sizes.
Output: This is the final result that the algorithm outputs. It is imperative that the
result is in a form that can be interpreted in terms of the problem which is to be solved
by the algorithm. For instance, it makes no sense to output ‘Yes’ or ‘No’ to a problem
which asked to find the maximum number out of an array of numbers.
Pseudocode: This is the language in which an algorithm is described. Note that
when we say pseudocode, we mean to write down the steps of the algorithm using
almost plain English in a manner that is understandable to a general user. Our focus
will be the solution to the problem, without going into implementation details and
technicalities of a programming language. Given our background, we will be using
structure conventions of C/C + +/Java. We will see many examples of this later on
in the notes.
4
Formulating the problem with precise notation and definitions often yield a good ideas
towards a solution. e.g. both the above algorithms just use definitions of even numbers
This is implementing the definition algorithm design paradigm, one can also call it a brute-
force solution, though this term is often used for search problem.
One way that I find very useful in designing algorithms to not look for a smart solution right
away. I instead ask, what is the dumbest/obvious/laziest way to solve the problem? What
are the easiest cases? and what are the hardest cases? where the hardness come from when
going from easy cases to hard cases?
3.3.1 Runtime:
of the algorithms we discuused so far. The first algorithm performs one mod operation over
an integer and one comparison (with 0). The second algorithm performs one mod operation
over a single-digit integer (recall size of the input) and one comparison (with 0) The third
algorithm performs a certain number of comparisons of single digit integers. The actual
number of comparison depends on the input. If A is even and its last digit is 0, then it
5
takes only one comparison, if A[0] = 2, then we first compare A[0] with 0 and then with 2,
hence there are two comparisons. Similarly if A is odd or A[0] = 8, then it performs five
comparisons.
This gives rise to the concept of best-case, worst-case analysis. While the best-case analysis
gives us an optimistic view of the efficiency of an algorithm, common sense dictates that we
should take a pessimistic view of efficiency, since this allows us to ensure that we can do no
worse than worst-case, hence our focus will be on worst-case analysis.
Note that the runtimes of these algorithms do not depend on the input A, as it should not,
as discussed above we consider the worst case only. But they do not even depend on the size
of the input, we call this constant runtime, that is it stays constant if we increase the size
of input, say we double the number of digits the algorithm still in the worst case performs 5
comparisons.
4 Analysis of Algorithms
Analysis of algorithms is the theoretical study of performance and resource utilization of
algorithms. We typically consider the efficiency/performance of an algorithm in terms of
time it takes to produce output. We could also consider utilization of other resources such
as memory, communication bandwidth etc. One could also consider various other factors
such as user-friendliness, maintainability, stability, modularity, and security etc. In this
course we will mainly be concerned with time efficiency of algorithm.
6
which varies heavily across computing systems, or time taken(in seconds), which depends on
machine/hardware, operating systems, other concurrent programs, implementation language,
and programming style etc.
We need a consistent mechanism to measure running time of an algorithm. This runtime
should be independent of the platform of actual implementation of the algorithm (such
as actual computer architecture, operating system etc.) We need running time also to be
independent to actual programing language used for implementing the algorithm.
Over the last few decades, Moore’s Law has predicted the rise in computing power available
in orders of magnitude, so processing that might have been unfeasible 20 years ago is trivial
with today’s computers. Hence, a more stable measure is required, which is the number of
elementary operations that are executed, based on the size of input. We will define what
constitutes an elementary operation below.
7
Best case runtime:
tbest (n) = minI:|I|=n {T (I)}
Average case:
tav (n) = AverageI:|I|=n {T (I)}
5 Arithmetic problems
Now that we have established the terminology and thinking styles, we look at some represen-
tative problems and see whether we can satisfy the questions with regards to the algorithm
employed.
This problem is very simple if A, B, and their sum are within word size of the given computer.
But for larger n as given in the problem specification, we perform the grade-school digit by
digit addition, taking care of carry etc.
6 5 4 3 2 1 0 1 1 1
A=
4 6 9 2 7 5 8 5 4 6 9 2 7 5 8
+
6 5 4 3 2 1 0 8 5 1 7 2 2 6 1
B=
5 1 7 2 2 6 1 1 3 9 8 6 5 0 1 9
In order to determine the tens digit and unit digit of a 2-digit number, we can employ the
mod operator and the divide operator. To determine the units digit, we simply mod by 10.
As for the tens digit, we divide by 10, truncating the decimal. We need this to determine
the carry as we do manually. Can there be a better way to determine the carry in this case,
8
think of what is the (largest) value of a carry when we add two 1-digit integer. We can
determine if there should be a carry if a number is greater than 9, in that case we only have
to extract the unit digit. We discussed in class that if the mod operator and type-casting
is not available, then we can still separate digits in an integer by using the definition of
positional number system.
The algorithm, with mod operator is as follows.
Correctness of Algorithm for adding two integers given as arrays The correction
of this algorithm again follows from the definition of addition.
Runtimeof Algorithm for adding two integers given as arrays The running time
(number of single digits) arithmetic operations performed by this algorithm is determined
in details as follows, later on we will not go into so much detail. We count how many times
each step is executed.
We do not count the memory operations, the algorithm clearly performs 4n 1-digit additions,
n divisions and n mod operations. If we consider all arithmetic operations to be of the
same complexity, then the runtime of this algorithm is 6n. You should be able to modify
the algorithm to be able to add to integers that do not have the same number of digits.
9
Can we improve this algorithm? Since any algorithm for adding n digits integers must
perform some arithmetic on every digit, we cannot really improve upon this algorithm.
We apply the grade-school multiplication algorithms, multiply A with the first digit of B,
then with the second digit of B and so on, and adding all of these arrays. We use a n × n
array Z (a 2-dimensional array or a matrix) to store the intermediate arrays. Of course, now
we know how to add these arrays.
2 7 5 8
×
9 6 3 2
5 5 1 6
8 2 7 4
1 6 5 4 8
2 4 8 2 2
2 6 5 6 5 0 5 6
Here again we will use the technique to determine the carry and the unit digits etc. Can we
be sure that when we multiply two 1 digit integers, the result will only have at most 2 digits.
10
Runtime of grade-school multiplication algorithm
The algorithm has two phases, in the first phase it computes the matrix, where we do all
multiplications, and in the second phase it adds all elements of the matrix column-wise.
In the first phases two for loops are nested each running for n iterations. You should
know that by the product rule the body of these two nested loops is executed n2 times.
As in addition, the loop body has 6 arithmetic operations. So in total the first phase
performs 6n2 operations.
In the second phase, the outer loop iterates for 2n iterations which for each value of i
the inner loop iterates n times (different value of j. While in the body of the nested
loop there is one addition performed, so in total 2n2 additions. Furthermore, the outer
loop (outside the nested loop) performs one mod and one divisions, so a total of 2n
arithmetic.
The grand total number of arithmetic operations performed is 6n2 +2n2 +2n = 8n2 +2n
arithmetic operations.
Question, when we double the size of input, (that is make n double of the previous),
what happens to the number of operations, how do they grow. Draw a table of the
number of operations for n = 2, 4, 8, 16, 32, etc.
A[0] ∗ 100 + A[1] ∗ 101 + A[2] ∗ 102 + . . . × B[0] ∗ 100 + B[1] ∗ 101 + B[2] ∗ 102 + . . .
Algorithm 7 Multiplying two long integers using Distributive law of multiplication over
addition
1: C ← 0
2: for i = 1 to n do
3: for j = 1 to n do
4: C ← C + 10i+j × A[i] ∗ B[j]
The correctness of this algorithm also follows from definition of multiplication and it performs
n2 single-digit multiplications and n2 shifting (multiplication with powers of 10).
11
Can we improve these algorithms? When we study divide and conquer paradigm of
algorithm design we will improve upon this algorithm.
Again we apply the grade-school repeated multiplication algorithm, i.e. just execute the
definition of an and multiply a n times. More precisely
n times
n z }| {
x = a = a ∗ a ∗ ... ∗ a ∗ a
Algorithm 8 Exponentiation
Input: a, n - Integers a and n ≥ 0
Output: an
1: x ← 1
2: for i = 1 to n do
3: x←x∗a
4: return x
This algorithm clearly is correct and takes n multiplications. This time integer multiplica-
tions not 1-digit multiplications. We can tweak it to save one multiplication by initializing
x to a, but be careful what if n = 0.
n−1
a ∗ a
if n > 1
an = a if n = 1
1 if n = 0
12
Algorithm 9 Recursive Exponentiation
Input: a, n - Integers a and n ≥ 0
Output: an
1: function rec-exp(a,n)
2: if n = 0 then return 1
3: else if n = 1 then return a
4: else
5: return a ∗ rec-exp(a, n − 1)
Runtime of recursive exponentiation Its runtime is something, ok let me tell you this.
Say its runtime is T (n) when I input n, (we will discuss recurrences and their solution in
more detail later). We have
1
if n = 0
T (n) = 1 if n = 1
T (n − 1) if n ≥ 2
Note that when n is even n/2 is an integer and when n is odd, (n − 1)/2 is an integer, so
in both cases we get the same problem (exponentiating one integer to the power of another
integer) but of smaller size. And smaller powers are supposed to be easy, really! well at least
n = 0 or 1, or sometime even 2. So we exploit this formula and use recursion.
13
Algorithm 10 Exponentiation by repeated squaring
Input: a, ns - Integers a and n ≥ 0
Output: an
1: function rep-sq-exp(a,n)
2: if n = 0 then return 1
3: else if n > 0 and n is even then
4: z ← rep-sq-expP(a, n/2)
5: return z ∗ z
6: else
7: z ← rep-sq-exp(a, (n − 1)/2)
8: return a ∗ z ∗ z
Again correctness of this algorithm follows form the above formula. But you need to prove
it using induction, i.e. prove that this algorithm returns an for all positive integers n, prove
the base case using the stopping condition of this function etc.
1 if n = 0
1 if n = 1
R(n) =
R(n/2) + 2 if n > 1 and n is even
n−1
R( /2) + 3 if n > 1 and n is odd
We will discuss recurrence relation later, but for now if you think about it as follows: Assume
n is a power of 2 so it stays even when halve it.
R(n) = R (n/2) + 2 = R (n/4) + 2 + 2 = R (n/8) + 2 + 2 + 2 = . . .
In general we have
j times
z }| {
R(n) = R (n/2j ) + 2 + 2 + . . . + 2
14
when j = log n, then we get
log n times
z }| {
R(n) = R (n/2log n ) + 2 + 2 + . . . + 2
log n times
z }| {
= R (n/n) + 2 + 2 + . . . + 2
log n times
z }| {
= R (1) + 2 + 2 + . . . + 2 = 1 + 2 log n
It is very easy to argue that in general that is for n not necessarily power of 2 or even, the
runtime is something like 1 + 3 log n.
Exercise: Give a non-recursive implementation of repeated squaring based exponentiation.
You can also use the binary expansion of n
A B
n
X
a1 b1
A·B = ai ∗ b i
a2 b2
i=1 ..
· ..
= Σni=1aibi
. .
an bn
Dot product is also commonly called an inner product or a scalar product. The a geometric
interpretation for dot product is the following. If u is a unit vector and v is any vector then
v · u is the projection of v onto u. The projection of any point p on v onto u is the point on
u closest to p. If v and u are both unit vectors then v · u is the cos of the angle between the
two vectors. So in a way the dot product between two vectors measures their similarity. It
tells us how much of v is in the direction of u.
What we’re interested in is how to compute the dot product. We need to multiply all the
corresponding elements and then sum them up. So we can run a for loop, keep a running
sum and at the ith turn add the product of the ith terms to it. The exact code is given below.
15
Algorithm 11 Dot product of two vectors
Input: A, B - n dimensional vectors as arrays of length n
Output: s = A · B
1: function dot-prod(A, B)
2: s←0
3: for i = i to n do
4: s ← s + A[i] ∗ B[i]
5: return s
Runtime of dot-prod How much time does this take? Well that depends on a lot of
factors. What machine you’re running the program on. How many other programs are being
run at the same time. What operating system you’re using. And many other things. So
just asking how much time a program takes to terminate doesn’t really tell us much. So
instead we’ll ask a different question. How many elementary operations does this program
performs. Elementary operations are things such as adding or multiplying two 32 bit num-
bers, comparing two 32 bit numbers, swapping elements of an array etc. In particular we’re
interested in how the number of elementary operations performed grows with the size of the
input.This captures the efficiency of the algorithm much better.
So with that in mind, let’s analyze this algorithm. What’s the size of the input? A dot
product is always between two vectors. What can change however is the size of these vectors.
So that’s our input size. So suppose the vectors are both n dimensional. Then this code
performs n additions and n multiplications. So a total of 2n elementary operations are
performed. Can we do any better? Maybe for particular cases we can, when a vector
contains a lot of zeros. But in the general case probably this is as efficient as we can get.
Matrix-vector multiplication only for the case when the number of columns in matrix A
equals the number of rows in vector x. So, if A is an m × n matrix (i.e., with n columns),
then the product Ax is defined for n × 1 column vectors x. If we let Ax = b, then b is
an m × 1 column vector. Matrix vector multiplication should be known to everyone, it is
explained in the following diagram
16
Dot Product
A B C
a11 a12 . . . a1n b1
a21 a22 . . . a2n b2
a31 a32 . . . a3n =
b3
.. .. .. .. ..
am1 am2 . . . amn bn
m×n n×1 m×1
Correct by definition
Runtime is m dot-products of n-dim vectors
Total runtime m × n real multiplications and additions
(C)ij = Ai · Bj .
17
b11 . . . b1k
B b21 . . . b2k n × k
.. ..
. .
bn1 . . . bnk
A
a11 a12 . . . a1n c11 . . .
a21 a22 . . . a2n ...
a31 a32 . . . a3n ... C
.. .. .. .. ..
. . . . .
am1 am2 . . . amn ...
m×n m×k
We know how to compute the dot product of two vectors so we can use that for matrix
multiplication. The code is as follows
Analysis : How many elementary operations are performed in this algorithm? i goes
from 1 to m and for each value of i, j goes from 1 to k. So the inner loop runs a total of
mk times. Each time the inner loop runs we compute a dot product of two n dimensional
vectors. Computing the dot product takes of two n dimensional vectors takes 2n operations,
so the algorithm uses a total of mk ∗ 2n = 2mnk operations. Can we do any better for this
problem? We defined matrix multiplication in terms of dot products and we said 2n is the
minimum number of operations needed for a dot product, so it would seem we can’t. But
in fact there is a better algorithm for this. For those interested, you can look up Strassen’s
Algorithm for matrix multiplication.
18
b11 . . . b1k
B b21 . . . b2k n × k
.. ..
. .
bn1 . . . bnk
Matrix-Vector Product
A
a11 a12 . . . a1n c11 . . .
a21 a22 . . . a2n ...
a31 a32 . . . a3n ... C
.. .. .. .. ..
. . . . .
am1 am2 . . . amn ...
m×n m×k
19