Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
110 views

CS Study Material PDF

Sets are collections of distinct objects that can be defined using curly braces. Basic set operations include union, intersection, complement, and membership. A Venn diagram uses overlapping circles to depict the relationships between sets visually. Sets are a fundamental concept in mathematics that allow discrete objects to be grouped and manipulated as a single unit.

Uploaded by

Adhish
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
110 views

CS Study Material PDF

Sets are collections of distinct objects that can be defined using curly braces. Basic set operations include union, intersection, complement, and membership. A Venn diagram uses overlapping circles to depict the relationships between sets visually. Sets are a fundamental concept in mathematics that allow discrete objects to be grouped and manipulated as a single unit.

Uploaded by

Adhish
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 164

Basic Set Theory

A set is a Many that allows itself to can be written:


be thought of as a One.
- Georg Cantor {2n : n is an integer}

The opening and closing curly braces denote a set, 2n


This chapter introduces set theory, mathematical in-
specifies the members of the set, the colon says “such
duction, and formalizes the notion of mathematical
that” or “where” and everything following the colon
functions. The material is mostly elementary. For
are conditions that explain or refine the membership.
those of you new to abstract mathematics elementary
All correct mathematics can be spoken in English.
does not mean simple (though much of the material is
The set definition above is spoken “The set of twice
fairly simple). Rather, elementary means that the
n where n is an integer”.
material requires very little previous education to un-
The only problem with this definition is that we
derstand it. Elementary material can be quite chal-
do not yet have a formal definition of the integers.
lenging and some of the material in this chapter, if
The integers are the set of whole numbers, both pos-
not exactly rocket science, may require that you ad-
itive and negative: {0, ±1, ±2, ±3, . . .}. We now in-
just you point of view to understand it. The single
troduce the operations used to manipulate sets, using
most powerful technique in mathematics is to adjust
the opportunity to practice curly brace notation.
your point of view until the problem you are trying to
solve becomes simple. Definition 2.1 The empty set is a set containing
Another point at which this material may diverge no objects. It is written as a pair of curly braces with
from your previous experience is that it will require nothing inside {} or by using the symbol ∅.
proof. In standard introductory classes in algebra,
trigonometry, and calculus there is currently very lit- tle As we shall see, the empty set is a handy object.
emphasis on the discipline of proof. Proof is, how- ever, It is also quite strange. The set of all humans that
the central tool of mathematics. This text is for a weigh at least eight tons, for example, is the empty
course that is a students formal introduction to tools set. Sets whose definition contains a contradiction or
and methods of proof. impossibility are often empty.

Definition 2.2 The set membership symbol ∈ is


2.1 Set Theory used to say that an object is a member of a set. It
has a partner symbol ∈
/ which is used to say an object
A set is a collection of distinct objects. This means is not in a set.
that {1, 2, 3} is a set but {1, 1, 3} is not because 1
appears twice in the second collection. The second Definition 2.3 We say two sets are equal if they
collection is called a multiset. Sets are often specified have exactly the same members.
with curly brace notation. The set of even integers

1
Example 2.1 If S ∩ T = {1, 3, 5},

S = {1, 2, 3} S ∩ U = {2, 3, 5}, and

then 3 ∈ S and 4 ∈ / S. The set membership symbol T ∩ U = {3, 4, 5}


is often used in defining operations that manipulate
sets. The set
T = {2, 3, 1} Definition 2.6 If A and B are sets and A ∩ B = ∅
then we say that A and B are disjoint, or disjoint
is equal to S because they have the same members: 1, sets.
2, and 3. While we usually list the members of a set
Definition 2.7 The union of two sets S and T is
in a “standard” order (if one is available) there is no
the collection of all objects that are in either set. It is
requirement to do so and sets are indifferent to the
written S ∪ T . Using curly brace notion
order in which their members are listed.
S ∪ T = {x : (x ∈ S) or (x ∈ T )}
Definition 2.4 The cardinality of a set is its size.
For a finite set, the cardinality of a set is the number The symbol or is another Boolean operation, one that
of members it contains. In symbolic notation the size is true if either of the propositions it joins are true.
of a set S is written |S|. We will deal with the idea Its symbolic equivalent is ∨ which lets us re-write the
of the cardinality of an infinite set later. definition of union as:
S ∪ T = {x : (x ∈ S) ∨ (x ∈ T )}
Example 2.2 Set cardinality
For the set S = {1, 2, 3} we show cardinality by writ- Example 2.4 Unions of sets.
ing |S| = 3 Suppose S = {1, 2, 3}, T = {1, 3, 5}, and U =
{2, 3, 4, 5}. Then:
We now move on to a number of operations on sets. S ∪ T = {1, 2, 3, 5},
You are already familiar with several operations on
numbers such as addition, multiplication, and nega- S ∪ U = {1, 2, 3, 4, 5}, and
tion.
T ∪ U = {1, 2, 3, 4, 5}
Definition 2.5 The intersection of two sets S and
T is the collection of all objects that are in both sets.
It is written S ∩ T . Using curly brace notation When performing set theoretic computations, you
should declare the domain in which you are working.
S ∩ T = {x : (x ∈ S) and (x ∈ T )} In set theory this is done by declaring a universal set.
Definition 2.8 The universal set, at least for a
The symbol and in the above definition is an ex- given collection of set theoretic computations, is the
ample of a Boolean or logical operation. It is only set of all possible objects.
true when both the propositions it joins are also true.
It has a symbolic equivalent ∧. This lets us write the If we declare our universal set to be the integers then
formal definition of intersection more compactly: 1 2
{ 2 , 3 } is not a well defined set because the objects
S ∩ T = {x : (x ∈ S) ∧ (x ∈ T used to define it are not members of the universal
1 2
)} set. The symbols { , } do define a set if a universal
2 3
set that includes 2 1and 23 is chosen. The problem
Example 2.3 Intersections of sets arises from the fact that neither of these numbers are
Suppose S = {1, 2, 3, 5}, integers. The universal set is commonly written U .
T = {1, 3, 4, 5}, and U = {2, 3, 4, 5}. Now that we have the idea of declaring a universal
Then: set we can define another operation on sets.

24
2.1.1 Venn Diagrams notation for not is ¬. There is not much savings in
space as the definition of compliment becomes
A Venn diagram is a way of depicting the relationship
between sets. Each set is shown as a circle and circles S c = {x : ¬(x ∈ S)}
overlap if the sets intersect.
Example 2.6 Set Compliments
Example 2.5 The following are Venn diagrams for
the intersection and union of two sets. The shaded
parts of the diagrams are the intersections and unions (i) Let the universal set be the integers. Then the
respectively. compliment of the even integers is the odd inte-
gers.
(ii) Let the universal set be {1, 2, 3, 4, 5}, then the
compliment of S = {1, 2, 3} is Sc = {4, 5} while
the compliment of T = {1, 3, 5} is Tc = {2, 4}.
(iii) Let the universal set be the letters {a, e, i, o, u, y}.
Then {y}c = {a, e, i, o, u}.
The Venn diagram for Ac is
A∩B

Ac
A∪B

Notice that the rectangle containing the diagram is We now have enough set-theory operators to use them
labeled with a U representing the universal set. to define more operators quickly. We will continue to
give English and symbolic definitions.
Definition 2.9 The compliment of a set S is the
collection of objects in the universal set that are not Definition 2.10 The difference of two sets S and T
is the collection of objects in S that are not in T . The
in S. The compliment is written S c . In curly brace
notation difference is written S − T . In curly brace nota-
tion
c
S = {x : (x ∈ U ) ∧ (x ∈ / S − T = {x : x ∈ (S ∩ (Tc ))},
S)} or alternately

or more compactly as
S − T = {x : (x ∈ S) ∧ (x ∈
/ T )}
S c = {x : x ∈
/ S}
Notice how intersection and complementation can be
however it should be apparent that the compliment of used together to create the difference operation and
a set always depends on which universal set is chosen. that the definition can be rephrased to use Boolean
operations. There is a set of rules that reduces the
There is also a Boolean symbol associated with the number of parenthesis required. These are called op-
complementation operation: the not operation. The erator precedence rules.

25
(i) Other things being equal, operations are per- Another important tool for working with sets is the
formed left-to-right. ability to compare them. We have already defined
what it means for two sets to be equal, and so by
(ii) Operations between parenthesis are done first, implication what it means for them to be unequal.
starting with the innermost of nested parenthe- We now define another comparator for sets.
sis.
(iii) All complimentations are computed next. (iv) Definition 2.12 For two sets S and T we say that S
is a subset of T if each element of S is also an
All intersections are done next. element of T . In formal notation S ⊆ T if for all x
∈ S we have x ∈ T .
(v) All unions are performed next.
If S ⊆ T then we also say T contains S which
(vi) Tests of set membership and computations,
can be written T ⊇ S. If S ⊆ T and S = T then we
equality or inequality are performed last.
write S ⊂ T and we say S is a proper subset of T .
Special operations like the set difference or the
symmetric difference, defined below, are not included Example 2.9 Subsets
in the precedence rules and thus always use paren- If A = {a, b, c} then A has eight different subsets:
thesis. ∅ {a} {b} {c}
Example 2.7 Operator precedence
Since complementation is done before intersection {a, b} {a, c} {b, c} {a, b, c}
the symbolic definition of the difference of sets can be Notice that A ⊆ A and in fact each set is a subset of
rewritten: itself. The empty set ∅ is a subset of every set.

S − T = {x : x ∈ S ∩ T c } We are now ready to prove our first proposition.


If we were to take the set operations Some new notation is required and we must intro-
duce an important piece of mathematical culture. If
A ∪ B ∩ Cc we say “A if and only if B” then we mean that either
A and B are both true or they are both false in any
and put in the parenthesis we would get
given circumstance. For example: “an integer x is
(A ∪ (B ∩ (C c ))) even if and only if it is a multiple of 2”. The phrase
“if and only if ” is used to establish logical equiva-
Definition 2.11 The symmetric difference of lence. Mathematically, “A if and only if B” is a way
two sets S and T is the set of objects that are in one of stating that A and B are simply different ways
and only one of the sets. The symmetric difference is of saying the same thing. The phrase “if and only if
written S∆T . In curly brace notation: ” is abbreviated iff and is represented symbolically as
the double arrow ⇔. Proving an iff statement is done
S∆T = {(S − T ) ∪ (T − S)}
by independently demonstrating that each may be
Example 2.8 Symmetric differences deduced from the other.
Let S be the set of non-negative multiples of two that
are no more than twenty four. Let T be the non- Proposition 2.1 Two sets are equal if and only if
negative multiples of three that are no more than each is a subset of the other. In symbolic notation:
twenty four. Then
(A = B) ⇔ (A ⊆ B) ∧ (B ⊆ A) Proof:
S∆T = {2, 3, 4, 8, 9, 10, 14, 15, 16, 20, 21, 22}
Let the two sets in question be A and B. Begin by
Another way to think about this is that we need num- assuming that A = B. We know that every set is
bers that are positive multiples of 2 or 3 (but not both)
that are no more than 24.

26
a subset of itself so A ⊆ A. Since A = B we may where it is false. It is thus possible for a false mathe-
substitute into this expression on the left and obtain matical statement to be “true most of the time”. In
B ⊆ A. Similarly we may substitute on the right and the next chapter we will develop the theory of prime
obtain A ⊆ B. We have thus demonstrated that if numbers. For now we will assume the reader has a
A = B then A and B are both subsets of each other, modest familiarity with the primes. The statement
giving us the first half of the iff. “Prime numbers are odd” is false once, because 2 is a
Assume now that A ⊆ B and B ⊆ A. Then prime number. All the other prime numbers are odd.
the definition of subset tells us that any element of The statement is a false one. This very strict defini-
A is an element of B. Similarly any element of B tion of what makes a statement true is a convention
is an element of A. This means that A and B have in mathematics. We call 2 a counter example. It is
the same elements which satisfies the definition of set thus necessary to find only one counter-example to
equality. We deduce A = B and we have the second demonstrate a statement is (mathematically) false.
half of the iff. ✷
A note on mathematical grammar: the symbol ✷ in- Example 2.10 Disproof by counter example
dicates the end of a proof. On a paper turned in by a Prove that the statement A ∪ B = A ∩ B is false.
student it is usually taken to mean “I think the proof
ends here”. Any proof should have a ✷ to indicate its Let A = {1, 2} and B = {3, 4}. Then A ∩ B = ∅
end. The student should also note the lack of calcu- while A ∪ B = {1, 2, 3, 4}. The sets A and B form a
lations in the above proof. If a proof cannot be read counter-example to the statement.
back in (sometimes overly formal) English then it is
probably incorrect. Mathematical symbols should be
used for the sake of brevity or clarity, not to obscure Problems
meaning.
Problem 2.1 Which of the following are sets? As-
Proposition 2.2 De Morgan’s Laws Suppose sume that a proper universal set has been chosen and
that S and T are sets. DeMorgan’s Laws state that answer by listing the names of the collections of ob-
jects that are sets. Warning: at least one of these
(i) (S ∪ T )c = S c ∩ T c , and items has an answer that, while likely, is not 100%
certain.
(ii) (S ∩ T )c = S c ∪ T c .
(i) A = {2, 3, 5, 7, 11, 13, 19}
Proof:
(ii) B = {A, E, I , O, U }
Let x ∈ (S ∪ T )c ; then x is not a member of S or √
T . Since x is not a member of S we see that x ∈ (iii) C = { x : x < 0}
S c . Similarly x ∈ T c . Since x is a member of both
these sets we see that x ∈ S c ∩ T c and we see that (iv) D = {1, 2, A, 5, B, Q, 1, V }
(S ∪ T )c ⊆ S c ∩ Tc . Let y ∈ S c ∩ T c . Then the
(v) E is the list of first names of people in the 1972
definition of intersection tells us that y ∈ S c and
c phone book in Lawrence Kansas in the order
y ∈ T . This in turn lets us deduce that y is not a
they appear in the book. There were more than
member of S ∪ T , since it is not in either set, and
c
35,000 people in Lawrence that year.
so we see that y ∈ (S ∪ T ) . This demonstrates that
S ∩ T ⊆ (S ∪ T ) . Applying Proposition 2.1 we get (vi) F is a list of the weight, to the nearest kilogram,
c c c

that (S ∪ T )c = S c ∩ T c and we have proven part (i). of all people that were in Canada at any time in
The proof of part (ii) is left as an exercise. ✷ 2007.
In order to prove a mathematical statement you must
prove it is always true. In order to disprove a mathe- (vii) G is a list of all weights, to the nearest kilogram,
matical statement you need only find a single instance that at least one person in Canada had in 2007.

27
Problem 2.2 Suppose that we have the set U = Problem 2.5 Find an example of an infinite set that
{n : 0 ≤ n < 100} of whole numbers as our has a finite complement, be sure to state the universal
universal set. Let P be the prime numbers in U , set.
let E be the even numbers in U , and let F =
{1, 2, 3, 5, 8, 13, 21, 34, 55, 89}. Describe the following Problem 2.6 Find an example of an infinite set that
sets either by listing them or with a careful English has an infinite complement, be sure to state the uni-
sentence. versal set.

(i) E c , Problem 2.7 Add parenthesis to each of the follow-


ing expressions that enforce the operator precedence
(ii) P ∩ F , (iii) P ∩ E, rules as in Example 2.7. Notice that the first three de-
scribe sets while the last returns a logical value (true
(iv) F ∩ E ∪ F ∩ E c , and of false).
(v) F ∪ F c .
(i) A ∪ B ∪ C ∪ D (ii) A ∪ B ∩ C ∪ D (iii) Ac ∩ B c
Problem 2.3 Suppose that we take the universal set ∪ C (iv) A ∪ B = A ∩ C
U to be the integers. Let S be the even integers, let
T be the integers that can be obtained by tripling any Problem 2.8 Give the Venn diagrams for the fol-
one integer and adding one to it, and let V be the set lowing sets.
of numbers that are whole multiples of both two and
three. (i) A − B (ii) B − A (iii) Ac ∩ B

(i) Write S, T , and V using symbolic notation. (iv) A∆B (v) (A∆B)c (vi) Ac ∪ B c

(ii) Compute S ∩ T , S ∩ V and T ∩ V and give sym-


bolic representations that do not use the symbols
S, T , or V on the right hand side of the equals
sign.

Problem 2.4 Compute the cardinality of the follow-


ing sets. You may use other texts or the internet.

(i) Two digit positive odd integers.

(ii) Elements present in a sucrose molecule.


(iii) Isotopes of hydrogen that are not radioactive.
(iv) Planets orbiting the same star as the planet you
are standing on that have moons. Assume that
Pluto is a minor planet.
(v) Elements with seven electrons in their valence
shell. Remember that Ununoctium was discov- Problem 2.9 Examine the Venn diagram above.
Notice that every combination of sets has a unique
ered in 2002 so be sure to use a relatively recent
number in common. Construct a similar collection
reference.
of four sets.
(vi) Subsets of S = {a, b, c, d} with cardinality 2.
Problem 2.10 Read Problem 2.9. Can a system of
(vii) Prime numbers whose base-ten digits sum to ten. sets of this sort be constructed for any number of sets?
Be careful, some have three digits. Explain your reasoning.

28
29
29
Problem 2.11 Suppose we take the universal set to 2.2 Mathematical Induction
be the set of non-negative integers. Let E be the
set of even numbers, O be the set of odd numbers Mathematical induction is a technique used in prov- ing
and F = {0, 1, 2, 3, 5, 8, 13, 21, 34, 89, 144, ...} be the mathematical assertions. The basic idea of induc- tion
set of Fibonacci numbers. The Fibonacci sequence is is that we prove that a statement is true in one case
0, 1, 1, 2, 3, 5, 8, . . . in which the next term is obtained and then also prove that if it is true in a given case it
by adding the previous two. is true in the next case. This then permits the cases for
which the statement is true to cascade from the initial
(i) Prove that the intersection of F with E and O true case. We will start with the mathe- matical
are both infinite. foundations of induction.
(ii) Make a Venn diagram for the sets E, F , and O, We assume that the reader is familiar with the sym-
and explain why this is a Mickey-Mouse problem. bols <, >, ≤ and ≥. From this point on we will
denote the set of integers by the symbol Z. The
Problem 2.12 A binary operation ⊙ is commuta- non-negative integers are called the natural numbers.
tive if x ⊙ y = y ⊙ x. An example of a commuta- The symbol for the set of natural numbers is N. Any
tive operation is multiplication. Subtraction is non- mathematical system rests on a foundation of axioms.
commutative. Determine, with proof, if union, inter- Axioms are things that we simply assume to be true.
section, set difference, and symmetric difference are We will assume the truth of the following principle,
commutative. adopting it as an axiom.
Problem 2.13 An identity for an operation ⊙ is
The well-ordering principle: Every non-empty
an object i so that, for all objects x, i ⊙ x = x ⊙ i = ¯
set of natural numbers contains a smallest element.
x. Find, with proof, identities for the operations set
union and set intersection. The well ordering principle is an axiom that
agrees with the common sense of most people famil-
Problem 2.14 Prove part (ii) of Proposition 2.2. iar with the natural numbers. An empty set does
Problem 2.15 Prove that not contain a smallest member because it contains
no members at all. As soon as we have a set of nat-
A ∪ (B ∪ C ) = (A ∪ B) ∪ C ural numbers with some members then we can order
Problem 2.16 Prove that those members in the usual fashion. Having ordered
them, one will be smallest. This intuition agreeing
A ∩ (B ∩ C ) = (A ∩ B) ∩ C with this latter claim depends strongly on the fact
the integers are “whole numbers” spaced out in in-
Problem 2.17 Prove that
crements of one. To see why this is important con-
A∆(B∆C ) = (A∆B)∆C sider the smallest positive distance. If such a distance
existed, we could cut it in half to obtain a smaller
Problem 2.18 Disprove that distance - the quantity contradicts its own existence.
A∆(B ∪ C ) = (A∆B) ∪ C The well-ordering principle can be used to prove the
correctness of induction.
Problem 2.19 Consider the set S = {1, 2, 3, 4}. For
each k = 0, 1, . . . , 4 how many k element subsets does Theorem 2.1 Mathematical Induction I Sup-
S have? pose that P (n) is a proposition that it either true or
false for any given natural numbers n. If
Problem 2.20 Suppose we have a set S with n ≥ 0
elements. Find a formula for the number of different (i) P (0) is true and,
subsets of S that have k elements.
(ii) when P (n) is true so is P (n + 1)
Problem 2.21 For finite sets S and T , prove
Then we may deduce that P (n) is true for any natural
|S ∪ T | = |S| + |T | − |S ∩ T | number.

29
30
30
Proof: induction that the proposition is true for all natural
numbers. ✷
Assume that (i) and (ii) are both true state- The set of all subsets of a given set is itself an impor-
ments. Let S be the set of all natural numbers for tant object and so has a name.
which P (n) is false. If S is empty then we are done,
so assume that S is not empty. Then, by the well Definition 2.13 The set of all subsets of a set S is
ordering principle, S has a least member m. By (i) called the powerset of S. The notation for the
above m = 0 and so m − 1 is a natural number. Since powerset of S is P (S).
m is the smallest member of S it follows that P (m−1)
is true. But this means, by (ii) above, that P (m) is This definition permits us to rephrase Proposition 2.3
true. We have a contradiction and so our assumption as follows: the power set of a set of n elements has
that S = ∅ must be wrong. We deduce S is empty size 2n .
and that as a consequence P (n) is true for all n ∈ N.
Theorem 2.1 lets us prove propositions that are true

on the natural numbers, starting at zero. A small
The technique used in the above proof is called proof modification of induction can be used to prove state-
by contradiction. We start by assuming the logical ments that are true only for those n ≥ k for any
opposite of what we want to prove, in this case that integer k. All that is needed is to use induction on a
there is some m for which P (m) is false, and from proposition Q(n − k) where Q(n − k) is logically
that assumption we derive an impossibility. If an as- equivalent to P (n). If Q(n − k) is true for n − k ≥ 0
sumption can be used to demonstrate an impossibility then P (n) is true for n ≥ k and we have the modified
then it is false and its logical opposite is true. induction. The practical difference is that we start
A nice problem on which to demonstrate mathemat- with k instead of zero.
ical induction is counting how many subsets a finite
set has. Example 2.11 Prove that n2 ≥ 2n for all n ≥ 2.

Proposition 2.3 Subset counting. A set S with


Notice that 22 = 4 = 2 × 2 so the proposition is true
n elements has 2n subsets.
when n = 2. We next assume that P (n) is true for
Proof: some n and we compute:
First we check that the proposition is true when
n2 ≥ 2n
n = 0. The empty set has exactly one subset: it-
self. Since 20 = 1 the proposition is true for n = 0. n + 2n + 1 ≥ 2n + 2n + 1 (n + 1)2 ≥
2

We now assume the proposition is true for some n. 2n + 2n + 1 (n + 1)2 ≥ 2n + 1 + 1


Suppose that S is a set with n + 1 members and that (n + 1)2 ≥ 2n + 2
x ∈ S. Then S − {x} (the set difference of S and a set (n + 1)2 ≥ 2(n + 1)
{x} containing only x) is a set of n elements and so,
by the assumption, has 2n subsets. Now every subset
of S either contains x or it fails to. Every subset of S
To move from the third step to the fourth step we
that does not contain x is a subset of S − {x} and so
use the fact that 2n > 1 when n ≥ 2. The last step is
there are 2n such subsets of S. Every subset of S that
P (n + 1), which means we have deduced P (n + 1)
contains x may be obtained in exactly one way from
from P (n). Using the modified form of induction we
one that does not by taking the union with {x}. This
have proved that n2 ≥ 2n for all n ≥ 2.
means that the number of subsets of S containing or
failing to contain x are equal. This means there are
2n subsets of S containing x. The total number of It is possible to formalize the procedure for using
subsets of S is thus 2n + 2n = 2n+1 . So if we assume mathematical induction into a three-part process. Once
the proposition is true for n we can demonstrate that we have a proposition P (n),
it is also true for n + 1. It follows by mathematical

30
(i) First demonstrate a base case by directly demon- up lists of numbers. If we wished to sum some for-
strating P (k), mula f (i) over a range from a to b, that is to say a
≤ i ≤ b, then we write :
(ii) Next make the induction hypothesis that P (n) is
X b
true for some n,
f (i)
i=a
(iii) Finally, starting with the assumption that P (n)
is true, demonstrate P (n + 1). On the other hand if S is a set of numbers and we
want to add up f (s) for all s ∈ S we write:
These steps permit us to deduce that P (n) is true for X
all n ≥ k. f (s)
s∈S
Example 2.12 Using induction, prove The result proved in Example 2.12 may be stated in
the following form using sigma notation.
1
1+2+··· +n = n(n + 1) X n
2 1
i= n(n + 1)
2
In this case P (n) is the statement i=1

Proposition 2.4 Suppose that c is a constant and


1 that f (i) and g(i) are formulas. Then
1+2+··· +n = n(n + 1)
2 Pb Pb Pb
(i) i=a (f (i) + g(i)) = i=a f (i) + i=a g(i)
Base case: 1 = 1 1(1 + 21), so P (1) is true. Induc- Pb Pb Pb
(ii) i=a (f (i) − g(i)) = i=a f (i) − i=a g(i)
tion hypothesis: for some n, Pb Pb
(iii) i=a c · f (i) = c · i=a f (i).
1
1+2+··· +n = n(n + 1) Proof:
2
Compute: Part (i) and (ii) are both simply the associative
law for addition: a + (b + c) = (a + b) + c applied many
1 + 2 + · · · + (n + 1) = 1 + 2 + · · · + n + (n + 1) times. Part (iii) is a similar multiple application of
1 the distributive law ca + cb = c(a + b). ✷
= n(n + 1) + (n + 1)
2 The sigma notation lets us work with indefinitely long
1 (and even infinite) sums. There are other similar no-
= (n(n + 1) + 2(n + 1))
2 tations. If A1 , A2 , . . . , An are sets then the intersec-
1 2 tion or union of all these sets can be written:
= n + 3n + 2
2 \n
1 Ai
= (n + 1)(n + 2)
2 i=1
1
= (n + 1)((n + 1) + 1) [n
2 Ai
i=1
and so we have shown that if P (n) is true then so is
P (n + 1). We have thus proven that P (n) is true for Similarly if f (i) is a formula on the integers then
all n ≥ 1 by mathematical induction. Yn
f (i)
i=1
We now introduce sigma notation which makes prob-
lems like the one worked in Example is the notation for computing the product f (1) · f (2) ·
P 2.12 easier to
state and manipulate. The symbol is used to add · · · · f (n). This notation is called Pi notation.

31
DefinitionP 2.14 When we solve an expression P in- Problem 2.24 Suppose that X ⊆ Y with |Y | = n
volving to obtain a formula that does not use or and |X | = m. Compute the number of subsets of Y
”. . .” as in Example 2.12 then we say we have found a that contain X .
closed form for the P expression. Example 2.12 finds
a closed form for i=1 i.
n Problem 2.25 Compute the following sums.
P20
At this point we introduce a famous mathematical (i) i=1 i,
sequence in order to create an arena for practicing (ii) P30 i, and
i=10
proofs by induction.
P21
(iii) i=−20 i.
Definition 2.15 The Fibonacci numbers are de-
fined as follows. f1 = f2 = 1 and, for n ≥ 3, Problem 2.26 Using mathematical induction, prove
fn = fn−1 + fn−2 . the following formulas.
P n
Example 2.13 The Fibonacci numbers with four or (i) i=1 1 = n,
fewer digits are: f1 = 1, f2 = 1, f3 = 2, f4 = 3, Pn 2 n(n+1)(2n+1)
f5 = 5, f6 = 8, f7 = 13, f8 = 21, f9 = 34, f10 = 55, (ii) i=1 i = , and 6
f11 = 89, f12 = 144, f13 = 233, f14 = 377, f15 = Pn 2
3 n (n+1)
2

610, f16 = 987, f17 = 1597, f18 = 2584, f19 = 4181,


(iii) i=1 i = 4
.
and f20 = 6765. Problem 2.27 If f (i) and g(i) are formulas and c
and d are constants prove that
Example 2.14 Prove that the Fibonacci number f3n
is even. b
X b
X b
X
(c · f (i) + d · g(i)) = c · f (i) + d · g(i)
Solution: i=a i=a i=a

Notice that f3 = 2 and so the proposition is true


when n = 1. Assume that the proposition is true for
some n ≥ 1. Then:

f3(n+1) = f3n+3 (2.1)


= f3n+2 + f3n+1 (2.2)
= f3n+1 + f3n + f3n+1 (2.3)
= 2 · f3n+1 + f3n (2.4)Problem 2.28 Suppose you want to break an n × m
chocolate bar, like the 6 × 4 example shown above,
but this suffices because f3n is even by the induction into pieces corresponding to the small squares shown.
hypothesis while 2·f3n+1 is also even. The sum is thus What is the minimum number of breaks you can
even and so f3(n+1) is even. If follows by induction make? Prove your answer is correct.
that f3n is even for all n. ✷
Problem 2.29 Prove by induction that the sum of
the first n odd numbers equals n2 .
Problems
Problem 2.30 Compute the sum of the first n pos-
Problem 2.22 Suppose that S = {a, b, c}. Compute itive even numbers.
and list explicitly the members of the powerset, P (S).
Problem 2.31 Find a closed form for
Problem 2.23 Prove that for a finite set X that X n
i2 + 3i + 5
|X | ≤ |P (X )| i=1

32
Problem 2.32 Let f (n, 3) be the number of subsets Problem 2.41 Consider the statement “All cars are
of {1, 2, . . . , n} of size 3. Using induction, prove that the same color.” and the following “proof ”.
f (n, 3) = 16 n(n − 1)(n − 2). Proof:
Problem 2.33 Suppose that we have sets
We will prove for n ≥ 1 that for any set of n
X1 , X2 , . . . , Xn and Y1 , Y2 , . . . , Yn such that Xi ⊆ Yi .
cars all the cars in the set have the same color.
Prove that the intersection of all the Xi is a subset
of the intersection of all the Yi : • Base Case: n=1 If there is only one car then
\n
\n clearly there is only one color the car can be.
Xi ⊆ Yi
i=1 i=1
• Inductive Hypothesis: Assume that for any set of
n cars there is only one color.
Problem 2.34 Suppose that S1 , S2 , . . . Sn are sets.
Prove the following generalization of DeMorgan’s • Inductive step: Look at any set of n + 1 cars.
laws: Number them: 1, 2, 3, . . . , n, n + 1. Consider the
Tn Sn sets {1, 2, 3, . . . , n} and {2, 3, 4, ..., n + 1}. Each
c
(i) ( i=1 Si ) = i=1 Sic, and is a set of only n cars, therefore for each set there
Sn c Tn is only one color. But the nth car is in both sets
(ii) ( i=1 Si ) = i=1 Sci .
so the color of the cars in the first set must be the
Problem 2.35 Prove by induction that the Fi- same as the color of the cars in the second set.
bonacci number f4n is a multiple of 3. Therefore there must be only one color among all n
+ 1 cars.
Problem 2.36 Prove that if r is a real number r = 1
• The proof follows by induction. ✷
and r = 0 then

X n What are the problems with this proof ?


1 − rn+1
ri =
1−r
i=0
2.3 Functions
Problem 2.37 Prove by induction that the Fi-
bonacci number f5n is a multiple of 5. In this section we will define functions and extend
much of our ability to work with sets to infinite sets.
Problem 2.38 Prove by induction that the Fi- There are a number of different types of functions and
bonacci number fn has the value so this section contains a great deal of terminology.
√ √ !n √ √ !n Recall that two finite sets are the same size if they
5 1+ 5 5 1− 5
fn = · −
·
contain the same number of elements. It is possible
5 2 5 2 to make this idea formal by using functions and, once
the notion is formally defined, it can be applied to
Problem 2.39 Prove that for sufficiently large n the infinite sets.
Fibonacci number fn is the integer closest to
√ √ !n Definition 2.16 An ordered pair is a collection of
5 5 1+ two elements with the added property that one ele-
5 2 ment comes first and one element comes second. The
set containing only x and y (for x = y) is written
and compute the exact value of f30 . Show your work {x, y}. The ordered pair containing x and y with x
(i.e. don’t look the result up on the net). first is written (x, y). Notice that while {x, x} is not
a well defined set, (x, x) is a well defined ordered pair
Problem 2.40 Prove that n(n−1)(n−2)(n−3)
24
is a because the two copies of x are different by virtue of
whole number for any whole number n. coming first and second.

33
The reason for defining ordered pairs at this point of ordered pairs {(r 2 , r) : r ≥ 0}. This function is
is that it permits us to make an important formal well defined because each non-negative real number is
definition that pervades the rest of mathematics. the square of some positive real number.

Definition 2.17 A function f with domain S and The ma jor contrasts between functions in calculus
range T is a set of ordered pairs (s, t) with first ele- and functions in set theory are:
ment from S and second element from T that has the
property that every element of S appears exactly once (i) The domain of functions in calculus are often
as the first element in some ordered pair. We write specified only by implication (you have to know
f : S → T for such a function. how all the functions used work) and are almost
always a subset of the real numbers. The domain
Example 2.15 Suppose that A = {a, b, c} and B =
in set theory must be explicitly specified and may
{0, 1} then
be any set at all.
f = {(a, 0), (b, 1), (c, 0)} (ii) Functions in calculus typically had graphs that
is a function from A to B. The function f : A → B you could draw and look at. Geometric intuition
can also be specified by saying f (a) = 0, f (b) = 1 and driven by the graphs plays a ma jor role in our
f (c) = 0. understanding of functions. Functions in set the-
ory are seldom graphed and often don’t have a
The set of ordered pairs {(a, 0), (b, 1)} is not a func- graph.
tion from A to B because c is not the first coordi-
nate of any ordered pair. The set of ordered pairs
A point of similarity between calculus and set the-
{(a, 0), (a, 1), (b, 0), (c, 0)} is not a function from A
ory is that the range of the function is not explicitly
to B because a appears as the first coordinate of two
specified. When we have a function f : S → T then
different ordered pairs.
the range of f is a subset of T .
In calculus you may have learned the vertical line rule Definition 2.18 If f is a function then we denote
that states that the graph of a function may not in- the domain of f by dom(f ) and the range of f by
tersect a vertical line at more than one point. This rng(f )
corresponds to requiring that each point in the do-
main of the function appear in only one ordered pair. Example 2.17 Suppose that f (n) : N → N is de-
In set theory, all functions are required to state their fined by f (n) = 2n. Then the domain and range of f
domain and range when they are defined. In calculus are the integers: dom(f ) = rng(f ) = N. If we specify
functions had a domain that was a subset of the real the ordered pairs of f we get
numbers and you were sometimes required to identify
the subset. f = {(n, 2n) : n ∈ N}

Example 2.16 This example contrasts the way There are actually two definitions of range that are
functions were treated in a typical calculus course with used in mathematics. The definition we are using, the
the way we treat them in set theory. set from which second coordinates of ordered pairs in a
Calculus: find the domain of the function function are drawn, is also the definition typically using
√ in computer science. The other definition is the set of
f (x) = x second coordinates that actually appear in or- dered
pairs. This set, which we will define formally later, is
Since we know that the square root function exists the image of the function. To make matters even worse
only for non-negative real numbers the domain is {x : the set we are calling the range of a func- tion is also
x ≥ 0}. called the co-domain. We include these confusing

Set theory: the function f = x from the non- terminological notes for students that may try and
negative real numbers to the real numbers is the set look up supplemental material.

34
Definition 2.19 Let X, Y, and Z be sets. The com- We use the symbol R for the real numbers. We also
position of two functions f : X → Y and g : Y → Z assume familiarity with interval notation for contigu-
is a function h : X → Z for which h(x) = g(f (x)) ous subsets of the reals. For real numbers a ≤ b
for all x ∈ X . We write g ◦ f for the composition of
g with f . (a, b) is {x : a < x < b} (a, b] is
{x : a < x ≤ b} [a, b) is {x : a ≤ x <
The definition of the composition of two functions b} [a, b] is {x : a ≤ x ≤ b}
requires a little checking to make sure it makes sense.
Since every point must appear as a first coordinate of Example 2.20 The function f : Z → Z given by
an ordered pair in a function, every result of applying f (n) = 5 − n is a surjection. If we set m = 5 − n then n
f to an element of X is an element of Y to which g can = 5 − m. This means that if we want to find some n
be applied. This means that h is a well-defined set of so that f (n) is, for example, 8, then 5 − 8 = −3 and
ordered pairs. Notice that the order of composition is we see that f (−3) = 8. This demonstrates that all m
important - if the sets X , Y , and Z are distinct there have some n so that f (n) = m, showing that all m
is only one order in which composition even makes appear as the second coordinate of an ordered pair in
sense. f.
2
The function g : R → R given by g(x) = 1+x x
2 is not a
Example 2.18 Suppose that f : N → N is given by
surjection because −1 < g(x) < 1 for all x ∈ R.
f (n) = 2n while g : N → N is given by g(n) = n + 4.
Then
Definition 2.22 A function that is both surjective
(g ◦ f )(n) = 2n + 4
and injective is said to be bijective. Bijective func-
while tions are called bijections.
(f ◦ g)(n) = 2(n + 4) = 2n + 8
Example 2.21 The function f : Z → Z given by
We now start a series of definitions that divide func- f (n) = n is a bijection. All of its ordered pairs have
tions into a number of classes. We will arrive at a the same first and second coordinate. This function is
point where we can determine if the mapping of a called the identity function.
function is reversible, if there is a function that ex- 15
actly reverses the action of a given function.

Definition 2.20 A function f : S → T is injective 10

or one-to-one if no element of T (no second coordi-


5
nate) appears in more than one ordered pair. Such a
function is called an injection.
0

Example 2.19 The function f : N → N given by


f (n) = 2n is an injection. The ordered pairs of f are -5

(n, 2n) and so any number that appears as a second


coordinate does so once.
-10
The function g : Z → Z given by g(n) = n2 is not
an injection. To see this notice that g contains the
-15
ordered pairs (1, 1) and (−1, 1) so that 1 appears twice -3 -2 -1 0 1 2 3

as the second coordinate of an ordered pair.


The function g : R → R given by g(x) = x3 − 4x is
Definition 2.21 A function f : S → T is surjec- not a bijection. It is not too hard to show that it is a
tive or onto if every element of T appears in an surjection, but it fails to be an injection. The portion
ordered pair. Surjective functions are called surjec- of the graph shown above demonstrates that g(x) takes
tions. on the same value more than once. This means that

35
some numbers appear twice as second coordinates of Definition 2.24 The inverse of a function f : S →
ordered pairs in g. We can use the graph because g is T is a function g : T → S so that for all x ∈ S,
a function from the real numbers to the real numbers. g(f (x)) = x and for all y ∈ T , f (g(y)) = y.

For a function f : S → T to be a bijection every If a function f has an inverse we use the notation f
element of S appears in an ordered pair as the first −1
for that inverse. Since an exponent of −1 also
member of an ordered pair and every element of T means reciprocal in some circumstances this can be a
appears in an ordered pair as the second member of bit confusing. The notational confusion is resolved by
an ordered pair. Another way to view a bijection is as considering context. So long as we keep firmly in mind
a matching of the elements of S and T so that every that functions are sets of ordered pairs it is easy to
element of S is paired with an element of T . For prove the proposition/definition that follows after the
finite sets this is clearly only possible if the sets are next example.
the same size and, in fact, this is the formal definition
of “same size” for sets.
Example 2.23 If E is the set of even integers then
Definition 2.23 Two sets S and T are defined to be the bijection f (n) = 2n from Z to E has the inverse f
the same size or to have equal cardinality if there −1 : E → Z given by g(2n) = n. Notice that defin-
is a bijection f : S → T . ing the rule for g as depending on the argument 2n
seamlessly incorporates the fact that the domain of g is
Example 2.22 The sets A = {a, b, c} and Z = the even integers.
{1, 2, 3} are the same size. This is obvious because
they have the same number of elements, |A| = |Z | = 3
but we can construct an explicit bijection

f = {(a, 3), (b, 1), (c, 2)}

with each member of A appearing once as a first co-


ordinate and each member of B appearing once as a
second coordinate. This bijection is a witness that A
and B are the same size.
Let E be the set of even integers. Then the function

g:Z→ E

in which g(n) = 2n is a bijection. Notice that each


integers can be put into g and that each even integer
has exactly one integer that can be doubled to make it.
If g(x) = x−1 x
, shown above with its asymptotes x
The existence of g is a witness that the set of integers
= 1 and y = 1 then f is a function from the set
and the set of even integers are the same size. This
H = R − {1} to itself. The function was cho- sen
may seem a bit bizarre because the set Z − E is theto have asymptotes at equal x and y values; this is a
infinite set of odd integers. In fact one hallmark of an
bit unusual. The function g is a bijection. No- tice
infinite set is that it can be the same size as a proper
that the graph intersects any horizontal or ver- tical
subset. This also means we now have an equality setline in at most one point. Every value except x = 1
for sizes of infinite sets. We will do a good deal more
may be put into g meaning that g is a function on H .
with this in Chapter 3. Since the vertical asymptote goes off to ∞ in both
directions, all values in H come out of g. This
Bijections have another nice property: they can be demonstrates g is a bijection. This means that it has
unambiguously reversed. an inverse which we now compute using a standard

36
technique from calculus classes. B is proven by first assuming A and deducing B and
then separately assuming B and deducing A. The
x
y = formal symbol for A iff B is A ⇔ B. Likewise we
x−1 have symbols for the ability to deduce B given A,
y(x − 1) = x
A ⇒ B and vice-versa B ⇒ A. These symbols are
xy − y = x spoken “A implies B” and “B implies A” respectively.
xy − x = y
x(y − 1) = y Proposition 2.6 Suppose that X , Y , and Z are
yx sets. If f : X → Y and g : Y → Z are bijections
= then so is g ◦ f : X → Z .
y−1

which tells us that g −1 (x) = x


x−1
so g = g −1 : the Proof: this proof is left as an exercise.
function is its own inverse.
Definition 2.25 Suppose that f : A → B is a func-
Proposition 2.5 A function has an inverse if and tion. The image of A in B is the subset of B made
only if it is a bijection. of elements that appear as the second element of or-
dered pairs in f . Colloquially the image of f is the set
Proof: of elements of B hit by f . We use the notation I m(f )
Suppose that f : S → T is a bijection. Then if for images. In other words I m(f ) = {f (a) : a ∈ A}.
g : T → S has ordered pairs that are the exact reverse
of those given by f it is obvious that for all x ∈ S, Example 2.24 If f : N → N is given by the rule
g(f (x)) = x, likewise that for all y ∈ T , f (g(y)) = y. f (n) = 3n then the set T = {0, 3, 6, . . .} of natural
We have that bijections posses inverses. It remains numbers that are multiples of three is the image of f .
to show that non-bijections do not have inverses. Notation: I m(f ) = T .
If f : S → T is not a bijection then either it is
not a surjection or it is not an injection. If f is not a If g : R → R given by g(x) = x then
2

surjection then there is some t ∈ T that appears in no


ordered pair of f . This means that no matter what I m(g) = {y : y ≥ 0, y ∈ R}
g(t) is, f (g(t)) = t and we fail to have an inverse.
If, on the other hand, f : S → T is a surjection There is a name for the set of all ordered pairs drawn
but fails to be an injection then for some distinct from two sets.
a, b ∈ S we have that f (a) = t = f (b). For g : T → S
to be an inverse of f we would need g(t) = a and Definition 2.26 If A and B are sets then the set of
g(t) = b, forcing t to appear as the first coordinate all ordered pairs with the first element from A and
of two ordered pairs in g and so rendering g a non- the second from B is called the Cartesian Product
function. We thus have that non-bijections do not of A and B.
have inverses. ✷
The type of inverse we are discussing above is a two- The notation for the Cartesian product of A and
sided inverse. The functions f and f −1
are mutually B is A × B. using curly brace notation:
inverses of one another. It is possible to find a func-
tion that is a one-way inverse of a function so that A × B = {(a, b) : a ∈ A, b ∈ B}
f (g(x)) = x but g(f (x)) is not even defined. These
are called one-sided inverses. Example 2.25 If A = {1, 2} and B = {x, y} then
Note on mathematical grammar: Recall that when
two notions, such as “bijection” and “has an inverse” A × B = {(1, x), (1, y), (2, x), (2, y)}
are equivalent we use the phrase “if and only if ” (ab-
breviated iff ) to phrase a proposition declaring that The Cartesian plane is an example of a Cartesian
the notions are equivalent. A proposition that A iff product of the real numbers with themselves: R × R.

37
2.3.1 Permutations n 0 1 2 3 4 5 6 7
n! 1 1 2 6 24 120 720 5040
In this section we will look at a very useful sort of
function, bijections of finite sets. Proposition 2.7 The number of permutations of a
finite set with n elements is n!.
Definition 2.27 A permutation is a bijection of a
finite set with itself. Likewise a bijection of a finite Proof: this proof is left as an exercise.
set X with itself is called a permutation of X.
Notice that one implication of Proposition 2.6 is that
Example 2.26 Let A = {a, b, c} then the possible the composition of two permutations is a permuta-
permutations of A consist of the following six func- tion. This means that the set of permutations of a
tions: set is closed under functional composition.

Definition 2.30 A fixed point of a function f : S


{(a,a)(b,b)(c,c)} {(a,a)(b,c)(c,b)} → S is any x ∈ S such that f (x) = x. We say that f
fixes x.
{(a,b)(b,a)(c,c)} {(a,b)(b,c)(c,a)}

{(a,c)(b,a)(c,b)} {(a,c)(b,b)(c,a)} Problems


Notice that the number of permutations of three ob- Problem 2.42 Suppose for finite sets A and B that
jects does not depend on the identity of those objects. f : A → B is an injective function. Prove that
In fact there are always six permutations of any set of
three objects. We now define a handy function that |B| ≥ |A|
uses a rather odd notation. The method of show-
ing permutations in Example 2.26, explicit listing of Problem 2.43 Suppose that for finite sets A and B
ordered pairs, is a bit cumbersome. that f : A → B is a surjective function. Prove that
|A| ≥ |B|.
Definition 2.28 Assume that we have agreed on an
order, e.g. a,b,c, for the members of a set X = Problem 2.44 Using functions from the integers to
{a, b, c}. Then one-line notation for a permutation the integers give an example of
f consists of listing the first coordinate of the ordered
pairs in the agreed on order. The table in Example (i) A function that is an injection but not a surjec-
2.26 would become: tion.

abc acb bac bca cab cba (ii) A function that is a surjection but not an injec-
tion.
in one line notation. Notice the saving of space.
(iii) A function that is neither an injection nor a sur-
Definition 2.29 The factorial of a natural number jection.
n is the product
(iv) A bijection that is not the identity function.

Problem 2.45 For each of the following functions


Y n
from the real numbers to the real numbers say if the
n(n − 1)(n − 2) · · · 3 · 2 · 1 = i
function is surjective or injective. It may be neither.
i=1

with the convention that the factorial of 0 is 1. We (i) f (x) = x2 (ii) g(x) = x3
denote the factorial of n as n!, spoken ”n factorial”. √
x√ x≥0
(iii) h(x) =
Example 2.27 Here are the first few factorials: − −x x < 0

38
Interlude
The Collatz Conjecture

One of the most interesting features of mathematics is that it is possible to phrase problems
in a few lines that turn out to be incredibly hard. The Collatz conjecture was first posed
in 1937 by Lothar Collatz. Define the function f from the natural numbers to the natural
numbers with the rule
3n + 1 n odd
n f (n) =
2
n even
Collatz’ conjecture is that if you apply f repeatedly to a positive integer then the resulting
sequence of numbers eventually arrives at one. If we start with 17, for example, the result
of repeatedly applying f is:

f (17) = 52, f (52) = 26, f (26) = 13, f (13) = 40, f (40) = 20, f (20) = 10, f (10) = 5,
f (5) = 16, f (16) = 8, f (8) = 4, f (4) = 2, f (2) = 1

The sequences of numbers generated by repeatedly applying f to a natural number are


called hailstone sequences with the collapse of the value when a large power of 2 appears
being analogous to the impact of a hailstone. If we start with the number 27 then 111 steps
are required to reach one and the largest intermediate number is 9232. This quite irregular
behavior of the sequence is not at all apparent in the original phrasing of the problem.
The Collatz conjecture has been checked for numbers up to 5 × 2 61 (about 5.764 × 1018 )
by using a variety of computational tricks. It has not, however, been proven or disproven.
The very simple statement of the problem causes mathematicians to underestimate the
difficulty of the problem. At one point a mathematician suggested that the problem might
have been developed by the Russians as a way to slow American mathematical research.
This was after several of his colleagues spent months working on the problem without
obtaining results.
A simple (but incorrect) argument suggests that hailstone sequences ought to grow indefi-
nitely. Half of all numbers are odd, half are even. The function f slightly more than triples
odd numbers and divides even numbers in half. Thus, on average, f increases the value of
numbers. The problem is this: half of all even numbers are multiples of four and so are
divided in half twice. One-quarter of all even numbers are multiples of eight and so get
divided in half three times, and so on. The net effect of factors that are powers of two is
to defeat the simple argument that f grows “on average”.

39
40 CHAPTER 2. BASIC SET THEORY
40
Problem 2.46 True or false (and explain): The Problem 2.59 Suppose that X and Y are finite sets
x−1
function f (x) = x+1 is a bijection from the real num- and that |X | = |Y | = n. Prove that there are n!
bers to the real numbers. bijections of X with Y .

Problem 2.47 Find a function that is an injection Problem 2.60 Suppose that X and Y are sets with
of the integers into the even integers that does not |X | = n, |Y | = m. Count the number of functions
appear in any of the examples in this chapter. from X to Y .
Problem 2.61 Suppose that X and Y are sets with
Problem 2.48 Suppose that B ⊂ A and that there |X | = n, |Y | = m for m > n. Count the number of
exists a bijection f : A → B. What may be reasonably injections of X into Y .
deduced about the set A?
Problem 2.62 For a finite set S with a subset T
Problem 2.49 Suppose that A and B are finite sets. prove that the permutations of S that have all mem-
Prove that |A × B| = |A| · |B|. bers of T as fixed points form a set that is closed
under functional composition.
Problem 2.50 Suppose that we define h : N → N as
Problem 2.63 Compute the number of permuta-
follows. If n is even then h(n) = n/2 but if n is odd
tions of a set S with n members that fix at least m < n
then h(n) = 3n + 1. Determine if h is a (i) surjection
points.
or (ii) injection.
Problem 2.64 Using any technique at all, estimate
Problem 2.51 Prove proposition 2.6. the fraction of permutations of an n-element set that
have no fixed points. This problem is intended as an
Problem 2.52 Prove or disprove: the composition exploration.
of injections is an injection.
Problem 2.65 Let X be a finite set with |X | = n.
Problem 2.53 Prove or disprove: the composition Let C = X × X . How many subsets of C have the
of surjections is a surjection. property that every element of X appears once as a
first coordinate of some ordered pair and once as a
Problem 2.54 Prove proposition 2.7. second coordinate of some ordered pair?
P
Problem 2.55 List all permutations of Problem Q 2.66 An alternate version of Sigma ( )
and Pi ( ) notation works by using a set as an index.
X = {1, 2, 3, 4} So if S = {1, 3, 5, 7} then
X Y
s = 16 and s = 105
using one-line notation.
s∈S s∈S

Problem 2.56 Suppose that X is a set and that f , Given all the material so far, give and defend rea-
g, and h are permutations of X . Prove that the equa- sonable values for the sum and product of an empty
tion f ◦ g = h has a solution g for any given permu- set.
tations f and h.
Problem 2.67 Suppose that fα : [0, 1] → [0, 1] for
Problem 2.57 Examine the permutation f of Q = −1 < α < ∞ is given by
{a, b, c, d, e} which is bcaed in one line notation. If (α + 1)x fα (x) = ,
we create the series f, f ◦ f, f ◦ (f ◦ f ), . . . does the αx + 1
identity function, abcde, ever appear in the series? prove that fα is a bijection.
If so, what is its first appearance? If not, why not? Problem 2.68 Find, to five decimals accuracy:
Problem 2.58 If f is a permutation of a finite set, Ln(200!)
prove that the sequence f, f ◦ f, f ◦ (f ◦ f ), . . . must
Explain how you obtained the answer.
contain repeated elements.

40
2.4. ∞ + 1 41

2.4 ∞ +1 the representation of the next natural number. This


permits us to conclude that the set of all natural num-
We conclude the chapter with a brief section that bers
demonstrates a strange thing that can be accom- {0, 1, 2, . . .}
plished with set notation. We choose to represent
fits the definition of a natural number. Which nat-
the natural numbers 0, 1, 2, . . . by sets that contain
ural number is it? It is easy to see, in the minimal
the number of elements counted by the corresponding
set representation, that for natural numbers m and
natural number. We also choose to do so as simply as
n, m < n implies that the representation of m is a
possible, using only curly braces and commas. Given
subset of the representation of n. Every finite natural
this the numbers and their corresponding sets are:
number is a subset of the set of all natural numbers
0 : {} and so we conclude that {0, 1, 2, . . .} is an infinite
1 : {{}} = {0} natural number. The set notation thus permits us to
2 : {{},{{}}}={0,1} construct an infinite number.
3 : {{},{{}},{{},{{}}}}={0,1,2} The set consisting of the representations of all finite
4: natural numbers is an infinite natural number. The
{{},{{}},{{},{{}}},{{},{{}},{{},{{}}}} number has been given the name ω, the lower-case
} omega. In addition to being a letter omega tradi-
={0,1,2,3} tionally also means “the last”. The number ω comes
after all the finite natural numbers. If we now apply
The trick for the above representation is this. Zero is Proposition 2.8 we see that
represented by the empty set. One is represented by
the set of the only thing we have constructed - zero, ω ∪ {ω} = ω + 1
represented as the empty set. Similarly the represen- This means that we can add one to an infinite num-
tation of two is the set of the representation of zero ber. Is the resulting number ω + 1 a different number
and one (the empty set and the set of the empty set). from ω? It turns out the answer is “‘yes”, because
This representation is incredibly inefficient but it uses the representations of these numbers are different as
a very small number of symbols. This representation sets. The representation of ω contains no infinite sets
also has a useful property. As always, we will start while the representation of ω + 1 contains one.
with a definition.

Definition 2.31 The minimal set representa- Problems


tion of the natural numbers is constructed as follows:
Problem 2.69 Find the representation for 5 using
(i) Let 0 be represented by the empty set. the curly-brace-and-comma notation.
(ii) For n > 0 let n be represented by the set Problem 2.70 Give the minimal set representation
{0, 1, . . . , n − 1}. of ω + 2 using the simplified notation.
Problem 2.71 Suppose that n > m are natural
The shorthand {0, 1} for {{}, {{}}} is called the
numbers and that S is the minimal set representa-
sim- plified notation for the minimal set
tion of n while T is the minimal set representation of
representation. We now give the useful property of
m. Is the representation of n − m a member of the
the minimal set representation.
set difference S − T ?
Proposition 2.8 n + 1 = n ∪ {n} Problem 2.72 Give a formula, as a function of n,
for the number of times that the symbol { appears in
Proof: the representation of n.
This follows directly from Definition 2.31 by consid-
ering the set difference of the representations of n and Problem 2.73 Prove or disprove: there are an infi-
n − 1. ✷ nite number of distinct infinite numbers.
The definition says that any set of the representations
of consecutive natural numbers, starting at zero, is

41
Linear Equations and Matrices

The purpose of this chapter is to learn about linear systems. We will restrict our discussion for now to
equations whose coefficients are real numbers. In order to develop the algorithmic approach to linear
systems known as Gaussian reduction, we will introduce the notion of a matrix so that we can approach any
system via its coefficient matrix. This allows us to state a set of rules called row operations to bring our
equations into a normal form called the reduced row echelon form of the system. The set of solutions may
then be expressed in terms of fundamental and particular solutions. Along the way, we will develope the
criterion for a system to have a unique solution. After we have developed some further algebraic tools, which
will come in the next chapter, we’ll be able to considerably strengthen the techniques we developed in this
chapter.

2.1 Linear equations: the beginning of algebra


The subject of algebra arose from studying equations. For example, one might want to find all the real
numbers x such that x = x2 1. To solve, we could rewrite our equation as x2 x 6 = 0 and then factor its

left hand side. This would tell us that (x 3)(x + 2) = 0, so we would conclude that either x = 3 or x = 2
since either x 3 or x + 2 has to be zero. −− Finding the roots of a polynomial is a nonlinear problem,
whereas the topic to be studied here is−the theory of linear equations.
The simplest linear − − ax = b. The letter x is the variable, and a and b are fixed
equation is the equation
numbers. For example, consider 4x = 3. The

15

42
16

ƒ x = b/a, and this solution is unique. If a = 0


solution is x = 3/4. In general, if a = 0, then
and b = 0, there is no solution, since the equation says 0 = b. And in the case where a and b
ƒ
are both 0, every real number x is a solution. This points out a general property of linear
equations. Either there is a unique solution (i.e. exactly one), no solution or infinitely many
solutions.
More generally, if x1, x2, . . . xn are variables and a1, a2, . . . an and c are fixed real
numbers, then the equation
a1x1 + a2x2 + · · · + anxn = c
is said to be a linear equation. The ai are the coefficients, the xi the variables and c is the
constant. While in familiar situations, the coefficients are real numbers, it will turn out that in
other important settings, such as coding theory, the coefficients might be elements of some
general field. We will study fields in the next chapter. For now, let us just say that in a field it
is possible to carry out division. The real numbers are a field, but the integers are not (3/4 isn’t
an integer).
Let’s take another example. Suppose you are planning to make a cake using 10
ingredients, and you want the cake to have 2000 calories. Let ai be the number of calories per
gram of the ith ingredient. Presumably, each ai is nonnegative, although in the future, foods
with negative calories may actually be available. Similarly, let xi be the number of grams of
the ith ingredient. Then a1x1 + a2x2 + + a10x10 is the total number of calories in the
recipe. Since you want the total number of calories in your cake to be exactly 2000, you
consider the equation a1x1 + a2x2 + ·+· ·a10x10 = 2000. The totality of possible solutions x1,
x2, . . . , x10 for this equation is the set of all possible recipes you can concoct.
The following more complicated example illustrates how linear equations can be used in
nonlinear problems. Let R denote the real numbers, and suppose ·we · · want to know something
about the set of common solutions of the equations z = x + xy5 and z2 = x + y4. These
2

equations represent two surfaces in real three space R3, so we’d expect the set of common
solutions to lie on a curve. Here it’s impossible to express the solutions in a closed form, but we
can study them locally using linear methods. For example, both surfaces meet at (1, 1, 1), and
they both have a tangent plane at (1, 1, 1). The tangent line to the curve of intersection at (1,
1, 1) is the intersection of these two tangent planes. This will give us a linear approximation to
the curve near (1, 1, 1).
Nonlinear systems such as in the above example are usually difficult to solve; their theory
involves highly sophisticated mathematics. On the other

16
17

hand, it turns out that systems of linear equations are handled quite simply by elementary methods, and
modern computers make it possible to solve gigantic linear systems with fantastic speed.
A general linear system consisting of m equations in n unknowns will look like:

a11x1 + a12x2 + · · · + a1nxn = b1 a21x1 + a23x2 + · · · + a2nxn = b2

. (2.1)

am1x1 + am2x2 + · · · + amnxn = b m.

Notice how the coefficients aij are labelled. The first index gives its row and the second index its column.
The case where all the constants bi are zero is called the homogeneous case. Otherwise, the system is said
to be nonhomogeneous
The main problem, of course, is to find a procedure or algorithm for describing the solution set of a
linear system. The principal procedure for solving a linear system is called Gaussian reduction. We will
take this up below.

17
18

2.2 Matrices
To simplify the cumbersome notation for a system used above, we will now introduce the notion
of a matrix.

Deftnition 2.1. A matrix is simply a rectangular array of real numbers.


An m × n matrix is an array having m rows and n columns, such as
.
.

If m = n, we say A is square of degree n. The set of all m × n matrices with


real entries will be denoted by Rm×n.

2.2.1 Matrix Addition and Vectors


It turns out to be very useful to introduce addition and multiplication for matrices. We will
begin with sums.

Deftnition 2.2. The matrix sum (or simply the sum) A + B of two m n matrices × A
and B is defined to be the m n matrix C such
× that c ij = aij +b ij for all pairs of indices
(i, j). The scalar multiple αA of A by a real number α is the matrix obtained by
multiplying each entry of A by α.

Example 2.1. Here are a couple of examples. Let

18
19

The m × n matrix all of whose entries are zero is called the zero matrix. If O is the m n zero
matrix and A × is any m n matrix, then A + O = × A. Thus O is the additive identity for matrix
addition. Now that the additive identity for matrix addition is defined, we can observe that the
matrix −A is the additive inverse of A, in the sense that A + (−A) = (−A) + A = O.
A column matrix is usually simply called a vector. The set of all n × 1 column matrices (or vectors) is
denoted by Rn. Vectors with the same num- ber of components are combined via the component-wise addition
and scalar multiplication defined above. We will use the notation (u1, u2, . . . , un)T to express the column
matrix

in a more compact form. What the superscript T stands for will be clarified later. Vectors will usually be
written as bold faced letters. For example, x will stand for

If u1, u2, . . . um are vectors in Rn and if a1, a2, . . . , am are scalars, that is elements of R, then the
vector
a 1 u 1 + a 2 u 2 + · · · + a mu m
is called a linear combination of u1, u2, . . . um.

2.2.2 Some Examples


So far we have only considered matrices over the real numbers. After we define fields in the next Chapter,
we will be able to study matrices over ar- bitrary fields, which will give us a much wider range of applications.
Briefly, a field is a set with the operations of addition and multiplication which satisfies the basic algebraic
properties of the integers. However, fields also have division in the sense that every element of a field has a
multiplicative inverse. we will leave the precise meaning of this statement for the next chapter.
The field with the smallest number of elements is the integers mod 2, which is denoted by F 2. This
field consists of two elements 0 and 1 with
addition being defined by 0 + 0 = 0, 0 + 1 = 1 + 0 = 1 and 1 + 1 = 0.
Multiplication is defined so that 1 is its usual self: 0 × 1 = 0 and 1 × 1 = 1. except that 1 + 1 is defined to be
0: 1 + 1 = 0. F2 is very useful in computer science since adding 1 represents a change of state
(off to on, on to off), while adding 0 represents status quo.
Matrices over F2 are themselves quite interesting. For example, since F2 has only two elements, there are
precisely 2mn such matrices. Addition of such matrices has an interesting property, as the
following example shows.
19
20
Example 2.2. For example,

In the first sum, the parity of every element in the first matrix is reversed. In the second, we see every matrix
over F2 is its own additive inverse.

Example 2.3. Random Key Crypts. Suppose Rocky the flying squirrel wants to send a message to his sidekick,
Bullwinkle the moose, and he wants to make sure that the notorious villains Boris and Natasha won’t be able to
learn what it says. Here is what the ever resourceful squirrel does. First he assigns the number 1 to a, 2 to b and
so forth up to 26 to z. He also assigns 0 to the space between two words. He then computes the binary
expansion of each integer between 1 and 26. Thus 1=1, 2=10, 3=11, 4=100, . . . , 26=11010. He now converts
his message into a sequence of five digit strings. Note that 00000 represents a space. The result is his encoded
message, which is normally referred to as the plaintext. To make things more compact, he arranges the
plaintext into a matrix. For example, if there are 5 words, hence 4 spaces, he could make a 3 3 matrix of 5 digit
strings of zeros and ones.
Let’s denote the matrix containing the plaintext by P , and suppose P is m n. Now the fun starts. Rocky and
Bullwinkle have a list of m n matrices of zeros and ones that only they know. The flying squirrel selects one of
× ×
these matrices, say number 47, and tells Bullwinkle. Let E be matrix number 47. Cryptographers call E the
key. Now he sends the ciphertext encE(P ) = P +E to Bullwinkle. If only Rocky and Bullwinkle know E, then
the matrix P containing the plaintext is secure. Even if Boris
× and Natasha succeed in learning the ciphertext P
+ E, they will still have to know E to
find out what P is. The trick is that the key E has to be sufficiently random so that neither Boris nor Natasha
can guess it. For example, if E is the all ones matrix, then P isn’t very secure since Boris and Natasha will
surely try it. Notice that once Bullwinkle receives the ciphertext, all he has to do is add the key E to recover
the plaintext P since

encE(P ) + E = (P + E) + E = P + (E + E) = P + O = P.

This is something even a mathematically challenged moose can do.


The hero’s encryption scheme is extremely secure if the key E is suffi- ciently random and only used
once. (Such a crypt is called a one time pad.) However, if he uses E to encrypt another plaintext message
Q, and Boris and Natasha pick up both encE(P ) = P + E and encE(Q) = Q + E, then they can likely find
out what both P and Q say. The reason for this is that

(P + E) + (Q + E) = (P + Q) + (E + E) = P + Q + O = P + Q.

The point is that knowing P + Q may be enough for a good cryptographer to deduce both P and Q. But, as
a one time pad, the random key is quite secure (in fact, apparently secure enough for communications on the
hot line between Washington and Moscow).

Example 2.4. (Scanners) We can also interpret matrices over F2 in an- other natural way. Consider a black
and white photograph as being a rect- angular array consisting of many black and white dots. By giving the
white dots the value 0 and the black dots the value 1, the black and white photo is therefore transformed into a
matrix over F2. Now suppose we want to com- pare two black and white photographs whose matrices A
and B are both m n. It’s inefficient for a computer to scan the two matrices to see in how many
positions they agree. However, when A and BB are added, the sum A + B has a 1 in in any component
where A and B differ, and a 0 wherever they coincide. For example, the sum two identical photographs is
× 20
21
the zero matrix, and the sum of two complementary photographs is the all ones matrix. An obvious
measure of how similar the two matrices A and B are is the number of non zero entries of A + B, i.e. Σ(aij
+ bij). This easily tabulated number is known as the Hamming distance between A and B.

2.2.3 Matrix Product


We will introduce matrix multiplication in the next Chapter. To treat linear systems, however, we need to
define the product Ax of a m × n matrix A
and a (column) vector x in Rn. Put

.
From now on, one should think of the left hand side of the linear system
Ax = b as a product.
Let us point out a basic property of multiplication.
Proposition 2.1. The matrix product Ax is distributive. That is, for any
x and y in Rn and any A ∈ Rm×n, A(x + y) = Ax + Ay.
Proof. This is obvious from the distributive property of real numbers.

21
22

2.3 Solving Linear Systems via Gaussian Reduc- tion

Gaussian reduction is an algorithmic procedure for finding the solution set of a linear system. We
will say that two linear systems are equivalent if their solution sets are equal. The strategy in
Gaussian reduction is to replace the original system with a sequence of equivalent systems
until the final system is in reduced row echelon form. That is, its coefficient matrix is in
reduced row echelon form. The sequence of equivalent systems is produced by applying row
operations.

2.3.1 Row Operations and Equivalent Systems

×
Let A be an m n matrix and consider the linear system Ax = b. The augmented coefficient
matrix of this system is (A b). The first thing is to point out the role of row operations. What
|
happens when one performs an elementary row operation on (A b)? In fact, I claim that the
new system is equivalent to the original system.
|
For example, row swaps simply interchange two equations, so they clearly leave the solution
set unchanged. Similarly, multiplying the ith equation by a non-zero constant a does likewise,
since the original system can be recaptured by multiplying the ith equation by a −1. The only
question is whether a row operation of type III changes the solutions. Suppose the ith equation
is replaced by itself plus a multiple k of the jth equation, where i = j. Then any solution of the
original system is still a solution of the new system. But any solution of the new system is also
a solution of the original system since subtracting k times the jth equation from the ith
equation of the new system gives us back the original system. Therefore the systems are
ƒ
equivalent.
To summarize this, we state
Proposition 2.3. Performing a sequence of row operations on the aug- mented
coefficient matrix of a linear system gives a new system which is equivalent to the
original system.
To reiterate, to solve a linear system by Gaussian reduction, the first step is to put the
augmented coefficient matrix in reduced row echelon form via a sequence of row operations.
The next step will be to find the solution set.

22
23

2.3.2 The Homogeneous Case


Solving a linear system Ax = b involves several steps. The first step is to solve the associated
homogeneous system.
Deftnition 2.7. A linear system Ax = b is said to be homogeneous if b = 0. The solution set of a
homogeneous linear system Ax = 0 is called the null space of A. The null space of A is denoted
throughout by N (A).
The efficient way to describe the solution set of a homogeneous linear system to use vectors. Note first
that since performing row operations on (A 0) doesn’t alter the last column, we only need to use the
coefficient matrix A.
|
The following example shows how to write down the null space.
Example 2.6. Consider the homogeneous linear system
0x1 + x2 + 2x3 + 0x4 + 3x + 5 − x6 = 0
0x1 + 0x2 + 0x3 + x4 + 2x5 + 0x6 = 0.

Notice that the coefficient matrix A is already reduced. Indeed,

The procedure is to solve for the variables corresponding to the columns with corners, which we call the
corner variables. Since the corner variables have nonzero coefficients, they can be expressed in terms of
the remaining variables, which are called the free variables. For A, the corner columns are the second and
fourth, so x2 and x4 are the corner variables, and the variables x1, x3, x5 and x6 are the free variables.
Solving for x2 and x4 gives
x2 = −2x3 − 3x5 + x6 x4 = −2x5
In this expression, the corner variables are dependent variables which are functions of the free variables.
Now let x = (x1, x2, x3, x4, x5, x6)T denote an arbitrary vector in R6 in the solution set of the system, and
let us call x the general solution vector. Notice that we have expressed it as a column vector. Replacing
the corner variables by their expressions in terms of the free variables gives a new expression for the general
solution vector involving only the free variables. Namely

x = (x1, −2x3 − 3x5 + x6, x3, − 2x5, x5, x6)T .

23
24

The general solution vector now depends only on the free variables, and there is a solution for
any choice of these variables.
Using a little algebra, we can compute the vector coefficients of each one of the free
variables in x. These vectors are called the fundamental solutions.
In this example, the general solution vector x has the form

x = x 1 f 1 + x 3f 2 + x 4f 3 + x 5 f 4 , (2.5)

where

The equation (2.5) tells us that every solution of Ax = 0 is a linear combi- nation of the fundamental solutions
f1, . . . , f4. −

This example illustrates a trivial but useful fact.


Proposition 2.4. In an arbitrary homogeneous linear system with coeffi- cient matrix
A, every solution is a linear combination of the fundamental solutions, and the
number of fundamental solutions is the number of free variables. Thus,

#corner variables + #free variables = #variables. (2.6)

Proof. The proof that every solution is a linear combination of the funda- mental solutions
goes exactly like the above example, so we will omit it. Equation (2.6) is an application of the
fact that every variable is either a free variable or a corner variable, but not both.
We will eventually prove several refinements of this property which will say considerably
more about the structure of the solution set.
Let us point out something a bit unusual in Example 2.6. The variable x1 never actually
appears in the system, but it does give a free variable and a corresponding fundamental
solution (1, 0, 0, 0, 0, 0)T . Suppose instead of A the coefficient matrix is

24
25

Now (1, 0, 0, 0, 0, 0)T is no longer a fundamental solution. In fact the solution set is now a subset of R5. The
corner variables are x1 and x3, and there are now only three fundamental solutions corresponding to the
free variables x2, x4, and x5.
Even though identity (2.6) is completely obvious, it gives some very useful information. Here is a
typical application.

Example 2.7. Consider a linear system with 25 variables and assume there are 10 free variables. Then there
are 15 corner variables, so the system has to have at least 15 equations. That is, there have to be at least 15
linear constraints on the 25 variables.

We can also use (2.6) to say when the homogeneous system Ax = 0 has a unique solution (that is,
exactly one solution). Note that 0 is always a solution: the trivial solution. Hence if the solution is to
be unique, then the only possibility is that (A) = 0 . But this happens exactly when there are no free
variables, since if there is a free variable there have to be non trivial solutions. Thus a homogeneous
system has a unique solution if andN only if every
{ } variable is a corner variable, which is the case exactly
when the number of corner variables is the number of columns of A. By the same reasoning, if a homogeneous
system has more variables than equations, there have to be non trivial solutions, since there has to be at least
one free variable.

2.3.3 The Non-homogeneous Case

− b = 0 is saidƒ to be non-homogeneous. A non- homogeneous system requires that we


A system Ax b with
use an augmented coefficient matrix (A b).
To resolve the non-homogeneous case, we need to observe a result some- times called the
|
Super-Position Principle.
Proposition 2.5. If a system with augmented coefficient matrix (A b) has a| particular solution p,
then any other solution has the form p + x, where x is an arbitrary element of N (A).

Proof. The proof is quite easy. Suppose p = (p1, . . . , pn)T , and let x = (x1, . . . , xn)T be an
element of (A). By the distributivity
N of matrix mul- tiplication (Proposition 2.1),

A(p + x) = Ap + Ax = b + 0 = b.

25
26

− to the homogeneous system,


Conversely, if q is also particular solution, then p q is a solution
since

A(p − q) = Ap − Aq = b − b = 0.
Thus, p q−is an element of (A).NTherefore q = p + x, where x = q p, −
as asserted. This completes the proof.
In the above proof, we made the statement that A(p + x) = Ap + Ax. This follows from
a general algebraic identity called the distributive law which we haven’t yet discussed.
However, our particular use of the distributive law is easy to verify from first principles.
Example 2.8. Consider the system involving the counting matrix C of Example 2.5:

1x1 + 2x2 + 3x3 = a


4x1 + 5x2 + 6x3 = b
7x1 + 8x2 + 9x3 = c,

where a, b and c are fixed arbitrary constants. This system has augmented coefficient matrix

26
38

reduced system turns out to be the same one we obtained by using the sequence in Example 11.2. We get

1x1 + 0x2 − 1x3 = (−5/3)a + (2/3)b


0x1 + 1x2 + 2x3 = (4/3)a − (1/3)b
0x1 + 0x2 + 0x3 = a − 2b + c
Clearly the above system may in fact have no solutions. Indeed, from the last equation, we see that
whenever a 2b+c = 0, there cannot be a solution, since the left side of the third equation is always zero. Such
− ƒ
a system is called inconsistent. For a simpler example of an inconsistent system, think of three lines in R2
which don’t pass through a common point. This is an example where the system has three equations but
only two variables.
Example 2.9. Let’s solve the system of Example 2.8 for a = 1, b = 1 and
c = 1. In that case, the original system is equivalent to
1x1 + 0x2 − 1x3 = −1
0x1 + 1x2 + 2x3 = 1
0x1 + 0x2 + 0x3 = 0

It follows that x1 = −1 + x3 and x2 = 1 − 2x3. This represents a line in R 3.


The line of the previous example is parallel to the line of intersection of the three planes

1x1 + 2x2 + 3x3 = 0


4x1 + 5x2 + 6x3 = 0
7x1 + 8x2 + 9x3 = 0.
In fact, one can see directly that these three planes meet in a line since our computation with row
operations shows that the vectors normal to the three planes are contained in a single plane through the
origin. On the other hand, when a 2b + c = 0, what happens is that the line of intersection of any two of the
planes is parallel to the third plane but doesn’t meet it.
− ƒ

3.1 Matrix Multiplication


In this section, we will define the product of two matrices and state the basic properties of the resulting matrix
algebra. Let Rm×n denote the set of all m × n matrices with real entries, and let (F2)m×n denote the set of all
m ×n matrices over F2.
We have already defined matrix addition and the multiplication of a matrix by a scalar, and we’ve seen
how to multiply an m n matrix and a column vector with n components. We wil now define matrix
×
multiplication. In general, the product AB of two matrices A and B is defined only when the number of
columns of A equals the number of rows of B. Suppose A = (a1 a2 an) is m n and B = b1 b2
bp is n p. Since we already know the definition of each Abj, let us simply put
. Σ
··· × . · ·Σ
· ×
AB = Ab1 Ab2 ··· Abp . (3.1) To write this out

more precisely, let C = AB, and suppose the entry of


C in the i-th row and k-th column is denoted cik. Then, using summation

38
notation, we have
Σ
n
39
cik = aij bjk ,
j=1
so

n
Σ
AB = aijb jk .
j=1

Thus, for real matrices, we have

Rm×n · Rn×p ⊂ Rm×p,

·
where denotes matrix multiplication.
Another way of putting the definition is to say that if the columns of A
are a1, . . . , an, then the r-th column of AB is

b1ra1 + b2ra2 + . . . bnran. (3.2)

Hence the r-th column of AB is the linear combination of all n columns of


A using the n entries in the r-th column of B as the scalars. One can also express AB as a
linear combination of the rows of B. The reader is invited to work this out explicitly. We will
in fact use it to express row operations in terms of matrix multiplication. below

Example 3.1. Here are two examples.

×
This example points out that for there exist 2 2 matrices A and B such that AB = BA,
even though both products AB and BA are defined. In general, matrix multiplication is
ƒ
not commutative. In fact, almost any pair of 2 2 matrices you choose will not commute. In
general, the multiplication of n n matrices is not commutative. The only exception is that
all 1×1 commute (why?)
3.1.1 ×
The Transpose of a Matrix ×

Another operation on matrices is transposition, or taking the transpose. If A is m n, the


T
transpose× A of A is the n m matrix AT :=×(crs), where crs = asr. This is easy to remember:
T
the ith row of A is just the ith column of A.

39
40

Deftnition 3.1. A matrix A which is equal to its transpose (that is, A =


AT ) is called symmetric.
Clearly, every symmetric matrix is square. The symmetric matrices over
R turn out to be especially fundamental, as we will see later.
· of two vectors v, w
n
The dot product v w
n
∈ R is defined to be the matrix product
v · w = vTw = Σ vi w i.
i=1
Proposition 3.1. Let A and B be m × n matrices. Then
(AT + BT ) = AT + BT .
Furthermore, . ΣT
AB = B T AT .
Proof. The first identity is left as an exercise. The product transpose iden- tity can be seen as follows.
The (i, j)-entry of B T AT is the dot product of the i-th row of B T and the j-th column of AT . Since this is
the same thing as the dot product of the j-th row of A and the i-th column of B, which is the (j, i)-
entry of AB, and hence the (i, j)-entry of (AB)T , we see that (AB)T = B T AT . Suggestion: try this out
on an example.

40
41

3.1.2 The Algebraic Laws


Except for the commutativity of multiplication, the usual algebraic proper- ties of addition and
multiplication in the reals also hold for matrices.
Proposition 3.2. Assuming all the sums and products below are defined, then matrix
addition and multiplication satisfy:
(1) the associative law: Matrix addition and multiplication are asso- ciative:
. Σ . Σ . Σ . Σ
A+B +C =A+ B+C and AB C = A BC .

(2) the distributive law: Matrix addition and multiplication are dis- tributive:
. Σ . Σ
A B + C = AB + AC and A + B C = AC + BC.

(3) the scalar multiplication law: For any scalar r,


. Σ . Σ . Σ
rA B = A rB = r AB .

(4) the commutative law for addition: Matrix addition is commu- tative: A + B
= B + A.
Verifying these properties is a routine exercise, so we will omit the details. I suggest working a
couple of examples to convince yourself, if necessary. Though the associative law for
multiplication doesn’t seem to be exciting, it often turns to be extremely useful. We will soon
see some examples of why.
Recall that the n × n identity matrix I n is the matrix having one in each diagonal entry
and zero in each entry off the diagonal. For example,

Note the interesting fact that we can also construct the identity matrix over F2. The off diagonal entries
are 0, of course, and the diagonal entries consist of the nonzero element 1, which is is the multiplicative
identity of F2.
We have
Proposition 3.3. If A is an m n matrix ×
(over R or F2), then AIn = A
and ImA = A.
Proof. This is an exercise in using the definition of multiplication.

41
Determinant of a Matrix
2

What is Determinant of a Matrix?


Determinant of a Matrix is a special number that is defined only for square matrices (matrices
which have same number of rows and columns). Determinant is used at many places in calculus
and other matrix related algebra, it actually represents the matrix in term of a real number which
can be used in solving system of linear equation and finding the inverse of a matrix.
How to calculate?
The value of determinant of a matrix can be calculated by following procedure –
For each element of first row or first column get cofactor of those elements and then multiply the
element with the determinant of the corresponding cofactor, and finally add them with alternate
signs. As a base case the value of determinant of a 1*1 matrix is the single value itself.
Cofactor of an element, is a matrix which we can get by removing row and column of that
element from that matrix.
Determinant of 2 x 2 Matrix:
Basics of Limits and Continuity
Cauchy and Heine Definitions of Limit
3
Let f(x) be a function that is defined on an open interval X containing x=a. (The
value f(a) need not be defined.)
The number L is called the limit of function f(x) as x→a if and only if, for
every ε>0 there exists δ>0 such that
|f(x)−L|<ε,
whenever

0<|x−a|<δ.
This definition is known as ε−δ− or Cauchy definition for limit.
There’s also the Heine definition of the limit of a function, which states that a
function f(x) has a limit L at x=a, if for every sequence {xn}, which has a limit
at a, the sequence f(xn) has a limit L. The Heine and Cauchy definitions of limit of a
function are equivalent.
One-Sided Limits
Let limx→a−0 denote the limit as x goes toward a by taking on values of x such
that x<a. The corresponding limit limx→a−0 f(x) is called the left-hand limit of f(x) at
the point x=a.
Similarly, let limx→a+0 denote the limit as x goes toward a by taking on values
of x such that x>a. The corresponding limit limx→a+0f(x) is called the right-hand
limit of f(x) at x=a.
Note that the 2-sided limit limx→a f(x) exists only if both one-sided limits exist and are
equal to each other, that is limx→a−0f(x) =limx→a+0f(x). In this case,
limx→a f(x)=limx→a−0 f(x)=limx→a+0f(x).
Lecture Notes on Differentiation 4

A tangent line to a function at a point is the line that best approximates the function at that
point better than any other line.

The slope of the function at a given point is the slope of the tangent line to the function at that
point.

The derivative of f at x = a is the slope, m, of the function f at the point x = a (if m


exists), denoted by f J (a) = m. All other notations:
dy df d
yJ, , , f (x), Dx y, Dx f (x).
dx dx dx

The function f (x) is differentiable at a point x0 if f J (x0 ) exists. If a function is


differentiable at all points in its domain (i.e. f J (x) is defined for all x in the domain), then we
consider f J (x) as a function and call it the derivative of f (x).

The derivative of f that we have been talking about is called the ftrst derivative. Now, we define
the second derivative of a function to be the derivative of f J , denoted by f JJ (x)
2 . Σ
or d f 2(= d d f ) .
dx dx dx
Example 1: Given f (x) = c where c is a constant. Then f J (x) = 0 because the slope of the
function at each point is zero.

Example 2: If f (x) = 2 − 3x , then the derivative f J (x) = 2 because the slope of the
function at each point is 2.

Example 3: Given f (x) = |x|. We have


.
−1 if x < 0
f J (x) = .
1 if x >0

However, f J (0) is not defined because there is no unique tangent line to f (x) at x = 0.

The following is a table of derivatives of some basic functions:

f (x) f J (x)
c 0
mx + c m
xa axa−1
ex ex
ln x 1
x

5
Rules of Differentiation:
1. (f ± g)J = fJ ± gJ
2. (c · f )J = cfJ 6
3. (Product Rule) (f · g) = f g + fg
J J J

. ΣJ J
f f g − fg J (where g(x) ƒ= 0)
4. (Quotient Rule) =
g g
2
5. (Chain Rule) (f ◦ g)J = (f (g(x)))J = fJ(g(x)) · gJ(x)

The equation of the tangent line to the function at point x = x0 is:


y − f (x0 ) = f J (x0 )(x − x0 )

Theorem (The Extreme-Value Theorem for Continuous Functions)


If f is continuous at every point of a closed interval I, then f assumes both an absolute maximum
value value M and an absolute minimum value m somewhere in I.

Deftnition
A point in the domain of a function f at which f J = 0 or f J does not exist is a critical point of
f.

Theorem
Extreme values (local or global) occur only at critical points and endpoints.

Examples:
1. Find absolute maximum and minimum values of f (x) = 4 − x2 on the interval [−3,1].
2. Find absolute maximum and minimum values of f (x) = x2/3 on the interval [−1, 8].
3. Find absolute maximum and minimum values of f (x) = x1/3 on the interval [−1, 1].

Theorem (The Mean Value Theorem)


Suppose the f(x) is continuous on a closed interval [a, b] and differentiable on the interval’s interior
(a, b). Then there is at least one point c in (a, b) at which
Facts:
• If fJ(x) > 0 for all x in some interval, then f increases on this interval.
• If fJ(x) < 0 for all x in some interval, then f decreases on this interval. 7

x
Example: Given f (x) = .
1 + x2
1 · (1 + x2 ) − x(2x) 1 − x2 (1 + x)(1 − x)
f J (x) = = = .
(1 + x2) 2
2
(1 + x ) 2
(1 + x2)2

We can use the Key Number Method to test the signs of f J (x).

− + −
-1 1
We know that f J (x) is positive on (−1, 1). Thus, f is increasing on (−1, 1).

Also, f J (x) < 0 on ( −∞ , −1) and (1, ∞ ). Thus, f is decreasing on ( −∞ , −1) and (1, ∞ ).
The following is the graph of f (x).

The 1st Derivative Test


Suppose f is continuous and differentiable on some open interval containing x = a, except possible
at x = a.
a. If f J changes from − to + at x = a, then f has a local minimum at x = a.
b. If f J changes from + to − at x = a, then f has a local maximum at x = a.

The function f (x) is concave up on the interval (a, b) if f J (x) is increasing on (a, b).
The function f (x) is concave down on the interval (a, b) if f J (x) is increasing on (a, b).

Facts:

• If f JJ (x) > 0 for all x in some interval I, then f J increases on I and thus f is concave up on I.
• If f JJ (x) < 0 for all x in some interval I, then f J decreases on this interval and thus f
is concave down on I.

The inflection point (or point of inflection) of a function f is defined to be the point at which
the concavity changes.
Below is a picture illustrating when a function is concave up or concave down. Notice the tangent
lines and their slopes. A point of inflection is also labeled on the picture.

Note: To find the inflection points, we look at the second derivative. Find all the points such
that f JJ is zero or undefined at those points. Then use the Key Number Method to test the sign
changes of f JJ at those points.

Examples:
1. f (x) = x3 − 12x − 5.
2. f (x) = x4 − 4x3 + 10.
Examples from Economics
Suppose that
r(x) = the revenue from selling x items 9
c(x) = the cost of producing the x items
p(x) = r(x) − c(x) = the profit from producing and selling x items.

The marginal revenue, marginal cost, and marginal proftt when producing and selling x
items are the derivatives
dr
= marginal revenue,
dx
dc
= marginal cost,
dx
dp
= marginal profit.
dx

Let’s consider the relationship of p to these derivatives.


If r(x) and c(x) are differentiable for all x > 0, and if p(x) = r(x) c(x)−has a maximum value, it
occurs at a production level at which pJ (x) = 0. Since pJ (x) = r J (x) cJ (x), pJ (x) = 0−implies that
r J (x) − cJ (x) = 0 or r J (x) = cJ (x).

Therefore,

At a production level yielding maximum profit, marginal revenue equals marginal cost.

Figure. The graph of a typical cost function starts concave down and later turns concave
up. It crosses the revenue curve at the break-even point B. To the left of B, the company
operates at a loss. To the right, the company operates at a profit, with the maximum profit
occurring where cJ (x) = rJ (x). Farther to the right, cost exceeds revenue (perhaps because of a
combination of rising labor and materials costs and market saturation) and production levels
become unprofitable again.
Integration
Mean Value Theorem Suppose f (x) is continuous on [a, b] and differentiable on (a, b).
Then there exists a point c in (a, b) at which 10

f (b) − f (a)
= f J (c). (1)
b−a

Corollary 1 If f J (x) = 0 at each point of an interval I, then f (x) = C for all x in I, where
C is a constant.

Corollary 2 If f J (x) = g J (x) at each point of an interval I, then there exists a constant C
such that f (x) = g(x) + C for all x in I.

A function, F (x), is an antiderivative of a function f (x) if F J (x) = f (x) for all x in the
domain of f .

Example: The function F (x) = x2 is an antiderivative of f (x) = 2x. The function


tt(x) = x2 + 4 is also an antiderivative of f (x) = 2x.

The set of all antiderivative of f is the indefinite integral of f with respect to x, denoted
by
∫ f (x)dx

The symbol is an integral sign. The function f (x) is the integrand of the integral, and
x is the variable of integration.

To verify ∫xexdx = xex − ex + C, we take the derivative of the right hand side.
d
xex − ex + C = ex + xex x x
− e = xe . Thus, the integral statement is correct.
dx
11

Deftnition: (Deftnite Integral)


b

∫ f (x)dx = (signed or net) area between the curve and x-axis from a to b.
a

The number a is called the lower limit and the number b is called the upper limit.
Note: If the curve from a to b is below the x-axis, the definite integral of f (x) from a
to b will be negative.

12

If part of the curve from a to b is below the x-axis and part of it is above the x-axis, the definite
integral of f (x) from a to b could be zero.

We can approximate the area under the curve using rectangles.

Left Endpoint Rule: using rectangles with left top corner on the curve Right
Endpoint Rule: using rectangles with right top corner on the curve Midpoint Rule:
using rectangles with top midpoint on the curve

13

Ln = sum of area of n rectangles using left endpoint rule. Rn =


sum of area of n rectangles using right endpoint rule. Mn = sum
of area of n rectangles using midpoint rule.

The Fundamental Theorem of Calculus


Note: If a question asks you to find the area of a region, it means the total area, i.e. the
(positive) measure of the size of the region.
14

Example: Find the area of the region between x-axis and the graph of f (x) = x3 − x2 −
2x, −1 ≤ x ≤ 2.

If the graph of the function is not given, you may want to sketch the graph first and see what
are the regions. We also want to factor the function and find the x-intercepts. Thus, f (x) = x3 − x2
− 2x = x(x2 − x − 2) = x(x + 1)(x − 2). See the following:

Since from x = 1−to x = 0, the curve is positive and from x = 0 to x = 2, the curve is negative,

we can integrate the function from x = 1 to x = 0 and from x = 0 to x = 2 separately.

Note that the first integral is positive but the second is negative. Thus, the total area
Combinatorics
15
An area of mathematics primarily concerned with counting, both as a means and an end in obtaining results,
and certain properties of finite structures.

The Sum Rule: If there are n(A) ways to do A and, distinct from them, n(B) ways to do B, then the number of
ways to do A or B is n(A) + n(B). This rule generalizes: there are n(A) + n(B) + n(C) ways to do A or B or C

Example: A woman has decided to shop at one store today, either in the north part of town or the south part of town.
If she visits the north part of town, she will shop at either a mall, a furniture store, or a jewelry store (3 ways). If she
visits the south part of town then she will shop at either a clothing store or a shoe store (2 ways).

Thus there are 3+2=5 possible shops the woman could end up shopping at today.

The Product Rule: If there are n(A) ways to do A and n(B) ways to do B, then the number of ways to do A and B
is n(A) × n(B). This is true if the number of ways of doing A and B are independent; the number of choices for
doing B is the same regardless of which choice you made for A. Again, this generalizes. There are n(A)×n(B)×n(C)
ways to do A and B and C

Example: A traveling salesman wants to do a tour of all 50 state capitals. How many ways can he do this?

Answer: 50 choices for the first place to visit, 49 for the second, . . . : 50! altogether.

Permutations

A permutation of n things taken r at a time, written P(n, r), is an arrangement in a row of r things, taken from a set of
n distinct things. Order matters.
Example : How many permutations are there of 5 things taken 3 at a time?
Answer: 5 choices for the first thing, 4 for the second, 3 for the third: 5 × 4 × 3 = 60.

 If the 5 things are a, b, c, d, e, some possible permutations are: abc abd abe acb acd ace adb adc ade aeb aec aed
...
In general
Combinations
A combination of n things taken r at a time, written C(n, r) or (“n choose r”) is any
16subset of r things
from n things. Order makes no difference.
Example : How many ways are there of choosing 3 things from 5?
Answer : If order mattered, then it would be 5 × 4 × 3. Since order doesn’t matter, abc, acb, bac, bca, cab, cba are all
the same.
• For way of choosing three elements, there are 3! = 6 ways of ordering them.
Therefore, the right answer is (5 × 4 × 3)/3! = 10: abc abd abe acd ace ade bcd bce bde cde

In general
Probability and Statistics
17 throwing a
Probability refers to the extent of occurrence of events. When an event occurs like
ball, picking a card from deck, etc ., then the must be some probability associated with that event.
In terms of mathematics, probability refers to the ratio of wanted outcomes to the total number of
possible outcomes. There are three approaches to the theory of probability, namely:
1. Empirical Approach
2. Classical Approach
3. Axiomatic Approach

Basic Terminologies:
 Random Event :- If the repetition of an experiment occurs several times under similar
conditions, if it does not produce the same outcome everytime but the outcome in a trial is
one of the several possible outcomes, then such an experiment is called random event or a
probabilistic event.
 Elementary Event – The elementary event refers to the outcome of each random event
performed. Whenever the random event is performed, each associated outcome is known as
elementary event.
 Sample Space – Sample Space refers tho the set of all possible outcomes of a random
event.Example, when a coin is tossed, the possible outcomes are head and tail.
 Event – An event refers to the subset of the sample space associated with a random event.
 Occurrence of an Event – An event associated with a random event is said to occur if any
one of the elementary event belonging to it is an outcome.
 Sure Event – An event associated with a random event is said to be sure event if it always
occurs whenever the random event is performed.
 Impossible Event – An event associated with a random event is said to be impossible event
if it never occurs whenever the random event is performed.
 Compound Event – An event associated with a random event is said to be compound event
if it is the disjoint union of two or more elementary events.
 Mutually Exclusive Events – Two or more events associated with a random event are said
to be mutually exclusive events if any one of the event occurrs, it prevents the occurrence of
all other events.This means that no two or more events can occur simultaneously at the same
time.
 Exhaustive Events – Two or more events associated with a random event are said to be
exhaustive events if their union is the sample space.
Probability of an Event – If there are total p possible outcomes associated with a random
experiment and q of them are favourable outcomes to the event A, then the probability of event A
is denoted by P(A) and is given by
P(A) = q/p
The probability of non occurrence of event A, i.e, P(A’) = 1 – P(A)
Note –
 If the value of P(A) = 1, then event A is called sure event .
 If the value of P(A) = 0, then event A is called impossible event.
 Also, P(A) + P(A’) = 1
Theorems:
 General – Let A, B, C are the events associated with a random experiment, then
1. P(A∪B) = P(A) + P(B) – P(A∩B)
2. P(A∪B) = P(A) + P(B) if A and B are mutually exclusive
3. P(A∪B∪C) = P(A) + P(B) + P(C) – P(A∩B) – P(B∩C)- P(C∩A) + P(A∩B∩C)
4. P(A∩B’) = P(A) – P(A∩B)
5. P(A’∩B) = P(B) – P(A∩B) 18
 Extension of Multiplication Theorem – Let A1, A2, ….., An are n events associated with a
random experiment, then
P(A1∩A2∩A3 ….. An) = P(A1)P(A2/A1)P(A3/A2∩A1) ….. P(An/A1∩A2∩A3∩ ….. ∩An-1)
Example-1: A bag contains 10 oranges and 20 apples out of which 5 apples and 3 oranges are
defective .If a person takes out two at random, what is the probability that either both are good or
both are apples ?
Solution –
Out of 30 items, two can selected in 30C2 ways .
Thus, Total elementary events = 30C2 .
Consider the events :
A = Getting two apples
B = Getting two good items
Required Probability is :
P(A∪B) = P(A) + P(B) – P(A∩B) …(i)
There are 20 apples, out of which 2 can drawn in 20C2 ways .
P(A) = 20C2/30C2
There are 8 defective items and 22 are good, Out of 22 good items, two can be can drawn
in 22C2 ways .
P(B) = 22C2/30C2
Sice there are 15 items which are good apples, out of which 2 can be selected in 15C2 ways .
P(A∩B) = 15C2/30C2
Substituting the values of P(A), P(B) and P(A∩B) in (i)
Required probability is = (20C2/30C2) + (22C2/30C2) – (15C2/30C2)
= 316/435
Example-2: The probability that a person will get an electric contract is 2/5 and probability that
he will not get plumbing contract is 4/7 . If the probability of getting at least one contact is 2/3,
what is the probability of getting both ?
Solution:
Consider the two events:
A = Person gets electric contract
B = Person gets plumbing contract
We have,
P(A) = 2/5
P(B’) = 4/7
P(A∪B) = 2/3
Now,
P(A∩B) = P(A) + P(B) – P(A∪B)
= (2/5) + (1 – 4/7) – (2/3)
= 17/105
Total Law of Probability – Let S be the sample space associated with a random experiment and
E1, E2, …, En be n mutually exclusive and exhaustive events associated with the random
experiment . If A is any event which occurs with E1 or E2 or … or En, then
P(A) = P(E1)P(A/E1) + P(E2)P(A/E2) + ... + P(En)P(A/En)
Example-1: A bag contains 3 black balls and 4 red balls .A second bag contains 4 black balls and
2 red balls .One bag is selected at random .From the selected bag, one ball is drawn . Find the
probability that the ball drawn is red.
Solution:
A red ball can be drawn in two ways:
1. Selecting bag I and then drawing a red ball from it .
2. Selecting bag II and then drawing a red ball from it . 19
Let E1, E2 and A be the defined events as follows :
E1 = Selecting bag I
E2 = Selecting bag II
A = Drawing red ball
Since selecting one of the two bags at random .
P(E1) = 1/2
P(E2) = 1/2
Now, probability of drawing red ball when first bag has been chosen
P(A/E1) = 4/7
and, probability of drawing red ball when second bag has been chosen
P(A/E2) = 2/6
Using total law of probability, we have
P(A) = P(E1)P(A/E1) + P(E2)P(A/E2)
= (1/2)(4/7) + (1/2)(2/6)
= 19/42
Hence, the probability of drawing a red ball is 19/42
Example-2: In a bulb factory, three machines namely A, B, C produces 25%, 35% and 40% of the
total bulbs respectively . Of their output, 5, 4 and 2 percent are defective bulbs respectively . A
bulb is drawn is drawn at random from products . What is the probability that bulb drawn is
defective ?
Solution:
Let E1, E2, E3 and A be the defined events as follows :
E1 = The bulb is manufactured by machine A
E2 = The bulb is manufactured by machine B
E3 = The bulb is manufactured by machine C
A = The bulb is defective
According to given conditions ;
P(E1) = 25/100
P(E2) = 35/100
P(E3) = 40/100
Now, probability that the bulb is defective given that is produced by Machine A
P(A/E1) = 5/100
and, probability that the bulb is defective given that is produced by Machine B
P(A/E2) = 4/100
and, probability that the bulb is defective given that is produced by Machine C
P(A/E3) = 2/100
Using total law of probability, we have
P(A) = P(E1)P(A/E1) + P(E2)P(A/E2) + P(E3)P(A/E3)
= (25/100)(5/100) + (35/100)(4/100) + (40/100)(2/100)
= 0.0345
Hence, the probability that the bulb is defective is 0.0345

Mean
Mean is average of a given set of data. Let us consider below example 2, 4, 4, 4, 5, 5, 7, 9 the
mean (average) of a given set of data is 5

20

Fact about Mean :


1. The mean (or average) is the most popular and well known measure of central tendency.
2. It can be used with both discrete and continuous data, although its use is most often with
continuous data.
3. There are other types of means.Geometric mean, Harmonic mean and Arithmetic mean.
4. Mean is the only measure of central tendency where the sum of the deviations of each value
from the mean is always zero.
Formula of mean of ungrouped data :

Formula of mean of grouped data :

Median
Median is the middle value of a set of data. To determine the median value in a sequence of
numbers, the numbers must first be arranged in ascending order.
 If there is an odd amount of numbers, the median value is the number that is in the middle, with the
same amount of numbers below and above.
 If there is an even amount of numbers in the list, the median is the average of the two middle values.

Fact about Median :


1. Median is joined by the mean and the mode to create a grouping called measures of central
tendency.
2. Median is an important measure (compared to mean) for distorted data, because median is
not so easily distorted. For example, median of {1, 2, 2, 5, 100) is 2 and mean is 22.
3. If user add a constant to every value, the mean and median increase by the same constant.
4. If user multiply every value by a constant, the mean and the median will also be multiplied
by that constant.
Formula of Median of ungrouped data :
21

Formula of Median of grouped data :

Mode
Mode is the value which occurs most frequently in a set of observations. For example, {6, 3, 9, 6,
6, 5, 9, 3} the Mode is 6, as it occurs most often.
Fact about Mode :
1. Sometimes there can be more than one mode.Having two modes is called bimodal.Having more than
two modes is called multimodal.
2. There is an empirical relationship between Mean, Median, and Mode.
Mean – Mode = 3 [ Mean – Median ]
3. Mode can be useful for qualitative data.
4. Mode can be located graphically.
5. Mode can be computed in an open-end frequency table.
6. Mode is not affected by extremely large or small values.
Formula for Mode of grouped data :

22

Variance and Standard Deviation


Mean is average of a given set of data. Let us consider below example
2,4,4,4,5,5,7,9
These eight data points have the mean (average) of 5:
23
Boolean Algebra
Boolean Algebra is used to analyze and simplify the digital (logic) circuits. It uses only the
binary numbers i.e. 0 and 1. It is also called as Binary Algebra or logical 24
Algebra. Boolean
algebra was invented by George Boole in 1854.

Rule in Boolean Algebra


Following are the important rules used in Boolean algebra.
Variable used can have only two values. Binary 1 for HIGH and Binary 0 for LOW.
 Complement of a variable is represented by an overbar (-). Thus, complement of variable B is
represented as . Thus if B = 0 then = 1 and B = 1 then = 0.
 ORing of the variables is represented by a plus (+) sign between them. For example ORing of A, B, C
is represented as A + B + C.
 Logical ANDing of the two or more variable is represented by writing a dot between them such as
A.B.C. Sometime the dot may be omitted like ABC.

Boolean Laws
There are six types of Boolean Laws.

Commutative law
Any binary operation which satisfies the following expression is referred to as commutative
operation.

Commutative law states that changing the sequence of the variables does not have any effect on
the output of a logic circuit.

Associative law
This law states that the order in which the logic operations are performed is irrelevant as their
effect is the same.

Distributive law
Distributive law states the following condition.

AND law
These laws use the AND operation. Therefore they are called as AND laws.

OR law
These laws use the OR operation. Therefore they are called as OR laws.

25
INVERSION law
This law uses the NOT operation. The inversion law states that double inversion of a variable
results in the original variable itself.

Logic gates
Logic gates are the basic building blocks of any digital system. It is an electronic circuit having
one or more than one input and only one output. The relationship between the input and the
output is based on a certain logic. Based on this, logic gates are named as AND gate, OR gate,
NOT gate etc.

AND Gate
A circuit which performs an AND operation is shown in figure. It has n input (n >= 2) and one
output.

Logic diagram

Truth Table

OR Gate
A circuit which performs an OR operation is shown in figure. It has n input (n >= 2) and one
output.
Logic diagram

26

Truth Table

NOT Gate
NOT gate is also known as Inverter. It has one input A and one output Y.

Logic diagram

Truth Table

NAND Gate
A NOT-AND operation is known as NAND operation. It has n input (n >= 2) and one output.

Logic diagram

Truth Table
27

NOR Gate
A NOT-OR operation is known as NOR operation. It has n input (n >= 2) and one output.

Logic diagram

Truth Table

XOR Gate
XOR or Ex-OR gate is a special type of gate. It can be used in the half adder, full adder and
subtractor. The exclusive-OR gate is abbreviated as EX-OR gate or sometime as X-OR gate. It
has n input (n >= 2) and one output.

Logic diagram

Truth Table
28

XNOR Gate
XNOR gate is a special type of gate. It can be used in the half adder, full adder and subtractor.
The exclusive-NOR gate is abbreviated as EX-NOR gate or sometime as X-NOR gate. It has n
input (n >= 2) and one output.

Logic diagram

Truth Table

Boolean Expressions & Functions


Boolean algebra is algebra of logic. It deals with variables that can have two discrete values, 0
(False) and 1 (True); and operations that have logical significance. The earliest method of
manipulating symbolic logic was invented by George Boole and subsequently came to be known
as Boolean Algebra.
Boolean algebra has now become an indispensable tool in computer science for its wide
applicability in switching theory, building basic electronic circuits and design of digital
computers.

Boolean Functions
A Boolean function is a special kind of mathematical function f:Xn→Xf:Xn→X of degree n,
where X={0,1}X={0,1} is a Boolean domain and n is a non-negative integer. It describes the
way how to derive Boolean output from Boolean inputs.
Example − Let, F(A,B)=A′B′F(A,B)=A′B′. This is a function of degree 2 from the set of ordered
pairs of Boolean variables to the
set {0,1}{0,1} where F(0,0)=1,F(0,1)=0,F(1,0)=0F(0,0)=1,F(0,1)=0,F(1,0)=0 and F(1,1)=0F(
1,1)=0
Boolean Expressions
A Boolean expression always produces a Boolean value. A Boolean expression is composed of a
combination of the Boolean constants (True or False), Boolean variables and logical
29
connectives.
Each Boolean expression represents a Boolean function.
Example − AB′CAB′C is a Boolean expression.
Boolean Identities
Double Complement Law
∼(∼A)=A
Complement Law
A+∼A=1 (OR Form)
A.∼A=0 (AND Form)
Idempotent Law
A+A=A (OR Form)
A.A=A (AND Form)
Identity Law
A+0=A (OR Form)
A.1=A (AND Form)
Dominance Law
A+1=1 (OR Form)
A.0=0 (AND Form)
Commutative Law
A+B=B+A (OR Form)
A.B=B.A (AND Form)
Associative Law
A+(B+C)=(A+B)+C)
A.(B.C)=(A.B).C (AND Form)
Absorption Law
A.(A+B)=A
A+(A.B)=A
Simplification Law
A.(∼A+B)=A.B
A+(∼A.B)=A+B
Distributive Law
A+(B.C)=(A+B).(A+C)
A.(B+C)=(A.B)+(A.C)
De-Morgan's Law
∼(A.B)=∼A+∼B
∼(A+B)=∼A.∼B
Canonical Forms
For a Boolean expression there are two kinds of canonical forms −

 The sum of minterms (SOM) form


 The product of maxterms (POM) form 30
The Sum of Min-terms (SOM) or Sum of
Products (SOP) form
A minterm is a product of all variables taken either in their direct or complemented form. Any
Boolean function can be expressed as a sum of its 1-minterms and the inverse of the function can
be expressed as a sum of its 0-minterms. Hence,
F (list of variables) = ∑ (list of 1-minterm indices)
and
F' (list of variables) = ∑ (list of 0-minterm indices)

A B C Term Minterm

0 0 0 x’y’z’ m0

0 0 1 x’y’z m1

0 1 0 x’yz’ m2

0 1 1 x’yz m3

1 0 0 xy’z’ m4

1 0 1 xy’z m5

1 1 0 xyz’ m6

1 1 1 xyz m7

Example
Let, F(x,y,z)=x′y′z′+xy′z+xyz′+xyzF(x,y,z)=x′y′z′+xy′z+xyz′+xyz
Or, F(x,y,z)=m0+m5+m6+m7F(x,y,z)=m0+m5+m6+m7
Hence,
F(x,y,z)=∑(0,5,6,7)F(x,y,z)=∑(0,5,6,7)
Now we will find the complement of F(x,y,z)F(x,y,z)
F′(x,y,z)=x′yz+x′y′z+x′yz′+xy′z′F′(x,y,z)=x′yz+x′y′z+x′yz′+xy′z′
Or, F′(x,y,z)=m3+m1+m2+m4F′(x,y,z)=m3+m1+m2+m4
Hence,
F′(x,y,z)=∑(3,1,2,4)=∑(1,2,3,4)F′(x,y,z)=∑(3,1,2,4)=∑(1,2,3,4)
The Product of Max-terms (POM) or
Product of Sums (POS) form
A maxterm is addition of all variables taken either in their direct or complemented form. Any
Boolean function can be expressed as a product of its 0-maxterms and the inverse of the function
can be expressed as a product of its 1-maxterms. Hence,
F(list of variables) = ππ (list of 0-maxterm indices). 31

and
F'(list of variables) = ππ (list of 1-maxterm indices).
A B C Term Maxterm

0 0 0 x+y+z M0

0 0 1 x + y + z’ M1

0 1 0 x + y’ + z M2

0 1 1 x + y’ + z’ M3

1 0 0 x’ + y + z M4

1 0 1 x’ + y + z’ M5

1 1 0 x’ + y’ + z M6

1 1 1 x’ + y’ + z’ M7

Example
Let F(x,y,z)=(x+y+z).(x+y+z′).(x+y′+z).(x′+y+z)F(x,y,z)=(x+y+z).(x+y+z′).(x+y′+z).(x′+y+z)
Or, F(x,y,z)=M0.M1.M2.M4F(x,y,z)=M0.M1.M2.M4
Hence,
F(x,y,z)=π(0,1,2,4)F(x,y,z)=π(0,1,2,4)
F′′(x,y,z)=(x+y′+z′).(x′+y+z′).(x′+y′+z).(x′+y′+z′)F″(x,y,z)=(x+y′+z′).(x′+y+z′).(x′+y′+z).(x′+y′
+z′)
Or, F(x,y,z)=M3.M5.M6.M7F(x,y,z)=M3.M5.M6.M7
Hence,
F′(x,y,z)=π(3,5,6,7)F′(x,y,z)=π(3,5,6,7)
Logic Gates
Boolean functions are implemented by using logic gates. The following are the logic gates −

NOT Gate
A NOT gate inverts a single bit input to a single bit of output.

A ~A

0 1

1 0
AND Gate
An AND gate is a logic gate that gives a high output only if all its inputs are high, otherwise it
gives low output. A dot (.) is used to show the AND operation.
32

A B A.B

0 0 0

0 1 0

1 0 0

1 1 1

OR Gate
An OR gate is a logic gate that gives high output if at least one of the inputs is high. A plus (+) is
used to show the OR operation.

A B A+B

0 0 0

0 1 1

1 0 1

1 1 1

NAND Gate
A NAND gate is a logic gate that gives a low output only if all its inputs are high, otherwise it
gives high output.

A B ~(A.B)

0 0 1

0 1 1

1 0 1

1 1 0

NOR Gate
An NOR gate is a logic gate that gives high output if both the inputs are low, otherwise it gives
low output.

A B ~(A+B)

0 0 1

0 1 0
1 0 0

1 1 0
33
XOR (Exclusive OR) Gate
An XOR gate is a logic gate that gives high output if the inputs are different, otherwise it gives
low output.

A B A⊕B

0 0 0

0 1 1

1 0 1

1 1 0

X-NOR (Exclusive NOR) Gate


An EX-NOR gate is a logic gate that gives high output if the inputs are same, otherwise it gives
low output.

A B A X-NOR B

0 0 1

0 1 0

1 0 0

1 1 1
Karnaugh Map
Karnaugh introduced a method for simplification of Boolean functions in an easy way. This
method is known as Karnaugh map method or K-map method. It is a graphical 34 method, which
consists of 2n cells for ‘n’ variables. The adjacent cells are differed only in single bit position.

K-Maps for 2 to 5 Variables


K-Map method is most suitable for minimizing Boolean functions of 2 variables to 5 variables.
Now, let us discuss about the K-Maps for 2 to 5 variables one by one.

2 Variable K-Map
The number of cells in 2 variable K-map is four, since the number of variables is two. The
following figure shows 2 variable K-Map.

 There is only one possibility of grouping 4 adjacent min terms.


 The possible combinations of grouping 2 adjacent min terms are {(m0, m1), (m2, m3), (m0, m2) and (m1,
m3)}.

3 Variable K-Map
The number of cells in 3 variable K-map is eight, since the number of variables is three. The
following figure shows 3 variable K-Map.

 There is only one possibility of grouping 8 adjacent min terms.


 The possible combinations of grouping 4 adjacent min terms are {(m0, m1, m3, m2), (m4, m5, m7, m6),
(m0, m1, m4, m5), (m1, m3, m5, m7), (m3, m2, m7, m6) and (m2, m0, m6, m4)}.
 The possible combinations of grouping 2 adjacent min terms are {(m0, m1), (m1, m3), (m3, m2), (m2, m0),
(m4, m5), (m5, m7), (m7, m6), (m6, m4), (m0, m4), (m1, m5), (m3, m7) and (m2, m6)}.
 If x=0, then 3 variable K-map becomes 2 variable K-map.

4 Variable K-Map
The number of cells in 4 variable K-map is sixteen, since the number of variables is four. The
following figure shows 4 variable K-Map.
35

 There is only one possibility of grouping 16 adjacent min terms.


 Let R1, R2, R3 and R4 represents the min terms of first row, second row, third row and fourth row
respectively. Similarly, C1, C2, C3 and C4 represents the min terms of first column, second column,
third column and fourth column respectively. The possible combinations of grouping 8 adjacent min
terms are {(R1, R2), (R2, R3), (R3, R4), (R4, R1), (C1, C2), (C2, C3), (C3, C4), (C4, C1)}.
 If w=0, then 4 variable K-map becomes 3 variable K-map.

5 Variable K-Map
The number of cells in 5 variable K-map is thirty-two, since the number of variables is 5. The
following figure shows 5 variable K-Map.

 There is only one possibility of grouping 32 adjacent min terms.


 There are two possibilities of grouping 16 adjacent min terms. i.e., grouping of min terms from m0 to
m15 and m16 to m31.
 If v=0, then 5 variable K-map becomes 4 variable K-map.
In the above all K-maps, we used exclusively the min terms notation. Similarly, you can use
exclusively the Max terms notation.

Minimization of Boolean Functions using K-Maps


If we consider the combination of inputs for which the Boolean function is ‘1’, then we will get
the Boolean function, which is in standard sum of products form after simplifying the K-map.
Similarly, if we consider the combination of inputs for which the Boolean function is ‘0’, then we
will get the Boolean function, which is in standard product of sums form after simplifying the
K-map.
36
Follow these rules for simplifying K-maps in order to get standard sum of products form.
 Select the respective K-map based on the number of variables present in the Boolean function.
 If the Boolean function is given as sum of min terms form, then place the ones at respective min term
cells in the K-map. If the Boolean function is given as sum of products form, then place the ones in all
possible cells of K-map for which the given product terms are valid.
 Check for the possibilities of grouping maximum number of adjacent ones. It should be powers of two.
Start from highest power of two and upto least power of two. Highest power is equal to the number of
variables considered in K-map and least power is zero.
 Each grouping will give either a literal or one product term. It is known as prime implicant. The prime
implicant is said to be essential prime implicant, if atleast single ‘1’ is not covered with any other
groupings but only that grouping covers.
 Note down all the prime implicants and essential prime implicants. The simplified Boolean function
contains all essential prime implicants and only the required prime implicants.
Note 1 − If outputs are not defined for some combination of inputs, then those output values will
be represented with don’t care symbol ‘x’. That means, we can consider them as either ‘0’ or
‘1’.
Note 2 − If don’t care terms also present, then place don’t cares ‘x’ in the respective cells of K-
map. Consider only the don’t cares ‘x’ that are helpful for grouping maximum number of
adjacent ones. In those cases, treat the don’t care value as ‘1’.

Example
Let us simplify the following Boolean function, f(W, X, Y, Z)= WX’Y’ + WY + W’YZ’ using
K-map.
The given Boolean function is in sum of products form. It is having 4 variables W, X, Y & Z. So,
we require 4 variable K-map. The 4 variable K-map with ones corresponding to the given
product terms is shown in the following figure.

Here, 1s are placed in the following cells of K-map.


 The cells, which are common to the intersection of Row 4 and columns 1 & 2 are corresponding to the
product term, WX’Y’.
 The cells, which are common to the intersection of Rows 3 & 4 and columns 3 & 4 are corresponding
to the product term, WY.
 The cells, which are common to the intersection of Rows 1 & 2 and column 4 are corresponding to the
product term, W’YZ’. 37
There are no possibilities of grouping either 16 adjacent ones or 8 adjacent ones. There are three
possibilities of grouping 4 adjacent ones. After these three groupings, there is no single one left
as ungrouped. So, we no need to check for grouping of 2 adjacent ones. The 4 variable K-
map with these three groupings is shown in the following figure.

Here, we got three prime implicants WX’, WY & YZ’. All these prime implicants
are essential because of following reasons.
 Two ones (m8 & m9) of fourth row grouping are not covered by any other groupings. Only fourth row
grouping covers those two ones.
 Single one (m15) of square shape grouping is not covered by any other groupings. Only the square shape
grouping covers that one.
 Two ones (m2 & m6) of fourth column grouping are not covered by any other groupings. Only fourth
column grouping covers those two ones.
Therefore, the simplified Boolean function is
f = WX’ + WY + YZ’
Follow these rules for simplifying K-maps in order to get standard product of sums form.
 Select the respective K-map based on the number of variables present in the Boolean function.
 If the Boolean function is given as product of Max terms form, then place the zeroes at respective Max
term cells in the K-map. If the Boolean function is given as product of sums form, then place the
zeroes in all possible cells of K-map for which the given sum terms are valid.
 Check for the possibilities of grouping maximum number of adjacent zeroes. It should be powers of
two. Start from highest power of two and upto least power of two. Highest power is equal to the
number of variables considered in K-map and least power is zero.
 Each grouping will give either a literal or one sum term. It is known as prime implicant. The prime
implicant is said to be essential prime implicant, if atleast single ‘0’ is not covered with any other
groupings but only that grouping covers.
 Note down all the prime implicants and essential prime implicants. The simplified Boolean function
contains all essential prime implicants and only the required prime implicants.
Note − If don’t care terms also present, then place don’t cares ‘x’ in the respective cells of K-
map. Consider only the don’t cares ‘x’ that are helpful for grouping maximum number of
adjacent zeroes. In those cases, treat the don’t care value as ‘0’.
38
Example
Let us simplify the following Boolean function, $f\left ( X,Y,Z \right )=\prod M\left ( 0,1,2,4
\right )$ using K-map.
The given Boolean function is in product of Max terms form. It is having 3 variables X, Y & Z.
So, we require 3 variable K-map. The given Max terms are M0, M1, M2 & M4. The 3 variable K-
map with zeroes corresponding to the given Max terms is shown in the following figure.

There are no possibilities of grouping either 8 adjacent zeroes or 4 adjacent zeroes. There are
three possibilities of grouping 2 adjacent zeroes. After these three groupings, there is no single
zero left as ungrouped. The 3 variable K-map with these three groupings is shown in the
following figure.

Here, we got three prime implicants X + Y, Y + Z & Z + X. All these prime implicants
are essential because one zero in each grouping is not covered by any other groupings except
with their individual groupings.
Therefore, the simplified Boolean function is
f = (X + Y).(Y + Z).(Z + X)
In this way, we can easily simplify the Boolean functions up to 5 variables using K-map method.
For more than 5 variables, it is difficult to simplify the functions using K-Maps. Because, the
number of cells in K-map gets doubled by including a new variable.
Due to this checking and grouping of adjacent ones (min terms) or adjacent zeros (Max terms)
will be complicated. We will discuss Tabular method in next chapter to overcome the
difficulties of K-map method.
Combinational Circuits
Combinational circuit is a circuit in which we combine the different gates in the circuit, for
example encoder, decoder, multiplexer and demultiplexer. Some of the 39 characteristics of
combinational circuits are following −
 The output of combinational circuit at any instant of time, depends only on the levels present at input
terminals.
 The combinational circuit do not use any memory. The previous state of input does not have any effect
on the present state of the circuit.
 A combinational circuit can have an n number of inputs and m number of outputs.

Block diagram

We're going to elaborate few important combinational circuits as follows.

Half Adder
Half adder is a combinational logic circuit with two inputs and two outputs. The half adder circuit
is designed to add two single bit binary number A and B. It is the basic building block for
addition of two single bit numbers. This circuit has two outputs carry and sum.

Block diagram

Truth Table

Circuit Diagram
40

Full Adder
Full adder is developed to overcome the drawback of Half Adder circuit. It can add two one-bit
numbers A and B, and carry c. The full adder is a three input and two output combinational
circuit.

Block diagram

Truth Table

Circuit Diagram
41

N-Bit Parallel Adder


The Full Adder is capable of adding only two single digit binary number along with a carry input.
But in practical we need to add binary numbers which are much longer than just one bit. To add
two n-bit binary numbers we need to use the n-bit parallel adder. It uses a number of full adders
in cascade. The carry output of the previous full adder is connected to carry input of the next full
adder.

4 Bit Parallel Adder


In the block diagram, A0 and B0 represent the LSB of the four bit words A and B. Hence Full
Adder-0 is the lowest stage. Hence its Cin has been permanently made 0. The rest of the
connections are exactly same as those of n-bit parallel adder is shown in fig. The four bit parallel
adder is a very common logic circuit.

Block diagram

N-Bit Parallel Subtractor


The subtraction can be carried out by taking the 1's or 2's complement of the number to be
subtracted. For example we can perform the subtraction (A-B) by adding either 1's or 2's
complement of B to A. That means we can use a binary adder to perform the binary subtraction.

4 Bit Parallel Subtractor


The number to be subtracted (B) is first passed through inverters to obtain its 1's complement.
The 4-bit adder then adds A and 2's complement of B to produce the subtraction.
S3 S2 S1 S0 represents the result of binary subtraction (A-B) and carry output Cout represents the
polarity of the result. If A > B then Cout = 0 and the result of binary form (A-B)
42then Cout = 1 and
the result is in the 2's complement form.

Block diagram

Half Subtractors
Half subtractor is a combination circuit with two inputs and two outputs (difference and borrow).
It produces the difference between the two binary bits at the input and also produces an output
(Borrow) to indicate if a 1 has been borrowed. In the subtraction (A-B), A is called as Minuend
bit and B is called as Subtrahend bit.

Truth Table

Circuit Diagram
43

Full Subtractors
The disadvantage of a half subtractor is overcome by full subtractor. The full subtractor is a
combinational circuit with three inputs A,B,C and two output D and C'. A is the 'minuend', B is
'subtrahend', C is the 'borrow' produced by the previous stage, D is the difference output and C' is
the borrow output.

Truth Table

Circuit Diagram

Multiplexers
Multiplexer is a special type of combinational circuit. There are n-data inputs, one output and m
select inputs with 2m = n. It is a digital circuit which selects one of the n data inputs and routes it
to the output. The selection of one of the n inputs is done by the selected inputs. Depending on
the digital code applied at the selected inputs, one out of n data sources 44 is selected and
transmitted to the single output Y. E is called the strobe or enable input which is useful for the
cascading. It is generally an active low terminal that means it will perform the required operation
when it is low.

Block diagram

Multiplexers come in multiple variations

 2 : 1 multiplexer
 4 : 1 multiplexer
 16 : 1 multiplexer
 32 : 1 multiplexer

Block Diagram

Truth Table
45

Demultiplexers
A demultiplexer performs the reverse operation of a multiplexer i.e. it receives one input and
distributes it over several outputs. It has only one input, n outputs, m select input. At a time only
one output line is selected by the select lines and the input is transmitted to the selected output
line. A de-multiplexer is equivalent to a single pole multiple way switch as shown in fig.
Demultiplexers comes in multiple variations.

 1 : 2 demultiplexer
 1 : 4 demultiplexer
 1 : 16 demultiplexer
 1 : 32 demultiplexer

Block diagram

Truth Table

Decoder
A decoder is a combinational circuit. It has n input and to a maximum m = 2n outputs. Decoder is
identical to a demultiplexer without any data input. It performs operations which are exactly
opposite to those of an encoder.
Block diagram

46

Examples of Decoders are following.

 Code converters
 BCD to seven segment decoders
 Nixie tube decoders
 Relay actuator

2 to 4 Line Decoder
The block diagram of 2 to 4 line decoder is shown in the fig. A and B are the two inputs where D
through D are the four outputs. Truth table explains the operations of a decoder. It shows that
each output is 1 for only a specific combination of inputs.

Block diagram

Truth Table

Logic Circuit
47

Encoder
Encoder is a combinational circuit which is designed to perform the inverse operation of the
decoder. An encoder has n number of input lines and m number of output lines. An encoder
produces an m bit binary code corresponding to the digital input number. The encoder accepts an
n input digital word and converts it into an m bit another digital word.

Block diagram

Examples of Encoders are following.

 Priority encoders
 Decimal to BCD encoder
 Octal to binary encoder
 Hexadecimal to binary encoder

Priority Encoder
This is a special type of encoder. Priority is given to the input lines. If two or more input line are
1 at the same time, then the input line with highest priority will be considered. There are four
input D0, D1, D2, D3 and two output Y0, Y1. Out of the four input D3 has the highest priority and
D0 has the lowest priority. That means if D3 = 1 then Y1 Y1 = 11 irrespective of the other inputs.
Similarly if D3 = 0 and D2 = 1 then Y1 Y0 = 10 irrespective of the other inputs.

Block diagram
48

Truth Table

Logic Circuit
Sequential Circuits
The combinational circuit does not use any memory. Hence the previous state of input does not
have any effect on the present state of the circuit. But sequential circuit has 49
memory so output
can vary based on input. This type of circuits uses previous input, output, clock and a memory
element.

Block diagram

Flip Flop
Flip flop is a sequential circuit which generally samples its inputs and changes its outputs only at
particular instants of time and not continuously. Flip flop is said to be edge sensitive or edge
triggered rather than being level triggered like latches.

S-R Flip Flop


It is basically S-R latch using NAND gates with an additional enable input. It is also called as
level triggered SR-FF. For this, circuit in output will take place if and only if the enable input (E)
is made active. In short this circuit will operate as an S-R latch if E = 1 but there is no change in
the output if E = 0.

Block Diagram

Circuit Diagram
50

Truth Table

Operation
S.N. Condition Operation

1 S = R = 0 : No change
If S = R = 0 then output of NAND gates 3 and 4 are forced to become 1.
Hence R' and S' both will be equal to 1. Since S' and R' are the input of the
basic S-R latch using NAND gates, there will be no change in the state of
outputs.

2 S = 0, R = 1, E = 1
Since S = 0, output of NAND-3 i.e. R' = 1 and E = 1 the output of NAND-
4 i.e. S' = 0.
Hence Qn+1 = 0 and Qn+1 bar = 1. This is reset condition.

3 S = 1, R = 0, E = 1
Output of NAND-3 i.e. R' = 0 and output of NAND-4 i.e. S' = 1.
Hence output of S-R NAND latch is Qn+1 = 1 and Qn+1 bar = 0. This is the
reset condition.

4 S = 1, R = 1, E = 1
As S = 1, R = 1 and E = 1, the output of NAND gates 3 and 4 both are 0
i.e. S' = R' = 0.
Hence the Race condition will occur in the basic NAND latch.
Master Slave JK Flip Flop
Master slave JK FF is a cascade of two S-R FF with feedback from the output of second to input
of first. Master is a positive level triggered. But due to the presence of the inverter
51
in the clock
line, the slave will respond to the negative level. Hence when the clock = 1 (positive level) the
master is active and the slave is inactive. Whereas when clock = 0 (low level) the slave is active
and master is inactive.

Circuit Diagram

Truth Table

Operation
S.N. Condition Operation

1 J = K = 0 (No change)
When clock = 0, the slave becomes active and master is inactive. But since
the S and R inputs have not changed, the slave outputs will also remain
unchanged. Therefore outputs will not change if J = K =0.

2 J = 0 and K = 1 (Reset)
Clock = 1 − Master active, slave inactive. Therefore outputs of the master
become Q1 = 0 and Q1 bar = 1. That means S = 0 and R =1.
Clock = 0 − Slave active, master inactive. Therefore outputs of the slave
become Q = 0 and Q bar = 1.
Again clock = 1 − Master active, slave inactive. Therefore even with the
changed outputs Q = 0 and Q bar = 1 fed back to master, its output will be
Q1 = 0 and Q1 bar = 1. That means S = 0 and R = 1.
Hence with clock = 0 and slave becoming active the outputs of slave will
remain Q = 0 and Q bar = 1. Thus we get a stable output from the Master
slave. 52

3 J = 1 and K = 0 (Set)
Clock = 1 − Master active, slave inactive. Therefore outputs of the master
become Q1 = 1 and Q1 bar = 0. That means S = 1 and R =0.
Clock = 0 − Slave active, master inactive. Therefore outputs of the slave
become Q = 1 and Q bar = 0.
Again clock = 1 − then it can be shown that the outputs of the slave are
stabilized to Q = 1 and Q bar = 0.

4 J = K = 1 (Toggle)
Clock = 1 − Master active, slave inactive. Outputs of master will toggle.
So S and R also will be inverted.
Clock = 0 − Slave active, master inactive. Outputs of slave will toggle.
These changed output are returned back to the master inputs. But since
clock = 0, the master is still inactive. So it does not respond to these
changed outputs. This avoids the multiple toggling which leads to the race
around condition. The master slave flip flop will avoid the race around
condition.

Delay Flip Flop / D Flip Flop


Delay Flip Flop or D Flip Flop is the simple gated S-R latch with a NAND inverter connected
between S and R inputs. It has only one input. The input data is appearing at the output after
some time. Due to this data delay between i/p and o/p, it is called delay flip flop. S and R will be
the complements of each other due to NAND inverter. Hence S = R = 0 or S = R = 1, these input
condition will never appear. This problem is avoid by SR = 00 and SR = 1 conditions.

Block Diagram

Circuit Diagram
53

Truth Table

Operation
S.N. Condition Operation

1 E=0
Latch is disabled. Hence no change in output.

2 E = 1 and D = 0
If E = 1 and D = 0 then S = 0 and R = 1. Hence irrespective of the present state, the
next state is Qn+1 = 0 and Qn+1 bar = 1. This is the reset condition.

3 E = 1 and D = 1
If E = 1 and D = 1, then S = 1 and R = 0. This will set the latch and Q n+1 = 1 and
Qn+1 bar = 0 irrespective of the present state.

Toggle Flip Flop / T Flip Flop


Toggle flip flop is basically a JK flip flop with J and K terminals permanently connected
together. It has only input denoted by T as shown in the Symbol Diagram. The symbol for
positive edge triggered T flip flop is shown in the Block Diagram.

Symbol Diagram

Block Diagram
54

Truth Table

Operation
S.N. Condition Operation

1 T = 0, J = K = 0 The output Q and Q bar won't change

2 T = 1, J = K = 1 Output will toggle corresponding to every leading edge of clock signal.


Basics of Programming
55
Computer programming is the act of writing computer programs, which are a sequence of
instructions written using a Computer Programming Language to perform a specified task by the
computer. Before getting into computer programming, let us first understand computer programs
and what they do.
A computer program is a sequence of instructions written using a Computer Programming
Language to perform a specified task by the computer.
The two important terms that we have used in the above definition are −

 Sequence of instructions
 Computer Programming Language
To understand these terms, consider a situation when someone asks you about how to go to a
nearby KFC. What exactly do you do to tell him the way to go to KFC?
You will use Human Language to tell the way to go to KFC, something as follows −
First go straight, after half kilometer, take left from the red light and then drive around one kilometer and you
will find KFC at the right.
Here, you have used English Language to give several steps to be taken to reach KFC. If they are
followed in the following sequence, then you will reach KFC −
1. Go straight
2. Drive half kilometer
3. Take left
4. Drive around one kilometer
5. Search for KFC at your right side
Now, try to map the situation with a computer program. The above sequence of instructions is
actually a Human Program written in English Language, which instructs on how to reach KFC
from a given starting point. This same sequence could have been given in Spanish, Hindi, Arabic,
or any other human language, provided the person seeking direction knows any of these
languages.
Now, let's go back and try to understand a computer program, which is a sequence of instructions
written in a Computer Language to perform a specified task by the computer. Following is a
simple program written in Python programming Language −
print "Hello, World!"
The above computer program instructs the computer to print "Hello, World!" on the computer
screen.
 A computer program is also called a computer software, which can range from two lines to millions of
lines of instructions.
 Computer program instructions are also called program source code and computer programming is
also called program coding.
 A computer without a computer program is just a dump box; it is programs that make computers active.
As we have developed so many languages to communicate among ourselves, computer scientists
have developed several computer-programming languages to provide instructions to the computer
(i.e., to write computer programs). We will see several computer programming languages in the
subsequent chapters. 56

Introduction to Computer Programming


If you understood what a computer program is, then we will say: the act of writing computer
programs is called computer programming.
As we mentioned earlier, there are hundreds of programming languages, which can be used to
write computer programs and following are a few of them −

 Java
 C
 C++
 Python
 PHP
 Perl
 Ruby

Uses of Computer Programs


Today computer programs are being used in almost every field, household, agriculture, medical,
entertainment, defense, communication, etc. Listed below are a few applications of computer
programs −
 MS Word, MS Excel, Adobe Photoshop, Internet Explorer, Chrome, etc., are examples of computer
programs.
 Computer programs are being used to develop graphics and special effects in movie making.
 Computer programs are being used to perform Ultrasounds, X-Rays, and other medical examinations.
 Computer programs are being used in our mobile phones for SMS, Chat, and voice communication.

Computer Programmer
Someone who can write computer programs or in other words, someone who can do computer
programming is called a Computer Programmer.
Based on computer programming language expertise, we can name a computer programmers as
follows −

 C Programmer
 C++ Programmer
 Java Programmer
 Python Programmer
 PHP Programmer
 Perl Programmer
 Ruby Programmer

Algorithm
From programming point of view, an algorithm is a step-by-step procedure to resolve any
problem. An algorithm is an effective method expressed as a finite set of well-defined
instructions.
Thus, a computer programmer lists down all the steps required to resolve 57 a problem before
writing the actual code. Following is a simple example of an algorithm to find out the largest
number from a given list of numbers −
1. Get a list of numbers L1, L2, L3....LN
2. Assume L1 is the largest, Largest = L1
3. Take next number Li from the list and do the following
4. If Largest is less than Li
5. Largest = Li
6. If Li is last number from the list then
7. Print value stored in Largest and come out
8. Else repeat same process starting from step 3
The above algorithm has been written in a crude way to help beginners understand the concept.
You will come across more standardized ways of writing computer algorithms as you move on to
advanced levels of computer programming.
Though Environment Setup is not an element of any Programming Language, it is the first step to
be followed before setting on to write a program.
When we say Environment Setup, it simply implies a base on top of which we can do our
programming. Thus, we need to have the required software setup, i.e., installation on our PC
which will be used to write computer programs, compile, and execute them. For example, if you
need to browse Internet, then you need the following setup on your machine −

 A working Internet connection to connect to the Internet


 A Web browser such as Internet Explorer, Chrome, Safari, etc.
If you are a PC user, then you will recognize the following screenshot, which we have taken from
the Internet Explorer while browsing tutorialspoint.com.
Similarly, you will need the following setup to start with programming using any programming
language.

 A text editor to create computer programs. 58


 A compiler to compile the programs into binary format.
 An interpreter to execute the programs directly.
In case you don’t have sufficient exposure to computers, you will not be able to set up either of
these software. So, we suggest you take the help from any technical person around you to set up
the programming environment on your machine from where you can start. But for you, it is
important to understand what these items are.

Text Editor
A text editor is a software that is used to write computer programs. Your Windows machine must
have a Notepad, which can be used to type programs. You can launch it by following these steps

Start Icon → All Programs → Accessories → Notepad → Mouse Click on Notepad
It will launch Notepad with the following window −

You can use this software to type your computer program and save it in a file at any location.
You can download and install other good editors like Notepad++, which is freely available.
If you are a Mac user, then you will have TextEdit or you can install some other commercial
editor like BBEdit to start with.

Compiler?
You write your computer program using your favorite programming language and save it in a
text file called the program file.
Now let us try to get a little more detail on how the computer understands a program written by
you using a programming language. Actually, the computer cannot understand 59 your program
directly given in the text format, so we need to convert this program in a binary format, which
can be understood by the computer.
The conversion from text program to binary file is done by another software called Compiler and
this process of conversion from text formatted program to binary format file is called program
compilation. Finally, you can execute binary file to perform the programmed task.
We are not going into the details of a compiler and the different phases of compilation.
The following flow diagram gives an illustration of the process −

So, if you are going to write your program in any such language, which needs compilation like C,
C++, Java and Pascal, etc., then you will need to install their compilers before you start
programming.

Interpreter
We just discussed about compilers and the compilation process. Compilers are required in case
you are going to write your program in a programming language that needs to be compiled into
binary format before its execution.
There are other programming languages such as Python, PHP, and Perl, which do not need any
compilation into binary format, rather an interpreter can be used to read such programs line by
line and execute them directly without any further conversion.
60

So, if you are going to write your programs in PHP, Python, Perl, Ruby, etc., then you will need
to install their interpreters before you start programming.
Let's discuss about a very simple but very important concept available in almost all the
programming languages which is called data types. As its name indicates, a data type represents
a type of the data which you can process using your computer program. It can be numeric,
alphanumeric, decimal, etc.
Let’s keep Computer Programming aside for a while and take an easy example of adding two
whole numbers 10 & 20, which can be done simply as follows −
10 + 20
Let's take another problem where we want to add two decimal numbers 10.50 & 20.50, which
will be written as follows −
10.50 + 20.50
The two examples are straightforward. Now let's take another example where we want to record
student information in a notebook. Here we would like to record the following information −
Name:
Class:
Section:
Age:
Sex:
Now, let's put one student record as per the given requirement −
Name: Zara Ali
61
Class: 6th
Section: J
Age: 13
Sex: F
The first example dealt with whole numbers, the second example added two decimal numbers,
whereas the third example is dealing with a mix of different data. Let's put it as follows −
 Student name "Zara Ali" is a sequence of characters which is also called a string.
 Student class "6th" has been represented by a mix of whole number and a string of two characters. Such
a mix is called alphanumeric.
 Student section has been represented by a single character which is 'J'.
 Student age has been represented by a whole number which is 13.
 Student sex has been represented by a single character which is 'F'.
This way, we realized that in our day-to-day life, we deal with different types of data such as
strings, characters, whole numbers (integers), and decimal numbers (floating point numbers).
Similarly, when we write a computer program to process different types of data, we need to
specify its type clearly; otherwise the computer does not understand how different operations can
be performed on that given data. Different programming languages use different keywords to
specify different data types. For example, C and Java programming languages use int to specify
integer data, whereas char specifies a character data type.
Subsequent chapters will show you how to use different data types in different situations. For
now, let's check the important data types available in C, Java, and Python and the keywords we
will use to specify those data types.

C and Java Data Types


C and Java support almost the same set of data types, though Java supports additional data types.
For now, we are taking a few common data types supported by both the programming languages

Type Keyword Value range which can be represented by this data type

Character char -128 to 127 or 0 to 255

Number int -32,768 to 32,767 or -2,147,483,648 to 2,147,483,647

Small Number short -32,768 to 32,767

Long Number long -2,147,483,648 to 2,147,483,647


Decimal Number float 1.2E-38 to 3.4E+38 till 6 decimal places

62 to build more
These data types are called primitive data types and you can use these data types
complex data types, which are called user-defined data type, for example a string will be a
sequence of characters.

Python Data Types


Python has five standard data types but this programming language does not make use of any
keyword to specify a particular data type, rather Python is intelligent enough to understand a
given data type automatically.

 Numbers
 String
 List
 Tuple
 Dictionary
Here, Number specifies all types of numbers including decimal numbers and string represents a
sequence of characters with a length of 1 or more characters. For now, let's proceed with these
two data types and skip List, Tuple, and Dictionary, which are advanced data types in Python

Variables
Variables are the names you give to computer memory locations which are used to store values in
a computer program.
For example, assume you want to store two values 10 and 20 in your program and at a later stage,
you want to use these two values

Keywords
Like int, long, and float, there are many other keywords supported by C programming language
which we will use for different purpose. Different programming languages provide different set
of reserved keywords, but there is one important & common rule in all the programming
languages that we cannot use a reserved keyword to name our variables, which means we cannot
name our variable like int or float rather these keywords can only be used to specify a variable
data type.
For example, if you will try to use any reserved keyword for the purpose of variable name, then
you will get a syntax error.

C Programming Reserved Keywords


Here is a table having almost all the keywords supported by C Programming language −

auto else long switch


break enum register typedef

case extern return 63 union

char float short unsigned

const for signed void

continue goto sizeof volatile

default if static while

do int struct _Packed

double

Java Programming Reserved Keywords


Here is a table having almost all the keywords supported by Java Programming language −

abstract assert boolean break

byte case catch char

class const continue default

do double else enum

extends final finally float

for goto if implements

import instanceof int interface

long native new package


private protected public return

short static strictfp 64 super

switch synchronized this throw

throws transient try void

volatile while

Python Programming Reserved Keywords


Here is a table having almost all the keywords supported by Python Programming language −

and exec not

assert finally or

break for pass

class from print

continue global raise

def if return

del import try

elif in while

else is with

except lambda yield


Operators
An operator in a programming language is a symbol that tells the compiler or interpreter to
perform specific mathematical, relational or logical operation and produce 65final result. This
chapter will explain the concept of operators and it will take you through the important
arithmetic and relational operators available in C, Java, and Python.

Arithmetic Operators
Computer programs are widely used for mathematical calculations. We can write a computer
program which can do simple calculation like adding two numbers (2 + 3) and we can also write
a program, which can solve a complex equation like P(x) = x 4 + 7x3 - 5x + 9. If you have been
even a poor student, you must be aware that in first expression 2 and 3 are operands and + is an
operator. Similar concepts exist in Computer Programming.
Take a look at the following two examples −
2+3

P(x) = x4 + 7x3 - 5x + 9.
These two statements are called arithmetic expressions in a programming language
and plus, minus used in these expressions are called arithmetic operators and the values used in
these expressions like 2, 3 and x, etc., are called operands. In their simplest form, such
expressions produce numerical results.
Similarly, a programming language provides various arithmetic operators. The following table
lists down a few of the important arithmetic operators available in C programming language.
Assume variable A holds 10 and variable B holds 20, then −

Operator Description Example

+ Adds two operands A + B will give 30

- Subtracts second operand from the first A - B will give -10

* Multiplies both operands A * B will give 200

/ Divides numerator by de-numerator B / A will give 2

% This gives remainder of an integer division B % A will give 0

Relational Operators
Consider a situation where we create two variables and assign them some values as follows −
A = 20
B = 10
Here, it is obvious that variable A is greater than B in values. So, we need the help of some
symbols to write such expressions which are called relational expressions. If we use C
programming language, then it will be written as follows −
(A > B) 66

Here, we used a symbol > and it is called a relational operator and in their simplest form, they
produce Boolean results which means the result will be either true or false. Similarly, a
programming language provides various relational operators. The following table lists down a
few of the important relational operators available in C programming language. Assume
variable A holds 10 and variable B holds 20, then −

Operator Description Example

== Checks if the values of two operands are equal or not, if yes then (A == B) is not true.
condition becomes true.

!= Checks if the values of two operands are equal or not, if values are (A != B) is true.
not equal then condition becomes true.

> Checks if the value of left operand is greater than the value of right (A > B) is not true.
operand, if yes then condition becomes true.

< Checks if the value of left operand is less than the value of right (A < B) is true.
operand, if yes then condition becomes true.

>= Checks if the value of left operand is greater than or equal to the (A >= B) is not true.
value of right operand, if yes then condition becomes true.

<= Checks if the value of left operand is less than or equal to the value (A <= B) is true.
of right operand, if yes then condition becomes true.

Logical Operators
Logical operators are very important in any programming language and they help us take
decisions based on certain conditions. Suppose we want to combine the result of two conditions,
then logical AND and OR logical operators help us in producing the final result.
The following table shows all the logical operators supported by the C language. Assume
variable A holds 1 and variable B holds 0, then −

Operator Description Example

&& Called Logical AND operator. If both the operands are non-zero, then condition (A && B) is false.
becomes true.
|| Called Logical OR Operator. If any of the two operands is non-zero, then (A || B) is true.
condition becomes true.

67
! Called Logical NOT Operator. Use to reverses the logical state of its operand. If !(A && B) is true.
a condition is true then Logical NOT operator will make false.

Decision statements
Decision making is critical to computer programming. There will be many situations when you
will be given two or more options and you will have to select an option based on the given
conditions. For example, we want to print a remark about a student based on his secured marks.
Following is the situation −
Assume given marks are x for a student:

If given marks are more than 95, then


Student is brilliant

If given marks are less than 30, then


Student is poor

If given marks are less than 95 and more than 30, then
Student is average
Now, the question is how to write a programming code to handle such situations. Almost all the
programming languages provide conditional statements that work based on the following flow
diagram −

if...else statement
An if statement can be followed by an optional else statement, which executes when the Boolean
expression is false. The syntax of an if...else statement in C programming language is −
if(boolean_expression) {
68
/* Statement(s) will execute if the boolean expression is true */
} else {

/* Statement(s) will execute if the boolean expression is false */


}
The above syntax can be represented in the form of a flow diagram as shown below −

An if...else statement is useful when we have to take a decision out of two options.

if...elseif...else statement
An if statement can be followed by an optional else if...else statement, which is very useful to test
various conditions.
While using if, else if, else statements, there are a few points to keep in mind −
 An if can have zero or one else's and it must come after an else if.
 An if can have zero to many else…if's and they must come before the else.
 Once an else…if succeeds, none of the remaining else…if's or else's will be tested.
The syntax of an if...else if...else statement in C programming language is −
if(boolean_expression 1) {

/* Executes when the boolean expression 1 is true */


}
else if( boolean_expression 2) {

/* Executes when the boolean expression 2 is true */


}
else if( boolean_expression 3) {

/* Executes when the boolean expression 3 is true */


} else { 69

/* Executes when the none of the above condition is true */


}
The Switch Statement
A switch statement is an alternative of if statements which allows a variable to be tested for
equality against a list of values. Each value is called a case, and the variable being switched on is
checked for each switch case. It has the following syntax −
switch(expression){
case ONE :
statement(s);
break;
case TWO:
statement(s);
break;
......

default :
statement(s);
}
The expression used in a switch statement must give an integer value, which will be compared
for equality with different cases given. Wherever an expression value matches with a case value,
the body of that case will be executed and finally, the switch will be terminated using
a break statement. If no break statements are provided, then the computer continues executing
other statements available below to the matched case. If none of the cases matches, then the
default case body is executed.
The above syntax can be represented in the form of a flow diagram as shown below −
70

Looping statements
Almost all the programming languages provide a concept called loop, which helps in executing one or
more statements up to a desired number of times. All high-level programming languages provide
various forms of loops, which can be used to execute one or more statements repeatedly.
The while Loop
A while loop available in C Programming language has the following syntax −
while ( condition ) {
/*....while loop body ....*/
}
The above code can be represented in the form of a flow diagram as shown below −
71

The following important points are to be noted about a while loop −


 A while loop starts with a keyword while followed by a condition enclosed in ( ).
 Further to the while() statement, you will have the body of the loop enclosed in curly braces {...}.
 A while loop body can have one or more lines of source code to be executed repeatedly.
 If the body of a while loop has just one line, then its optional to use curly braces {...}.
 A while loop keeps executing its body till a given condition holds true. As soon as the condition
becomes false, the while loop comes out and continues executing from the immediate next statement
after the while loop body.
 A condition is usually a relational statement, which is evaluated to either true or false. A value equal to
zero is treated as false and any non-zero value works like true.

The do...while Loop


A while loop checks a given condition before it executes any statements given in the body part. C
programming provides another form of loop, called do...while that allows to execute a loop body
before checking a given condition. It has the following syntax −
do {
/*....do...while loop body ....*/
}
while ( condition );
The above code can be represented in the form of a flow diagram as shown below −
72

The break statement


When the break statement is encountered inside a loop, the loop is immediately terminated and
the program control resumes at the next statement following the loop. The syntax for
a break statement in C is as follows −
break;
A break statement can be represented in the form of a flow diagram as shown below −

The continue statement


The continue statement in C programming language works somewhat like the break statement.
Instead of forcing termination, continue forces the next iteration of the loop to take place,
skipping any code in between. The syntax for a continue statement in C is as follows −
continue;
A continue statement can be represented in the form of a flow diagram as shown below −

73

Arrays
Consider a situation where we need to store five integer numbers. If we use programming's simple
variable and data type concepts, then we need five variables of int data type. It was simple, because
we had to store just five integer numbers. Now let's assume we have to store 5000 integer
numbers. Are we going to use 5000 variables?
To handle such situations, almost all the programming languages provide a concept called array.
An array is a data structure, which can store a fixed-size collection of elements of the same data
type. An array is used to store a collection of data, but it is often more useful to think of an array
as a collection of variables of the same type.
Instead of declaring individual variables, such as number1, number2, ..., number99, you just
declare one array variable number of integer type and use number1[0], number1[1], and ...,
number1[99] to represent individual variables. Here, 0, 1, 2, .....99 are index associated
with var variable and they are being used to represent individual elements available in the array.
All arrays consist of contiguous memory locations. The lowest address corresponds to the first
element and the highest address to the last element.

Create Arrays
To create an array variable in C, a programmer specifies the type of the elements and the number
of elements to be stored in that array. Given below is a simple syntax to create an array in C
programming −
type arrayName [ arraySize ];
This is called a single-dimensional array. The arraySize must be an integer constant greater than
zero and type can be any valid C data type. For example, now to declare a 10-element array
called number of type int, use this statement − 74

int number[10];
Here, number is a variable array, which is sufficient to hold up to 10 integer numbers.

Initializing Arrays
You can initialize an array in C either one by one or using a single statement as follows −
int number[5] = {10, 20, 30, 40, 50};
The number of values between braces { } cannot be larger than the number of elements that we
declare for the array between square brackets [ ].
If you omit the size of the array, an array just big enough to hold the initialization is created.
Therefore, if you write −
int number[] = {10, 20, 30, 40, 50};
You will create exactly the same array as you did in the previous example. Following is an
example to assign a single element of the array −
number[4] = 50;
The above statement assigns element number 5th in the array with a value of 50. All arrays have
0 as the index of their first element which is also called the base index and the last index of an
array will be the total size of the array minus 1. The following image shows the pictorial
representation of the array we discussed above −

Accessing Array Elements


An element is accessed by indexing the array name. This is done by placing the index of the
element within square brackets after the name of the array. For example −
int var = number[9];
The above statement will take the 10th element from the array and assign the value
to var variable.

Functions
A function is a block of organized, reusable code that is used to perform a single, related action.
Functions provide better modularity for your application and a high degree of code reusing. You
have already seen various functions like printf() and main(). These are called built-in functions
provided by the language itself, but we can write our own functions as well and this tutorial will
teach you how to write and use those functions in C programming language.
Good thing about functions is that they are famous with several names. Different programming
75 procedures, etc.
languages name them differently, for example, functions, methods, sub-routines,
If you come across any such terminology, then just imagine about the same concept,
Let's start with a program where we will define two arrays of numbers and then from each array,
we will find the biggest number. Given below are the steps to find out the maximum number
from a given set of numbers −
1. Get a list of numbers L1, L2, L3....LN
2. Assume L1 is the largest, Set max = L1
3. Take next number Li from the list and do the following
4. If max is less than Li
5. Set max = Li
6. If Li is last number from the list then
7. Print value stored in max and come out
8. Else prepeat same process starting from step 3
Let's translate the above program in C programming language −

#include <stdio.h>

int main() {
int set1[5] = {10, 20, 30, 40, 50};
int set2[5] = {101, 201, 301, 401, 501};
int i, max;

/* Process first set of numbers available in set1[] */


max = set1[0];
i = 1;
while( i < 5 ) {
if( max < set1[i] ) {
max = set1[i];
}
i = i + 1;
}

printf("Max in first set = %d\n", max );

/* Now process second set of numbers available in set2[] */


max = set2[0];
i = 1;
while( i < 5 ) {
if( max < set2[i] ) {
max = set2[i];
}
i = i + 1;
}
printf("Max in second set = %d\n", max );
}
When the above code is compiled and executed, it produces the following result −
Max in first set = 50
Max in second set = 501
If you are clear about the above example, then it will become easy to understand why we need a
76 but consider a
function. In the above example, there are only two sets of numbers, set1 and set2,
situation where we have 10 or more similar sets of numbers to find out the maximum numbers
from each set. In such a situation, we will have to repeat, processing 10 or more times and
ultimately, the program will become too large with repeated code. To handle such situation, we
write our functions where we try to keep the source code which will be used again and again in
our programming.
Now, let's see how to define a function in C programming language and then in the subsequent
sections, we will explain how to use them.

Defining a Function
The general form of a function definition in C programming language is as follows −
return_type function_name( parameter list ) {
body of the function

return [expression];
}
A function definition in C programming consists of a function header and a function body. Here
are all the parts of a function −
 Return Type − A function may return a value. The return_type is the data type of the value the
function returns. Some functions perform the desired operations without returning a value. In this case,
the return_type is the keyword void.
 Function Name − This is the actual name of the function. The function name and the parameter list
together constitute the function signature.
 Parameter List − A parameter is like a placeholder. When a function is invoked, you pass a value as a
parameter. This value is referred to as the actual parameter or argument. The parameter list refers to
the type, order, and number of the parameters of a function. Parameters are optional; that is, a function
may contain no parameters.
 Function Body − The function body contains a collection of statements that defines what the function
does.

Calling a Function
While creating a C function, you give a definition of what the function has to do. To use a
function, you will have to call that function to perform a defined task.
Now, let's write the above example with the help of a function −

#include <stdio.h>

int getMax( int set[] ) {


int i, max;

max = set[0];
i = 1;
while( i < 5 ) {
if( max < set[i] ) {
max = set[i];
}
i = i + 1; 77
}
return max;
}
main() {
int set1[5] = {10, 20, 30, 40, 50};
int set2[5] = {101, 201, 301, 401, 501};
int max;

/* Process first set of numbers available in set1[] */


max = getMax(set1);
printf("Max in first set = %d\n", max );

/* Now process second set of numbers available in set2[] */


max = getMax(set2);
printf("Max in second set = %d\n", max );
}
When the above code is compiled and executed, it produces the following result −
Max in first set = 50
Max in second set = 501

Calling a Function
While creating a C function, you give a definition of what the function has to do. To use a
function, you will have to call that function to perform the defined task.
When a program calls a function, the program control is transferred to the called function. A
called function performs a defined task and when its return statement is executed or when its
function-ending closing brace is reached, it returns the program control back to the main
program.
To call a function, you simply need to pass the required parameters along with the function name,
and if the function returns a value, then you can store the returned value. For example −
#include <stdio.h>

/* function declaration */
int max(int num1, int num2);

int main () {

/* local variable definition */


int a = 100;
int b = 200;
int ret;

/* calling a function to get max value */


ret = max(a, b);

printf( "Max value is : %d\n", ret );


78
return 0;
}

/* function returning the max between two numbers */


int max(int num1, int num2) {

/* local variable declaration */


int result;

if (num1 > num2)


result = num1;
else
result = num2;

return result;
}
We have kept max() along with main() and compiled the source code. While running the final
executable, it would produce the following result −
Max value is : 200

Function Arguments
If a function is to use arguments, it must declare variables that accept the values of the arguments.
These variables are called the formal parameters of the function.
Formal parameters behave like other local variables inside the function and are created upon
entry into the function and destroyed upon exit.
While calling a function, there are two ways in which arguments can be passed to a function −

Sr.No. Call Type & Description

1 Call by value

This method copies the actual value of an argument into the formal parameter of the function. In this
case, changes made to the parameter inside the function have no effect on the argument.

2 Call by reference
This method copies the address of an argument into the formal parameter. Inside the function, the
address is used to access the actual argument used in the call. This means that changes made to the
parameter affect the argument.

By default, C uses call by value to pass arguments. In general, it means the code within a
function cannot alter the arguments used to call the function.
A scope in any programming is a region of the program where a defined variable can have its
existence and beyond that variable it cannot be accessed. There are three places where variables
can be declared in C programming language −
 Inside a function or a block which is called local variables. 79

 Outside of all functions which is called global variables.


 In the definition of function parameters which are called formal parameters.
Let us understand what are local and global variables, and formal parameters.

Local Variables
Variables that are declared inside a function or block are called local variables. They can be used
only by statements that are inside that function or block of code. Local variables are not known to
functions outside their own. The following example shows how local variables are used. Here all
the variables a, b, and c are local to main() function.
#include <stdio.h>

int main () {

/* local variable declaration */


int a, b;
int c;

/* actual initialization */
a = 10;
b = 20;
c = a + b;

printf ("value of a = %d, b = %d and c = %d\n", a, b, c);

return 0;
}
Global Variables
Global variables are defined outside a function, usually on top of the program. Global variables
hold their values throughout the lifetime of your program and they can be accessed inside any of
the functions defined for the program.
A global variable can be accessed by any function. That is, a global variable is available for use
throughout your entire program after its declaration. The following program show how global
variables are used in a program.
#include <stdio.h>

/* global variable declaration */


int g;

int main () {

/* local variable declaration */


int a, b;
/* actual initialization */
a = 10;
b = 20; 80
g = a + b;

printf ("value of a = %d, b = %d and g = %d\n", a, b, g);

return 0;
}
A program can have same name for local and global variables but the value of local variable
inside a function will take preference. Here is an example −
#include <stdio.h>

/* global variable declaration */


int g = 20;

int main () {

/* local variable declaration */


int g = 10;

printf ("value of g = %d\n", g);

return 0;
}
When the above code is compiled and executed, it produces the following result −
value of g = 10
Formal Parameters
Formal parameters, are treated as local variables with-in a function and they take precedence over
global variables. Following is an example −
#include <stdio.h>

/* global variable declaration */


int a = 20;

int main () {

/* local variable declaration in main function */


int a = 10;
int b = 20;
int c = 0;

printf ("value of a in main() = %d\n", a);


c = sum( a, b);
printf ("value of c in main() = %d\n", c);

return 0;
}

/* function to add two integers */


int sum(int a, int b) { 81

printf ("value of a in sum() = %d\n", a);


printf ("value of b in sum() = %d\n", b);

return a + b;
}
When the above code is compiled and executed, it produces the following result −
value of a in main() = 10
value of a in sum() = 10
value of b in sum() = 20
value of c in main() = 30
Initializing Local and Global Variables
When a local variable is defined, it is not initialized by the system, you must initialize it yourself.
Global variables are initialized automatically by the system when you define them as follows −

Data Type Initial Default Value

int 0

char '\0'

float 0

double 0

pointer NULL

It is a good programming practice to initialize variables properly, otherwise your program may
produce unexpected results, because uninitialized variables will take some garbage value already
available at their memory location.

Parameter Passing Techniques in C/C++


There are different ways in which parameter data can be passed into and out of methods and
functions. Let us assume that a function B() is called from another function A(). In this case A is
called the “caller function” and B is called the “called function or callee function”. Also, the
arguments which A sends to B are called actual arguments and the parameters of B are called
formal arguments.

Terminology 82

 Formal Parameter : A variable and its type as they appear in the prototype of the function or
method.
 Actual Parameter : The variable or expression corresponding to a formal parameter that appears
in the function or method call in the calling environment.
 Modes:
o IN: Passes info from caller to callee.
o OUT: Callee writes values in caller.
o IN/OUT: Caller tells callee value of variable, which may be updated by callee.

Important methods of Parameter Passing

1. Pass By Value : This method uses in-mode semantics. Changes made to formal parameter do not
get transmitted back to the caller. Any modifications to the formal parameter variable inside the
called function or method affect only the separate storage location and will not be reflected in the
actual parameter in the calling environment. This method is also called as call by value.

// C program to illustrate
// call by value
#include <stdio.h>

void func(int a, int b)


{
a += b;
printf("In func, a = %d b = %d\n", a, b);
}
int main(void)
{
int x = 5, y = 7;

// Passing parameters
func(x, y);
printf("In main, x = %d y = %d\n", x, y);
return 0;
}
83
Output:

In func, a = 12 b = 7
In main, x = 5 y = 7

Languages like C, C++, Java support this type of parameter passing. Java in fact is strictly
call by value.
Shortcomings:

o Inefficiency in storage allocation


o For objects and arrays, the copy semantics are costly
2. Pass by reference(aliasing) : This technique uses in/out-mode semantics. Changes made to formal
parameter do get transmitted back to the caller through parameter passing. Any changes to the
formal parameter are reflected in the actual parameter in the calling environment as formal
parameter receives a reference (or pointer) to the actual data. This method is also called as
<em>call by reference. This method is efficient in both time and space.

// C program to illustrate
// call by reference
#include <stdio.h>

void swapnum(int* i, int* j)


{
int temp = *i;
*i = *j;
*j = temp;
}

int main(void)
{
int a = 10, b = 20;

// passing parameters
swapnum(&a, &b);
84
printf("a is %d and b is %d\n", a, b);
return 0;
}

Output:

a is 20 and b is 10

C and C++ both support call by value as well as call by reference whereas Java does’nt
support call by reference.
Shortcomings:

o Many potential scenarios can occur


o Programs are difficult to understand sometimes

Other methods of Parameter Passing

These techniques are older and were used in earlier programming languages like Pascal, Algol and
Fortran. These techniques are not applicable in high level languages.

1. Pass by Result : This method uses out-mode semantics. Just before control is transfered back to
the caller, the value of the formal parameter is transmitted back to the actual parameter.T his
method is sometimes called call by result. In general, pass by result technique is implemented by
copy.
2. Pass by Value-Result : This method uses in/out-mode semantics. It is a combination of Pass-by-
Value and Pass-by-result. Just before the control is transferred back to the caller, the value of the
formal parameter is transmitted back to the actual parameter. This method is sometimes called as
call by value-result
3. Pass by name : This technique is used in programming language such as Algol. In this technique,
symbolic “name” of a variable is passed, which allows it both to be accessed and update.
Example:
To double the value of C[j], you can pass its name (not its value) into the following procedure.

procedure double(x);
real x;
begin
x:=x*2
end;

In general, the effect of pass-by-name is to textually substitute the argument in a procedure


call for the corresponding parameter in the body of the procedure.
Implications of Pass-by-Name mechanism:

o The argument expression is re-evaluated each time the formal parameter is passed.
o The procedure can change the values of variables used in the argument expression and
hence change the expression’s value.
Recursion
Recursion is the process of repeating items in a self-similar way. In programming languages, if a
program allows you to call a function inside the same function, then it is called85
a recursive call of
the function.
void recursion() {
recursion(); /* function calls itself */
}

int main() {
recursion();
}
The C programming language supports recursion, i.e., a function to call itself. But while using
recursion, programmers need to be careful to define an exit condition from the function,
otherwise it will go into an infinite loop.
Recursive functions are very useful to solve many mathematical problems, such as calculating the
factorial of a number, generating Fibonacci series, etc.

Number Factorial
The following example calculates the factorial of a given number using a recursive function −
#include <stdio.h>

unsigned long long int factorial(unsigned int i) {

if(i <= 1) {
return 1;
}
return i * factorial(i - 1);
}

int main() {
int i = 12;
printf("Factorial of %d is %d\n", i, factorial(i));
return 0;
}
When the above code is compiled and executed, it produces the following result −
Factorial of 12 is 479001600
Fibonacci Series
The following example generates the Fibonacci series for a given number using a recursive
function −
#include <stdio.h>

int fibonacci(int i) {

if(i == 0) {
return 0;
}

if(i == 1) { 86
return 1;
}
return fibonacci(i-1) + fibonacci(i-2);
}

int main() {

int i;

for (i = 0; i < 10; i++) {


printf("%d\t\n", fibonacci(i));
}

return 0;
}
When the above code is compiled and executed, it produces the following result −
0
1
1
2
3
5
8
13
21
34

Stack
A stack is an Abstract Data Type (ADT), commonly used in most programming languages. It is
named stack as it behaves like a real-world stack, for example – a deck of cards or a pile of plates,
etc.

A real-world stack allows operations at one end only. For example, we can place or remove a card
or plate from the top of the stack only. Likewise, Stack ADT allows all data operations at one end
only. At any given time, we can only access the top element of a stack.
This feature makes it LIFO data structure. LIFO stands for Last-in-first-out. Here, the element
which is placed (inserted or added) last, is accessed first. In stack terminology, insertion operation
is called PUSH operation and removal operation is called POP operation.
87
Stack Representation

The following diagram depicts a stack and its operations −

A stack can be implemented by means of Array, Structure, Pointer, and Linked List. Stack can
either be a fixed size one or it may have a sense of dynamic resizing. Here, we are going to
implement stack using arrays, which makes it a fixed size stack implementation.

Basic Operations

Stack operations may involve initializing the stack, using it and then de-initializing it. Apart from
these basic stuffs, a stack is used for the following two primary operations −

 push() − Pushing (storing) an element on the stack.


 pop() − Removing (accessing) an element from the stack.

When data is PUSHed onto stack.

To use a stack efficiently, we need to check the status of stack as well. For the same purpose, the
following functionality is added to stacks −

 peek() − get the top data element of the stack, without removing it.
 isFull() − check if stack is full.
 isEmpty() − check if stack is empty.
At all times, we maintain a pointer to the last PUSHed data on the stack. As this pointer always
represents the top of the stack, hence named top. The top pointer provides top value of the stack
without actually removing it.
88
First we should learn about procedures to support stack functions −

peek()
Algorithm of peek() function −

begin procedure peek


return stack[top]
end procedure

Implementation of peek() function in C programming language −

Example

int peek() {
return stack[top];
}

isfull()
Algorithm of isfull() function −

begin procedure isfull

if top equals to MAXSIZE


return true
else
return false
endif

end procedure

Implementation of isfull() function in C programming language −

Example

bool isfull() {
if(top == MAXSIZE)
return true;
else
return false;
}

isempty()
Algorithm of isempty() function −

begin procedure isempty

if top less than 1


return true
else
return false
endif

end procedure 89

Implementation of isempty() function in C programming language is slightly different. We


initialize top at -1, as the index in array starts from 0. So we check if the top is below zero or -1 to
determine if the stack is empty. Here's the code −

Example

bool isempty() {
if(top == -1)
return true;
else
return false;
}
Push Operation

The process of putting a new data element onto stack is known as a Push Operation. Push
operation involves a series of steps −

 Step 1 − Checks if the stack is full.


 Step 2 − If the stack is full, produces an error and exit.
 Step 3 − If the stack is not full, increments top to point next empty space.
 Step 4 − Adds data element to the stack location, where top is pointing.
 Step 5 − Returns success.

If the linked list is used to implement the stack, then in step 3, we need to allocate space
dynamically.

Algorithm for PUSH Operation


A simple algorithm for Push operation can be derived as follows −

begin procedure push: stack, data


if stack is full
return null
endif

top ← top + 1 90
stack[top] ← data

end procedure

Implementation of this algorithm in C, is very easy. See the following code −

Example

void push(int data) {


if(!isFull()) {
top = top + 1;
stack[top] = data;
} else {
printf("Could not insert data, Stack is full.\n");
}
}
Pop Operation

Accessing the content while removing it from the stack, is known as a Pop Operation. In an array
implementation of pop() operation, the data element is not actually removed, instead top is
decremented to a lower position in the stack to point to the next value. But in linked-list
implementation, pop() actually removes data element and deallocates memory space.

A Pop operation may involve the following steps −

 Step 1 − Checks if the stack is empty.


 Step 2 − If the stack is empty, produces an error and exit.
 Step 3 − If the stack is not empty, accesses the data element at which top is pointing.
 Step 4 − Decreases the value of top by 1.
 Step 5 − Returns success.

Algorithm for Pop Operation


A simple algorithm for Pop operation can be derived as follows −
begin procedure pop: stack

if stack is empty
return null
endif 91

data ← stack[top]
top ← top - 1
return data

end procedure

Implementation of this algorithm in C, is as follows −

Example

int pop(int data) {

if(!isempty()) {
data = stack[top];
top = top - 1;
return data;
} else {
printf("Could not retrieve data, Stack is empty.\n");
}
}

Queue
Queue is an abstract data structure, somewhat similar to Stacks. Unlike stacks, a queue is open at
both its ends. One end is always used to insert data (enqueue) and the other is used to remove data
(dequeue). Queue follows First-In-First-Out methodology, i.e., the data item stored first will be
accessed first.

A real-world example of queue can be a single-lane one-way road, where the vehicle enters first,
exits first. More real-world examples can be seen as queues at the ticket windows and bus-stops.

Queue Representation

As we now understand that in queue, we access both ends for different reasons. The following
diagram given below tries to explain queue representation as data structure −
92

As in stacks, a queue can also be implemented using Arrays, Linked-lists, Pointers and Structures.
For the sake of simplicity, we shall implement queues using one-dimensional array.

Basic Operations

Queue operations may involve initializing or defining the queue, utilizing it, and then completely
erasing it from the memory. Here we shall try to understand the basic operations associated with
queues −

 enqueue() − add (store) an item to the queue.


 dequeue() − remove (access) an item from the queue.

Few more functions are required to make the above-mentioned queue operation efficient. These
are −

 peek() − Gets the element at the front of the queue without removing it.
 isfull() − Checks if the queue is full.
 isempty() − Checks if the queue is empty.

In queue, we always dequeue (or access) data, pointed by front pointer and while enqueing (or
storing) data in the queue we take help of rear pointer.

Let's first learn about supportive functions of a queue −

peek()
This function helps to see the data at the front of the queue. The algorithm of peek() function is as
follows −

Algorithm

begin procedure peek


return queue[front]
end procedure

Implementation of peek() function in C programming language −

Example

int peek() {
return queue[front];
}
isfull()
As we are using single dimension array to implement queue, we just check for the rear pointer to
93
reach at MAXSIZE to determine that the queue is full. In case we maintain the queue in a circular
linked-list, the algorithm will differ. Algorithm of isfull() function −

Algorithm

begin procedure isfull

if rear equals to MAXSIZE


return true
else
return false
endif

end procedure

Implementation of isfull() function in C programming language −

Example

bool isfull() {
if(rear == MAXSIZE - 1)
return true;
else
return false;
}

isempty()
Algorithm of isempty() function −

Algorithm

begin procedure isempty

if front is less than MIN OR front is greater than rear


return true
else
return false
endif

end procedure

If the value of front is less than MIN or 0, it tells that the queue is not yet initialized, hence empty.

Here's the C programming code −

Example

bool isempty() {
if(front < 0 || front > rear)
return true;
else
return false;
}
Enqueue Operation

Queues maintain two data pointers, front and rear. Therefore, its operations are94comparatively
difficult to implement than that of stacks.

The following steps should be taken to enqueue (insert) data into a queue −

 Step 1 − Check if the queue is full.


 Step 2 − If the queue is full, produce overflow error and exit.
 Step 3 − If the queue is not full, increment rear pointer to point the next empty space.
 Step 4 − Add data element to the queue location, where the rear is pointing.
 Step 5 − return success.

Sometimes, we also check to see if a queue is initialized or not, to handle any unforeseen
situations.

Algorithm for enqueue operation


procedure enqueue(data)

if queue is full
return overflow
endif

rear ← rear + 1
queue[rear] ← data
return true

end procedure

Implementation of enqueue() in C programming language −

Example
int enqueue(int data)
if(isfull())
return 0;

rear = rear + 1; 95
queue[rear] = data;

return 1;
end procedure
Dequeue Operation

Accessing data from the queue is a process of two tasks − access the data where front is pointing
and remove the data after access. The following steps are taken to perform dequeue operation −

 Step 1 − Check if the queue is empty.


 Step 2 − If the queue is empty, produce underflow error and exit.
 Step 3 − If the queue is not empty, access the data where front is pointing.
 Step 4 − Increment front pointer to point to the next available data element.
 Step 5 − Return success.

Algorithm for dequeue operation


procedure dequeue

if queue is empty
return underflow
end if

data = queue[front]
front ← front + 1
return true

end procedure

Implementation of dequeue() in C programming language −


Example

int dequeue() {
if(isempty())
return 0; 96

int data = queue[front];


front = front + 1;

return data;
}

Tree
Trees are graphs that do not contain even a single cycle. They represent hierarchical structure in a
graphical form. Trees belong to the simplest class of graphs. Despite their simplicity, they have a
rich structure.

Trees provide a range of useful applications as simple as a family tree to as complex as trees in
data structures of computer science.

Tree

A connected acyclic graph is called a tree. In other words, a connected graph with no cycles is
called a tree.

The edges of a tree are known as branches. Elements of trees are called their nodes. The nodes
without child nodes are called leaf nodes.

A tree with ‘n’ vertices has ‘n-1’ edges. If it has one more edge extra than ‘n-1’, then the extra
edge should obviously has to pair up with two vertices which leads to form a cycle. Then, it
becomes a cyclic graph which is a violation for the tree graph.

Example 1
The graph shown here is a tree because it has no cycles and it is connected. It has four vertices and
three edges, i.e., for ‘n’ vertices ‘n-1’ edges as mentioned in the definition.
Note − Every tree has at least two vertices of degree one.

Example 2
97

In the above example, the vertices ‘a’ and ‘d’ has degree one. And the other two vertices ‘b’ and
‘c’ has degree two. This is possible because for not forming a cycle, there should be at least two
single edges anywhere in the graph. It is nothing but two edges with a degree of one.

Forest

A disconnected acyclic graph is called a forest. In other words, a disjoint collection of trees is
called a forest.

Example
The following graph looks like two sub-graphs; but it is a single disconnected graph. There are no
cycles in this graph. Hence, clearly it is a forest.

Spanning Trees

Let G be a connected graph, then the sub-graph H of G is called a spanning tree of G if −

 H is a tree
 H contains all vertices of G.

A spanning tree T of an undirected graph G is a subgraph that includes all of the vertices of G.

Example
98

In the above example, G is a connected graph and H is a sub-graph of G.

Clearly, the graph H has no cycles, it is a tree with six edges which is one less than the total
number of vertices. Hence H is the Spanning tree of G.

Circuit Rank

Let ‘G’ be a connected graph with ‘n’ vertices and ‘m’ edges. A spanning tree ‘T’ of G contains
(n-1) edges.

Therefore, the number of edges you need to delete from ‘G’ in order to get a spanning tree = m-(n-
1), which is called the circuit rank of G.

This formula is true, because in a spanning tree you need to have ‘n-1’ edges. Out of ‘m’ edges,
you need to keep ‘n–1’ edges in the graph.

Hence, deleting ‘n–1’ edges from ‘m’ gives the edges to be removed from the graph in order to get
a spanning tree, which should not form a cycle.

Example
Take a look at the following graph −
For the graph given in the above example, you have m=7 edges and n=5 vertices.

Then the circuit rank is


99
G = m – (n – 1)
= 7 – (5 – 1)
=3

Example
Let ‘G’ be a connected graph with six vertices and the degree of each vertex is three. Find the
circuit rank of ‘G’.

By the sum of degree of vertices theorem,

n ∑ i=1 deg(Vi) = 2|E|

6 × 3 = 2|E|

|E| = 9

Circuit rank = |E| – (|V| – 1)

= 9 – (6 – 1) = 4

Kirchoff’s Theorem

Kirchoff’s theorem is useful in finding the number of spanning trees that can be formed from a
connected graph.

Example

The matrix ‘A’ be filled as, if there is an edge between two vertices, then it should be given as ‘1’,
else ‘0’.
10
0

Tree represents the nodes connected by edges. We will discuss binary tree or binary search tree
specifically.

Binary Tree is a special datastructure used for data storage purposes. A binary tree has a special
condition that each node can have a maximum of two children. A binary tree has the benefits of
both an ordered array and a linked list as search is as quick as in a sorted array and insertion or
deletion operation are as fast as in linked list.
10
1

Important Terms

Following are the important terms with respect to tree.

 Path − Path refers to the sequence of nodes along the edges of a tree.
 Root − The node at the top of the tree is called root. There is only one root per tree and one
path from the root node to any node.
 Parent − Any node except the root node has one edge upward to a node called parent.
 Child − The node below a given node connected by its edge downward is called its child
node.
 Leaf − The node which does not have any child node is called the leaf node.
 Subtree − Subtree represents the descendants of a node.
 Visiting − Visiting refers to checking the value of a node when control is on the node.
 Traversing − Traversing means passing through nodes in a specific order.
 Levels − Level of a node represents the generation of a node. If the root node is at level 0,
then its next child node is at level 1, its grandchild is at level 2, and so on.
 keys − Key represents a value of a node based on which a search operation is to be carried
out for a node.

Binary Search Tree Representation

Binary Search tree exhibits a special behavior. A node's left child must have a value less than its
parent's value and the node's right child must have a value greater than its parent value.
10
2

We're going to implement tree using node object and connecting them through references.

Tree Node

The code to write a tree node would be similar to what is given below. It has a data part and
references to its left and right child nodes.

struct node {
int data;
struct node *leftChild;
struct node *rightChild;
};

In a tree, all nodes share common construct.

BST Basic Operations

The basic operations that can be performed on a binary search tree data structure, are the following

 Insert − Inserts an element in a tree/create a tree.


 Search − Searches an element in a tree.
 Preorder Traversal − Traverses a tree in a pre-order manner.
 Inorder Traversal − Traverses a tree in an in-order manner.
 Postorder Traversal − Traverses a tree in a post-order manner.

We shall learn creating (inserting into) a tree structure and searching a data item in a tree in this
chapter. We shall learn about tree traversing methods in the coming chapter.

Insert Operation

The very first insertion creates the tree. Afterwards, whenever an element is to be inserted, first
locate its proper location. Start searching from the root node, then if the data is less than the key
value, search for the empty location in the left subtree and insert the data. Otherwise, search for
the empty location in the right subtree and insert the data.

Algorithm
If root is NULL
then create root node
return

If root exists then


compare the data with node.data

while until insertion position is located

If data is greater than node.data 10


goto right subtree 3
else
goto left subtree

endwhile

insert data

end If

Implementation
The implementation of insert function should look like this −

void insert(int data) {


struct node *tempNode = (struct node*) malloc(sizeof(struct node));
struct node *current;
struct node *parent;

tempNode->data = data;
tempNode->leftChild = NULL;
tempNode->rightChild = NULL;

//if tree is empty, create root node


if(root == NULL) {
root = tempNode;
} else {
current = root;
parent = NULL;

while(1) {
parent = current;

//go to left of the tree


if(data < parent->data) {
current = current->leftChild;

//insert to the left


if(current == NULL) {
parent->leftChild = tempNode;
return;
}
}

//go to right of the tree


else {
current = current->rightChild;

//insert to the right


if(current == NULL) {
parent->rightChild = tempNode;
return;
}
}
}
}
}
Search Operation

Whenever an element is to be searched, start searching from the root node, then 10 if the data is less
4
than the key value, search for the element in the left subtree. Otherwise, search for the element in
the right subtree. Follow the same algorithm for each node.

Algorithm
If root.data is equal to search.data
return root
else
while data not found

If data is greater than node.data


goto right subtree
else
goto left subtree

If data found
return node
endwhile

return data not found

end if

The implementation of this algorithm should look like this.

struct node* search(int data) {


struct node *current = root;
printf("Visiting elements: ");

while(current->data != data) {
if(current != NULL)
printf("%d ",current->data);

//go to left tree

if(current->data > data) {


current = current->leftChild;
}
//else go to right tree
else {
current = current->rightChild;
}

//not found
if(current == NULL) {
return NULL;
}

return current;
}
}
10
5

Traversal is a process to visit all the nodes of a tree and may print their values too. Because, all
nodes are connected via edges (links) we always start from the root (head) node. That is, we
cannot randomly access a node in a tree. There are three ways which we use to traverse a tree −

 In-order Traversal
 Pre-order Traversal
 Post-order Traversal

Generally, we traverse a tree to search or locate a given item or key in the tree or to print all the
values it contains.

In-order Traversal

In this traversal method, the left subtree is visited first, then the root and later the right sub-tree.
We should always remember that every node may represent a subtree itself.

If a binary tree is traversed in-order, the output will produce sorted key values in an ascending
order.

We start from A, and following in-order traversal, we move to its left subtree B. B is also
traversed in-order. The process goes on until all the nodes are visited. The output of inorder
traversal of this tree will be −

D→B→E→A→F→C→G
Algorithm
Until all nodes are traversed −
Step 1 − Recursively traverse left subtree.
Step 2 − Visit root node. 10
Step 3 − Recursively traverse right subtree. 6
Pre-order Traversal

In this traversal method, the root node is visited first, then the left subtree and finally the right
subtree.

We start from A, and following pre-order traversal, we first visit A itself and then move to its left
subtree B. B is also traversed pre-order. The process goes on until all the nodes are visited. The
output of pre-order traversal of this tree will be −

A→B→D→E→C→F→G

Algorithm
Until all nodes are traversed −
Step 1 − Visit root node.
Step 2 − Recursively traverse left subtree.
Step 3 − Recursively traverse right subtree.
Post-order Traversal

In this traversal method, the root node is visited last, hence the name. First we traverse the left
subtree, then the right subtree and finally the root node.
10
7

We start from A, and following Post-order traversal, we first visit the left subtree B. B is also
traversed post-order. The process goes on until all the nodes are visited. The output of post-order
traversal of this tree will be −

D→E→B→F→G→C→A

Algorithm
Until all nodes are traversed −
Step 1 − Recursively traverse left subtree.
Step 2 − Recursively traverse right subtree.
Step 3 − Visit root node.

Searching Algorithms
Searching Algorithms are designed to check for an element or retrieve an element from any data
structure where it is stored. Based on the type of search operation, these algorithms are generally
classified into two categories:

1. Sequential Search: In this, the list or array is traversed sequentially and every element is checked.
For example: Linear Search.
2. Interval Search: These algorithms are specifically designed for searching in sorted data-structures.
These type of searching algorithms are much more efficient than Linear Search as they repeatedly
target the center of the search structure and divide the search space in half. For Example: Binary
Search.

Linear Search
Problem: Given an array arr[] of n elements, write a function to search a given element x in arr[].
Examples :
Input : arr[] = {10, 20, 80, 30, 60, 50,
110, 100, 130, 170}
x = 110; 10
Output : 6 8

Element x is present at index 6

Input : arr[] = {10, 20, 80, 30, 60, 50,


110, 100, 130, 170}
x = 175;
Output : -1
Element x is not present in arr[].
A simple approach is to do linear search, i.e

 Start from the leftmost element of arr[] and one by one compare x with each element of arr[]
 If x matches with an element, return the index.
 If x doesn’t match with any of elements, return -1.

Binary Search
Given a sorted array arr[] of n elements, write a function to search a given element x in arr[].
A simple approach is to do linear search.The time complexity of above algorithm is O(n).
Another approach to perform the same task is using Binary Search.
Binary Search: Search a sorted array by repeatedly dividing the search interval in half. Begin
with an interval covering the whole array. If the value of the search key is less than the item in the
middle of the interval, narrow the interval to the lower half. Otherwise narrow it to the upper half.
Repeatedly check until the value is found or the interval is empty.
Example :

10
9

The idea of binary search is to use the information that the array is sorted and reduce the time
complexity to O(Log n).
We basically ignore half of the elements just after one comparison.
1. Compare x with the middle element.
2. If x matches with middle element, we return the mid index.
3. Else If x is greater than the mid element, then x can only lie in right half subarray after the mid
element. So we recur for right half.
4. Else (x is smaller) recur for the left half.

Jump Search
Like Binary Search, Jump Search is a searching algorithm for sorted arrays. The basic idea is to
check fewer elements (than linear search) by jumping ahead by fixed steps or skipping some
elements in place of searching all elements.
For example, suppose we have an array arr[] of size n and block (to be jumped) size m. Then we
search at the indexes arr[0], arr[m], arr[2m]…..arr[km] and so on. Once we find the interval
(arr[km] < x < arr[(k+1)m]), we perform a linear search operation from the index km to find the
element x.
Let’s consider the following array: (0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610).
Length of the array is 16. Jump search will find the value of 55 with the following steps assuming
that the block size to be jumped is 4.
STEP 1: Jump from index 0 to index 4;
STEP 2: Jump from index 4 to index 8;
STEP 3: Jump from index 8 to index 12;
STEP 4: Since the element at index 12 is greater than 55 we will jump back a step to come to
index 8.
STEP 5: Perform linear search from index 8 to get the element 55.
What is the optimal block size to be skipped?
In the worst case, we have to do n/m jumps and if the last checked value is greater
11 than the
element to be searched for, we perform m-1 comparisons more for linear search.0Therefore the
total number of comparisons in the worst case will be ((n/m) + m-1). The value of the function
((n/m) + m-1) will be minimum when m = √n. Therefore, the best step size is m = √n.

Interpolation Search
Given a sorted array of n uniformly distributed values arr[], write a function to search for a
particular element x in the array.
Linear Search finds the element in O(n) time, Jump Search takes O(√ n) time and Binary
Search take O(Log n) time.
The Interpolation Search is an improvement over Binary Search for instances, where the values in
a sorted array are uniformly distributed. Binary Search always goes to the middle element to
check. On the other hand, interpolation search may go to different locations according to the value
of the key being searched. For example, if the value of the key is closer to the last element,
interpolation search is likely to start search toward the end side.
To find the position to be searched, it uses following formula.

// The idea of formula is to return higher value of pos


// when element to be searched is closer to arr[hi]. And
// smaller value when closer to arr[lo]
pos = lo + [ (x-arr[lo])*(hi-lo) / (arr[hi]-arr[Lo]) ]

arr[] ==> Array where elements need to be searched


x ==> Element to be searched
lo ==> Starting index in arr[]
hi ==> Ending index in arr[]
Algorithm
Rest of the Interpolation algorithm is the same except the above partition logic.
Step1: In a loop, calculate the value of “pos” using the probe position formula.
Step2: If it is a match, return the index of the item, and exit.
Step3: If the item is less than arr[pos], calculate the probe position of the left sub-array. Otherwise
calculate the same in the right sub-array.
Step4: Repeat until a match is found or the sub-array reduces to zero.

Sublist Search (Search a linked list in another list)


Given two linked lists, the task is to check whether the first list is present in 2nd list or not.
Examples:
Input : list1 = 10->20
list2 = 5->10->20 111

Output : LIST FOUND

Input : list1 = 1->2->3->4


list2 = 1->2->1->2->3->4
Output : LIST FOUND

Input : list1 = 1->2->3->4


list2 = 1->2->2->1->2->3
Output : LIST NOT FOUND
Algorithm:
1- Take first node of second list.
2- Start matching the first list from this first node.
3- If whole lists match return true.
4- Else break and take first list to the first node again.
5- And take second list to its second node.
6- Repeat these steps until any of linked lists becomes empty.
7- If first list becomes empty then list found else not.

Fibonacci Search
Given a sorted array arr[] of size n and an element x to be searched in it. Return index of x if it is
present in array else return -1.
Examples:
Input: arr[] = {2, 3, 4, 10, 40}, x = 10
Output: 3
Element x is present at index 3.

Input: arr[] = {2, 3, 4, 10, 40}, x = 11


Output: -1
Element x is not present.
Fibonacci Search is a comparison-based technique that uses Fibonacci numbers to search an
element in a sorted array.
Similarities with Binary Search:
1. Works for sorted arrays
2. A Divide and Conquer Algorithm. 11
3. Has Log n time complexity. 2
Differences with Binary Search:
1. Fibonacci Search divides given array in unequal parts
2. Binary Search uses division operator to divide range. Fibonacci Search doesn’t use /, but uses + and -
. The division operator may be costly on some CPUs.
3. Fibonacci Search examines relatively closer elements in subsequent steps. So when input array is big
that cannot fit in CPU cache or even in RAM, Fibonacci Search can be useful.
Background:
Fibonacci Numbers are recursively defined as F(n) = F(n-1) + F(n-2), F(0) = 0, F(1) = 1. First few
Fibinacci Numbers are 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, …
Observations:
Below observation is used for range elimination, and hence for the O(log(n)) complexity.
F(n - 2) &approx; (1/3)*F(n) and
F(n - 1) &approx; (2/3)*F(n).
Algorithm:
Let the searched element be x.
The idea is to first find the smallest Fibonacci number that is greater than or equal to the length of
given array. Let the found Fibonacci number be fib (m’th Fibonacci number). We use (m-2)’th
Fibonacci number as the index (If it is a valid index). Let (m-2)’th Fibonacci Number be i, we
compare arr[i] with x, if x is same, we return i. Else if x is greater, we recur for subarray after i,
else we recur for subarray before i.
Below is the complete algorithm
Let arr[0..n-1] be the input array and element to be searched be x.
1. Find the smallest Fibonacci Number greater than or equal to n. Let this number be fibM [m’th
Fibonacci Number]. Let the two Fibonacci numbers preceding it be fibMm1 [(m-1)’th Fibonacci
Number] and fibMm2 [(m-2)’th Fibonacci Number].
2. While the array has elements to be inspected:
1. Compare x with the last element of the range covered by fibMm2
2. If x matches, return index
3. Else If x is less than the element, move the three Fibonacci variables two Fibonacci down,
indicating elimination of approximately rear two-third of the remaining array.
4. Else x is greater than the element, move the three Fibonacci variables one Fibonacci down.
Reset offset to index. Together these indicate elimination of approximately front one-third of
the remaining array.
3. Since there might be a single element remaining for comparison, check if fibMm1 is 1. If Yes,
compare x with that remaining element. If match, return index.

Basics of Sorting
Sorting refers to arranging data in a particular format. Sorting algorithm specifies the way to
arrange data in a particular order. Most common orders are in numerical or lexicographical order.
The importance of sorting lies in the fact that data searching can be optimized to a very high
level, if data is stored in a sorted manner. Sorting is also used to represent data in more readable
formats. Following are some of the examples of sorting in real-life scenarios −
 Telephone Directory − The telephone directory stores the telephone numbers of people sorted by their
names, so that the names can be searched easily.
 Dictionary − The dictionary stores words in an alphabetical order so that searching of any word
becomes easy. 11
3
In-place Sorting and Not-in-place Sorting
Sorting algorithms may require some extra space for comparison and temporary storage of few
data elements. These algorithms do not require any extra space and sorting is said to happen in-
place, or for example, within the array itself. This is called in-place sorting. Bubble sort is an
example of in-place sorting.
However, in some sorting algorithms, the program requires space which is more than or equal to
the elements being sorted. Sorting which uses equal or more space is called not-in-place sorting.
Merge-sort is an example of not-in-place sorting.

Stable and Not Stable Sorting


If a sorting algorithm, after sorting the contents, does not change the sequence of similar content
in which they appear, it is called stable sorting.

If a sorting algorithm, after sorting the contents, changes the sequence of similar content in which
they appear, it is called unstable sorting.

Stability of an algorithm matters when we wish to maintain the sequence of original elements,
like in a tuple for example.

Adaptive and Non-Adaptive Sorting Algorithm


A sorting algorithm is said to be adaptive, if it takes advantage of already 'sorted' elements in the
list that is to be sorted. That is, while sorting if the source list has some element already sorted,
adaptive algorithms will take this into account and will try not to re-order them.
A non-adaptive algorithm is one which does not take into account the elements11which are already
4
sorted. They try to force every single element to be re-ordered to confirm their sortedness.

Important Terms
Some terms are generally coined while discussing sorting techniques, here is a brief introduction
to them −

Increasing Order
A sequence of values is said to be in increasing order, if the successive element is greater than
the previous one. For example, 1, 3, 4, 6, 8, 9 are in increasing order, as every next element is
greater than the previous element.

Decreasing Order
A sequence of values is said to be in decreasing order, if the successive element is less than the
current one. For example, 9, 8, 6, 4, 3, 1 are in decreasing order, as every next element is less
than the previous element.

Non-Increasing Order
A sequence of values is said to be in non-increasing order, if the successive element is less than
or equal to its previous element in the sequence. This order occurs when the sequence contains
duplicate values. For example, 9, 8, 6, 3, 3, 1 are in non-increasing order, as every next element is
less than or equal to (in case of 3) but not greater than any previous element.

Non-Decreasing Order
A sequence of values is said to be in non-decreasing order, if the successive element is greater
than or equal to its previous element in the sequence. This order occurs when the sequence
contains duplicate values. For example, 1, 3, 3, 6, 8, 9 are in non-decreasing order, as every next
element is greater than or equal to (in case of 3) but not less than the previous one.

Bubble Sort Algorithm


Bubble sort is a simple sorting algorithm. This sorting algorithm is comparison-based algorithm
in which each pair of adjacent elements is compared and the elements are swapped if they are not
in order. This algorithm is not suitable for large data sets as its average and worst case
complexity are of Ο(n2) where n is the number of items.

How Bubble Sort Works?


We take an unsorted array for our example. Bubble sort takes Ο(n2) time so we're keeping it short
and precise.
Bubble sort starts with very first two elements, comparing them to check which one is greater.

11
5
In this case, value 33 is greater than 14, so it is already in sorted locations. Next, we compare 33
with 27.

We find that 27 is smaller than 33 and these two values must be swapped.

The new array should look like this −

Next we compare 33 and 35. We find that both are in already sorted positions.

Then we move to the next two values, 35 and 10.

We know then that 10 is smaller 35. Hence they are not sorted.

We swap these values. We find that we have reached the end of the array. After one iteration, the
array should look like this −

To be precise, we are now showing how an array should look like after each iteration. After the
second iteration, it should look like this −

Notice that after each iteration, at least one value moves at the end.
11 sorted.
And when there's no swap required, bubble sorts learns that an array is completely
6

Now we should look into some practical aspects of bubble sort.

Algorithm
We assume list is an array of n elements. We further assume that swap function swaps the values
of the given array elements.
begin BubbleSort(list)

for all elements of list


if list[i] > list[i+1]
swap(list[i], list[i+1])
end if
end for

return list

end BubbleSort

Insertion sort
This is an in-place comparison-based sorting algorithm. Here, a sub-list is maintained which is
always sorted. For example, the lower part of an array is maintained to be sorted. An element
which is to be 'insert'ed in this sorted sub-list, has to find its appropriate place and then it has to
be inserted there. Hence the name, insertion sort.
The array is searched sequentially and unsorted items are moved and inserted into the sorted sub-
list (in the same array). This algorithm is not suitable for large data sets as its average and worst
case complexity are of Ο(n2), where n is the number of items.

How Insertion Sort Works?


We take an unsorted array for our example.

Insertion sort compares the first two elements.

It finds that both 14 and 33 are already in ascending order. For now, 14 is in sorted sub-list.
Insertion sort moves ahead and compares 33 with 27. 11
7

And finds that 33 is not in the correct position.

It swaps 33 with 27. It also checks with all the elements of sorted sub-list. Here we see that the
sorted sub-list has only one element 14, and 27 is greater than 14. Hence, the sorted sub-list
remains sorted after swapping.

By now we have 14 and 27 in the sorted sub-list. Next, it compares 33 with 10.

These values are not in a sorted order.

So we swap them.

However, swapping makes 27 and 10 unsorted.

Hence, we swap them too.

Again we find 14 and 10 in an unsorted order.


We swap them again. By the end of third iteration, we have a sorted sub-list of 4 items.

11
8
This process goes on until all the unsorted values are covered in a sorted sub-list. Now we shall
see some programming aspects of insertion sort.

Algorithm
Now we have a bigger picture of how this sorting technique works, so we can derive simple steps
by which we can achieve insertion sort.
Step 1 − If it is the first element, it is already sorted. return 1;
Step 2 − Pick next element
Step 3 − Compare with all elements in the sorted sub-list
Step 4 − Shift all the elements in the sorted sub-list that is greater than the
value to be sorted
Step 5 − Insert the value
Step 6 − Repeat until list is sorted

Selection sort
Selection sort is a simple sorting algorithm. This sorting algorithm is an in-place comparison-
based algorithm in which the list is divided into two parts, the sorted part at the left end and the
unsorted part at the right end. Initially, the sorted part is empty and the unsorted part is the entire
list.
The smallest element is selected from the unsorted array and swapped with the leftmost element,
and that element becomes a part of the sorted array. This process continues moving unsorted
array boundary by one element to the right.
This algorithm is not suitable for large data sets as its average and worst case complexities are of
Ο(n2), where n is the number of items.

How Selection Sort Works?


Consider the following depicted array as an example.

For the first position in the sorted list, the whole list is scanned sequentially. The first position
where 14 is stored presently, we search the whole list and find that 10 is the lowest value.
11
So we replace 14 with 10. After one iteration 10, which happens to be the minimum value in the
list, appears in the first position of the sorted list. 9

For the second position, where 33 is residing, we start scanning the rest of the list in a linear
manner.

We find that 14 is the second lowest value in the list and it should appear at the second place. We
swap these values.

After two iterations, two least values are positioned at the beginning in a sorted manner.

The same process is applied to the rest of the items in the array.
Following is a pictorial depiction of the entire sorting process −
12
0

Now, let us learn some programming aspects of selection sort.

Algorithm
Step 1 − Set MIN to location 0
Step 2 − Search the minimum element in the list
Step 3 − Swap with value at location MIN
Step 4 − Increment MIN to point to next element
Step 5 − Repeat until list is sorted

Merge sort
Merge sort is a sorting technique based on divide and conquer technique. With worst-case time
complexity being Ο(n log n), it is one of the most respected algorithms.
Merge sort first divides the array into equal halves and then combines them in a sorted manner.
12
How Merge Sort Works? 1

To understand merge sort, we take an unsorted array as the following −

We know that merge sort first divides the whole array iteratively into equal halves unless the
atomic values are achieved. We see here that an array of 8 items is divided into two arrays of size
4.

This does not change the sequence of appearance of items in the original. Now we divide these
two arrays into halves.

We further divide these arrays and we achieve atomic value which can no more be divided.

Now, we combine them in exactly the same manner as they were broken down. Please note the
color codes given to these lists.
We first compare the element for each list and then combine them into another list in a sorted
manner. We see that 14 and 33 are in sorted positions. We compare 27 and 10 and in the target
list of 2 values we put 10 first, followed by 27. We change the order of 19 and 35 whereas 42 and
44 are placed sequentially.

In the next iteration of the combining phase, we compare lists of two data values, and merge them
into a list of found data values placing all in a sorted order.

After the final merging, the list should look like this −
Now we should learn some programming aspects of merge sorting.

Algorithm
Merge sort keeps on dividing the list into equal halves until it can no more 12 be divided. By
2
definition, if it is only one element in the list, it is sorted. Then, merge sort combines the smaller
sorted lists keeping the new list sorted too.
Step 1 − if it is only one element in the list it is already sorted, return.
Step 2 − divide the list recursively into two halves until it can no more be divided.
Step 3 − merge the smaller lists into new list in sorted order.

Quick sort
Quick sort is a highly efficient sorting algorithm and is based on partitioning of array of data into
smaller arrays. A large array is partitioned into two arrays one of which holds values smaller than
the specified value, say pivot, based on which the partition is made and another array holds
values greater than the pivot value.
Quick sort partitions an array and then calls itself recursively twice to sort the two resulting
subarrays. This algorithm is quite efficient for large-sized data sets as its average and worst case
complexity are of Ο(n2), where n is the number of items.

Partition in Quick Sort


Following animated representation explains how to find the pivot value in an array.

The pivot value divides the list into two parts. And recursively, we find the pivot for each sub-
lists until all lists contains only one element.

Quick Sort Pivot Algorithm


Based on our understanding of partitioning in quick sort, we will now try to write an algorithm
for it, which is as follows.
Step 1 − Choose the highest index value has pivot
Step 2 − Take two variables to point left and right of the list excluding pivot
Step 3 − left points to the low index
Step 4 − right points to the high
Step 5 − while value at left is less than pivot move right
Step 6 − while value at right is greater than pivot move left
Step 7 − if both step 5 and step 6 does not match swap left and right
Step 8 − if left ≥ right, the point where they met is new pivot

12
3
Graph and its representation
A graph is a pictorial representation of a set of objects where some pairs of objects are connected
by links. The interconnected objects are represented by points termed as vertices, and the links
that connect the vertices are called edges.
Formally, a graph is a pair of sets (V, E), where V is the set of vertices and E is the set of edges,
connecting the pairs of vertices. Take a look at the following graph −

In the above graph,


V = {a, b, c, d, e}
E = {ab, ac, bd, cd, de}

Graph Data Structure


Mathematical graphs can be represented in data structure. We can represent a graph using an
array of vertices and a two-dimensional array of edges. Before we proceed further, let's
familiarize ourselves with some important terms −
 Vertex − Each node of the graph is represented as a vertex. In the following example, the labeled circle
represents vertices. Thus, A to G are vertices. We can represent them using an array as shown in the
following image. Here A can be identified by index 0. B can be identified using index 1 and so on.
 Edge − Edge represents a path between two vertices or a line between two vertices. In the following
example, the lines from A to B, B to C, and so on represents edges. We can use a two-dimensional
array to represent an array as shown in the following image. Here AB can be represented as 1 at row 0,
column 1, BC as 1 at row 1, column 2 and so on, keeping other combinations as 0.
 Adjacency − Two node or vertices are adjacent if they are connected to each other through an edge. In
the following example, B is adjacent to A, C is adjacent to B, and so on.
 Path − Path represents a sequence of edges between the two vertices. In the following example, ABCD
represents a path from A to D.
12
4

Basic Operations
Following are basic primary operations of a Graph −
 Add Vertex − Adds a vertex to the graph.
 Add Edge − Adds an edge between the two vertices of the graph.
 Display Vertex − Displays a vertex of the graph.

Depth first traversal


Depth First Search (DFS) algorithm traverses a graph in a depthward motion and uses a stack to
remember to get the next vertex to start a search, when a dead end occurs in any iteration.
12
5

As in the example given above, DFS algorithm traverses from S to A to D to G to E to B first,


then to F and lastly to C. It employs the following rules.
 Rule 1 − Visit the adjacent unvisited vertex. Mark it as visited. Display it. Push it in a stack.
 Rule 2 − If no adjacent vertex is found, pop up a vertex from the stack. (It will pop up all the vertices
from the stack, which do not have adjacent vertices.)
 Rule 3 − Repeat Rule 1 and Rule 2 until the stack is empty.

Step Traversal Description

Initialize the stack.


2

Mark S as visited and put it12onto the stack. Explore


any unvisited adjacent node6 from S. We have three
nodes and we can pick any of them. For this
example, we shall take the node in an alphabetical
order.

Mark A as visited and put it onto the stack. Explore


any unvisited adjacent node from A.
Both S and D are adjacent to A but we are
concerned for unvisited nodes only.

Visit D and mark it as visited and put onto the


stack. Here, we have B and C nodes, which are
adjacent to D and both are unvisited. However, we
shall again choose in an alphabetical order.

We choose B, mark it as visited and put onto the


stack. Here B does not have any unvisited adjacent
node. So, we pop B from the stack.
6

12
We check the stack top for 7return to the previous
node and check if it has any unvisited nodes. Here,
we find D to be on the top of the stack.

Only unvisited adjacent node is from D is C now.


So we visit C, mark it as visited and put it onto the
stack.

As C does not have any unvisited adjacent node so we keep popping the stack until we find a
node that has an unvisited adjacent node. In this case, there's none and we keep popping until the
stack is empty.

Breadth first traversal


Breadth First Search (BFS) algorithm traverses a graph in a breadthward motion and uses a queue
to remember to get the next vertex to start a search, when a dead end occurs in any iteration.
12
8

As in the example given above, BFS algorithm traverses from A to B to E to F first then to C and
G lastly to D. It employs the following rules.
 Rule 1 − Visit the adjacent unvisited vertex. Mark it as visited. Display it. Insert it in a queue.
 Rule 2 − If no adjacent vertex is found, remove the first vertex from the queue.
 Rule 3 − Repeat Rule 1 and Rule 2 until the queue is empty.

Step Traversal Description

Initialize the queue.


2

12
9
We start from visiting S (starting node), and mark it
as visited.

We then see an unvisited adjacent node from S. In


this example, we have three nodes but
alphabetically we choose A, mark it as visited and
enqueue it.

Next, the unvisited adjacent node from S is B. We


mark it as visited and enqueue it.

Next, the unvisited adjacent node from S is C. We


mark it as visited and enqueue it.
6

13
0
Now, S is left with no unvisited adjacent nodes. So,
we dequeue and find A.

From A we have D as unvisited adjacent node. We


mark it as visited and enqueue it.

At this stage, we are left with no unmarked (unvisited) nodes. But as per the algorithm we keep
on dequeuing in order to get all unvisited nodes. When the queue gets emptied, the program is
over.

You might also like