Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
2 views

Data Structures and Algorithms

The document provides lecture notes on Data Structures and Algorithms, covering fundamental concepts such as data structures, algorithms, data types, and graph theory. It details the organization of data, the characteristics of algorithms, and the significance of records and files in data management. Additionally, it introduces concepts related to sets, relations, and string structures, emphasizing their importance in computing and mathematics.

Uploaded by

Paul Oshos
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Data Structures and Algorithms

The document provides lecture notes on Data Structures and Algorithms, covering fundamental concepts such as data structures, algorithms, data types, and graph theory. It details the organization of data, the characteristics of algorithms, and the significance of records and files in data management. Additionally, it introduces concepts related to sets, relations, and string structures, emphasizing their importance in computing and mathematics.

Uploaded by

Paul Oshos
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 35

DELTA CENTRAL COLLEGE OF MANAGEMENT AND

SCIENCE (DECCOMS)
UGHELLI, DELTA STATE.

in affiliation with,
TEMPLE GATE POLYTECHNIC
ABA, ABIA STATE.

LECTURE NOTES

ON
DATA STRUCTURES AND ALGORITHMS
(COM 124)

BY

MR. PAUL APELEOKHA

1
CHAPTER ONE
INTRODUCTION
CONCEPT OF DATA STRUCTURES
DATA STRUCTURE
Structure is a manner of building, constructing or organizing; it is
seen as something built or constructed. Also we say structure is the
arrangement or interrelation of all of the parts of a whole.
A data structure is a structure whose elements are items of date and
whose organization is determined both by the relationship between
the data items and by the access functions used to store and retrieve
them. E.g A vector array is linearly ordered with subscripts.
Data structures can be defined as a way of organizing data items such
that they can easily be stored, referenced/retrieved or manipulated by
a computer program.
A data item is a single unit of values. It is a raw fact which becomes
Information after processing. Data items for example, date are called
group items if they can be divided into subsystems. The date for
instance is represented by the day, the month and umber is called an
elementary item, because it cannot be sub-divided into sub-items. It is
indeed treated as a single item. An entity is used to describe anything
that has certain attributes or properties, which may be assigned
values. For example, the following are possible attributes and their
corresponding values for an entity known as,
STUDENT.
ATTRIBUTES NAME AGE SEX MATRIC
NO

VALUES Paul 21 Male


800654
Entities with similar attributes for example, all the 200 level
Computer science & Statistics students form an entity set.
MAIN FUNCTIONS OF DATA STRUCTURES:
 Seek to identify and develop entities, operations and
appropriate classes of Problems to use them.
 Determine representations for abstract entities to implement
abstract operations on concrete representations.
Algorithm An algorithm is a finite set of instructions which, if
followed, accomplishes a particular task. It is a prescribed set of well-
defined rules or instructions for the solution of a problem, such as the
performance of a calculation, in a finite number of steps. On the other
hand another word similar to algorithm is program. When an
algorithm is expressed in a language that a computer can understand
is called a program, which explains why computer languages are
called programming languages.
2
CHARACTERISTICS OF ALGORITHM:
 Has a finite set of steps with definite instructions.
 Instructions have definite order.
 Algorithm must eventually stop.
 Actions are deterministic.
VALUE RANGE
All possible values that could be assigned to a given attribute of an
entity set is called the range of values of the attribute.
GUIDELINES TOWARDS NAMING DATA ITEMS.
1. Give meaningful names that suggest clearly the purpose of the data
item.
2. Use simple names for data items. Use simple letters for loop
counters.
3. Use common prefixes or suffixes to associate names of the same
category. E.g. records of a student could have the prefix of
stu.names, std.course,stdDept etc. (also to files of the same
software/application.
4. Avoid choosing names whose meanings are not related to the
problem, etc.
DATA TYPES
In mathematics it is customary to classify variables according to
certain important characteristics. Clear distinctions are made
between real, complex, and logical variables or between variables
representing individual values, or sets of values, or sets of sets, or
between functions, functional, sets of functions, and so on. This notion
of classification is equally if not
more important in data processing. We will adhere to the principle
that every constant, variable, expression, or function is of a certain
type. This type essentially characterizes the set of values to which a
constant belongs, or which can be assumed by a variable or
expression, or which can be generated by a function. Therefore, a
data type is a set of values together with the operations defined on the
values: {(values) (operations)}. The operations are performed on the
values defined. E.g integer (-4,-1,1,3,4) are values while(+,-,*,/) are
operations. Data types also allow us to associate meaning to sequence
of bits in the computer memory. Eg string AA@, integer 5 etc.

THE RECORD STRUCTURE


The most general method to obtain structured types is to join
elements of arbitrary types that are possibly themselves structured
types, into a compound. Examples from mathematics are complex
numbers, consisting of two real numbers, and coordinates points,
consisting of two or more
3
numbers according to the dimensionality of the space spanned by the
coordinate system. An example from data processing is describing
people by a few relevant characteristics, such as their first and last
names, their date of birth, sex, and marital status. In mathematics
such a compound type is the Cartesian product of its constituent
types. This stems from the fact that the set of values defined by this
compound type consists of all possible combinations of values, taken
one from each set defined by each constituent type. Thus, the number
of such combinations, also called n-tuples, is the product of the
number of elements in each constituent set, that is, the cardinality of
the compound type is the product of the cardinalities of the
constituent types. In data processing, composite types, such as
descriptions of persons or objects, usually occur in files or data banks
and record the relevant characteristics of a person or object. The
word record has therefore become widely accepted to describe a
compound of data of this nature, and we adopt this nomenclature in
preference to the term Cartesian product. In general, a record type T
with components of the types T1, T2,... , Tn is defined as follows:
TYPE T = RECORD s1: T1; s2: T2; ... sn: Tn END card(T) = card(T1) *
card(T2) * ... * card(Tn)
Examples
TYPE Complex = RECORD re, im: REAL END
TYPE Date = RECORD day, month, year: INTEGER END
TYPE Person = RECORD name, firstname: Name;
birthdate: Date;
sex: (male, female);
marstatus: (single, married, widowed, divorced)
END
We may visualize particular, record-structured values of, for example,
the variables
z: Complex
d: Date
p: Person

THE FOLLOWING ARE THE UNITS FOR IDENTIFYING DATA


CHARACTER, FIELDS, SUB FIELDS, RECORDS, AND FILES.
A file is a collection of logically related records; e.g. students file,
stock file.
A record is a collection of logically related data fields; e. g Data
relating to students in students file. In a database table records are
usually in rows. Therefore, the table below has three (3) records.
While a field is consecutive storage position of values. It is a unit of
data within a record e. g student’s number, Name, Age. In a database
concept fields are usually in columns of a given table. Data items for
4
example, date are called group items if they can be divided into
subsystems. The date for instance is represented by the day, the
month and number is called an elementary item, because it cannot be
sub-divided into sub-items otherwise known as
Sub fields. It is indeed treated as a single item.
Character is the smallest unit of information. It includes letters,
digits and special symbols such as + (Plus sign), _(minus sign), \, /,
$,a,b,…z, A,B,…Z etc. Every character requires one byte of memory
unit for storage in computer system.

CHAPTER TWO
GRAPH THEORY
INTRODUCTION
Graphs were first used in solving bridge construction problem far
back 1736 by one Euler. Graphs are used in a wide variety of ways in
computing; the vertices will usually represent objects of some kind
and the edges will represent connections of physical or logical nature
between the vertices. E.g. computer network, parse trees, databases,
etc. Other areas where graphs can be applied are analysis of electrical
circuits, finding of chemical compounds, statistical analysis,
mechanics genetics, cybernetics, linguistics, social sciences, etc.
Indeed it might be said that of all mathematical structures, graphs are
the most widely used.
Binary trees provide a very useful way of representing relationships in
which a hierarchy exists. That is, a node is pointed to by at most one
other node and each node points to at most two other nodes. If we
remove the restriction that each node can point to at most two other
nodes, we have a non binary tree, e.g.

5
If we remove the restriction that each node can be pointed to by at
most one other node, we have a data structure called a directed
graph. If pointed to replaced with related to then we have an
undirected graph.
DEFINITIONS AND TERMINOLOGY
A graph is made up of a set of nodes (also called vertices) and a set of
lines called edges that connect the different nodes. A graph G consists
of two sets V and E. V is a finite non-empty set of Vertices. E is a set of
pairs of Vertices. These pairs are called edges. V (G) and E (G) will
represent the sets of vertices and edges of graph G.

We can write
G= (V,E) to represent a V= (v1, v2….) E= (eı, e2…..en) graph.
A graph is usually depicted in a pictorial form in which the vertices
appear as dots or other shapes, perhaps labeled for identification
purposes and the edges shown as lines joining the appropriate points.
If direction is added to each edge of a graph, a directed graph or
digraph is obtained. The edges then form a finite set of ordered pairs
of distinct vertices i.e. Edge (v1. V2) is different from the edge (v2,
v1). When there are no directions in a graph it becomes an undirected
graph. If a graph is directed the direction of the line is indicated by
which node is listed first. A tree is a special case of a directed graph.
It should be apparent that a tree is really a restricted instance of a
graph satisfying the three conditions below.
1. It is connected
2. It has no closed circuits
3. It has distinguished node called the root
Due to the above conditions, a tree with V vertices must have V-1
edges.
Some examples of graphs are
V (GQ) = {(A,B,C,D)}
E (G1) ={(A,B), (A,D), (B,C), (B,D)}
6

A
B
Graph G1

V (G2) ={O,J,S,C,M,Q,U}
E (G2) = {(OJ), (OS), (JC), (JC), (JK), (SQ), (SU), (UQ) (MM)}

O S

J
U
Q

C K
Graph G2 (diagram)
Vertices connected by an edge are adjacent. The vertices (AB), from
G1 are end vertices of the edge (AB). They are said to be adjacent to
each other. Also edge {(A,B)} is incident with A and B. In G2 above J
is adjacent to C but C is not adjacent to J because (J,C) is an edge in
G2 and (C,J) is not an edge in G2 being a digraph. Rather C is
adjacent from J. Hence the adjacency property is symmetric in an
undirected graph but not in a digraph.
Two or more edges joining the same pair of vertices are called
multiple edges. A simple graph can be connected or disconnected.
When a graph has no loop or multiple edges it is called a simple
graph.

RELATIONS AND SYMBOLS


Relations
A relation is determined by specifying all ordered pairs of objects in
that relation; it does not matter by what property the set of these
ordered pairs is described. We are led to the following definition.
7
Definition. A set R is a relation if all elements of R are ordered pairs,
i.e., if for any z ∈ R there exist x and y such that z = (x, y). It is
customary to write xRy instead of (x, y) ∈ R. We say that x is in
relation R with y if xRy holds. The set of all x which are in relation R
with some y is called the domain of R and denoted by “dom R.” So
dom R = {x | there exists y such that xRy}. dom R is the set of all first
coordinates of ordered pairs in R.
The set of all y such that, for some x, x is in relation R with y is called
the range of R, denoted by “ran R.” So ran R = {y | there exists x such
that
xRy}.

A symbol is something such as an object, picture, written word,


sound,
or particular mark that represents something else by association,
resemblance, or convention. For example, a red octagon may stand for
"STOP". On maps, crossed sabres may indicate a battlefield. Numerals
are
Symbols for numbers. All language consists of symbols. The word
"cat" is not a cat, but represents the idea of a cat.
Language and symbols
All languages are made up of symbols. Spoken words are the symbols
of mental experience, and written words are the symbols of spoken
words.
The word "cat", for example, whether spoken or written, is not a
literal cat but a sequence of symbols that by convention associate the
word with a concept. Hence, the written or spoken word "cat"
represents (or stands for) a particular concept formed in the mind. A
drawing of a cat, or a stuffed cat, could also serve as a symbol for the
idea of a cat. The study or interpretation of symbols is known as
symbology, and the study of signs is known as semiotics.
Equivalence relation is a binary relation between two elements of a
set which groups them together as being "equivalent" in some way.
Let a, b, and c be arbitrary elements of some set X. Then "a ~ b" or "a
≡ b" denotes that a is equivalent to b. An equivalence relation "~" is
reflexive, symmetric, and transitive. In other words, the following
must hold for "~" to be an equivalence relation on X: An equivalence
relation partitions a set into several disjoint subsets, called
equivalence classes. All the elements in a given equivalence class are
equivalent among themselves, and no element
is equivalent with any element from a different class.
· Reflexivity: a ~ a
· Symmetry: if a ~ b then b ~ a
· Transitivity: if a ~ b and b ~ c then a ~ c.
8
· The equivalence class of a under "~", denoted [a], is the subset of X
for which every element b, a~b. X together with "~" is called a setoid.
Examples of equivalence relations
A ubiquitous equivalence relation is the equality ("=") relation
between elements of any set.
Other examples include:
· "Has the same birthday as" on the set of all people, given naive set
theory.
· "Is similar to" or "congruent to" on the set of all triangles.
· "Is congruent to modulo n" on the integers.
· "Has the same image under a function" on the elements of the
domain of the function.

Composite Relations
If the elements of a set A are related to those of a set B, and those of
B are in turn related to the elements of a set C, then one can expect a
relation between A and C. For example, if Tom is my father(parent-
child relation) and Sarah is a sister of Tom (sister relation), then
Sarah is my aunt
(aunt-nephew/niece relation). Composite relations give that kind of
relations.

CHAPTER THREE
SETS RELATION AND STRING STRUCTURE

9
THE TYPE SET
The objects of study of Set Theory are sets. As sets are fundamental
objects that can be used to define all other concepts in mathematics,
they are not defined in terms of more fundamental concepts. Rather,
sets are introduced either informally, and are understood as
something self-evident, or, as is now standard in modern
mathematics, axiomatically, and their properties are postulated by the
appropriate formal axioms. The language of set theory is based on a

member of B (in symbols A ∈ B), or that the set B contains A as its


single fundamental relation, called membership. We say that A is a

element. The understanding is that a set is determined by its


elements; in other words, two sets are deemed equal if they have
exactly the same elements. In practice, one considers sets of numbers,
sets of points, sets of functions, sets of some other sets and so on. In
theory, it is not necessary to
distinguish between objects that are members and objects that
contain members -- the only objects one needs for the theory are sets.
See the supplement the type SET denotes sets whose elements are
integers in the range 0 to a small number, typically 31 or 63. Given,
for example, variables
VAR r, s, t: SET possible assignments are r := {5}; s := {x, y .. z}; t :=
{}
Here, the value assigned to r is the singleton set consisting of the
single element 5; to t is assigned the empty set, and to s the elements
x, y, y+1, … , z-1, z.
Set Operations
A relation is a set. It is a set of ordered pairs if it is a binary relation,
and it is a set of ordered n-tuples if it is an n-ary relation. Thus all the
set operations apply to relations such as , , and complementing. For
example, the union of the "less than" and "equality" relations on the
set of integers is the "less than or equal to" relation on the set of
integers. The intersection of the "less than" and "less than or equal to"
relations on the set of integers is the "less than" relation on the same
set. The complement of the "less than" relation on the set of integers
is the "greater than or equal to" relation on the same set. Therefore,
the following are the elementary operators defined on variables of
type SET:
* set intersection
+ set union
- set difference
/ symmetric set difference
IN set membership
Constructing the intersection or the union of two sets is often called
set multiplication or set addition, respectively; the priorities of the set
10
operators are defined accordingly, with the intersection operator
having priority over the union and operators, which in turn have
priority over the membership operator, which is classified as a
relational operator. The Following are examples of set expressions
and their fully parenthesized equivalents:
r * s + t = (r*s) + t
r - s * t = r - (s*t)
r - s + t = (r-s) + t
THIS IS A TEXT
r + s / t = r + (s/t)
x IN s + t = x IN (s+t)
Sets, Elements, and Subsets
One dictionary has, among the many definitions for set, the following:
a number of things naturally connected by location, formation, or
order in time. Although set holds the record for words with the most
dictionary definitions, there are terms mathematicians choose to leave
undefined, or actually, defined by usage. Set, element, member, and
subset are four such terms which will be discussed in today's lesson.
Today's activity will also explore the concept. Each item inside a set is
termed an element.
The brace symbols { and } are used to enclose the elements in a set.
Each element is a member of the set (or belongs to the set).
The symbol for membership is . It can be read "is an element of" and
looks quite similar to the Greek letter epsilon ( ). A subset is a portion
of a set.
The symbol for subset is . Some books will allow and use it reversed—
we will not. A superset is a set that includes other sets. For example:
If A B, then A is a subset of B and B is a superset of A. A subset might
have no members, in which case it is termed the null set or empty
set.
The empty set is denoted either by {} or by , a Norwegian letter. The
null set is a subset of every set. Note: a common mistake is to use { }
to denote the null set. This is actually a set with one element and that
element is the null set. Since some people slash their zeroes, it is
safest when handwriting to always use the notation {} to denote the
empty or null set. A singleton is a set with only one element. A subset
might contain every member of the original set. In this case it is
termed an improper subset.

A proper subset does not contain every member of the original set.
Sets may be finite, {1, 2, 3,..., 10}, or infinite, {1, 2, 3,...}. The
cardinality of a set A, n(A), is how many elements are in the set. The
symbol ... called ellipses means to continue in the indicated pattern.
There are 2n subsets of any set, where n is the set's cardinality—
11
check it out for n=3! The power set of a set is the complete set of
subsets of the set. In this class we will consider only safe sets, that is,
any set we consider should be welldefined.
There should be no ambiguity as to whether or not an element
belongs to a set. That is why we will avoid things like the village
barber who shaves everyone in the village that does not shave
himself. This results in a contradiction as to whether or not he shaves
himself. Also consider Russell's Paradox: Form the set of sets that are
not members of themselves. It is both true and false that this set must
contain itself. These are examples of ill-defined sets. Sometimes,
instead of listing elements in a set, we use set builder notation: {x| x
is a letter in the word "mathematics"}. The symbol can be read as
"such that." Sometimes the symbol is reserved to mean proper subset
and the symbol is used to allow the inclusion of the improper subset.
Compare this with the use of < and to exclude or include an endpoint.
We will make no such distinction. A set may contain the same
elements as another set. Such sets are equal or identical sets—
element order is unimportant. A = B where A = {m,o,r,e} and B =
{r,o,m,e}, in
general A=B if A B and B A. Sets may be termed equivalent if they
have the same cardinality. If they are equivalent, a one-to-one
correspondence can be established between their elements.
The universal set is chosen arbitrarily, but must be large enough to
include all elements of all sets under discussion. Complementary set,
A', is a set that contains all the elements of the universal set that are
not included in A. The symbol ' can be read "prime." For example: if
U={0, 1, 2, 3, 4, 5, 6,...} and A={0, 2, 4, 6, ...}, then A'={1, 3, 5, ...}.
Such paradoxes as those mentioned above, particularly involving
infinities (discussed in the next lesson), were well known by the
ancient Greeks. During the 19th century, mathematicians were able to
tame such paradoxes and about the turn of the 20th century
Whitehead and Russell started an ambitious project to carefully codify
mathematics. Set theory was developed about this time and serves to
unify the many branches of mathematics.

Although in 1931 Kurt Gödel showed this approach to be fatally


flawed, it is still a good way to explore areas of mathematics such as:
arithmetic, number theory, [abstract] algebra, geometry, probability,
etc. Geometry has a long history of such systematic study.
The ancient Greek Euclid similarly codified the mathematics of his
time into 13 books called The Elements. Although these books were
not limited to Geometry that is what they are best known for. In fact,
up until about my grandfather’s day, The Elements was the textbook
of choice for the study of Geometry! The Elements carefully separated
12
the assumptions and definitions from what was to be proved. The
concept of proof dates back another couple hundred years to the
ancient Greek Pythagoras and his school, the Pythagorean School.
Intersection and Union
Once we have created the concept of a set, we can manipulate sets in
useful ways termed set operations. Consider the following sets:
animals, birds, and white things. Some animals are white: polar bears,
mountain goats, big horn sheep, for example. Some birds are white:
dove, stork, sea gulls. Some white things are not birds or animal (but
birds are animals!): snow, milk,
Wedding gowns (usually).The intersection of sets are those elements
which belong to all intersected sets. Although we usually intersect
only two sets, the definition above is general. The symbol for
intersection is the union of sets is those elements which belong to any
set in the union.
Again, although we usually form the union of only two sets, the
definition above is general. The symbol for union is. For the example
given above, we can see that: {white things} {birds} = white birds
{white animals} {birds} = white animals and all birds {white birds}
{white animals} {animals}
Another name for intersection is conjunction. This comes from the
fact that an element must be a member of set A and set B to be a
member of A B. Another name for union is disjunction. This comes
from the fact that an element must be a member of set A or set B to
be a member of
A B. Conjunction and disjunction are grammar terms and date back to
when Latin was widely used.
I should note the very mathematical use of the word or in the
sentence above. Common usage now of the word or means one or the
other, but not both (excludes both).

Mathematicians and computer scientists on the other hand mean one


or the other, possibly both (including both). This ambiguity can cause
all kinds of problems! Mathematicians term the former exclusive or
(EOR or XOR) and the latter inclusive or. We will see ands & ors again
in numbers lesson 6 on truth tables.
Representation of Sets
A set s is conveniently represented in a computer store by its
characteristic function C(s). This is an array of logical values whose
with component has the meaning “i is present in s”. As an example,
the set of small integers s = {2, 3, 5, 7, 11, 13} is represented by the
sequence of bits, by a bit string: C(s) = (… 0010100010101100)
The representation of sets by their characteristic function has the
advantage that the operations of computing the union, intersection,
13
and difference of two sets may be implemented as elementary logical
operations. The following equivalences, which hold for all elements i
of the base type of
the sets x and y, relate logical operations with operations on sets:
i IN (x+y) = (i IN x) OR (i IN y)
i IN (x*y) = (i IN x) & (i IN y)
i IN (x-y) = (i IN x) & ~(i IN y)
These logical operations are available on all digital computers, and
moreover they operate concurrently on all corresponding elements
(bits) of a word. It therefore appears that in order to be able to
implement the basic set operations in an efficient manner, sets must
be represented in a small, fixed number of words upon which not only
the basic logical operations, but also those of shifting are available.
Testing for membership is then implemented by a single shift and a
subsequent (sign) bit test operation. As a consequence, a test of the
form x IN {c1, c2, ... , cn} can be implemented considerably more
efficiently than the equivalent Boolean expression
(x = c1) OR (x = c2) OR ... OR (x = cn)
A corollary is that the set structure should be used only for small
integers as elements, the largest one being the word length of the
underlying computer (minus 1).

CHAPTER FOUR
SORTING AND SEARCHING TECHNIQUES
INTRODUCTION
Sorting an array of values is a commonly occurring programming
task. A lecturer may want exam scores to see a flanking of the
students in the class. There is every need for us to present
information in a useable form. The telephone directory is a very good
book with ordered entries in alphabetical sequence by name. knowing
the name of an individual and going to the appropriate section of the
directory then searching several pages until a match found for the
name, the telephone number can then be found. Another example is
the dictionary that is a large book of ordered words. Names in
telephone directories, Matric Number in a sorted student database
are called keys.
On this chapter we shall discuss falling 2 or more sorting techniques.
Sort keys must be capable of being ordered. i.e either K,<K 29K1= K2 or
K1>K2. Sort keys can serve as unique identifiers in records.

14
There are many situations in which data items within elements of an
array need to be arranged in ascending or descending order of key.
The order chosen is usually based upon the need of the user.
Reasons why stored data/information must be kept in ordered format
are as follows.
1. To provide fast access to the information, given an appropriate key.
2. To allow an orderly presentation of information when producing
reports.
3. To simplify changes, insertions, amendments and deletions to the
information without destroying the key order of the remaining
information.
4. To assist in searching.
SORTING
Sorting techniques can be categorized into two namely: internal and
external. Internal methods are methods used when the file to be
sorted is small enough so that the entire sorting can be carried out in
the main memory. External sorting on the other hand is used when
the file to be sorted is too large to fit into the main memory rather the
data are stored on external secondary storage medium such as tape or
diskettes and successive parts of the data are sorted in the main
memory. The sorting technique used by a programmer will be
determined by any of the following:
a. The amount of data/information to be sorted
b. The configuration of the computer
c. The nature of the data or application
d. The urgency of the results
Examples of sorting techniques
Internal sorting External sort
Bubble sort Merge sort
Insertion sort -2-way merge
Quick sort -k-way merge
Merge sort
Heap sort
Selection sort
Radix sort
In this section, we shall explain two sorting methods namely: Bubble
sort and insertion sort.
INSERTION SORT
This involves the comparison of two elements in an array and when
the second is lower than the first, a swapping is performed. In this
method, an element is swapped/moved to its correct position in the
list. This continues until the entire list is sorted.
BUBBLE SORTING

15
The bubble sort technique sorts data in passes. When data are sorted,
the maximum or minimum value can easily be located. The bubble
sort is a selection sort that uses a different scheme for finding the
minimum/maximum value. The number of passes in worst cases is
always one less than the number of items in the array. Each iteration
puts the smallest unsorted elements in its correct place but it also
makes changes in the location of the other elements in the array. The
first iteration will put the smaller elements in the array in the first
position. We start with the element in the first position and compare
successive pair of elements. Swapping whenever the bottom element
of the pair is smaller than the one above it. In this way the smallest
element “bubbles” up to the top of the array. This technique is
repeated for all other elements. Below is the algorithm or
subroutine/segment for bubble sort technique.
The position of elements in the array are altered by the inner loop if
the condition/testing is true. By changing the relational operator to
less than (<) in IF statement the list will be sorted in descending
order. There will be N-1 comparisons the first time through the outer
loop, N-2 comparisons the second time, and so on.
SEARCHING
Data are usually stored in computer memory of mass storage devices
so that they may be examined later for a variety of reasons. For
example records of airline reservations are stored so that it can be
confirmed at a later date, list of students admitted into a university
are stored to determine eligibility for enrolling in courses, etc. Given
all these lists and other examples, need might arise for us to search
for a particular item. Searching is a useful and common processing
activity and there are many methods which can be used. The way the
records are arranged and the choice of method used for searching can
make a substantial difference in the programs performance.

16
BINARY SEARCH
For binary search to be performed, the list must be presorted. This is
a prerequisite and hence it is a faster technique. In binary search an
element A, is compared with the element in the middle of the array. If
they are the same A is found else the searching is restricted to the
first or second half of the array, depending on whether A is less than
or greater than the middle element of the array. Each time a
comparison is performed, the number of items left to search is half as
many as it was preceding the comparism. Thus one comparism
reduces the size of the list yet to be searched to N/2. Two comparisms
to N/2 items. The rapid reduction in the size of the list remaining to be
searched is the major advantage why the binary search technique is
very efficient.

Since in binary search the list is repeatedly splitted into two hence the
name binary.
Algorithm for binary search technique is stated below.
1. Set high = N {N= number of items}
Low= 1
2. If high> =low then
3. Middle (high = low)/2 {to determine the middle subscript in the
array}.
4. If high< low then item x is not the list stop
5. If them x= Array (middle) then item is found
Else
Item not found
6. If them x array (middle) then
high= middle + 1
Goto step 3 Lower part of array
7. If them x> array (middle) then
Low= middle + 1
Go step Upper part of array
END IF (2)
8. Stop

Example of Binary Search


(a)
No. Searched
Found= False for=25

91 2
11 3
16 4
18 5
25 6
29 7
32 8
35

Last
First
or 17
or
Low Last
(b) 9 11 16 18 25 29 32 35
1 2 3 4 5 6 7 8

Midpoint
First Last

First<= last and not found


Midpoint= (1+8) div 2 =4
List(4)<25, so move first to midpoint +1
(c) 9 11 16 18 25 29 32 35

1 2 3 4 5 6 7 8

Last
First Midpoint
Midpoint
First<= last and not found
Midpoint= (5+8) div 2=6
List(6)<25, so set first to midpoint -1
(d) 9 11 16 18 25 29 32 35
1 2 3 4 5 6 7 8

Midpoint

18
CHAPTER FIVE
LISTS
A list is a collection of homogenous elements with a linear relationship
between the elements. This means that each element in the list except
the first one has a unique predecessor and each element except the
last one has a unique successor in the list affects its accessing
function, if the list is in ascending order, the successor or any element
is greater or equal to that element in the list. An array is one way of
implementing a list, but a list can be seen as abstract data type.
An abstract data type is a list which consists of a collection of
homogenous data of arranged in a sequence one after the other. They
provide a flexible way of handling data items serially. Elements in a
list may be names of students, scores, and alphabets Etc. examples
are (MON, TUE, WED, THUR, FRI, SAT, SUN) days of the week.
(2,4,6,8,10) even integers
(JAN, FEB, MAR, APRI….) months of the year.
TYPES OF LISTS
Depending on how a list is constructed, we may have a linear
(sequential) list, circuliar list or linked list. Also depending on the
restrictions imposed on how elements may be deleted from or inserted
into a list, it may also be seen as a stack or a queue. We shall discuss
these various types below.
LINEAR LISTS
A linear list is one in which one of the elements. (Last element) points
at no other element. This means that the list is discontinuous after the
last element. A linear list therefore experiences discontinuity at the
terminal points (beginning and end). Elements can easily be added to
it from the end of the list Diagrammatically, it is represented as
below.

2 4 6 8 10 12
Data can be assessed directly or randomly
In a sequent/linear list, the physical arrangement of the elements
matches their logical ordering. From the above, we have a list of even
numbers beginning with 2 in the first slot, 4 in the next slct and so on.
For any element in the list, the location of its successor is implicitly
defined: it is the element in the next physical slot. This has a close

19
resemblance with arrays. In fact, arrays are a type of linear lists of
elements.

LINKED LIST
In computer science, a linked list is one of the fundamental data
structures, and can be used to implement other data structures. It
consists of a sequence of nodes, each containing arbitrary data fields
and one or two references “links” pointing to the next and/or previous
nodes. The principal benefit of a linked list over a conventional array
is that the order of the linked items may be different from the order
that the data items are stored in memory or on disk, allowing the list
of items to be traversed (visited) in a different order. Linked lists
permit insertion and removal of nodes at any point in the list in
constant time, but do not allow random access. Several different types
of linked list exist; singly-linked lists, doubly-linked lists, and
circularly-linked lists.
Linked lists were developed in 1955-56 by Alien Newell, Cliff Shaw
and Herbert Simon at RAND Corporation as the primary data
structure for their Information Processing Languagee.
Linked lists are used as a building block for many other data
structures, such as stacks, queues and their variations.

Link Data

ANODE
Types of linked lists
Singly-linked list
The simplest kind of linked list is a singly-linked list (or slist for
short), which has one link per node. This link points to the next node
in the list, or to a null value or empty list if it is the final node.

Doubly-linked list
A more sophisticated kind of linked list is a doubly-linked list or
two-way linked list. Each node has two links: one points to the
previous node, or points to a null value or empty list if it is the first

20
node; and one points to the next, or points to a null value or empty list
if it is the final node.
Circularly-linked list
In a circularly-linked list, the first and final nodes are linked
together. This can be done for both singly and doubly linked lists. To
traverse a circular linked list, you begin at any node and follow
viewed another way, circularly-linked lists can be seen as having not
beginning or end. This type of list is most useful for managing buffers
for data ingest, and in cases where you have one object in a list and
wish to see all other objects in the list.
The pointer pointing to the whole list may be called the access
pointer.
Doubly-circularly-linked list
In a doubly-circularly-linked list, each node has two links, similar
to a doubly-linked list, except that the previous link of the first node
points to the last node and the next link of the last node points to the
first node. As in doubly-linked lists, insertions and removals can be
done at any point with access to any nearby node. Although
structurally a doubly-circularly-linked list has no beginning or end, an
external access pointer may formally establish the pointed node to be
the head node or the tail node, and maintain order just as well as a
doubly-linked list with sentinel nodes. Draw a diagram to illustrate
this type of linked list.
Sentinel Nodes
Linked lists sometimes have a special dummy or sentinel node at the
beginning and/or at the end of the list, which is not used to store data.
Its purpose is to simplify or speed up some operations, by ensuring
that every data node always has a previous and/or next node, and that
every list (even one that contains no data elements) always has a
“first” and “last” node.

LINKED LISTS VS. ARRAYS


Linked lists have several advantages over arrays. Elements can be
inserted into linked lists indefinitely, while an array will eventually
either fill up or need to be resized, an expensive operation that may
not even be possible if memory is fragmented. Similarly, an array

21
from which many elements are removed may become wastefully
empty or need to be made smaller.
On the other hand, arrays allow random access, while linked lists
allow only sequential access to elements. Singly-linked lists, in fact,
can only be traversed in one direction. This makes linked lists
unsuitable for applications where it’s useful to look up an element by
its index quickly, such as heapsort. Sequential access on arrays is also
faster than on linked lists on many machines due to locality of
reference and data caches. Linked lists receive almost no benefit from
the cache.
Another disadvantage of linked lists is the extra storage needed for
references, which often makes them impractical for lists of small data
items such as characters or Boolean values. It can also be slow, and
with a native allocator, wasteful, to allocate memory separately for
each new element, a problem generally solved using memory pools.
A number of linked list variants exist that aim to ameliorate some of
the above problems. Unrolled linked lists store several elements in
each list node, increasing cache performance while decreasing
memory overhead for references. CDR coding does both these as well,
by replacing references with the actual data referenced, which
extends off the end of the referencing record.
A good example that highlights the pros and cons of using arrays vs.
linked lists is by implementing a program that resolves the Josephus
problem. The Josephus problem is an election method that works by
having a group of people stand in a circle. Starting at a
predetermined person, you count around the circle n times. Once you
reach nth person, take them out of the circle and have the members
close the circle. Then count around the circle the same n times and
repeat the process, until only one person is left. That person wins the
election. This shows the strengths and weaknesses of a linked list vs.
an array, because if you view the people as connected nodes in a
circular linked list then it shows how easily the linked list is able to
delete nodes (as it only has to rearrange the links to the different
nodes).
However, the linked list will be poor at finding the next person to
remove and will need to other hand, will be poor at finding the next
person to remove and will need to recurse through the list until it
finds that person. An array, on the cannot remove one node without
individually shifting all the elements up the list by one. However, it is

22
exceptionally easy to find the nth person in the circle by directly
referencing them by their positions in the array.
LINKED LIST OPERATIONS
Singly-linked lists
Our node data structure will have two fields. We also keep a variable
firstNode which always points to the first node in the list, or is null for
an empty list.
The following code inserts a node after an existing node in a singly
linked list. The diagram shows how it works. Inserting a node before
an existing one cannot be done; instead, you have to locate it while
keeping track of the previous node.
Similarly, we have functions for removing the node after a given node,
and for removing a node from the beginning of the list. The diagram
demonstrate the former. To find and remove a particular node, one
must again keep track of the previous element.
Doubly-linked lists
With doubly-linked lists there are even more pointers to update, but
also less information is needed, since we can use backwards pointers
to observe preceding elements in the list. This enables new
operations, and eliminates special-case functions.
MORE OPERATIONS ON A LINKED LIST
Two operations that can be carried out on linked list are insertion and
deletion of nodes into/from the list.
Assume that we have a list of words that are in alphabetical order.
To insert the data HAT between FAT and MAT, we may have to take
the following steps:
1. Get a node which is currently unused, let its address be X or temp
i.e. create a new node.
2. Set the data field of this node to HAT.
3. Set the link field of X to point to the node after FAT, which contains
MAT.
4. Set the link field of the node containing FAT to X.
The new arrows are dashed. The important thing to note is that when
we insert HAT, we do not have to move any other elements which are
already in the list.
23
Suppose we want to delete HAT from the list, all we need to do is find
the element which immediately proceeds HAT, which is FAT, and set
the link to the position of MAT. Here again, there is no need to move
the data around. Even though we have an arrow from HAT to MAT,
HAT is no longer in the list. The data HAT is lost i.e deleted.
The sequences of statements that are required to insert a new node at
the end of a list are as follows:
New(temp); {creates a new node}
Temp word: = tempword {store the word to be inserted in the
new node}
Temp link: ={the new node points to the next node in the list}
Last link=temp {the last node points to the inserted node}
Also for deletion of a node we have
Last link =current link
Dispose (current)
ADVANTAGES OF LINKED LIST OVER ARRAYS
1. Greater flexibility for the location of nodes in memory
2. By adding more pointers, a list may be traversed (visited) in a
number of different order/paths
3. Easier methods of inserting/deleting data from a list.

STACK
A stack is an ordered list in which all insertions and deletions are
made at one end called the top. A stack is a data structure
characterized by the expression “last in first out (LIFO) meaning that
the most recent item added to the stack is the first on which can be
removed from the stack. A stack pointer is used to keep track of the
last item added to the stack. Another name for a stack is the
pushdown list or a bounded array (i.e finite number of positions). An
array (one dimensional) or linked list may be used to implement
stack. Only 2 operations are possible with a stack Push operation is
said to occur if data items are added to a stack and POP operation on
the other hand occurs when items are removed form a stack.

24
If a stack is stored in an array declared as stack {ı..N} then Top =0 if
stacks is empty. The stack is considered an ordered group of items
because elements are ordered according to how long they have been
in the stack.

A special register SP is used to


4 D SP
Top=4 hold the value of the top of the
3 C
stack
2 B

1 A
From the diagram above, D entered last into the stack but will be
removed first. The push an item into a stack do the following.
1. Check that there is room in the stack to add another item (e.g. a
stack A= 1 to 5). The stack is full when SP=5. Any attempt made to
add items to a full stack, the stack overflow error will occur.
2. If no overflow, the stack pointer (SP) is incremented and the item is
transferred to the array element pointed to by the stack pointer.
Below is the PUSH algorithm
Algorithm PUSH
If SP <Max size of stack
Then SP= SP + 1
Date (SP) = item
Else
Write (“stack overflow”)
End if
End algorithm
OR
=
-
Begin
If Top> = N then
Write in (“stack full”)
Else
Begin
Top: top + 1
Stack (top): =item
End;
End:

25
To POP an item from a stack do the following:
1. Check that the stack is not empty i.e SP>0, Stack underflow will
occur if SP =0
2. If stack is not empty the item at the top of the stack as show by the
stack pointer is transferred to its destination and the SP is
decremented.

Find below the Pop algorithm


Algorithm Pop
If SP.0
Then item=data (SP)
SP= Sp-1
Else
Write (stack underflow)
End if
End algorithm
OR
=
If Top<=0 then
Writeln (‘stack is empty’)
Else
Item=data (top)
Top =top 1
End;
USES OF STACK
1. It is used in programming for processing of procedure calls and
their termination. E.g given the program segment below.
PrgMain PrgA1
PrgA2 PrgA3

A2
A1 s. A3
r t
end
end
end end
The MAIN procedure calls procedure Aı; on completion of A1,
execution of MAIN will resume at location r. The address r is passed
to A1 which saves it in some location for later processing. A1 then
invokes A2 which in turn invokes A3. in each case the invoking
procedure passes the return address to the invoked procedure.

26
If we examine the memory while A3 is computing there will be an
implicit stack when looks like (q. r, s,t). q is the address to which
MAIN returns control. Hence A3 much finish before A2, then A1 and
the Main.
2. They can easily be used to reverse our inputs.
PUSH G
A
When popped we have GAB
B
3. It can be used for storage
4. It is used to keep track or recursion and local variable in
subprograms.
5. They are extensively used in the evaluation of arithmetic
expressions.
However there are some demerits of using a stack.
1. Conserves extra but negligible memory space due to pointer
2. Only 2 possible operations can be performed on stacks
(PUSH&POP)
QUEUE
A queue (pronounced like the letter Q) is an ordered group of
homogeneous elements in which new elements are added at one end
(the rear) and elements are removed from the other end (the front).
Hence an element inserted at the rear of the queue can be deleted
only after all items inserted earlier are removed from the queue. A
queue is characterized by the first in first out (FIFO). As an example
of FIFO queue consider a line of students waiting to pay their school
fee in the bursary. Each new student enters the line at the rear and
when the cashier is ready for a new student, he or she takes the
student at the front of the line. This is very different from a stack.
Other examples of queues from everyday life includes.
1. The list of cars to be repaired at a service department of a vehicle
dealer.
2. The line of students to register
3. For a computer a queue of tasks waiting for the printer, accessing
a disk storage etc.
Diagrammatically we have a queue as
Rear in E D C B A Out
Front

27
The alphabets represents elements already in the queue, while the
arrows shows the direction in which elements are added to and
deleted from the queue.
Overflow and underflow can also occur in a queue.
OPERATIONS IN A QUEUE
Unlike the stack operations Push and Pop, the adding and removing
operations on a queue do not have standard names. Enq will be used
here to add or insert and Deg will refer to remove Enq increment
rear. Deq increment front.
Find below the effects of Queue operations’
Clear Queue(Queue) Rear
Enq (Queue, 10) Rear
Enq (Queue, 5) Rear
Enq (Queue, 6) Rear

(Queue) Rear
Eng (Queue, X) Rear
Deq (Queue, X) Rear
Begin
If rear= n then
Write (“queue is full”)
Else
Rear= rear + 4
Q (rear) =item

End
28
Procedure Deq(veritem: items)
Begin
If front =rear then
Write (‘Empty queue’)
Else
Front =front + 4;
Item =q{front}
End
Note that queue is empty if front =rear
APPLICATION OF QUEUES
Queues are also used in many ways by the Operating System (OS) to
schedule the use of the various computer resources. One of these
resources is the CPU itself. If you are working on a multi-user system,
when you tell the computer to run a particular program, the OS adds
your request to its ‘job queue’. When you request gets to the front of
the queue, the program you requested is executed. Similarly, queues
are used to allocate time to the various users of the input/output
devices-printers, disks tapes etc.
The OS maintains queues of requests to print, read or write to each of
these devices.

29
CHAPTER SIX
TREES
BASIC TERMINOLOGY
In this chapter we shall study a very important data object, trees
intuitively, a tree structure means that the data are organized so that
items of information are related by branches. One very common place
where such a structure arises is in the investigation of genealogies. 2
types of genealogical charts are pedigree and the lineal chart.
A Pedigree shows one’s ancestors while lineal chart describes the
ancestry of languages. Lineal is a chart of descendants rather then
ancestors and each item can produce several others unlike in
pedigree. Which is always binary. Below are illustrations

ROSE (Daughter)

GRACE (Mother of Rose) JOHN (Father of Rose)

BETTY AUSTIN MARY CLIFFORD Grand parents to Rose

30
Proto Indo-European

Italic
Hellenic Germanic

Osco-umbrain Latin West Germanc

Greek North Germanc

Low German

Oscan Spanish French Italian Rumanian High German


Umbrain Icelandic Norwegian Swedish Yidlish

The lineal chart above, thought has nothing to do with people. Is still
a genealogy. It describes in somewhat abbreviated form, the ancestral
lineage of modern European languages or the descendant of ancient
languages.
Thus this is a chart of descendants rather than ancestors and each
item can produce several others. Latin for instance is the forebear of
Spanish, French, Italian and Romanian. This tree does not have the
regular structure of the pedigree chart, but is a tree structure
nevertheless.
DEFINITION: A tree is a finite set of one or more nodes such that:
(i) There is a specially designated note called the root
(ii) The remaining nodes are partitioned into n>=0 disjoint sets T1,
T2… Tn where each of these sets is a tree. T1, T2….Tn are called sub
trees of the root. Also a tree is a special graph with a root node and
other nodes without loops, multiple edges or closed circuit.
Returning fig 5.1 we see that ROSE and proto indo-European are
roots. They have two and three sub trees respectively with roots grace
and john for tree (a) and italic, Hellenic and Germanic for tree (b) the
condition that T1, T2,….Tn be disjoint sets prohibits sub trees from
ever connecting together. It follows that every item in a tree is the
root of some sub tree of the whole.

TREE TERMINOLOGY
31
Many terms are used when referring to trees. A node stands for the
item of information plus the branches to the items tree (a) in fig 5.2
has 13 nodes. For convenience we have used alphabets to represent
the nodes or data. The root is A which is always at the top. The
number of sub trees of a node is called its degree. A has a degree of 3,
B has 2, G is zero etc. A node that has degree zero is called leaf or
terminal node.
Here K, L, F, G, M, ı are leaf nodes.

A D

B
J

E C H I
Depth Love
F 0 1
1 2
L G M 2 3
K 3 4

The roots of the sub trees of a node X are the children of X, X are the
parent of its children. Thus, the children of B are E, F; the parent of B
is A, children of the same parents are said to be siblings. H, ı, J are
siblings. The ancestors of a node are all the nodes along the path from
the root to that node. The ancestors of L are A, B, and E. The level of a
node is defined by initially letting the root be at level one. If a node is
at level r, then its children are at level r + ı , the height or depth of a
tree is defined to be the maximum level of any node in the tree. A
forest is a set of disjoint trees. If the root and the edges of the tree in
fig 5.2 are removed we have a forest with is trees. We can draw tree a
by listing its nodes. Hence the above tree can be written as
(A(B,(E,K,L),F),C(G),D,(H(M),ı,J)

TYPES OF TREES

32
There are several types of trees. Below is a list of some
Binary tree/binary search tree Pedigree
AVL tree Lineal
Orchard/Forest Decision tree
2-3 trees Games tree
B-trees etc. Spanning tree
At this level we shall deal with the binary tree, or the binary search
tree.
BINARY TREES
A binary tree is an important type of tree structure which occurs very
often. It is characterized by the fact that any node can have at most
two branches i.e. there is no node with degree than 2. For binary
trees we distinguish between the sub tree on the left and on the right.
Whereas for trees the order of the sub tree is irrelevant. Also a binary
tree may have zero or null nodes. A more exact definition of a binary
tree is given below:
A binary tree is a finite set of node which is either empty or consists of
a root and 2 disjoint binary trees called the left and right sub trees.
Other tree types are different from a binary tree in that there is no
specific order of drawing an ordinary tree unlike a binary tree.

3
3

(a) (b)
4 4

The trees above are different. The first (a) has no right sub tree and
the second (b) has no left sub tree. As trees they are the same even if
they are lightly drawn differently. Only (b) is a binary tree. Also a tree
cannot have zero/empty node as a binary tree can.
Given the special binary trees below

20 24 29

33 45
17 13
26

10
14 14
The first one is called a skewed or unbalanced tree, skewed to the left.
The second (b) is called a complete binary tree. All the terminologies
earlier defined for trees also apply to binary trees. We shall at this
point make some relevant observations regarding binary trees.
(a) The maximum number of nodes in a level say ı of a binary tree is
21 1 where ı >=1
If ı =1, then number of node is
21 1=20 1 (i.e root node)
If ı =2, Then number of nodes are
22 1= 21= 2
If ı =3, then
23 1=22 =4
The binary tree will look thus Level
1

BINARY SEARCH TREE


This is a binary tree in which the left sub tree if any, of any node
contains a smaller value than does the parent node and the right sub
tree if any contains a large value than does the parent node.
To build a binary search tree we must do it accordance with the
definition. Hence, given a list of ages 10 students in a dept, we shall
illustrate how we can insert nodes to build a binary tree: 21 24 27 18
19 23 25 20 26 32

21
24 34

18
23 27
35

You might also like