Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

DSA Unit - I Notes

Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

GLWEC, Hyd PC301CS - DATA STRUCTURES AND ALGORITHMS Anitha.

Unit – I
Algorithms & Arrays

Introduction to Data Structures:


A data structure is a way of storing data in a computer so that it can be used efficiently and it will allow
the most efficient algorithm to be used. The choice of the data structure begins from the choice of an
abstract data type (ADT). A well-designed data structure allows a variety of critical operations to be
performed, using as few resources, both execution time and memory space, as possible. Data structure
introduction refers to a scheme for organizing data, or in other words it is an arrangement of data in
computer's memory in such a way that it could make the data quickly available to the processor for
required calculations.

A data structure should be seen as a logical concept that must address two fundamental concerns.
1. First, how the data will be stored, and
2. Second, what operations will be performed on it.

As data structure is a scheme for data organization so the functional definition of a data structure should
be independent of its implementation. The functional definition of a data structure is known as ADT
(Abstract Data Type) which is independent of implementation. The way in which the data is organized
affects the performance of a program for different tasks. Computer programmers decide which data
structures to use based on the nature of the data and the processes that need to be performed on that data.
Some of the more commonly used data structures include lists, arrays, stacks, queues, heaps, trees, and
graphs.

Classification of Data Structures:


Data structures can be classified as
 Simple data structure
 Compound data structure
 Linear data structure
 Non linear data structure

Page 1
GLWEC, Hyd PC301CS - DATA STRUCTURES AND ALGORITHMS Anitha.V

Simple Data Structure: Simple data structure can be constructed with the help of primitive data
structure. A primitive data structure used to represent the standard data types of any one of the computer
languages. Variables, arrays, pointers, structures, unions, etc. are examples of primitive data structures.
Compound Data structure: Compound data structure can be constructed with the help of any one of the
primitive data structure and it is having a specific functionality. It can be designed by user. It can be
classified as
 Linear data structure
 Non-linear data structure

Linear Data Structure:


Linear data structures can be constructed as a continuous arrangement of data elements in the memory. It
can be constructed by using array data type. In the linear Data Structures the relationship of adjacency is
maintained between the data elements.

Operations applied on linear data structure:

The following list of operations applied on linear data structures


1. Add an element
2. Delete an element
3. Traverse
4. Sort the list of elements
5. Search for a data element
For example Stack, Queue, Tables, List, and Linked Lists.

Non-linear Data Structure:


Non-linear data structure can be constructed as a collection of randomly distributed set of data item joined
together by using a special pointer (tag). In non-linear Data structure the relationship of adjacency is not
maintained between the data items.

Operations applied on non-linear data structures:


The following list of operations applied on non-linear data structures.
1. Add elements
2. Delete elements
3. Display the elements
4. Sort the list of elements
5. Search for a data element
For example Tree, Decision tree, Graph and Forest

Abstract Data Type:


An abstract data type, sometimes abbreviated ADT, is a logical description of how we view the data and
the operations that are allowed without regard to how they will be implemented. This means that we are
concerned only with what data is representing and not with how it will eventually be constructed. By
providing this level of abstraction, we are creating an encapsulation around the data. The idea is that by
encapsulating the details of the implementation, we are hiding them from the user’s view. This is called
information hiding. The implementation of an abstract data type, often referred to as a data structure, will
require that we provide a physical view of the data using some collection of programming constructs and
primitive data types.

Page 2
GLWEC, Hyd PC301CS - DATA STRUCTURES AND ALGORITHMS Anitha.V

Definition:- An algorithm is the step-by-step unambiguous instructions to solve a given


problem.

Characteristics of Algorithms :- Any algorithm must have the following characteristics.


1) Input: There are zero or more quantities that are externally supplied.
2) Output: At least one quantity is produced
3) Definiteness: Each instruction is clear and unambiguous
4) Finiteness: If we trace out the instructions of an algorithm, then for all cases, the
algorithm terminates after a finite number of steps.
5) Effectiveness: Every instruction must be basic enough to be carried out, in principle, by
a person using only pencil and paper. Each operation not only should have the property
of definiteness but also should be feasible.

Writing an Algorithm:-

Algorithm algorithm_name(parameters)
{
Statements
}

Eg:

Algorithm swap(a,b) Algorithm swap(a,b) Algorithm swap(a,b) Algorithm swap(a,b)


{ begin { {
temp=a; temp=a; temp:=a; temp a;
a=b; a=b; a:=b; ab;
b=temp; b=temp; b:=temp; btemp;
} end } }

Sample Algorithms for practice


1) Calculate average of 3 numbers

Page 3
GLWEC, Hyd PC301CS - DATA STRUCTURES AND ALGORITHMS Anitha.V

2) Convert temperature from centigrade to Fahrenheit


3) Simple Interest calculation
4) Calculate area of a circle
5) Biggest of 2 numbers
6) Eligibility to vote
7) Display all even numbers till N
8) Display the sum of N natural numbers
9) Linear search

Types of Algorithms:-
There are several types of algorithms available some of which are listed below:
1. Brute Force Algorithm: It is the simplest approach for a problem. A brute force algorithm
is the first approach that comes to finding when we see a problem.
2. Recursive Algorithm: A recursive algorithm is based on recursion. In this case, a problem
is broken into several sub-parts and called the same function again and again.
3. Backtracking Algorithm: The backtracking algorithm basically builds the solution by
searching among all possible solutions. Using this algorithm, we keep on building the solution
following criteria. Whenever a solution fails we trace back to the failure point and build on the
next solution and continue this process till we find the solution or all possible solutions are
looked after.
4. Searching Algorithm: Searching algorithms are the ones that are used for searching
elements or groups of elements from a particular data structure. They can be of different types
based on their approach or the data structure in which the element should be found.
5. Sorting Algorithm: Sorting is arranging a group of data in a particular manner according to
the requirement. The algorithms which help in performing this function are called sorting
algorithms. Generally sorting algorithms are used to sort groups of data in an increasing or
decreasing manner.
6. Hashing Algorithm: Hashing algorithms work similarly to the searching algorithm. But
they contain an index with a key ID. In hashing, a key is assigned to specific data.
7. Divide and Conquer Algorithm: This algorithm breaks a problem into sub-problems,
solves a single sub-problem and merges the solutions together to get the final solution. It
consists of the following three steps:
 Divide
 Solve
 Combine
8. Greedy Algorithm: In this type of algorithm the solution is built part by part. The solution
of the next part is built based on the immediate benefit of the next part. The one solution
giving the most benefit will be chosen as the solution for the next part.
9. Dynamic Programming Algorithm: This algorithm uses the concept of using the already
found solution to avoid repetitive calculation of the same part of the problem. It divides the
problem into smaller overlapping subproblems and solves them.
10. Randomized Algorithm: In the randomized algorithm we use a random number so it gives
immediate benefit. The random number helps in deciding the expected outcome.

Page 4
GLWEC, Hyd PC301CS - DATA STRUCTURES AND ALGORITHMS Anitha.V

Advantages of Algorithms:
 It is easy to understand.
 An algorithm is a step-wise representation of a solution to a given problem.
 In Algorithm the problem is broken down into smaller pieces or steps hence, it is easier for
the programmer to convert it into an actual program.
Disadvantages of Algorithms:
 Writing an algorithm takes a long time so it is time-consuming.
 Understanding complex logic through algorithms can be very difficult.
 Branching and Looping statements are difficult to show in Algorithms

Recursive Algorithms:-
A recursive algorithm is an algorithm which calls itself with "smaller (or simpler)" input
values, and which obtains the result for the current input by applying simple operations to the
returned value for the smaller (or simpler) input. More generally if a problem can be solved
utilizing solutions to smaller versions of the same problem, and the smaller versions reduce to
easily solvable cases, then one can use a recursive algorithm to solve that problem. For example,
the elements of a recursively defined set, or the value of a recursively defined function can be
obtained by a recursive algorithm.

If a set or a function is defined recursively, then a recursive algorithm to compute its members or
values mirrors the definition. Initial steps of the recursive algorithm correspond to the basis
clause of the recursive definition and they identify the basic elements. They are then followed by
steps corresponding to the inductive clause, which reduce the computation for an element of one
generation to that of elements of the immediately preceding generation.

In general, recursive computer programs require more memory and computation compared with
iterative algorithms, but they are simpler and for many cases, it is a natural way of thinking about
the problem.

Example 1: Algorithm for finding the k-th even natural number


Note here that this can be solved very easily by simply outputting 2*(k - 1) for a given k . The
purpose here, however, is to illustrate the basic idea of recursion rather than solving the problem.

Algorithm 1: Even(positive integer k)


Input: k , a positive integer
Output: k-th even natural number (the first even being 0)

Algorithm:
if k = 1, then return 0;
else return Even(k-1) + 2 .

Here the computation of Even(k) is reduced to that of Even for a smaller input value, that
is Even(k-1). Even(k) eventually becomes Even(1) which is 0 by the first line. For example, to

Page 5
GLWEC, Hyd PC301CS - DATA STRUCTURES AND ALGORITHMS Anitha.V

compute Even(3), Algorithm Even(k) is called with k = 2. In the computation


of Even(2), Algorithm Even(k) is called with k = 1. Since Even(1) = 0, 0 is returned for the
computation of Even(2), and Even(2) = Even(1) + 2 = 2 is obtained. This value 2 for Even(2) is
now returned to the computation of Even(3), and Even(3) = Even(2) + 2 = 4 is obtained.
As can be seen by comparing this algorithm with the recursive definition of the set of
non-negative even numbers, the first line of the algorithm corresponds to the basis clause of the
definition, and the second line corresponds to the inductive clause.
By way of comparison, let us see how the same problem can be solved by an iterative
algorithm.

Algorithm 1-a: Even(positive integer k)


Input: k, a positive integer
Output: k-th even natural number (the first even being 0)

Algorithm:
int i, even;
i := 1;
even := 0;
while( i < k ) {
even := even + 2;
i := i + 1; }

return even
Types of Recursion: There are 2 types of Recursions. They are :
1) Direct recursion :- When a function calls itself within the same function repeatedly, it is
called the direct recursion.
 Tail Recursion: If a recursive function calling itself and that recursive call is the last
statement in the function then it’s known as Tail Recursion. After that call the
recursive function performs nothing. The function has to process or perform any
operation at the time of calling and it does nothing at returning time.
 Head Recursion: If a recursive function calling itself and that recursive call is the first
statement in the function then it’s known as Head Recursion. There’s no statement, no
operation before the call. The function doesn’t have to process or perform any
operation at the time of calling and all operations are done at returning time.
2) Indirect recursion:- When a function is mutually called by another function in a circular
manner, the function is called an indirect recursion function.

Sample recursive algorithms


1) Factorial of a number
2) Fibonacci series
3) Sum of N natural numbers
4) Binary Search

Performance Analysis of an algorithm :-

Page 6
GLWEC, Hyd PC301CS - DATA STRUCTURES AND ALGORITHMS Anitha.V

The performance analysis of algorithms can be measured on the scales of time and space. The
performance of a program is the amount of computer memory and time needed to run a program.
We use two approaches to determine the performance of a program or an algorithm. One is
analytical and the other is experimental. In performance analysis we use analytical methods,
while in performance measurement we conduct experiments.
Time Complexity: The time complexity of an algorithm or a program is a function of the
running time of the algorithm or a program. In other words, it is the amount of computer time it
needs to run to completion.
Space Complexity: The space complexity of an algorithm or program is a function of the space
needed by the algorithm or program to run to completion.

Frequency Count Method :-


algorithm swap(a,b)
{
temp=a; -> 1
a=b; -> 1
b=temp; -> 1
} -------
f(n) = 3 -> O(1)

space complexity analysis


a -> 1
b -> 1
temp ->1
--------
s(n)= 3 words (constant)
O(1)

algorithm sum(A,n)
{
s=0; -1
for(i=0;i<n;i++) - n+1
s=s+A[i]; -n
return s; -1
}
f(n)=1+n+1+n+1 =2n+3 -> O(n)

space complexity analysis


A -> n
n -> 1
s -> 1
i -> 1

Page 7
GLWEC, Hyd PC301CS - DATA STRUCTURES AND ALGORITHMS Anitha.V

--------
s(n)= n+3 -> O(n)

algorithm Add(A,B, n)
{
for(i=0;i<n;i++) - n+1
for(j=0;j<n;j++) - n * (n+1)
c[i][j]=A[i][j]+B[i][j]; - n*n
} ---------
f(n)=n+1+n2+n+n2=2n2+2n+1 -> O(n2)

space complexity analysis


A -> n2
B -> n2
C -> n2
i -> 1
j -> 1
n -> 1
--------
s(n)= 3n2+3
O(n2)

Asymptotic Notations.:-
It is the meaningful way of representation of time complexity. It is often used to describe how
the size of the input data affects an algorithm’s usage of computational resources. Running time
of an algorithm is described as a function of input size n for large n.
They are
i) Big Oh [ O (g(n)) ] { Upper bound }
ii) Big Omega [ Ω (g(n)) ] { Lower bound }
iii) Big Theta [ Ɵ (g(n)) ] { Average }
The Order increases in this format:
O (1), O (log n), O (n/2), O (n), O (n2), O (n3) . . .

Page 8
GLWEC, Hyd PC301CS - DATA STRUCTURES AND ALGORITHMS Anitha.V

Big oh(O): Definition: f(n) = O(g(n)) (read as f of n is big oh of g of n) if there exist a positive
integer
n0 and a positive number c such that |f(n)| ≤ c|g(n)| for all n ≥ n0 . Here g(n) is the upper bound
of the
function f(n).

Omega(Ω): Definition: f(n) = Ω(g(n)) ( read as f of n is omega of g of n), if there exists a


positive
integer n0 and a positive number c such that |f(n)| ≥ c |g(n)| for all n ≥ n0. Here g(n) is the lower
bound
of the function f(n).

Theta(Θ): Definition: f(n) = Θ(g(n)) (read as f of n is theta of g of n), if there exists a positive
integer
n0 and two positive constants c1 and c2 such that c1 |g(n)| ≤ |f(n)| ≤ c2 |g(n)| for all n ≥ n0. The
function
g(n) is both an upper bound and a lower bound for the function f(n) for all values of n, n ≥ n0 .

Page 9
GLWEC, Hyd PC301CS - DATA STRUCTURES AND ALGORITHMS Anitha.V

Analysis: Sum of n natural numbers


Algorithm: Time taken
1) Start ------ 0
2) Input: Read ‘n’ ------ 1
3) initialize: sum=0, i=1 ------ 1
4) Process: sum = sum + i ------ n
5) i = i +1 ------ n
6) if ( i < = n ) go to step 4 ------ n +1
7) Output:
8) print ‘sum’ ------ 1
9) Stop ------ 0
Total time in function f(n) = 3n + 4

Big Oh: O() { Upper bound – Worst Case}


The function f(n) = O(g(n)), if and only if there exists constants “c and n0 “ such that f(n) ≤ c .
g(n) for all value of n, where as n>n0 .

Analysis : Sum of n natural numbers


f(n) = 3n + 4 ; f(n) ≤ C . g(n)
3n + 4 ≤ C . g(n)
3n + 4 ≤ 3n + 4n (Upper bound)
If 3n + 4 ≤ 7n then C = 7 , g(n) = n
If n=1 then 7 ≤ 7 ; If n=2 then 10 ≤ 14 ;If n=3 then 13 ≤ 21

Page 10
GLWEC, Hyd PC301CS - DATA STRUCTURES AND ALGORITHMS Anitha.V

 Therefore f(n) = O(g(n)) = O(n)


 thus f(n) = O(n) [Worst case ]

Big Theta [ Ɵ () ] { Average }


The function f(n) = Ɵ(g(n)), if and only if there exists constants “c1, c2 and n0“ such that c1 . g(n)
≤ f(n) ≤ c . g(n) for all value of n, where as n>n0 .

Analysis : Sum of n natural numbers


f(n) = 3n + 4 ; c1 . g(n) ≤ f(n) ≤ c . g(n)
C1 . g(n) ≤ 3n + 4 ≤ C2 . g(n)
3n ≤ 3n + 4 ≤ 7n (Lower and Upper bound)
If 3n ≤ 3n + 4 ≤ 7n then C1 = 3 , C2 = 7 and g(n) = n
If n=1 then 3 ≤ 7 ≤ 7 ; If n=2 then 6 ≤ 10 ≤ 14 ;If n=3 then 9 ≤ 13 ≤ 21

 Therefore f(n) = Ɵ(g(n)) = Ɵ(n)


 thus f(n) = Ɵ(n) [Average Case]

Big Omega [ Ω () ] { Lower bound }


The function f(n) = Ω(g(n)), if and only if there exists constants “c and n0 “ such that f(n) ≥ c .
g(n) for all value of n, where as n>n0 .

Analysis : Sum of n natural numbers


f(n) = 3n + 4 ; f(n) ≥ C . g(n)

Page 11
GLWEC, Hyd PC301CS - DATA STRUCTURES AND ALGORITHMS Anitha.V

3n + 4 ≥ C . g(n)
3n + 4 ≥ 3n (Lower bound)
If 3n + 4 ≥ 3n then C = 3 , g(n) = n
If n=1 then 7 ≥ 3
If n=2 then 10 ≥ 6
If n=3 then 13 ≥ 9

 Therefore f(n) = Ω(g(n)) = Ω(n)


 thus f(n) = Ω(n) [Best Case]
Example II : Sum of n natural numbers
Algorithm: Time taken
1) Start ------ 0
2) Input: Read n ------ 1
3) Process: sum = n * (n+1) / 2 ------ 1
4) print ‘sum’ ------ 1
5) Stop ------ 0
Total time in function f(n) = 3

f(n) = 3 . 1 = C . g(n) ; C = 3 , g(n) = 1


therefore f(n) = O(g(n) ) = O (1) = Ω(1) = Ɵ(1)

Arrays – ADT:-

Arrays are defined as the collection of similar types of data items stored at contiguous memory
locations. It is one of the simplest data structures where each data element can be randomly
accessed by using its index number.

In C programming, they are the derived data types that can store the primitive type of data such
as int, char, double, float, etc. For example, if we want to store the marks of a student in 6
subjects, then we don't need to define a different variable for the marks in different subjects.
Instead, we can define an array that can store the marks in each subject at the contiguous
memory locations.
Properties of array
There are some of the properties of an array that are listed as follows -
o Each element in an array is of the same data type and carries the same size that is 4 bytes.
o Elements in the array are stored at contiguous memory locations from which the first
element is stored at the smallest memory location.
o Elements of the array can be randomly accessed since we can calculate the address of
each element of the array with the given base address and the size of the data element.
Representation of an array
We can represent an array in various ways in different programming languages. As an
illustration, let's see the declaration of array in C language -

Page 12
GLWEC, Hyd PC301CS - DATA STRUCTURES AND ALGORITHMS Anitha.V

As per the above illustration, there are some of the following important points -
o Index starts with 0.
o The array's length is 10, which means we can store 10 elements.
o Each element in the array can be accessed via its index.
Why are arrays required?
Arrays are useful because -
o Sorting and searching a value in an array is easier.
o Arrays are best to process multiple values quickly and easily.
o Arrays are good for storing multiple values in a single variable - In computer
programming, most cases require storing a large number of data of a similar type. To
store such an amount of data, we need to define a large number of variables. It would be
very difficult to remember the names of all the variables while writing the programs.
Instead of naming all the variables with a different name, it is better to define an array
and store all the elements into it.
Memory allocation of an array
As stated above, all the data elements of an array are stored at contiguous locations in the main
memory. The name of the array represents the base address or the address of the first element in
the main memory. Each element of the array is represented by proper indexing.

We can define the indexing of an array in the below ways -


1. 0 (zero-based indexing): The first element of the array will be arr[0].
2. 1 (one-based indexing): The first element of the array will be arr[1].
3. n (n - based indexing): The first element of the array can reside at any random index
number.

Page 13
GLWEC, Hyd PC301CS - DATA STRUCTURES AND ALGORITHMS Anitha.V

In the above image, we have shown the memory allocation of an array arr of size 5. The array
follows a 0-based indexing approach. The base address of the array is 100 bytes. It is the address
of arr[0]. Here, the size of the data type used is 4 bytes; therefore, each element will take 4 bytes
in the memory. Basic operations supported in the array are
 Traversal - This operation is used to print the elements of the array.
 Insertion - It is used to add an element at a particular index.
 Deletion - It is used to delete an element from a particular index.
 Search - It is used to search an element using the given index or by the value.
 Update - It updates an element at a particular index.

Polynomials:-
Polynomials and Sparse Matrix are two important applications of arrays and linked lists. A
polynomial is composed of different terms where each of them holds a coefficient and an
exponent. This tutorial chapter includes the representation of polynomials using linked lists and
arrays.
A polynomial p(x) is the expression in variable x which is in the form (ax n + bxn-1 + …. +
jx+ k), where a, b, c …., k fall in the category of real numbers and 'n' is non negative integer,
which is called the degree of polynomial.

An essential characteristic of the polynomial is that each term in the polynomial expression
consists of two parts:
 one is the coefficient
 other is the exponent

Example:

10x2 + 26x, here 10 and 26 are coefficients and 2, 1 is its exponential value.

Points to keep in Mind while working with Polynomials:

 The sign of each coefficient and exponent is stored within the coefficient and the
exponent itself
 Additional terms having equal exponent is possible one
 The storage allocation for each term in the polynomial must be done in ascending and
descending order of their exponent

Representation of a Polynomial:-

Page 14
GLWEC, Hyd PC301CS - DATA STRUCTURES AND ALGORITHMS Anitha.V

Polynomial can be represented in the various ways. These are:

 By the use of arrays


 By the use of Linked List

Representation of Polynomials Using Arrays


There may arise some situation where you need to evaluate many polynomial expressions and
perform basic arithmetic operations like addition and subtraction with those numbers. For this,
you will have to get a way to represent those polynomials. The simple way is to represent a
polynomial with degree 'n' and store the coefficient of n+1 terms of the polynomial in the array.
So every array element will consist of two values:

 Coefficient and
 Exponent

1) Program to create a polynomial ADT


2) Program to display the addition of two polynomials

Sparse matrices:-
A matrix is a two-dimensional data object made of m rows and n columns, therefore having
total m x n values. If most of the elements of the matrix have 0 value, then it is called a sparse
matrix.
Why to use Sparse Matrix instead of simple matrix ?
 Storage: There are lesser non-zero elements than zeros and thus lesser memory can be used
to store only those elements.
 Computing time: Computing time can be saved by logically designing a data structure
traversing only non-zero elements..
Example:
00304
00570
00000
02600
Representing a sparse matrix by a 2D array leads to wastage of lots of memory as zeroes in the
matrix are of no use in most of the cases. So, instead of storing zeroes with non-zero elements,
we only store non-zero elements. This means storing non-zero elements with triples- (Row,
Column, value).
Sparse Matrix Representations can be done in many ways following are two common
representations:
1. Array representation
2. Linked list representation
Method 1: Using Arrays:
2D array is used to represent a sparse matrix in which there are three rows named as

Page 15
GLWEC, Hyd PC301CS - DATA STRUCTURES AND ALGORITHMS Anitha.V

 Row: Index of row, where non-zero element is located


 Column: Index of column, where non-zero element is located
 Value: Value of the non zero element located at index – (row,column)

1) Program to create and display a sparse matrix


2) Program to perform addition on two sparse matrices
3) Program to display the transpose of a sparse matrix

Strings-ADT:- An Abstract Data Type (ADT) consists of a set of values, a defined set of
properties of these values, and a set of operations for processing the values. The string ADT
values are all sequences of characters upto a specified length.
o Properties
 The component characters are from the ASCII character set
 They are comparable in lexicographic order
 They have a length, from 0 to the specified length
o Operations on the string ADT include (p.264)
 Input
 Output
 Initialization and assignment
 Comparison greater, equal, less
 Determination of length
 Concatenation
 Accessing component characters and substrings

String functions in C

 strlen() - Returns the string's length.


 strlwr() - This command lowercases a string.
 istrupr() - It transforms a string to uppercase .
 strcat() - appends one string to the end of another.
 strncat() - This command appends the first n characters of a string to the end of another
string.
 strcpy() - to copy a string into another string.

Page 16
GLWEC, Hyd PC301CS - DATA STRUCTURES AND ALGORITHMS Anitha.V

 strncpy() - This command copies the first n characters of a string into another.
 strcmp() - function that compares two strings.
 strncmp() - compares two strings' first n characters.
 strcmpi() - This function compares two strings without regard to case I indicates that this
function ignores case).
 stricmp() - compares two strings regardless of case (identical to strcmpi).
 strnicmp() – This function compares the first n characters of two strings. There is no
difference in case.
 strdup() - This command duplicates a string.
 strchr() - Finds the first instance of a character in a string.
 strrchr() - Returns the position of a given character in a string.
 strstr() - Looks for the first instance of a string in another string.
 strset() - This command changes all characters in a string to a specific character.
 strnset() - This command changes the first n characters of a string to a specific character.
 strrev() - It reverses a string

Pattern Matching:- Pattern matching is the process of checking whether a specific sequence of
characters/tokens/data exists among the given data. Regular programming languages make use of
regular expressions (regex) for pattern matching. Pattern matching is used to determine whether
source files of high-level languages are syntactically correct. It is also used to find and replace a
matching pattern in a text or code with another text/code. Any application that supports search
functionality uses pattern matching in one way or another.
Exact string matching algorithms is to find one, several, or all occurrences of a defined
string (pattern) in a large string (text or sequences) such that each matching is perfect. All
alphabets of patterns must be matched to corresponding matched subsequence. Algorithms based
on character comparison:
● Naive Algorithm: It slides the pattern over text one by one and check for a match. If a
match is found, then slides by 1 again to check for subsequent matches.
● KMP (Knuth Morris Pratt) Algorithm: The idea is whenever a mismatch is detected, we
already know some of the characters in the text of the next window. So, we take advantage
of this information to avoid matching the characters that we know will anyway match.
● Boyer Moore Algorithm: This algorithm uses best heurestics of Naive and KMP algorithm
and starts matching from the last character of the pattern.
● Using the Trie data structure: It is used as an efficient information retrieval data structure.
It stores the keys in form of a balanced BST.

Program to search for a particular pattern in a string

Page 17
GLWEC, Hyd PC301CS - DATA STRUCTURES AND ALGORITHMS Anitha.V

Page 18

You might also like