Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
76 views

Module-1 DS Notes

This document provides an introduction to data structures and related concepts. It defines data structures as organized methods for storing and accessing data efficiently. Data structures are classified as primitive or non-primitive, with linear non-primitive structures including arrays, stacks, queues and lists, and non-linear structures being trees and graphs. Common operations on data structures are creating, inserting, deleting, searching, sorting and merging data. Arrays, structures, nested structures, arrays of structures, self-referential structures and unions are described as examples of data structures. Pointers, dynamic memory allocation and initializing pointers are also discussed.

Uploaded by

Vaishali K
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
76 views

Module-1 DS Notes

This document provides an introduction to data structures and related concepts. It defines data structures as organized methods for storing and accessing data efficiently. Data structures are classified as primitive or non-primitive, with linear non-primitive structures including arrays, stacks, queues and lists, and non-linear structures being trees and graphs. Common operations on data structures are creating, inserting, deleting, searching, sorting and merging data. Arrays, structures, nested structures, arrays of structures, self-referential structures and unions are described as examples of data structures. Pointers, dynamic memory allocation and initializing pointers are also discussed.

Uploaded by

Vaishali K
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

DSA/18CS32 Module-1

MODULE-1
Syllabus
Introduction: Data Structures, Classifications (Primitive & Non Primitive), Data structure
Operations, Review of Arrays, Structures, Self-Referential Structures, and Unions. Pointers and
Dynamic Memory Allocation Functions. Representation of Linear Arrays in Memory, Dynamically
allocated arrays, Array Operations: Traversing, inserting, deleting, searching, and sorting.
Multidimensional Arrays, Polynomials and Sparse Matrices. Strings: Basic Terminology, Storing,
Operations and Pattern Matching algorithms. Programming Examples.

Introduction
 Data is a value or a set of values. Data as such may not convey any meaning.
Example 90, Bob
 When data is interpreted to convey a meaning we call it an information.
For example Bob scored 90 marks
 A data item refers to a single unit of values. Data items that are divided into sub items are
called group items.
Example :Name of an employee can be divided to three subitems- first name, middle
name and last name
 Data items that are not divided into sub items are called elementary items.

Data Structures: A data structure is a particular method of storing and organizing data
in acomputer so that it can be used efficiently. The data Structure is classified into
a. Primitive data structure: These can be manipulated directly by the machine instructions.
Example integer character, float etc
b. Non primitive data structures: They cannot be manipulated directly by the machine
instructions. The non primitive data structures are further classified into linear and non linear
data structures.
 Linear data structures: show the relationship of adjacency between the elements of
the data structures. Example are arrays, stacks, queues , list etc.
 Non linear data structure: They do not show the relationship of adjacency between
the elements. Example are Trees and graphs

Operations on data structures

1. Create: Creating a new data structure


2. Insert: Adding a new record to the structure.
3. Delete: Removing the record from the structure.
4. Search: Finding the location of a particular record with a given key value, or finding the
location of all records which satisfy one or more conditions.
5. Sort: Managing the data or record in some logical order (Ascending or descending order).
6. Merge: Combining the record in two different sorted files into a single sorted file.
7. Traversal: Accessing each record exactly once so that certain items in the record may be
processed.

Review of arrays
 Array is a collection of elements of the same data type
 An array is declared by appending brackets to the name of a variable.

Dept. of ISE, SVIT Page 1


DSA/18CS32 Module-1

For example

int list[5]; // declares an array that can store 5 integers

In C all array index start at 0 and so list[0],list[1],list[2],list[3],list[4] are the names of the five array
elements each of which contains an integer value.

Structures : Structure is basically a user-defined data type that can store related information
that may be of same or different data types together.

The major difference between a structure and an array is that an array can store only information of
same data type. A structure is therefore a collection of variables under a single name. The variables
within a structure are of different data types and each has a name that is used to select it from the
structure.

For example,

Struct student {
char sname[10];
int age;
float average_marks;
} st;

It creates a variable whose name is st and that has three fields:

 a name that is a character array


 an integer value representing the age of the student
 a float value representing the average marks of the individual student.

To assign values to these fields dot operator (. ) is used as the structure member operator. We use
this operator to select a particular member of the structure.

strcpy(st.sname,"james");
st.age = 10;
st.average_marks = 35;

We can create our own structure data types by using the typedef statement. Consider an example
that creates a structure for the employee details.

typedef struct Employee {


char name[10];
int age;
float salary;
};
or
typedef struct {
char name[10];
int age;
float salary;
} Employee;

Dept. of ISE, SVIT Page 2


DSA/18CS32 Module-1

Comparing structures: Return TRUE if employee 1 and employee 2 are the same
otherwisereturn FALSE

int EmployeeEqual(employee p1, Employee p2)


{
if (!strcmp(p1.name, p2.name))
return FALSE;
if (p1.age != p2.age)
return FALSE;
if (p1.salary != p2.salary)
return FALSE;
return TRUE;
}

A typical function call might be:


if (EmployeeEqual(p1,p2))
printf("The two employee are the same\n");
else
printf("The two Employee are not the same\n");

Nested Structure: A structure can be embedded within another structure. That is a structure
canhave another structure as its member such a structure is called a nested structure.

For example, associated with our employee structure we may wish to include the date of Birth of an
employee by using nested stucture

typedef struct {
int month;
int day;
int year;
} date;
typedef struct {
char name[10];
int age;
float salary;
date dob;
}employee;

If we have to store the information of a person we declare a variable as


Employee p1;

A person born on September 10, 1974, would have the values for the date struct set as:
p1.dob.month = 9;
p1.dob.day = 10;
p1.dob.year = 1974;

Array of Structures: In the case of a student or the employee we may not store the details
ofonly 1 student or 1 employee. When we have to store the details of a group of students
we can declare an array of structures.

Example: struct student s[10];

Dept. of ISE, SVIT Page 3


DSA/18CS32 Module-1

Self-Referential Structures: A self-referential structure is one in which one or more of its


data member is a pointer to itself. They require dynamic memory allocation (malloc and free)
to explicitlyobtain and release memory.

Example:
typedef struct list {
int data;
list *link ;
};
 Each instance of the structure list will have two components, data and link. data is a single
character, while link is a pointer to a list structure.
 The value of link is either the address in memory of an instance of list or the null pointer.

Consider these statements, which create three structures and assign values to their respective fields:
list item1, item2, item3;
item1.data = 5
item2.data = 10
item3.data = 15
item1.link = item2.link = item3.link = NULL;

We can attach these structures together by replacing the null link field in item 2 with one that points
to item 3 and by replacing the null link field in item 1 with one that points to item 2.
item1.link = &item2; item2.1ink = &item3;

Unions: A union is a user-defined data type that can store related information that may be of
different data types or same data type, but the fields of a union must share their memory
space. This means that only one field of the union is "active" at any given time.

Example1: Suppose a program uses either a number that is int or float we can define a union as
Union num
{
int a;
float b;
};
Union num n1;
Now we can store values as n1.a=5 or n2.b= 3.14 only one member is active at a point of time.

Pointer variables, Declaration and Definition


 Pointer variable is a variable that stores the address of another variable.
 Declaring Pointer variables
A variable can be declared as a pointer variable by using an indirection operator(*)
Syntax: type * identifier; // type signifies the type of the pointer variable
For example
char *p;
int *m;
//The variable p is declared as a pointer variable of type character. The variable m is declared as a
pointer variable of type integer.

Initialization of pointer variables: Uninitialized variables have unknown garbage


values stored in them, similarly uninitialized pointer variables will have uninitialized
memory address stored inside them which may be interpreted as a memory location, and
may lead to runtime error.
Dept. of ISE, SVIT Page 4
DSA/18CS32 Module-1

These errors are difficult to debug and correct, therefore a pointer should always be initialized with a
valid memory address.

A pointer can be initialized as follows


int a;
int *p;

//Here the variable a and the pointer variable p are of the same data type. To make p to point at a we
have to write a statement

p=&a; // now the address of a is stored in the pointer variable p and now p is said to be
pointing at a.

If we do not want the pointer variable to point at anything we can initialize it to point at NULL

For example: int *p =NULL;


When we dereference a null pointer, we are using address zero, which is a valid address in the
computer.

NOTE: A pointer variable can only point at a variable of the same type.
We can have more than one pointer variable pointing at the same variable. For example
int a;
int *p,*q;
p=&a;
q=&a;
now both the pointer variable p and q are pointing at the same variable a. There is no limit to the
number of pointer variable that can point to a variable.

Accessing variables through pointers


 A Variable can be accessed through a pointer variable by using an indirection operator.
 Indirection operator(*): An indirection operator is a unary operator whose operand must be a
pointer value.
 For example to access a variable a through a pointer variable p we have to code it as follows
Void main()
{
int a=5;
int *p
p=&a;// p is now pointing at a
*p=*p+1
printf(“ %d %d %p”, a, *p,p);
}
Output: 6 6 XXXXX(address of variable a)
Now the value of a is modified through the pointer variable p

Note:
 we need parenthesis for expressions like (*p) ++ as the precedence of postfix increment is
more than precedence of the indirection operator (*). If the parenthesis is not used the address
will be incremented.
 The indirection and the address operators are the inverse of each other when combined in an
expression such as *&a they cancel each other

Dept. of ISE, SVIT Page 5


DSA/18CS32 Module-1

Write a program to add two numbers using pointers.


#include <stdio.h>
int main()
{
int num1, num2, *p, *q, sum;
printf("Enter two integers to add\n");
scanf("%d%d", &num1, &num2);
p = &num1; q = &num2;
sum = *p + *q;
printf("Sum of the numbers = %d\n", sum)
return 0;
}

Write a program to swap two numbers.


#include <stdio.h>
int main()
{
int num1, num2, *p, *q, sum;
printf("Enter two integers to swap\n");
scanf("%d%d", &num1, &num2);
p = &num1; q = &num2;
temp = *p;
*p = *q;
*q = temp;
printf(“After Swapping p = %d, q = %d\n", p,q);
return 0;
}

Pointers and Functions


When we call a function by passing the address of a variable we call it as pass by reference. By
passing the address of variables defined in main we can directly store the data in the calling function
rather returning the value. Pointers can also be used when we have to return more than one value
from a function

Program to swap two characters using Functions.


void main()
{
char a ,b;
printf(“\nenter two characters\n”);
scanf(“%c %c”, &a,&b);
printf(“the value before swap: %c %c” a,b);
swap(&a,&b);
printf(“the value after swap: %c %c” a,b);
}
void swap(char *p1,char *p2)
{
char temp;
temp=*p1;
*p1=*p2;
*p2=temp;
}

Dept. of ISE, SVIT Page 6


DSA/18CS32 Module-1

Memory allocation functions: In high level languages the data structures are fully defined
at compile time. Modern languages like C can allocate memory at execution this feature is
known as dynamic memory allocation.
There are two ways in which we can reserve memory locations for a variable
 Static memory allocation: the declaration and definition of memory should be specified in
the source program. The number of bytes reserved cannot be changed during runtime
 Dynamic memory allocation : Data definition can be done at runtime .It uses predefined
functions to allocate and release memory for data while the program is running. To use
dynamic memory allocation the programmer must use either standard data types or must
declare derived data types

Memory usage: Four memory management functions are used with dynamic memory. malloc, calloc
and realloc are used for memory allocation. The function free is used to return memory when it is not
used.

Heap: It is the unused memory allocated to the program When requests are made by memory
allocating functions, memory is allocated from the heap at run time.

Memory Allocation (malloc)


 When a malloc function is invoked requesting for memory, it allocates a block of memory that
contains the number of bytes specified in its parameter and returns a pointer to the start of the
allocated memory.
 When the requested memory is not available the pointer NULL is returned.

syntax: void *malloc (size_t size);


Example: void *malloc(sizeof(int));
 The pointer returned by the malloc function can be type cast to the pointer of the required
type by making use of type cast expressions

Example: To allocate an integer in the heap we code


int *pint
pint=(int*)malloc(sizeof(int))

Releasing memory (free): When memory locations allocated are no longer needed, they should be
freed by using the predefined function free.

Syntax: free(void*);
Example: int *p,a;
p=&a;
free(p);

Program showing the allocation of memory using malloc


int i,*pi;
float f,*pf;
Pi= (int*) malloc (sizeof((int));
Pf= (float *) malloc(sizeof(float));
*pi= 1344;
*pf= 3.14
Printf(“integer value= %d float value= %f”,*pi, *pf);
Free(pi);
Free(pf);

Dept. of ISE, SVIT Page 7


DSA/18CS32 Module-1

Contiguous memory allocation (calloc)


 This function is used to allocate contiguous block of memory. It is primarily used to allocate
memory for arrays.
 The function calloc() allocates a user specified amount of memory and initializes the allocated
memory to 0.
 A pointer to the start of the allocated memory is returned.
 In case there is insufficient memory it returns NULL

syntax : void * calloc (size_t count , size_t size);

Example: To allocate a one dimensional array of integers whose capacity is n the following code can
be written.
int *ptr
ptr=(int*)calloc(n,sizeof(int))

Reallocation of memory(realloc): The function realloc resizes the memory previously allocated by
either malloc or calloc.

syntax: Void * realloc (void * ptr , size_t new_size);

Example
int *p;
p=(int*)calloc(n,sizeof(int))
p=realloc(p,s) /*where s is the new size*/

The statement realloc(p,s) -- Changes the size of the memory pointed by p to s. The existing contents
of the block remain unchanged.

 When s> oldsize(Block size increases) the additional (s – oldsize )have unspecified value
 When s<oddsize (Block size reduces) the rightmost (oldsize-s) bytes of the old block are freed.
 When realloc is able to do the resizing it returns a pointer to the start of the new block
 When is not able to do the resizing the old block is unchanged and the function returns the
value NULL.

Dangling Reference: Once a pointer is freed using the free function then there is no way to retrieve
this storage and any reference to this location is called dangling reference.

Example2:
int i,*p,*f;
i=2;
p=&i;
f=p;
free(p);
*f=*f+2 /* Invalid dangling reference*/

The location that holds the value 2 is freed but still there exist a reference to this location through f
and pointer f will try to access a location that is freed so the pointer f is a dangling reference
Pointers can be dangerous: When pointers are used the following points needs to be taken care
1. When a pointer is not pointing at any object it is a good practise to set it to NULL so that there
is no attempt made to access a memory location that is out of range of our program or that does
not contain a pointer reference to the legitimate object.

Dept. of ISE, SVIT Page 8


DSA/18CS32 Module-1

2. Use explicit type casts when converting between pointer types.


int *pi;
float *pf;
Pi= (int*) malloc (sizeof((int));
Pf= (float *)pi;
3. Define explicit return types for functions. If the return type is omitted it defaults to integer
which has the same size as a pointer and can be later interpreted as a pointer

Dynamically allocated Arrays

One dimensional array: When we cannot determine the exact size of the array the space of the array
can be allocated at runtime.
For example consider the code given below
int i,n,*list;
printf(“enter the size of the array”);
scanf(“%d”,&n);
if (n<1)
{
fprintf(stderr,”Improper values of n \n”);
exit();
}
list=(int*) malloc (n*sizeof(n))/* or list=(int*)calloc(n,sizeof(int))

Two dimensional Arrays


 Example for representing a 2 dimensional array int x[3][5];
 Here a one dimensional array is created whose length is 3 where each element of x is a
one dimensional array of length 5

[0] [1] [2] [3] [4]


X[0]
X[1]
X[2]

Figure 1.2 Memory Representation of two dimensional array

In C we find the element x[i][j] by first accessing the pointer in x[i]. This pointer gives the address
of the zeroth element of row i of the array. Then by adding j*sizeof(int) to this pointer, the address
of the jth element of the ith row is determined

Example to find x[1][3] we first access the pointer in x[1] this pointer gives the address of x[1][0]
now by adding 3*sizeof (int) the address of the element x[1][3] is determined.

Program to create a 2 dimensional array at run time.


int ** maketwodarray(int rows,int cols)
{
int **x, i;
x=(int*)malloc( rows * sizeof(*x));
for (i=0;i< rows; i++)
x[i]= malloc(cols*sizeof(int));
return x;
}

Dept. of ISE, SVIT Page 9


DSA/18CS32 Module-1

The function can be invoked as follows to allocate memory


int ** twodarray
twodarray= maketwodarray(5,10);

Arrays
Linear Arrays: A Linear Array is a list of finite number (n) of homogenous data elements.
a. The elements of the array are referenced by an index set consisting of n consecutive
numbers(0... (n-1)).
b. The elements of the array are stored in successive memory locations
c. The number n of elements is called the length or size of the array. Length of the array can be
obtained from the index set using the formula
Length = Upper bound – Lowe bound +1
d. The elements of an array may be denoted by a[0],a[2]........... a[n-1]. The number k in a[k] is
called a subscript or index and a[k] is called the subscripted value.
e. An array is usually implemented as a consecutive set of memory locations

Declaration: Linear arrays are declared by adding a bracket to the name of a variable. The size of
the array is mentioned within the brackets.

Eg :- int list[5]; // Declares an array containing five integer elements.

In C all arrays start at index 0. Therefore, list[0], list[1], list[2], list[3], and list[4] are the names of
the five array elements ,each of which contains an integer value.

Representation of Linear Arrays in memory


 When the compiler encounters an array declaration such as int list[5], to create list, it allocates
five consecutive memory locations. Each memory location is large enough to hold a single
integer.
 The address of the first element list[0], is called the base address
base address=address(list[0])
 Using the base address the address of any element of list can be calculated using the formula
address(list[k])= base addres + w.k
// where w is the size of each element in the array list

Example: int list[5]

 The elements of the array is list[0] list[4]


 If the size of an integer on the machine is denoted by sizeof(int), then the memory
Address(list [k])= base address+ sizeof (int).k.

Variable Memory Address canlculatio


Let α the base address , the address of list[0]
Let w=sizeof(int)
address(list[1] )= α + w*1
address(list[2]) = α + w*2
address(list[3]) = α + w*3
address(list[4]) = α + w*4

Dept. of ISE, SVIT Page 10


DSA/18CS32 Module-1

Multi dimensional arrays

Two dimensional arrays: C uses the array of array representation to represent a multidimensional
array. In this representation a 2 dimensional array is represented as a one dimensional array in which
each element is itself a one dimensional array.
 A two dimensional m X n array A is a collection of m* n data elements such that each element
is specified by a pair of integers called subscripts.
 An element with subscript i and j will be represented as A[i][j]
 Declaration: int A[3][5];
// It declares an array A that contains three elements where each element is a one
dimensional array. Each one dimensional array has 5 integer elements.

Example : A 2 dimensional array A[3][4] is represented as

0 1 2 3
0 A[0][0] A[0][1] A[0][2] A[0][3]
Rows 1 A[1][0] A[1][1] A[1][2] A[1][3]
2 A[2][0] A[2][1] A[2][2] A[2][3]

Representation of two dimensional arrays in memory: A two dimensional m X n array A is stored


in the memory as a collection of m*n sequential memory locations.If the array is stored column by
column it is called column major order and if the array is stored row by row it is called row major
order.

Example: Representation of the two dimensional array A[3][4] in row major order and column major
order
A Subscript A Subscript
A[0][0] A[0][0] Column1
A[0][1] A[1][0]
Row1
A[0][2] A[2][0]
A[0][3] A[0][1] Column2
A[1][0] A[1][1]
A[1][1] A[2][1]
Row2
A[1][2] A[0][2] Column3
A[1][3] A[1][2]
A[2][0] A[2][2]
A[2][1] A[0][3] Column4
Row3
A[2][2] A[1][3]
A[2][3] A[2][3]

Row Major Order Column major order

Figure 1.1 Representation of Two Dimensional array

 Using the base address , the address of any element in an array A of size row X col can be
calculated using the formula.
 Row Major order
Address (A[i][j]) = Base address + w[ i*col+ j] considering the array indexing starts at 0
 Column Major order
Address (A[i][j]) = Base address + w[ i+row.j] considering the array indexing starts at 0

Dept. of ISE, SVIT Page 11


DSA/18CS32 Module-1

Example : When the compiler encounters an array declaration such as int A[3][4] it creates an array
A and allocates 20 consecutive memory locations. Each memory location is large enough to hold a
single integer.

Let α be the address of the first element A[0][0], is called the base address

Considering Row major order: Using the bases address we can calculate the addresses of other
element
Address of A[0][1] = 100 +2[0*4+ 1]= 100 +2=102
Address of A[0][2] = 100 +2[0*4+ 2]= 100 +4=104
Address of A[1][0] = 100 +2[1*4+ 0]= 100 +8=108
Addres of A[2][3]= 100 +2[2*4+3]= 100+22= 122

Representation of Multidimensional Arrays: In C, multidimensional arrays are represented using


the array-of-arrays representation The linear list is then stored in consecutive memory just as we store
a one-dimensional array.

If an n-dimensional array a is declared as a[upper0][upper1] ⋯ [uppern-1]; then the number of


elements in the array is: upper0*upper1*……uppern-1 also represented as

where Π is the product of the upperi's. For instance, if we declare a as a[10][10][10], then we require
10·10·10 = 1000 memory cell to hold the array. There are two common ways to represent
multidimensional arrays: row major order and column major order. We consider only row major order
here. As its name implies, row major order stores multidimensional arrays by rows.

Two dimensional arrays


 For instance, we interpret the two dimensional Array A[upper0][upper1] as upper0 rows, , each
row containing upper1 elements.
 If we assume that α the base address is the address of A[0][0]
 Then the address of an arbitrary element,
Address( a[i][j])= α + i·upper1 + j ---------- (1)
 Here the size is not considered. Considering the size the formula can be written as
Address (a[i][j]) = α + w(i·upper1 + j) where w is the size of each unit of memory location.

Representation of a three-dimensional array


 A[upper0][upper1][upper2], we interpret the array as upper0 two dimensional arrays of
dimension upper1 × upper2.

address of A[i][j][k]= α + i·upper1·upper2 + j·upper2 + k --------- (2)

Representation of a fourth-dimensional array


 A[upper0][upper1][upper2][upper3]
 We interpret the array as upper0 three dimensional arrays of dimension upper1 x upper2 x
upper3 .
address of A [i][j][k][l]= α + i·upper1·upper2.upper3 + j·upper2..upper3 + k.upper3 + l

Representation of an n-dimensional array


 Generalizing on the preceding discussion, we can obtain the addressing formula for any
element A[i0][i1] … [in-1] in an n dimensional array declared as:
A[upper0][upper1] … [uppern-1]
Dept. of ISE, SVIT Page 12
DSA/18CS32 Module-1

 If α is the address for A[0][0] … [0] the base address Then


Address of a[i0][0][0][0] … [0] = α + i0 upper1 upper2 … upper n-1
// address of a[i0][i1][0] … [0] = α+i0 upper1 upper2 … uppern-1 + i1 upper2 upper3 …
uppern-1
 Repeating in this way the address for A[i0][i1] … [in-1] is=

Array Operations: Operations that can be performed on any linear structure whether it is
anarray or a linked list include the following
a. Traversal- processing each element in the list
b. Search- Finding the location of the element with a given key.
c. Insertion- Adding a new element to the list
d. Deletion- Removing an element from the list.
e. Sorting- Arranging the elements in some type of order.
f. Merging- combining two list into a single list.

Traversing Linear Arrays: Traversing an array is accessing and processing each element exactly
once. Considering the processing applied during traversal as display of elements the array can be
traversed as follows
void displayarray(int a[])
{
int i;
printf("The Array Elements are:\n");
for(i=0;i<n;i++)
printf("%d\t",a[i]);
}

Insertion
 Inserting an element at the end of the array can be done provided the memory space allocated
for the array is large enough to accommodate the additional element.
 If an element needs to be inserted in the middle then all the elements form the specified position
to the end of the array should be moved down wards to accommodate the new element and to
keep the order of the other element.
 The following function inserts an element at the specified position

void insertelement(int item, int pos ,int *n, int a[],)


{
int i;
if (pos<0 || pos>n)
printf("Invalid Position\n");
Dept. of ISE, SVIT Page 13
DSA/18CS32 Module-1

else
{
for(i=n-1;i>=pos;i--)
a[i+1]=a[i]; //Make space for the new element in the given position

a[pos]=element;
*n++;
}
}

Deletion
 If an element needs to be deleted in the middle then all the elements form the specified position
to the end of the array should be moved upwards to fill up the array.
 The following function deletes an element at the specified position

void deleteelement(int a[],int pos,int* n)


{
int i;
if (pos<0 || pos>n-1)
printf("Invalid Position\n");
else
{
printf("The Deleted Element is %d\n",a[pos]);
for(i=pos;i<n;i++)
a[i]=a[i+1]; //Delete by pushing up other elements

*n--;
}
}

Sorting: Sorting refers to the operation of rearranging the elements of an array in increasing or
decreasing order.

Example: Write a program to sort the elements of the array in ascending order using bubble
sort.
#include<stdio.h>
void main()
{
int a[10],i,j,temp,n;
printf("enter the size of the array : ");
scanf("%d",&n);
printf("enter the elements of the array\n");
for(i=0;i<n;i++)
scanf("%d",&a[i]);

for(i=1;i<=n-1;i++)
for(j=0;j<n-i ;j++)
if (a[j] >a[j+1])
{
temp=a[j];
a[j]=a[j+1];

Dept. of ISE, SVIT Page 14


DSA/18CS32 Module-1

a[j+1]= temp;
}
printf("the sorted array is \n");
for(i=0;i<n;i++)
printf("%d \t",a[i]);
return(0);
}

Complexity of the bubble sort algorithm is O(n2)


 The time for sorting is measured in terms of the number of comparisons. In the bubble sort
there are n-1 comparison in the first pass which places the largest element in the last position.
 There are n-2 comparison in the second pass which places the second largest element in the
next to last position and so on, therefore
C(n)= (n-1) +(n-2) + 2+1= n(n-1)/2= n2/2 = O(n2)

Searching:
 Let DATA be a collection of data elements in memory and suppose a specific ITEM of
information is given.
 Searching refers to the operation of finding the Location LOC of the ITEM in DATA or
printing a message that the item does not appear here.
 The search is successful if the ITEM appear in DATA and unsuccessful otherwise.

The algorithm chosen for searching depends on the way the data is organised. The two algorithm
considered here is linear search and binary search.

LINEAR SEARCH: This program traverses the array sequentially to locate key
#include<stdio.h>
#include<stdlib.h>
void main()
{
int a[10],i,key,pos,n,flag=0;
printf("enter the size of the array : ");
scanf("%d",&n);
printf("enter the elements of the array\n");
for(i=0;i<n;i++)
scanf("%d",&a[i]);
printf("enter the key \n");
scanf("%d",&key);
for(i=0;i<=n-1;i++)
if (a[i]== key)
{
printf("key %d found at %d",key,pos+1);
exit();
}
printf("key not found");
}

Complexity of Linear search: The complexity is based on the number of comparison C(n) required
to find the key in the array element.
 The best case occurs when the key is found at first position. C(n)O(1)

Dept. of ISE, SVIT Page 15


DSA/18CS32 Module-1

 Worst case occurs when key element is not found in the array or when the element is in the
last position. Thus in worst case the running time is proportional to n C(n) O(n)
 The running time of the average case uses the probabilistic notation of expectation. Number
of comparison can be any number from 1 to n and each occurs with probability p= 1/n then
c(n) = 1.1/n +2.1/n+ .............. n.1/n
= (1+2+3… ..... +n).1/n
=n(n+1)/2.1/n=n+1/2

BINARY SEARCH: This algorithm is useful when the array is sorted.


For example when searching for a name in a telephone directory this algorithm is more efficient
than linear search as the number of element to search is reduced by half in each iteration.

#include<stdio.h>
#include<stdlib.h>
int main()
{
int a[10],i,key,mid,low,high,n;
printf("enter the size of the array : ");
scanf("%d",&n);
printf("enter the elements of the array in ascending order\n");
for(i=0;i<n;i++)
scanf("%d",&a[i]);
printf("enter the key \n");
scanf("%d",&key);

low=0;
high=n-1;
while(low<=high)
{
mid=(low+high)/2;
if (key==a[mid])
{
printf("element %d found at %d",key,mid+1);
exit(0);
}
else
{
if (key<a[mid])
high = mid-1;
else
low=mid+1;
}
}
printf("key not found");
return(0);
}

Complexity of binary search algorithm :


 The complexity is based on the number of comparison C(n) required to find the key in the array
element.
 Each comparison reduces the sample size in half so C(n) is of the order log 2 n +1

Dept. of ISE, SVIT Page 16


DSA/18CS32 Module-1

Limitation of binary search :


1. The list must be sorted
2. One must have a direct access to the middle element in any subset.

Polynomials: A polynomial is a sum of terms, where each term has a form axe, where x is the variable,
a is the coefficient, and e is the exponent.

Example for polynomials :


A(x) = 3x20 + 2x5 + 4
B(x) = x4 + 10x3 + 3x2 + 1

The largest (or leading) exponent of a polynomial is called its degree. Coefficients that are zero are
not displayed.
 Standard mathematical definitions for the sum and product of polynomials are:
 Assume that we have two polynomials

then

ADT Polynomial is objects: a set of ordered pairs of <ei, ai> where ai is Coefficients and ei is
Exponents, ei are integers >= 0

Functions:
for all poly,poly1,poly2 ∈ Polynomial,coef ∈ Coefficients, expon ∈ Exponents

Polynomial Zero() ::= return the polynomial, p(x) = 0


Boolean IsZero(poly) ::= if (poly) return FALSE else return TRUE
Coefficient ::= if (expon ∈ poly) return its coefficient else return zero
Coef(poly,expon)
Exponent LeadExp(poly) ::= return the largest exponent in poly
Polynomial ::= if (expon ∈ poly) return error else return the polynomial
Attach(poly,coef,expon) poly with the term <coef, expon> inserted
Polynomial ::= if (expon ∈ poly) return the polynomial poly with the term
Remove(poly,expon) whose exponent is expon deleted else return error
Polynomial ::= return the polynomial poly · coef · xexpon
SingleMult(poly,coef,expon)
Polynomial ::= return the polynomial poly1 + poly2
Add(poly1,poly2)
Polynomial ::= return the polynomial poly1 · poly2
Mult(poly1,poly2)
end Polynomial ::=

Polynomial Representation:
 A polynomial can be represented as an array of structures as follows.
 Only one global array, terms, is used to store all the polynomials.
 The C declarations needed are:

Dept. of ISE, SVIT Page 17


DSA/18CS32 Module-1

# define MAX_TERMS 100 /*size of terms array*/


typedef struct {
float coef;
int expon;
} polynomial;
polynomial terms[MAX_TERMS];
int avail = 0;

Consider the two polynomials


A(x) = 2x1000 + 1 and
B(x) = x4 + 10x3 + 3x2 + 1.
Figure below shows how these polynomials are stored in the array terms. The index of the first term
of A and B is given by startA and startB, respectively, finishA and finishB give the index of the last
term of A and B respectively. The index of the next free location in the array is given by avail.

For our example, startA = 0, finishA = 1, startB = 2, finishB = 5, and avail =6.

Start A Finish A startB FinishB avail

Coef 2 1 1 10 3 1
Exp 1000 0 4 3 2 0
0 1 2 3 4 5 6 7 8

Figure 1.3 Representation of polynomial in array

 There is no limit on the number of polynomials that we can place in terms.


 The total number of nonzero terms must not be greater than MAX_TERMS.

since A (x) = 2x 1000 + 1 uses only six units of storage: one for startA, one for finishA, two for the
coefficients, and two for the exponents. However, when all the terms are nonzero, the current
representation requires about twice as much space as the first one. This representation is useful only
when the number of non zero terms are more.

Polynomial addition
 C function that adds two polynomials, A and B to obtain the resultant polynomial D = A + B.
The polynomial is added term by term.
 The attach function places the terms of D into the array, terms starting at position avail,.
 If there is not enough space in terms to accommodate D, an error message is printed to the
standard error device and we exit the program with an error condition.

void padd(int startA,int finishA,int startB, int finishB, int *startD,int *finishD)
{
/ * add A(x) and B(x) to obtain D(x) */
float coefficient;
*startD = avail;
while (startA <= finishA && startB <= finishB)
{
switch(COMPARE(terms[startA].expon, terms[startB].expon))
{
case -1: attach(terms[startB].coef,terms[startB].expon);

Dept. of ISE, SVIT Page 18


DSA/18CS32 Module-1

startB++;
break;
case 0: coefficient = terms[startA].coef + terms[startB].coef;
if (coefficient)
attach(coefficient,terms[startA].expon);
startA++;
startB++;
break;
case 1: attach(terms[startA].coef,terms[startA].expon);
startA++;
}
}
while(startA <= finishA)
{
attach(terms[startA].coef,terms[startA].expon); /* add in remaining terms of A(x) */
startA++;
}

While(startB <= finishB)


{
attach(terms[startB].coef, terms[startB].expon); /* add in remaining terms of B(x) */
startB++;
}

*finishD = avail-1;
}

/* add a new term to the polynomial */

void attach(float coefficient, int exponent)


{
if (avail >= MAX_TERMS)
{
fprintf(stderr,"Too many terms in the polynomial\n") ;
exit(EXIT_FAILURE);
}
terms[avail].coef = coefficient;
terms[avail].expon = exponent;
avail++;
}

Time complexity analysis


 The number of non zero terms in A and in B are the most important factors in the time
complexity.
 The first loop can iterate m times and the second can iterate n times. So, the asymptotic
computing time of this algorithm is O(n +m).

Sparse Matrices
 If a matrix contains m rows and n columns the total number of elements in such a matrix is
m*n. If m equals n the matrix is a square matrix.

Dept. of ISE, SVIT Page 19


DSA/18CS32 Module-1

 When a matrix is represented as a two dimensional array defined as a[max_rows][max_cols],


we can locate each element quickly by writing a[i][j] where i is the row index and j is the
column index.
 Consider the matrix given below. It contains many zero entries, such a matrix is called a sparse
matrix
Col0 Col1 Col2 Col3 Col4 Col5
Row0 15 0 0 22 0 -15
Row1 0 11 3 0 0 0
Row2 0 0 0 -6 0 0
Row3 0 0 0 0 0 0
Row4 91 0 0 0 0 0
Row5 0 0 28 0 0 0

When a sparse matrix is represented as a two dimensional array space is wasted for example if we
have 1000x 1000 matrix with only 2000 non zero element, the corresponding 2 dimensional array
requires space for 1,000,000 elements

ADT Sparse Matrix objects: a set of triples, <row, column, value>, where row and column are
integers and form a unique combination, and value comes from the set item.
Functions:
for all a, b∈SparseMatrix, x∈item, i, j, maxCol, maxRow∈index

Sparse MatrixCreate(maxRow, maxCol) ::= return a SparseMatrix that can hold up to


maxItems = maxRow × maxCol and whose maximum
row size is maxRow and whose maximum column size
is maxCol.
Sparse MatrixTranspose(a) : = return the matrix produced by interchanging
the row and column value of every triple.
Sparse MatrixAdd(a, b) := if the dimensions of a and b are the same
return the matrix produced by adding corresponding
items, namely those with identical row and column
values else return error
Sparse MatrixMultiply(a, b) := if number of columns in a equals number of
rows in b return the matrix d produced by multiplying
a by b according to the formula:
d[i][j] =  (a[i][k]·b[k][j]) else return error.

Sparse Matrix Representation


 A Sparse matrix can be represented by using an array of triple <row, col, value >.
 In addition to ensure the operations terminate , it is necessary to know the number of rows and
columns, and the number of nonzero elements in the matrix. Putting all this information
together a sparse matrix can be created as follows
SparseMatrix Create(maxRow, maxCol) ::=
#define MAX_TERMS 101 /* maximum number of terms +1*/
typedef struct {
int col;
int row;
int value;
} term;
term a[MAX_TERMS] ;

Dept. of ISE, SVIT Page 20


DSA/18CS32 Module-1

Example

Col0 Col1 Col2 Col3 Col4 Col5 Row Col Value


Row0 15 0 0 22 0 -15 a[0] 6 6 8
Row1 0 11 3 0 0 0 a[1] 0 0 15
Row2 0 0 0 -6 0 0 a[2] 0 3 22
Row3 0 0 0 0 0 0 a[3] 0 5 -15
Row4 91 0 0 0 0 0 a[4] 1 1 11
Row5 0 0 28 0 0 0 a[5] 1 2 3
a[6] 2 3 -6
a[7] 4 0 91
a[8] 5 2 28

a) Two dimensional array b) Sparse matrix stored as triples

Figure 1.4 two dimensional array and its sparse matrix stored as triples

Write a program to store a sparse matrix in triplet form and search an element specified by
the user

#include<stdio.h>
#include<stdlib.h>
int main()
{
struct sparse
{
int r;
int c;
int v;
};
struct poly s[100];
int ele,i,j,k,n,m,key;
printf("enter the size of the array ; ");
scanf("%d %d",&m,&n);
k=1;

s[0].r=m;
s[0].c=n;
printf("\n enter the elements of the array\n");
for(i=0;i<m;i++)
for(j=0;j<n;j++)
{
scanf("%d",&ele);

if(ele !=0)
{
s[k].r=i;
s[k].c=j;
s[k].v= ele;
k++;

Dept. of ISE, SVIT Page 21


DSA/18CS32 Module-1

}
s[0].v=k-1;
}
for(i=0;i<=s[0].v;i++)
printf(" %d\t %d \t %d \n ",s[i].r, s[i].c, s[i].v);
printf(" enter the key to be searched");
scanf("%d",&key);
for(i=0;i<=s[0].v;i++)
if (key== s[i].v)
{
printf("element found at %d row and %d column",s[i].r,s[i].c);
exit(0);
}
printf("element not found ");
return(0);
}

Transposing a Matrix: To transpose a matrix we must interchange the rows and columns.
This means that each element a[i][j] in the original matrix becomes element b[j][i] in the
transposematrix.

The algorithm finds all the elements in column 0 and store them in row 0 of the transpose matrix, find
all the elements in column 1 and store them in row 1, etc." Since the original matrix was ordered by
rows and the columns were ordered within each row. The transpose matrix will also be arranged in
ascending order. The variable, currentb, holds the position in b that will contain the next transposed
term. The terms in b is generated by rows by collecting the nonzero terms from column i of a

The transpose b of the sparse matrix a of figure 1.4b is shown in figure 1.5

Row Col value


b[0] 6 6 8
b[1] 0 0 15
b[2] 0 4 91
b[3] 1 1 11
b[4] 2 1 3
b[5] 2 5 28
b[6] 3 0 22
b[7] 3 2 -6
b[8] 5 0 -15

Figure 1.5 Transpose of the matrix

Function to find the transpose of a sparse matrix

void transpose(term a[], term b[]) /* b is set to the transpose of a */


{
int n,i,j, currentb;
n = a[0].value; /* total number of elements */
b[0].row = a[0].col; /* rows in b = columns in a */
b[0] .col = a[0] .row; /* columns in b = rows in a */
b[0].value = n;

Dept. of ISE, SVIT Page 22


DSA/18CS32 Module-1

if (n > 0 ) /* non zero matrix */


{
currentb = 1;
for (i = 0; i < a[0].col; i++) /* transpose by the columns in a */
for (j = 1; j <= n; j++)
if (a[j].col == i) /* find elements from the current column */
{
b[currentb].row = a[j].col; /* element is in current column, add it to b */
b[currentb].col = a[j].row;
b[currentb].value = a[j].value;
currentb++;
}
}
}

Analysis of transpose: Hence, the asymptotic time complexity of the transpose algorithm is
O(columns·elements).

Algorithm to Transpose of a two dimensional arrya of size rows × columns


Input: two dimensional array A of size rows*columns
Output: two dimensional array B of size columns*rows that stores the transpose of A
for (i = 0; i < rows; i++)
for (j = 0; j < columns; j++)
b[j][i] = a[i][j];

time required is O(columns·rows)

Fast Transpose : We can transpose a matrix represented as a sequence of triples


inO(columns + elements) time. This algorithm, fastTranspose is listed below .

It first determines the number of elements in each column of the original matrix. This gives us the
number of elements in each row of the transpose matrix. From this information, we can determine the
starting position of each row in the transpose matrix. We now can move the elements in the original
matrix one by one into their correct position in the transpose matrix. We assume that the number of
columns in the original matrix never exceeds MAX_COL.

Program Fast Transpose

void fastTranspose(term a[], term b[]) /* the transpose of a is placed in b */


{
int rowTerms[MAX_COL], startingPos[MAX_COL];
int i,j, numCols = a[0].col, numTerms = a[0].value;
b[0].row = numCols;
b[0].col = a[0].row;
b[0].value = numTerms;
if (numTerms > 0) { /* nonzero matrix */
for (i = 0; i < numCols; i++)
rowTerms[i] = 0;
for (i = 1; i <= numTerms; i++)
rowTerms[a[i].col]++;
startingPos[0] = 1;

Dept. of ISE, SVIT Page 23


DSA/18CS32 Module-1

for (i = 1; i < numCols; i++)


startingPos[i] = startingPos[i-1] + rowTerms[i-1];

for (i = 1; i <= numTerms; i++)


{
j = startingPos[a[i].col]++;
b[j].row = a[i].col;
b[j].col = a[i].row;
b[j].value = a[i].value;
}
}
}

Analysis of Fast Transpose

 The first two for loops compute the values for rowTerms, the third for loop carries out the
computation of startingPos, and the last for loop places the triples into the transpose matrix.
These four loops determine the computing time of fastTranspose.
 The bodies of the loops are executed numCols, numTerms, numCols - 1, and numTerms times,
respectively. The computing time for the algorithm is O(columns + elements).
 However, transpose requires less space than fastTranspose since the latter function must
allocate space for the rowTerms and startingPos arrays.

Strings: A string is an array of characters that is delimited by the null character (\0).

Example1: Char s[100] = {“class”} ;


The string is internally represented as follows

C L A S S \0
S[0] S[1] S[2] S[3] S[4] S[5]

The same can also be declared as char s[]={“class”} ;

Using this declaration the compiler would have reserved just enough space to hold each character
word including the null character. In such cases we cannot store a string of length more than 5 in s

ADT string is Objects : a finite set of zero or more characters

Functions : for all s,t  string,i,j,m  non negative integers

String Null(m) ::= Return a string whose length is m characters long, but is initially set to
NULL. We write NULL as “”
Integer compare(s, t)::= If s equals t return 0
Else if s precedes t return -1
Else return +1
Boolean ISNull(s) ::= If (compare(s, NULL)) return FALSE
Else return TRUE
Integer Length(s) ::= If(compare(s, NULL))
Returns the number of characters in s else returns 0
String concat(s,t) ::= If(compare(t, NULL))
Return a string s whose elements are those of s followed by those of t

Dept. of ISE, SVIT Page 24


DSA/18CS32 Module-1

String substr(s, i, j) ::= If( (j>0) && (i+j-1)<length(s))


Return the string containing the characters of s at position i to i+j-1
Else return NULL

C provides several string functions which we access by including the header file string.h
Given below is a set of C string functions

char *strcat(char *dest, const char Appends the string pointed to, by src to the end of the
*src) string pointed to by dest.

char *strncat(char *dest, const char Appends the string pointed to, by src to the end of the
*src, size_t n) string pointed to, by dest up to n characters long.

int strcmp(const char *str1, const Compares the string pointed to, by str1 to the string
char *str2) pointed to bystr2.

int strncmp(const char *str1, const Compares first n characters


char *str2, size_t n) Returns<0 if str1<str2
0 is str1=str2
>0 if str1>str2

char *strcpy(char *dest, const char Copies the string pointed to, by src to dest and return dest
*src)

char *strncpy(char *dest, const char Copies n characters from the string pointed to,
*src, size_t n) by src to dest and returns dest

size_t strlen(const char *str) Returns the length of the string str . But not including the
terminating null character.

char *strchr(const char *str, int c) Returns pointer to the first occurrence of c in str . Returns
NULL if not present

char *strrchr(const char *str, int c) Returns pointer to the last occurrence of c in str . Returns
NULL if not present

char *strtok(char *str, const char Returns a token from string str . Tokens are separated
*delim) by delim.

char *strstr(char *str, const char Returns pointer to start of pat in str
*pat)

size_t strspn(const char *str, const Scan str for characters in spanset, returns the length of the
char *spanset) span

size_t strcspn(const char *str, const Scans str for character not in spanset, returns the length of
char *spanset) the span

char *strpbrk(const char *str, const Scan str for characters in spanset, returns pointer to first
char *spanset) occurrence of a character from spanset

Dept. of ISE, SVIT Page 25


DSA/18CS32 Module-1

Storing Strings
Strings are stored in three types of structures
1. Fixed Length structure
2. Variable Length structure with fixed maximums
3. Linked structures

1. Fixed length Storage, record oriented: In this structure each line of text is viewed as a record
where all records have the same length or have the same number of characters

Example: Assuming our record has a maximum of 12 characters per record the strings are stored as
follows

0 D A T A
1 S T R U C T U R E S
2 A N D
3 A P P L L I C A T I O N
4
5

Advantages:
 Ease of accessing data from any given record
 Ease of updating data in any given record( provided the length of the new data does not exceed
the record length

Disadvantages
 Time is wasted reading an entire record if most of the storage consist of in essential blank
spaces
 Certain records may require more space or data than available
 When the correction consist of more or fewer characters than original text, updation requires
the entire record to be changed( the disadvantage can be resolved by using array of pointers)

2. Variable Length storage with fixed maximum: The storage of variable length strings in memory
cells wth fixed lengths can be done in two ways
 Use a marker such as ($) to mark the end of the string
 List the length of the string as an additional field in the pointer array

Example :
0 5 D A T A $
1 11 S T R U C T U R E S $
2 4 A N D $
3 12 A P P L I C A T I O N $
4 0
5 0

3. Linked storage: Linked list is an ordered sequence of memory cells called nodes, where each node
stores two information the data and also stores the address of the next node in the list. Strings may be
stored in linked list as each node storing one character or a fixed number of characters and a link
containing the address of the node containing the next group of characters.

Dept. of ISE, SVIT Page 26


DSA/18CS32 Module-1

Example:

String insertion function

Example: Insert string t in string s at position 1

S= A m o b i l e ‘\0\

t= U t o ‘\0\

Initially
Temp= ‘\0\

Strncpy(temp,s,i)
a ‘\0\

Strcat(temp,t)
a U t o ‘\0\
Strcat(temp,(s+i))
a u T o m o b i L e ‘\0\

Consider two string str1 and str2 . insert srting str2 into str1 at position i.
# include<string.h>
# define max_size 100
Char str1[max_size];
Char str2 [max_size];

Void strins(char *s, char *t, int i)


{
Char str[max_size], *temp= string;

If (i<0 && i>strlen(s)


{
Printf(“ position is out of bound”);
Exit(0);
}
else
if (strlen(t))
{

Dept. of ISE, SVIT Page 27


DSA/18CS32 Module-1

strncpy(temp,s,i);
strcat(temp,t);
strcat(temp,(s+i));
strcpy(s,temp);
}}

Pattern matching : Consider two strings str and pat where pat is a pattern to be searched for in
stri. The easiest way to find if the pat is in str is by using the built in function strstr.

Example: If we have a declaration as follows


Char pat[max_size], str[max_size], *t;

The pattern matching can be carried out as follows


if((t= strstr(str,pat))
Printf(“The pattern found in the string is %s”,t);
Else
Printf(“ The pattern was not found”);

Since there are different methods to find pattern matching discussed below are two functions that
finds pattern matching in a more efficient way.

The easiest and the least efficient method in pattern matching is sequential search. The computing
time is of O(n.m).

Exhaustive patter matching is improved in nfind function by


⚫ Quitting when strlen(pat) is greater than the number of remaining characters
⚫ Compare the first and last character if pat and string before we check the remaining
characters

int nfind(char *string,char *pat)


{
Int i,j,start=0;
int lasts=strlen(string)-1;
int lastp=strlen(pat)-1;
int endmatch= lastp

for(i=0;endmatch<=lasts;endmatch++,strt++)
{
If(string[endmatch] == pat[lastp])
{
j=0;i= start;
while(j<lastp && string[i]== pat[j])
{
i++;
j++);
}
}
if(j==lastp)
return start;
}
return -1

Dept. of ISE, SVIT Page 28


DSA/18CS32 Module-1

}
Simulation of nfind
Pattern

a A b

j lastp

a B a b b A a b a a

s em ls
No Match

a B a b b A a b a a

S em ls
No Match

a B a b b A a b a a

s i em ls

No Match

a B a b b A a b a a

s Em ls

No Match

a B a b b A a b a a

s em ls
No Match

A B a b b A a b a a

S em ls
Match

Analysis of nfind algorithm: The speed of the program is linear O(m) the length of the string in the
best and average case but the worst case computing is still O(n.m)

Dept. of ISE, SVIT Page 29


DSA/18CS32 Module-1

Knuth, Morris, pratt pattern matching algorithm


 Ideally we would like an algorithm that works in O(strlen(str)+ strlen(pat)) time. This is
optimal for this problem as in the worst cast it is necessary to look at all characters in the
pattern and string at least once.
 We want to search the string for the pattern without moving backwards in the string. That is if
a mismatch occurs we want to use the knowledge of the characters in the pattern and the
position in the pattern where the mismatch occurred to determine where the search should
continue. Knuth, Morris, and pratt have developed an algorithm that works in this way and
has linear complexity.

The following declarations are assumed.


#define max_string_size 100
#define max_pat_size 100

int pmatch(char *string,char *pat)


{
int i=0,j=0;
int lens= strlen(string);
int lenp= strlen(spat);
while (i<lens && j<lenp)
{
if (string[i] == pat[j])
{
i++; j++; }
else if (j==0) i++;
else j= failure[j-1] +1;
}
return ((j=lenp) ? (i-lenp) : -1);
}

Example: For the pattern pat=abcabcacab we have the failure values calculated as below

j 1 2 3 4 5 6 7 8 9 10

pat a b c a b c a c a b
failure 0 0 0 1 2 3 1 3 1 2

Analysis of pmatch: The time complexity of function pmatch is O (m) = O(strlen(string))

Analysis of fail: The computing time of fails is O(n)=O(strlem(pa)).

Therefore when the failure function is not known in advance the total computing time is
O(strlen(string)) + O(strlem(pa))

Dept. of ISE, SVIT Page 30

You might also like