Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

DSA-Chapter 1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 25

Chapter 1

Introduction to Data Structures and algorithms


Data Structures are the programmatic way of storing data so that data can be used efficiently.

Why to Learn Data Structure and Algorithms?

As applications are getting complex and data rich, there are three common problems that
applications face now-a-days.

• Data Search − Consider an inventory of 1 million(106) items of a store. If the application


is to search an item, it has to search an item in 1 million(106) items every time slowing
down the search. As data grows, search will become slower.
• Processor speed − Processor speed although being very high, falls limited if the data
grows to billion records.
• Multiple requests − As thousands of users can search data simultaneously on a web
server, even the fast server fails while searching the data.

To solve the above-mentioned problems, data structures come to rescue. Data can be organized
in a data structure in such a way that all items may not be required to be searched, and the
required data can be searched almost instantly.

What is abstract data type?

An abstract data type is an abstraction of a data structure that provides only the interface to
which the data structure must adhere. The interface does not give any specific details about
something should be implemented or in what programming language.

In other words, we can say that abstract data types are the entities that are definitions of data and
operations but do not have implementation details. In this case, we know the data that we are
storing and the operations that can be performed on the data, but we don't know about the
implementation details. The reason for not having implementation details is that every

1|Page
programming language has a different implementation strategy for example; a C data structure is
implemented using structures while a C++ data structure is implemented using objects and
classes.

For example, a List is an abstract data type that is implemented using a dynamic array and
linked list. A queue is implemented using linked list-based queue, array-based queue, and stack-
based queue. A Map is implemented using Tree map, hash map, or hash table.

Abstract data type model

Before knowing about the abstract data type model, we should know about abstraction and
encapsulation.

Abstraction: It is a technique of hiding the internal details from the user and only showing the
necessary details to the user.

Encapsulation: It is a technique of combining the data and the member function in a single unit is
known as encapsulation.

There are two types of models in the ADT model, i.e., the public function and the private
function. The ADT model also contains the data structures that we are using in a program. In this
model, first encapsulation is performed, i.e., all the data is wrapped in a single unit, i.e., ADT.
Then, the abstraction is performed means showing the operations that can be performed on the
data structure and what are the data structures that we are using in a program.

Let's understand the abstract data type with a real-world example.

If we consider the smartphone. We look at the high specifications of the smartphone, such as:

• 4 GB RAM
• Snapdragon 2.2ghz processor
• 5 inch LCD screen
• Dual camera
• Android 8.0

2|Page
The above specifications of the smartphone are the data, and we can also perform the following
operations on the smartphone:

• call(): We can call through the smartphone.


• text(): We can text a message.
• photo(): We can click a photo.
• video(): We can also make a video.

The smartphone is an entity whose data or specifications and operations are given above. The
abstract/logical view and operations are the abstract or logical views of a smartphone.

The implementation view of the above abstract/logical view is given below:

1. class Smartphone
2. {
3. private:
4. int ramSize;
5. string processorName;
6. float screenSize;
7. int cameraCount;
8. string androidVersion;
9. public:
10. void call();
11. void text();
12. void photo();
13. void video();
14. }

The above code is the implementation of the specifications and operations that can be performed
on the smartphone. The implementation view can differ because the syntax of programming
languages is different, but the abstract/logical view of the data structure would remain the same.
Therefore, we can say that the abstract/logical view is independent of the implementation view.

3|Page
Note: We know the operations that can be performed on the predefined data types such as int,
float, char, etc., but we don't know the implementation details of the data types. Therefore, we
can say that the abstract data type is considered as the hidden box that hides all the internal
details of the data type.

Applications of Data Structure and Algorithms

Algorithm is a step-by-step procedure, which defines a set of instructions to be executed in a


certain order to get the desired output. Algorithms are generally created independent of
underlying languages, i.e. an algorithm can be implemented in more than one programming
language.

From the data structure point of view, following are some important categories of algorithms −

• Search − Algorithm to search an item in a data structure.


• Sort − Algorithm to sort items in a certain order.
• Insert − Algorithm to insert item in a data structure.
• Update − Algorithm to update an existing item in a data structure.
• Delete − Algorithm to delete an existing item from a data structure.

Data Structure is a systematic way to organize data in order to use it efficiently. Following terms
are the foundation terms of a data structure.

• Interface − Each data structure has an interface. Interface represents the set of operations
that a data structure supports. An interface only provides the list of supported operations,
type of parameters they can accept and return type of these operations.
• Implementation − Implementation provides the internal representation of a data
structure. Implementation also provides the definition of the algorithms used in the
operations of the data structure.

Characteristics of a Data Structure

4|Page
• Correctness − Data structure implementation should implement its interface correctly.
• Time Complexity − Running time or the execution time of operations of data structure
must be as small as possible.
• Space Complexity − Memory usage of a data structure operation should be as little as
possible.

Execution Time Cases

There are three cases which are usually used to compare various data structure's execution time
in a relative manner.

• Worst Case – is the function which performs the maximum number of steps on input
data of size n.
• Average Case – is the function which performs an average number of steps on input data
of n elements.
• Best Case – is the function which performs the minimum number of steps on input data
of n elements.

Data Structures and Algorithms – Arrays

An array works like a variable that can store a group of values, all of the same type.

Following are the important terms to understand the concept of Array.

• Element − Each item stored in an array is called an element.


• Index − Each location of an element in an array has a numerical index, which is used to
identify the element.

Syntax

Creating an array in C and C++ programming languages −

data_type array_name[array_size] = {elements separated using commas}


or,

5|Page
data_type array_name[array_size];

Need for Arrays

Arrays are used as solutions to many problems from the small sorting problems to more complex
problems like travelling salesperson problem. There are many data structures other than arrays
that provide efficient time and space complexity for these problems, so what makes using arrays
better? The answer lies in the random access lookup time. Array is a random access data
structure, where each element can be accessed directly and in constant time. A typical illustration
of random access is a book-each page of the book can be open independently of others. Random
access is critical to many algorithms, for example binary search. A linked list is a sequential
access data structure, where each element can be accessed only in a particular order. A typical
illustration of sequential access is a roll of paper or tape-all prior material unrolled in order to get
data you want.

Arrays provide O(1) random access lookup time. That means, accessing the 1st index of the array
and the 1000th index of the array will both take the same time. This is due to the fact that array
comes with a pointer and an offset value. The pointer points to the right location of the memory
and the offset value shows how far to look in the said memory.

array_name[index]
| |
Pointer Offset

Therefore, in an array with 6 elements, to access the 1st element, array is pointed towards the 0th
index. Similarly, to access the 6th element, array is pointed towards the 5th index.

Array Representation

Arrays are represented as a collection of buckets where each bucket stores one element. These
buckets are indexed from ‘0’ to ‘n-1’, where n is the size of that particular array. For example, an
array with size 10 will have buckets indexed from 0 to 9.

As per the above illustration, following are the important points to be considered.

6|Page
• Index starts with 0.
• Array length is 9 which means it can store 9 elements.
• Each element can be accessed via its index. For example, we can fetch an element at
index 6 as 23.

Basic Operations in the Arrays

The basic operations in the Arrays are insertion, deletion, searching, display, traverse, and
update. These operations are usually performed to either modify the data in the array or to report
the status of the array.

Following are the basic operations supported by an array.

• Traverse − print all the array elements one by one.


• Insertion − Adds an element at the given index.
• Deletion − Deletes an element at the given index.
• Search − Searches an element using the given index or by the value.
• Update − Updates an element at the given index.
• Display − Displays the contents of the array.

Insertion Operation

In the insertion operation, we are adding one or more elements to the array. Based on the
requirement, a new element can be added at the beginning, end, or any given index of array. This
is done using input statements of the programming languages.

Algorithm

Following is an algorithm to insert elements into a Linear Array until we reach the end of the
array −

1. Start
2. Create an Array of a desired datatype and size.
3. Initialize a variable ‘i’ as 0.

7|Page
4. Enter the element at ith index of the array.
5. Increment i by 1.
6. Repeat Steps 4 & 5 until the end of the array.
7. Stop

Example
#include <iostream>
using namespace std;
int main(){
int LA[5] = {}, i;
cout << "Array Before Insertion:" << endl;
for(i = 0; i < 5; i++)
cout << "LA[" << i <<"] = " << LA[i] << endl;

//prints garbage values


cout << "Inserting elements.." <<endl;
cout << "Array After Insertion:" << endl; // prints array values
for(i = 0; i < 5; i++) {
LA[i] = i + 2;
cout << "LA[" << i <<"] = " << LA[i] << endl;
}
return 0;
}

Output
Array Before Insertion:
LA[0] = 0
LA[1] = 0
LA[2] = 0
LA[3] = 0
LA[4] = 0
Inserting elements..
Array After Insertion:
LA[0] = 2
LA[1] = 3
LA[2] = 4
LA[3] = 5
LA[4] = 6

8|Page
Deletion Operation

In this array operation, we delete an element from the particular index of an array.

Algorithm

Consider LA is a linear array with N elements and K is a positive integer such that K<=N.
Following is the algorithm to delete an element available at the Kth position of LA.

1. Start
2. Set J = K
3. Repeat steps 4 and 5 while J < N
4. Set LA[J] = LA[J + 1]
5. Set J = J+1
6. Set N = N-1
7. Stop

Example
#include <iostream>
using namespace std;
int main(){
int LA[] = {1,3,5};
int i, n = 3;
cout << "The original array elements are :"<<endl;
for(i = 0; i<n; i++) {
cout << "LA[" << i << "] = " << LA[i] << endl;
}
for(i = 1; i<n; i++) {
LA[i] = LA[i+1];
n = n - 1;
}
cout << "The array elements after deletion :"<<endl;
for(i = 0; i<n; i++) {
cout << "LA[" << i << "] = " << LA[i] <<endl;
}
}

9|Page
Output
The original array elements are:
LA[0] = 1
LA[1] = 3
LA[2] = 5
The array elements after deletion:
LA[0] = 1
LA[1] = 5

Search Operation

Searching an element in the array using a key; The key element sequentially compares every
value in the array to check if the key is present in the array or not.

Algorithm

Consider LA is a linear array with N elements and K is a positive integer such that K<=N.
Following is the algorithm to find an element with a value of ITEM using sequential search.

1. Start
2. Set J = 0
3. Repeat steps 4 and 5 while J < N
4. IF LA[J] is equal ITEM THEN GOTO STEP 6
5. Set J = J +1
6. PRINT J, ITEM
7. Stop

10 | P a g e
Example

#include <iostream.h>

int main(){

int LA[] = {1,3,5,7,8};

int item = 5, n = 5;

int i = 0, j = 0;

cout<<"The original array elements are"<<endl;

for(i = 0; i<n; i++) {

cout<<LA[i]<<endl;

for(i = 0; i<n; i++) {

if( LA[i] == item ) {

cout<<"Found element" <<item << "at position" <<i+1;

return 0;

Output

The original array elements are

11 | P a g e
8

Found element5at position3

Traversal Operation

This operation traverses through all the elements of an array. We use loop statements to carry
this out.

Algorithm

Following is the algorithm to traverse through all the elements present in a Linear Array −

1 Start
2. Initialize an Array of certain size and datatype.
3. Initialize another variable ‘i’ with 0.
4. Print the ith value in the array and increment i.
5. Repeat Step 4 until the end of the array is reached.
6. End

Example
#include <iostream>
using namespace std;
int main(){
int LA[] = {1,3,5,7,8};
int n = 5;
int i = 0;
cout << "The original array elements are:\n";
for(i = 0; i<n; i++)
cout << "LA[" << i << "] = " << LA[i] << endl;
return 0;
}

Output
The original array elements are:
LA[0] = 1
LA[1] = 3
LA[2] = 5

12 | P a g e
LA[3] = 7
LA[4] = 8

Update Operation

Update operation refers to updating an existing element from the array at a given index.

Algorithm

Consider LA is a linear array with N elements and K is a positive integer such that K<=N.
Following is the algorithm to update an element available at the Kth position of LA.

1. Start
2. Set LA[K-1] = ITEM
3. Stop

Example
#include <iostream>
using namespace std;
int main(){
int LA[] = {1,3,5,7,8};
int item = 10, k = 3, n = 5;
int i = 0;
cout << "The original array elements are :\n";
for(i = 0; i<n; i++)
cout << "LA[" << i << "] = " << LA[i] << endl;
LA[2] = item;
cout << "The array elements after updation are :\n";
for(i = 0; i<n; i++)
cout << "LA[" << i << "] = " << LA[i] << endl;
return 0;
}

Output
The original array elements are :
LA[0] = 1
LA[1] = 3
LA[2] = 5

13 | P a g e
LA[3] = 7
LA[4] = 8
The array elements after updation are :
LA[0] = 1
LA[1] = 3
LA[2] = 10
LA[3] = 7
LA[4] = 8

Display Operation

This operation displays all the elements in the entire array using a print statement.

Algorithm
1. Start
2. Print all the elements in the Array
3. Stop

Example
#include <iostream>
using namespace std;
int main(){
int LA[] = {1,3,5,7,8};
int n = 5;
int i;
cout << "The original array elements are :\n";
for(i = 0; i<n; i++)
cout << "LA[" << i << "] = " << LA[i] << endl;
return 0;
}

Output
The original array elements are :
LA[0] = 1
LA[1] = 3
LA[2] = 5
LA[3] = 7
LA[4] = 8

14 | P a g e
Properties of Algorithm:

An algorithm is an effective, efficient method that can use to express the solution to any problem
within a finite amount of space. It is also a well-defined formal language. There are five
properties of an algorithm as given below:

1. Input: An algorithm should have some inputs.

2. Output: At least one output should be returned by the algorithm after the completion of the
specific task based on the given inputs.

3. Definiteness: Every statement of the algorithm should be unambiguous.

4. Finiteness: No infinite loop should be allowed in an algorithm.


Example:

• while(1<2)
• {
• number=number/2;
• }

5. Effectiveness: Writing an algorithm is a priori process of actual implementation of the


algorithm. So, a person should analyze the algorithm in a finite amount of time with a pen and
paper to judge the performance for giving the final version of the algorithm.

15 | P a g e
Algorithm Analysis Concepts

Complexity analysis is concerned with determining the efficiency of algorithms.

How do we measure the efficiency of algorithms?

1. Empirical (Computational) Analysis

• Here the total running time of the program is considered.

• It uses the system time to calculate the running time and it can’t be used for measuring
efficiency of algorithms.

• This is because the total running time of the program algorithm varies on the:
 Processor speed
 Current processor load
 Input size of the given algorithm
 and software environment (multitasking, single tasking…)

What is Efficiency depends on?

• Execution speed (most important)

• Amount of memory used

• Efficiency differences may not be noticeable for small data, but become very important
for large amounts of data

2. Theoretical (Asymptotic Complexity) Analysis

• Consider t = f(n) = n2 + 5n ; For all n>5, n2 is largest, and for very large n, the 5n term is
insignificant

• Therefore we can approximate f(n) by the n2 term only. This is called asymptotic
complexity

16 | P a g e
• Used when it is difficult or unnecessary to determine true computational complexity

• Usually it is difficult/impossible to determine computational complexity. So, asymptotic


complexity is the most common measure

Guidelines for Asymptotic Analysis

Loops

for(i=1;i<=n;i++)

// statement loop executes n times

Total time= O(n) linear growth

Nested Loops

for(i=1;i<=n;i++){

for(j=1;j<=n;j++){

//statement

}} Outer loop executes n times

Inner loop executes n times

Total time= n*n=O(n2 ) quadratic growth

Consecutive Statement

int x=2;

int i; 3 units

x=x+1;

17 | P a g e
for(i=1;i<=n;i++){

//statement

} n times

for(i=1;i<=n;i++){

for(j=1;j<=n;j++){

//statement

}} n2 times

Total time=n2+n+3

Execution of the following operations takes time unit 1

Assignment statement Eg. Sum=0;

Single I/O statement;. E.g. cin>>sum; cout<<sum;

Single Boolean statement. E.g. !done

Single arithmetic. E.g. a+b

Function return. E.g. return(sum);

IF-THEN-ELSE statement

if(n==0){

//statement

else{

18 | P a g e
for(i=1;i<=n;i++)

//statement

} Total time=Test +if part or else part

If part n==0 constant time

Total time =1+1=O(1)

Else part n==0 constant time

Total time=1+n=O(n)

Logarithmic complexity

Log2(8) how many times, 2 has been multiplied by itself in order to obtain value 8.

for(i=1;i<=n;){

//statement

i=i*2;}

Iteration 1 i=1 20
Iteration 2 i=2 21
Iteration 3 i=4 22
Iteration 4 i=8 23
Iteration 5 i=16 24
------------
Iteration k i=n 2k-1

n=2k-1 apply log on both sides

log2n= log22k-1

19 | P a g e
k-1= log2n k= log2n+1 Total time=O(log2n)

Examples

Calculate T(n) for the following

1. k=0;

Cout<<“enter an integer”;

Cin>>n;

For (i=0;i<n;i++)

K++

T(n) = 1+1+1+ (1+n+1+ n +n)

=5+3n

2. i=0;

While (i<n)

x++;

i++;

J=1;

While(j<=10)

20 | P a g e
x++;

j++

T(n)=1+n+1+n+n+1+11+10+10

=3n+34

Space complexity of an algorithm

The amount of space that an algorithm requires.

S(P)= C+SP

C=constant part or independent part

SP=variable part or dependent part

Examples

Sum(a,b,c){

a=10;

b=20;

c=a+b;

a=1 time unit

b=1 time unit

c=1 time unit S(P)=3+0=3 O(1)

21 | P a g e
Sum(a,n)

total=0;

For i=0 to n do

total= total+a[i];

S(P)=3+5*n=O(n)

Constants= n,total,i variable= a

Algorithm Analysis Categories

Algorithm must be examined under different situations to correctly determine their efficiency for
accurate comparisons

Best Case Analysis:

Assumes that data are arranged in the most advantageous order. It also assumes the minimum
input size.

E.g.

✓ For sorting – the best case is if the data are arranged in the required order.
✓ For searching – the required item is found at the first position.

Note: Best Case computes the lower boundary of T(n)

It causes fewest number of executions.

Worst case analysis:

Assumes that data are arranged in the disadvantageous order.

22 | P a g e
It also assumes that the input size is infinite.

Eg.

✓ For sorting – data are arranged in opposite required order


✓ For searching – the required item is found at the end of the item or the item is missing

It computes the upper bound of T(n) and causes maximum number of executions.

Average case Analysis

Assumes that data are found in random order.

It also assumes random or average input size.

E.g.

✓ For sorting – data are in random order.


✓ For searching – the required item is found at any position or missing.

It computes optimal bound of T(n)

It also causes average number of executions.

Best case and average case cannot be used to estimate (determine) complexity of algorithms

Worst case is the best to determine the complexity of algorithms. (it gives an upper bound on the
resources required by the algorithm. In the case of running time, the worst-case time complexity
indicates the longest running time performed by an algorithm given any input of size n, and thus
guarantees that the algorithm will finish in the indicated period of time).

Asymptotic Notations

Following are the commonly used asymptotic notations. It calculates the running time
complexity of an algorithm.

23 | P a g e
1. Big Oh Notation, O

The notation O(n) is the formal way to express the upper bound of an algorithm’s running time.
It measures the worst-case time complexity or the longest amount of time an algorithm can
possibly take to complete.

2. Omega Notation, Ω

The notation Ω(n) is the formal way to express the lower bound of an algorithm’s running time.
It measures the best-case time complexity or the best amount of time an algorithm can possibly
take to complete.

3. Theta Notation, Ѳ

The notation Ѳ(n) is the formal way to express the lower bound and the upper bound of an
algorithm’s running time with average case time complexity.

Big O Analysis

24 | P a g e
25 | P a g e

You might also like