Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
4 views

Data Structure Unit-1

Uploaded by

tushikasahu5
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Data Structure Unit-1

Uploaded by

tushikasahu5
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 74

DATA STRUCTURES

An Introduction
Basic Terminology
• Data: Data may be a single value or it may be a set of values.
• Information: Meaningful or Processed data is called
Information.
• Record is a collection of related data item.
• File is a collection of logically related records.
• Entity
– is a person, place, thing, event or concept about which information is
recorded.
– has certain attributes or properties which may be assigned values.
• Attributes gives the characteristics of the entity.
• Entity set: Entities with similar attributes forms an Entity Set.
• Range is a set of all possible values that could be assigned to a
particular attribute.
2
Data Structures
• Logical or mathematical model of a particular
organization of data is called a Data Structure.
• Data structures are the building blocks of the program.
• The selection of a particular data structure stresses on
following:
– The data structure must be rich enough in structure to
reflect the relationship existing between the data.
– The structure should be so simple that data can be
processed effectively whenever required.

ALGORITHM + DATA STRUCTURE = PROGRAM


3
Classification of Data Structures
• Data structures are normally divided into two
broad categories:
– Primitive data structures
• Basic data structures that are directly operated upon by
machine instruction.
• Available in most programming languages as built-in
types.
• E.g. int, float, char, pointer
– Non-primitive data structures
• These data structures are a set of homogenous and
heterogeneous data elements stored together.
4
Types of Data Structure

5
Non-primitive data structures
• These are further classified as:
• Linear data structure
– A data structure is said to be linear if its elements
forms any sequence
• Non-linear data structure
– Represents data containing hierarchical
relationship between elements e.g. trees, graphs

6
7
8
Data Structure Operations
• The choice of data structure depends on the
frequency with which specific operations are
performed.
• Operations that can be performed are:
– Traversing
– Searching
– Insertion
– Deletion
– Sorting
– Merging
9
• Traversing
– Accessing each record exactly once so that certain items in
the record may be processed.
• Searching
– Finding the location of the record with a given key value, or
finding the location of all records satisfying one or more
conditions.
• Insertion
– Adding a new record to the structure.
• Deletion
– Removing a record from a structure.
10
• Sorting
– Arranging the records in some logical order
• Merging
– Combining the records in two different sorted files
into a single sorted file.

11
Data types
• Each variable in C has its associated data type.
• Each data type requires different amount of memory.
• Some commonly known basic data types are:
– int
• Used to store an integer
• Requires 2 bytes of memory
– char
• Stores a single character
• Requires one byte of memory
– float
• Used to store decimal numbers with single precision
– double
• Used to store decimal numbers with double precision 12
13
14
15
Algorithm
• Algorithm is a step-by-step procedure, which defines a set of
instructions to be executed in a certain order to get the desired
output.
• An algorithm is a sequence of steps to solve a problem.

• An algorithm can be expressed in English like language, called


Pseudocode.

• There may be more than one algorithms to solve a problem.

• The choice of a particular algorithm depends on the following


considerations:
– Memory requirements (Space complexity)
– Performance requirements (Time Complexity)
16
Complexity of Algorirthms
• Space Complexity
– It is the amount of memory needed to run to
completion.
• Time Complexity
– It is the amount of time needed to run to
completion

17
Characteristics of an algorithm
An algorithm should have the following characteristics:

• Definiteness/ Unambiguity
– Each step of the algorithm must be clearly and precisely defined and there should not
be any ambiguity.
• Input
– An algorithm must have zero or more but finite number of inputs
• Output
– An algorithm must have one desirable output.
• Finiteness
– An algorithm must always terminate after a finite number of steps in finite amount of
time.
• effectiveness
– An algorithm should be effective.
– Each of the operation to be performed in an algorithm must be sufficiently basic that it
can be done exactly and in a finite length of time
• Independent
– An algorithm should have step-by-step directions, which should be independent of any
18
programming code.
Algorithmic Notations
• The format for the formal presentation of an
algorithm consists of two parts:
– First part is a paragraph which tells:
• the purpose of the algorithm
• identifies the variables which occur in the algorithm
• lists the input data
– The second part of the algorithm consists of the
lists of steps that is to be executed.

19
An Example Algorithm
Problem − Design an algorithm to add two numbers and display the
result.
• Step 1 − START
• Step 2 − declare three integers a, b & c
• Step 3 − define values of a & b
• Step 4 − add values of a & b
• Step 5 − store output of step 4 to c
• Step 6 − print c Step 7 − STOP

Algorithm

Step 1 − START ADD


Step 2 − get values of a & b
Step 3 − c ← a + b
Step 4 − display c
Step 5 − STOP

20
An Example Algorithm
A non-empty array DATA with N numerical values is given. Find the
location LOC and the value MAX of the largest element of DATA.

Algorithm: Given a nonempty array DATA with N numerical values, this


algorithm finds the location LOC and the value MAX of the largest
element of DATA. The variable K is used as a counter.

21
Steps, Control, Exit
• The steps of the algorithm are executed one
after the other, beginning with step 1.

• Control may be transferred to step n by the


statement “Go to step n”.

• If several statements appear in the same step,


• e. g. Set K : = 1, LOC : =1 and MAX : =DATA[1].
• They are executed from left to right.

• The algorithm is completed when the


statement “Exit” is encountered.
• Comments
– Each step may contain a comment in brackets which indicates the main
purpose of the step.
• Variable Names
– Variable names will use capital letters even though lowercase may be used
for these same variables.
• Assignment statements
– These statements will use dots-equal notation :=
• E.g. MAX:=DATA[1]
• Assigns the value of DATA[1] to MAX
• Input and Output
– Data may be read or may be output by means of read and write statements.
• Read: Variable names
• Write: Messages and/or variable names
• Procedures
– Used for independent algorithmic module (or subalgorithm) which solves a
particular problem

23
Why do we need Algorithms?

We need algorithms because of the following


reasons:
• Scalability: It helps us to understand the scalability.
When we have a big real-world problem, we need
to scale it down into small-small steps to easily
analyze the problem.
• Performance: The real-world is not easily broken
down into smaller steps. If the problem can be
easily broken into smaller steps means that the
problem is feasible.
24
Control Structures
• Algorithms mainly uses three types of logic or flow of
control such as:
– Sequence Logic, or sequential flow
– Selection Logic, or conditional flow
– Iteration Logic, or repetitive flow
• Sequential Logic

25
Selection Logic
• Selecting on out of several alternative modules.
• These are called conditional structures
• End of such statement can be indicated by
statement:
[End of If Structure.]

• These structures are divided into three categories:


• Single alternative
• Double alternative
• Multiple alternative 26
• Single Alternative

• Double Alternative

• Multiple Alternative

27
Iteration Logic
• Begins with a Repeat statement
• Followed by a module called body of loop
• End of such statement can be indicated by
statement:
[End of loop.]

28
29
Algorithm: Quadratic Equation

30
Complexity of Algorithms
• To measure the efficiency of algorithms, we
must have some criteria.
• Time and Space are the two main measures
for the efficiency of an algorithm i.e.
– Time Complexity
– Space Complexity

31
• The complexity of an algorithm M is the
function f(n) which gives the running time and
storage space requirement of the algorithm in
terms of size n of the input data.

• In simple words, the complexity of the


algorithm will depend on the number of
statement executed.
• The total number of statements executed will
depend on conditional statements.

32
Example
• E.g.
i=0; // (1 time)
while (i<n) // (n+1 times)
{
printf(“%d”,&i); // (n times)
i=i+1;// (n times)
}

• Total number of executions


= 1+(n+1)+(n)+(n)
= 3n+2

• If we ignore constants, complexity of the order n.


• Hence the complexity,
O(n) //Big-Oh Notation
33
Finding the complexity
• There are three cases to find the complexity:
– Worst case: maximum value of f(n) for any possible input
– Average case: expected value of f(n).
– Sometimes Best case can also be considered as
minimum possible value of f(n).
• E.g.
– number n1, n2, ……., nk occur with respective probabilities
p1, p2, ……., pk.
– Expected or Average value E is given by:
E=n1p1 + n2p2 + ……. + nkpk.
34
Linear Search

35
• The complexity of the searching algorithm is given by the number
C of comparisons between ITEM and DATA[K].

• Worst case
– When ITEM is the last element in the array DATA.
– When ITEM does not exist in the list.
– Then, C(n)=n

• Average case
– It is equally likely to occur at any position in the array.
– The number of comparisons can be any number 1,2,3,….,n
– Each number occurs with probability p=1/n.

36
Rate of Growth: Big O Notation
• Suppose,
– M is an algorithm
– n is the size of input data
• Then, complexity f(n) of M increases as n
increases.

37
Rate of Growth: Big O Notation

If f(n) <=c.g(n) where c is constant

38
• Suppose f(n) and g(n) are the functions defined on positive
integers.
• F(n) is bounded by some multiple of g(n) for all n.

• There exists a positive integer n0 and a positive number M such


that for all n>no, we have,
|f(n)| <= M|g(n)|

• Then, f(n) = O(g(n))


– It can be read as “f(n) is of order g(n)”.
– E.g.

39
Omega Notation (Ω)
• The Big-O notation defines an upper bound
function g(n) for f(n) which represents the
time/space complexity of the algorithm.
• In Omega notation, the function g(n) defines
the lower bound for function f(n).
• There exists a positive integer n0 and a positive
number M such that for all n>no, we have,
|f(n)| >= M|g(n)|

40
Omega Notation (Ω)

If f(n) >= c.g(n) where c is constant

41
Theta Notation (θ)
• It is used when function f(n) is bounded both
from above and below by the function

42
Theta Notation (θ)

If c.g(n) <= f(n) <=c2.g(n) 43


Arrays
• An array is a finite set of homogenous data elements.
• Stored in consecutive memory locations.
• The elements of array are referenced respectively by an index set
consisting of n consecutive numbers.

• The number n of elements is called the length or size of the array.


Length=UB-LB+1

Where,
• UB – largest index, called Upper Bound
• LB – smallest index, called Lower bound

Length=UB when LB=1


44
…continued
• The elements of array A may be denoted by:
– Subscript notation
A1, A2, A3, ……., An
– Parenthesis notation
A(1), A(2), …… , A(N)
– Bracket notation
A[1], A[2], A[3], …… ,A[N]

• The number K in A[K] is called subscript or index.

• A[K] is called subscripted variable. 45


Representation of Array

Example

46
Representation of Array in memory
• Let LA be a linear array in memory.
– LOC(LA[K])=address of the element LA[K] of array
LA
• Computer keeps track of address of first
element of LA only, called Base address
• Base(LA)

LOC(LA[K]) = Base(LA) + w(K-lower bound)


• w is the no. of words per memory cell for LA
47
Example

48
Operations on Arrays
• Traversing
– Accessing or processing (visiting) each element of array exactly
once
• Insertion
– To insert an element into array
• Deletion
– To delete element from array
• Searching
– To search any element from the given list
• Sorting
– To sort the given list of elements
49
Algorithm: Traversing
• LA is a linear array with lower bound LB and upper bound UB. This algorithm
traverses LA applying an operation PROCESS to each element of LA.

• Alternate algorithm

50
Insertion into Linear Array

51
Deletion into Linear Array

52
Binary search
• By using this technique, element can searched in minimum
possible comparisons.
• This given list of elements should be in sorted order.
• It can be done as follows:
– Find the middle element of the array
– Compare the mid element with an item to search.
– There are three cases:
• If it is the desired element, search is successful.
• If mid is greater than desired item, search only the left half of array.
• Else If mid is less than desired item, search only the right half of array.

• Complexity of Binary Search O(log2n)


53
54
Two-dimensional Arrays
• A two dimensional m×n array A is a collection
of m·n data elements.
• Each element is specified by a pair of integers
(such as J, K), called subscripts such that
1 ≤ J ≤ m and 1≤K≤n
• It is denoted by
– AJ,K or A[J,K]
• Two dimensional arrays are called matrix
arrays.
55
Two-dimensional array

56
Representation of 2-D array in memory

57
• Following formula can be applied to locate a
particular address:
• Column major order
– LOC(A[J,K]) = Base(A) + w(M(K-1)+(J-1))
• Row major order
– LOC(A[J,K]) = Base(A) + w(N(J-1)+(K-1))

58
Example

59
Bubble Sort

60
Selection Sort

61
Insertion Sort

62
Complexity of Insertion Sort
• Worst Case
– When array A is in reverse order
– (k-1) comparisons

• Average Case
– Approximately (k-1)/2 comparisons

63
Multi-dimensional Arrays
• A multi-dimensional or n-dimensional array
m1×m2×……..×mn array B is a collection of
m1·m2·……..·mn data elements.
• Each element is specified by a list of n integers
(such as K1, K2, ….., Kn), called subscripts such
that
1 ≤ K1 ≤ m1, 1 ≤ K2 ≤ m2, ………, 1 ≤ Kn ≤ mn
• It is denoted by
– B K1, K2, ….., Kn or B[K1, K2, ….., Kn]
64
Multi-dimensional Arrays
• Length Li can be calculated as
Li = upper bound – lower bound + 1
• For a given subscript Ki, effective index Ei of Li
is the number of indices preceding Ki in the
index set.
Ei = Ki - lower bound

65
66
• Column major order

• Row major order

67
An Example:

68
Recursion
• Recursion is a process in which a function calls itself with an
argument.

• A recursive procedure must have following two properties:


– There must be certain criteria, called base criteria, for which the
procedure does not call itself.
– Each time the procedure calls itself, it must be closer to the base
criteria.
• A recursive procedure with these two properties is said to be well
defined.

• It is of two types:
– Direct recursion
• When a function class itself
– Indirect recursion 69
• When two functions calls one another mutually.
70
Factorial Function
• The product of positive integers from 1 to n is
called “n factorial” denoted by n!
n!=1·2·3·……(n-2) ·(n-1) ·n
or, n!=n· (n-1)!

• Formal Definition (Factorial function)


– If n=0, then n!=1
– If n>0, then n!=n· (n-1)!

71
Algorithm: Factorial Function

72
Fibonacci Sequence
• Fibonacci sequence is as follows:
0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, ……………..
– Here,
F0=0 and F1=1
• Each succeeding term is the sum of two
preceding terms

• Formal Definition:
– If n=0 or n=1, then Fn=n
– If n>1, then Fn=Fn-2 + Fn-1 73
Algorithm: Fibonacci Sequence

74

You might also like