Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
4 views

Chapter 1 Introduction to datastructure and algorithm

This document provides an introduction to data structures and algorithms, explaining their importance in computer science for efficient data storage and manipulation. It covers basic terminology, the need for data structures, their advantages, classifications into linear and non-linear types, and operations performed on them. Additionally, it discusses algorithms, their characteristics, performance measurement, and asymptotic analysis with common notations.

Uploaded by

eliasaraya142
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Chapter 1 Introduction to datastructure and algorithm

This document provides an introduction to data structures and algorithms, explaining their importance in computer science for efficient data storage and manipulation. It covers basic terminology, the need for data structures, their advantages, classifications into linear and non-linear types, and operations performed on them. Additionally, it discusses algorithms, their characteristics, performance measurement, and asymptotic analysis with common notations.

Uploaded by

eliasaraya142
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 10

Chapter 1: Introduction to Data Structures and Algorithms (4 Hours)

1.1 Introduction to Data Structures


Introduction
A data structure is a collection of data elements that provides an efficient way to store and
organize data in a computer, enabling effective data manipulation and retrieval. Common
examples of data structures include arrays, linked lists, stacks, queues, and more.
Data structures play a fundamental role in various areas of computer science, including
operating systems, compiler design, artificial intelligence, graphics, and many others. They
are essential for designing efficient algorithms, as they enable programmers to manage and
process data effectively. By selecting the appropriate data structure, software performance can be
significantly enhanced, ensuring faster storage and retrieval of user data.
Basic Terminology
Data structures form the backbone of any software or program. Choosing the right data structure
is often a challenging task for programmers. Below are some essential terms related to data
structures:
 Data: The smallest unit of information, which can be a single value or a collection of
values. Example: A student's name and ID represent data related to the student.
 Group Items: Data elements that contain subordinate data items. Example: A student's
name can be divided into first name and last name.
 Record: A collection of related data items. Example: A student's name, address, course,
and marks together form a record.
 File: A collection of records of the same entity type. Example: A company's employee
database consists of multiple employee records stored in a file.
 Attribute & Entity: An entity represents a category of objects, and each entity has
attributes that define its properties. Example: A student is an entity, and name, age,
and roll number are attributes.
 Field: The smallest unit of data within a record, representing an attribute of an entity.
Need for Data Structures
As applications become increasingly complex and data volumes grow exponentially, several
challenges arise:
1. Processor Speed: Managing vast amounts of data requires high-speed processing. When
data grows into billions of records, even powerful processors may struggle to handle
such large datasets efficiently.
2. Data Search: Searching within large datasets can be time-consuming. For example, if an
inventory contains 1 million (10⁶) items, searching for a particular item requires
scanning each item sequentially, slowing down the process.
3. Multiple Requests: When thousands of users query data simultaneously (e.g., on a web
server), the system can become overloaded, leading to failures or slow performance.
To overcome these issues, efficient data structures are necessary. They help organize data so
that searching, inserting, updating, and deleting operations can be performed quickly without
scanning every element.
Advantages of Data Structures
1. Efficiency: Choosing the right data structure can significantly improve performance.
o Example: Searching in an unordered array requires scanning all elements
sequentially. However, binary search trees or hash tables allow faster lookups.
2. Reusability: Once implemented, data structures can be reused in multiple applications.
They can be compiled into libraries and shared among different programs.
3. Abstraction: Data structures are implemented using Abstract Data Types (ADTs),
which provide a level of abstraction. This means that users interact with the interface
without needing to understand the underlying implementation.
Data Structure Classification
Linear and Non-Linear Data Structures
Linear Data Structures
A data structure is called linear if all its elements are arranged in a sequential order. In linear
data structures, elements are stored in a non-hierarchical manner, where each element (except the
first and last) has a successor and predecessor.
Types of Linear Data Structures
1. Arrays:
o An array is a collection of similar data types where each element is called an array
element.
o Elements share the same variable name but are accessed using an index (subscript).
o Arrays can be one-dimensional, two-dimensional, or multidimensional.
o Example:
int age[100]; // An array of size 100
The elements are referenced as age[0], age[1], ..., age[99].
2. Linked List:
o A linked list is a linear data structure where elements (nodes) are stored at non-
contiguous memory locations.
o Each node contains a pointer to its adjacent node.
o It allows dynamic memory allocation and is more efficient for insertions/deletions
compared to arrays.
3. Stack:
o A stack is a Last In, First Out (LIFO) data structure where insertions and deletions
happen at one end, called the top.
o It can be implemented using arrays or linked lists.
o Example: A stack behaves like a pile of plates where only the topmost plate can be
accessed first.
4. Queue:
o A queue is a First In, First Out (FIFO) data structure where insertions happen at the
rear and deletions at the front.
o Example: A queue of people at a ticket counter follows FIFO order.
Non-Linear Data Structures
A non-linear data structure does not maintain a sequential order. Instead, elements are
connected in a complex way where each element can be linked to multiple other elements.
Types of Non-Linear Data Structures
1. Trees:
o A tree is a hierarchical data structure consisting of nodes.
o The root node is the topmost node, and leaf nodes are nodes without children.
o Nodes are connected based on a parent-child relationship.
o Trees can be classified into binary trees, binary search trees, AVL trees, etc.
2. Graphs:
o A graph consists of a set of vertices (nodes) connected by edges.
o Unlike trees, graphs can contain cycles.
o Graphs are widely used in networks, social media, and shortest path algorithms.
Operations on Data Structures
1. Traversing: Visiting each element of a data structure to perform an operation like
searching or sorting.
o Example: Calculating the average marks of a student requires traversing the entire
array of marks.
2. Insertion: Adding an element at any position in the data structure.
o If the size of the structure is n, we can insert up to n-1 elements.
3. Deletion: Removing an element from the data structure at any position.
o If the structure is empty and a deletion is attempted, it results in underflow.
4. Searching: Locating an element in a data structure.
o Two common searching techniques:
 Linear Search
 Binary Search (for sorted data)
5. Sorting: Arranging elements in a specific order (ascending or descending).
o Common sorting algorithms:
 Insertion Sort
 Selection Sort
 Bubble Sort
6. Merging: Combining two lists A and B of sizes M and N, respectively, to form a new list
C of size M+N.

Algorithm

An algorithm is a procedure having well defined steps for solving a particular problem.
Algorithm is finite set of logic or instructions, written in order for accomplish the certain
predefined task. It is not the complete program or code, it is just a solution (logic) of a problem,
which can be represented either as an informal description using a Flowchart or Pseudo code.

The major categories of algorithms are given below:

o Sort: Algorithm developed for sorting the items in certain order.

o Search: Algorithm developed for searching the items inside a data structure.

o Delete: Algorithm developed for deleting the existing element from the data structure.

o Insert: Algorithm developed for inserting an item inside a data structure.

o Update: Algorithm developed for updating the existing element inside a data structure.

The performance of algorithm is measured on the basis of following properties:


o Time complexity: It is a way of representing the amount of time needed by a program to
run to the completion.
o Space complexity: It is the amount of memory space required by an algorithm, during a
course of its execution. Space complexity is required in situations when limited memory
is available and for the multi user system.

Each algorithm must have:

o Specification: Description of the computational procedure.

o Pre-conditions: The condition(s) on input.

o Body of the Algorithm: A sequence of clear and unambiguous instructions.

o Post-conditions: The condition(s) on output.

Example: Design an algorithm to multiply the two numbers x and y and display the result in z.

o Step 1 START

o Step 2 declare three integers x, y & z

o Step 3 define values of x & y

o Step 4 multiply values of x & y

o Step 5 store the output of step 4 in z

o Step 6 print z

o Step 7 STOP

. Alternatively the algorithm can be written as ?

o Step 1 START MULTIPLY

o Step 2 get values of x & y

o Step 3 z← x * y

o Step 4 display z

o Step 5 STOP
Characteristics of an Algorithm

An algorithm must follow the mentioned below characteristics:

o Input: An algorithm must have 0 or well defined inputs.

o Output: An algorithm must have 1 or well defined outputs, and should match with the
desired output.
o Feasibility: An algorithm must be terminated after the finite number of steps.

o Independent: An algorithm must have step-by-step directions which is independent of


any programming code.
o Unambiguous: An algorithm must be unambiguous and clear. Each of their steps and
input/outputs must be clear and lead to only one meaning.

Asymptotic Analysis

In mathematical analysis, asymptotic analysis of algorithm is a method of defining the


mathematical boundation of its run-time performance. Using the asymptotic analysis, we can
easily conclude about the average case, best case and worst case scenario of an algorithm.

It is used to mathematically calculate the running time of any operation inside an algorithm.

Example: Running time of one operation is x(n) and for another operation it is calculated as
f(n2). It refers to running time will increase linearly with increase in 'n' for first operation and
running time will increase exponentially for second operation. Similarly the running time of both
operations will be same if n is significantly small.

Usually the time required by an algorithm comes under three types:

Worst case: It defines the input for which the algorithm takes the huge time.

Average case: It takes average time for the program execution.

Best case: It defines the input for which the algorithm takes the lowest time.
Asymptotic Notations

The commonly used asymptotic notations used for calculating the running time complexity of an
algorithm is given below:

o Big oh Notation (Ο)

o Omega Notation (Ω)

o Theta Notation (θ)

Big oh Notation (O)

It is the formal way to express the upper boundary of an algorithm running time. It measures the
worst case of time complexity or the longest amount of time, algorithm takes to complete their
operation. It is represented as shown below:

For example: If f(n) and g(n) are the two functions defined for positive integers, then f(n) is
O(g(n)) as f(n) is big oh of g(n) or f(n) is on the order of g(n)) if there exists constants c and no
such that:

F(n)≤cg(n) for all n≥no


This implies that f(n) does not grow faster than g(n), or g(n) is an upper bound on the function
f(n).

Omega Notation (Ω)

It is the formal way to represent the lower bound of an algorithm's running time. It measures the
best amount of time an algorithm can possibly take to complete or the best case time complexity.

If we required that an algorithm takes at least certain amount of time without using an upper
bound, we use big- Ω notation i.e. the Greek letter "omega". It is used to bound the growth of
running time for large input size.

If running time is Ω (f(n)), then for the larger value of n, the running time is at least k?f(n) for
constant (k). It is represented as shown below:

Theta Notation (?)

It is the formal way to express both the upper bound and lower bound of an algorithm running
time.

Consider the running time of an algorithm is θ (n), if at once (n) gets large enough the running
time is at most k2-n and at least k1 ?n for some constants k1 and k2. It is represented as shown
below:
Common Asymptotic Notations

constant - ?(1)

linear - ?(n)

logarithmic - ?(log n)

n log n - ?(n log n)

exponential - 2?(n)

cubic - ?(n3)

polynomial - n?(1)

quadratic - ?(n2)

You might also like