Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
48 views

Data Analytics With Python

Uploaded by

Tirth Patel
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views

Data Analytics With Python

Uploaded by

Tirth Patel
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 634

DATA ANALYTICS, MACHINE

LEARNING
AND
AI WITH PYTHON
©2022 TOPS Technolgies. All Rights Reserved
Welcome To The World Of Data

©2022 TOPS Technolgies. All Rights Reserved


Why Data Is So Important
Simply Say ………… It Can Speak.

What Does It Speaks or Do ?

It Just Gives The Answer Of your Questions like ….

Some Examples Are ,


1. What Is the Performance Of Your Company For Last So Many Years And How gradually it is
Increasing or Decreasing in term of profit .. AND MOST IMPORTANTLY WHAT SHOULD BE
THE NECESSARY STEPS FOR MOVING FURTHER IN BUSINESS
2. Who is my customers .
3. WHat Should Be My Best Marketing Strategy To increasing The Sales

©2022 TOPS Technolgies. All Rights Reserved


How Can We Listen That What Data
Is Speaking ?
We can Listen that Profitable Words By
Using The
Data Analytics

©2022 TOPS Technolgies. All Rights Reserved


There Are Many Confusing Terms Out There So What is the Difference Between Them.

Data Science
Data Analytics
Data Analysis
AI
ML
And
DL
©2022 TOPS Technolgies. All Rights Reserved
What is Data Science
Data science is an
interdisciplinary field
that uses scientific
methods, processes,
algorithms and systems
to extract knowledge
and insights from many
structural and
unstructured data. Data
science is related to
data mining, machine
learning and big data.
©2022 TOPS Technolgies. All Rights Reserved
What Is Data Analysis
Data analysis refers to the
process of examining,
transforming and arranging a
given data set in specific ways
in order to study its individual
parts and extract useful
information.

©2022 TOPS Technolgies. All Rights Reserved


What is Data Analytics
Data analytics is an overarching
science or discipline that
encompasses the complete
management of data. This not only
includes analysis, but also data
collection, organisation, storage, and
all the tools and techniques used.

Data Analytics Techniques Enable


You To Take raw Data And Uncover
Patterns To Extract Valuable Insight
From It
©2022 TOPS Technolgies. All Rights Reserved
AI , ML AND DL
AI : AI enables machines to think without
any human intervention.
ML : ML is a subset of AI that uses
statistical learning algorithms to build smart
systems. The ML systems can
automatically learn and improve without
explicitly being programmed.
DL : This subset of AI is a technique that is
inspired by the way a human brain filters
information. It is associated with learning
from examples. DL systems help a
computer model to filter the input data
through layers to predict and classify
©2022 TOPS Technolgies. All Rights Reserved
information. :
So How Does Data Analytics Makes It
Possible ?
Data Analytics Makes Use of Many Different Technologies Like…
Statistics
Excel
ML,DL
And Many More………
To Listen That Words And Also To Predict The Result

©2022 TOPS Technolgies. All Rights Reserved


Before Starting Of Our Data Journey
Let’s Familiarize Ourselves With
Some Basic Things…..

©2022 TOPS Technolgies. All Rights Reserved


Data Analyst Should Know….
How to Get the Data………..

How To Explore the Data ………

How To Make The Dataset Best For Getting The Efficient


Prediction…………

How To Do The Analysis And Modeling …………..

How To Communicate And Represent Your Findings…………...

©2022 TOPS Technolgies. All Rights Reserved


Get Your Data…..

You Should Know How To Fetch the Dataset From database….

As You Will be Part Of any Organization, There Is A Very High Chance That You Will Have a
Database to Get The Data For Analysis….

So The Main Skills You Need to Have Is …….

©2022 TOPS Technolgies. All Rights Reserved


Exploration Of The Data ………
Make yourSelf Familiar WIth the Data.
Look At All The Aspect Of Dataset And get As Much As Insight As You Can………..

Tools Required For That…

©2022 TOPS Technolgies. All Rights Reserved


Make The Dataset Best For Getting The Efficient
Prediction…………
EveryTime When You Get The Dataset, Then There Is A High Chance That You Will get the A Very
Messy dataset….

And You Can not Make Any Accurate Analysis On Such Dataset , So You Need To Clean those
Dataset….
And This Process Will Always Take Your 60% To 70% Time Of Your Project..

Tools Required For This ….

©2022 TOPS Technolgies. All Rights Reserved


Do The Analysis And Modeling …………..

Tts All About The Prediction of Your Data And Making Of Right Decision

Here You Will Use The Most Exciting Thing Which Is Machine Learning.

©2022 TOPS Technolgies. All Rights Reserved


Communicate And Represent Your Findings…………...

It’s Necessary To Represent Your Analysis To Your Client As well As to your upper
Authorities

©2022 TOPS Technolgies. All Rights Reserved


Now Let’s Start Our Journey

©2022 TOPS Technolgies. All Rights Reserved


First Let’s Understand What
Is……………………..

©2022 TOPS Technolgies. All Rights Reserved


Data
Data Is Plural Word and Its Singular Form is Datum.

Datum Is a Latin Word Which Means Something Given .

Data Is a Collection Of - Text , Number , dates , Symbol , Images

©2022 TOPS Technolgies. All Rights Reserved


Types Of Data
1. Structured Data 2. Unstructured Data

Arranged In Row And Columns Not in Row and Columns


Easy To locate Difficult To Locate
Example : Excel Example : Images , Videos

©2022 TOPS Technolgies. All Rights Reserved


Python

©2022 TOPS Technolgies. All Rights Reserved


Python
Modules:
1. Fundamentals of Python 8. Functions
2. Variable and Data types 9. Scope Of Variables
3. Operators 10. Input-Output
4. Collections 11. Files and Exceptions Handling
5. Conditional Statements 12. OOPS Concepts
6. Looping Statements
7. Control Statements
Introduction of Python
● Python is an interpreted, object-oriented, high-level programming
language with dynamic semantics.
● Its high-level built in data structures, combined with dynamic typing and
dynamic binding, make it very attractive for Rapid Application
Development
● Python supports modules and packages, which encourages program
modularity and code reuse.
● The Python interpreter and the extensive standard library are available
in source or binary form without charge for all major platforms, and can
be freely distributed.
● Python supports modules and packages, which encourages program
modularity and code reuse.

● The Python interpreter and the extensive standard library are available
in source or binary form without charge for all major platforms, and can
be freely distributed.
Why Python ?
Designed to be easy to learn and master ○ Clean, clear syntax ○ Very few
keywords

Highly portable

○ Runs almost anywhere - high end servers and workstations, down to


windows CE

○ Uses machine independent byte-code

Extensible

○ Designed to be extensible using C/C++,


Features of Python
Clean syntax plus high-level data types

○ Leads to fast coding (First language in many universities abroad!)

Uses white-space to delimit blocks

○ Humans generally do, so why not the language?

○ Try it, you will end up liking it

Uses white-space to delimit blocks

○ Variables do not need declaration


Features of Python
● Reduced development time Code is 2-10x shorter than C, C++, Java

● Improved program maintenance Code is extremely readable

● Less training Language is very easy to learn


Programming Style
● Python programs/modules are written as text files with traditionally a
.py

extension.

● Each Python module has its own discrete namespace.

● Name space within a Python module is a global one.

● Python modules and programs are differentiated only by the way they
are

called.
Programming Style
● py files executed directly are programs (often referred to as scripts)

● .py files referenced via the import statement are modules.

● Thus, the same .py file can be a program/script, or a module.


Installation

Anaconda Installation

Individual Edition

Click Here
print() function
● The print function in Python is a function that outputs to your console
window whatever you say you want to print out.

● At first blush, it might appear that the print function is rather useless for
programming, but it is actually one of the most widely used functions in all
of python. The reason for this is that it makes for a great debugging tool.
Refer this example:

1.1.1 Print sample

● Click here

1.1.2 single quotation and double quotation

● Click here
Escape Sequences

1.1.3 Escape sequence:

Click here
end= “ “
● The end=' ' is just to say that you want a space after the end of the

statement instead of a new line character.

Refer This Example :


● 1.1.4 end practical: click here
sep= “ “
● The separator between the arguments to print() function in Python is
space
by default (softspace feature) , which can be modified and can be
made to
any character, integer or string as per our choice.

● The ‘sep’ parameter is used to achieve the same, it is found only in


python
3.x or later. It is also used for formatting the output strings.
Refer This Example :
● 1.1.5 sep practical:
Comments
● A comment is a programmer-readable explanation or annotation in the
source code of a computer program. They are added with the purpose of
making the source code easier for humans to understand, and are generally
ignored by compilers and interpreters.

● Types of comments :

○ Single line comment Indicate by #

○ Document comment “”” statements “””


Variable and Data Types
Variables
Variable : A name which can store a value
Variables
● Unlike other programming languages, Python has no command for
declaring a variable.

● A variable is created the moment you first assign a value to it.

E.g. number=20

age = 21

Note : Variables do not need to be declared with any particular type


Variable Declaration Rules :
● A variable can have a short name (like a and b) or a more descriptive name
(age,username,product_price).

● A variable name must start with a letter or the underscore character

● A variable name cannot start with a number

10name = “python”

● A variable name can only contain alphanumeric characters and


underscores (A-z, 0-9, and _ )
Variable Declaration Rules :
● Variable names are case-sensitive (age, Age and AGE are three different
variables)

NAME=”python” print(name)

Error : NameError: name 'name' is not defined


Refer This Examples
: 1.1.6 Variable sample

● Click here
https://github.com/TopsCode/Python/blob/master/Module1/1.1%20Progra
ming%20Style/1.1.6%20Variable.py

1.1.7 Sum of two numbers

● Click here
https://github.com/TopsCode/Python/blob/master/Module1/1.1%20Progra
ming%20Style/1.1.7%20sum%20of%20two%20numbers(variable). py

1.1.8 Swaping of two numbers


Data Types
1. Immutable:

Whose Value can’t be modified.

Ex: int, float, string, tuple

1. Mutable:

Whose Value can be modified.

Ex: List, Set, Dictionary


Integer:

Integers are positive or negative whole numbers with no decimal point.

Float

Float represent real numbers and are written with a decimal point.

Strings

Python does not have a character data type, a single character is simply a
string with a length of 1. Square brackets can be used to access elements of
the string.
Operators in Python
● To perform specific operations we need to use some symbols and that
symbols are operator

Example :

A+B

Here, + is a operator

A and B is operand And

A+B is expression
ArithmeticOperators
Assignment Operators
Logical Operators
Comparison Operators
Identity Operators
Membership Operators
Collections
● List

Operations , Functions and methods

● Tuple

Operations , Functions and methods

● Dictionaries

Operations , Functions and methods

● Set

Operations , Functions and methods


List [ ]
1. Introduction
2. Accessing list
3. Operations
4. Working with lists
5. Function and Methods
1. Introduction
● Python knows a number of compound data types, used to group together
other

values.

● The most versatile is the list, which can be written as a list of comma-
separated

values (items) between square brackets.

● Lists might contain items of different types, but usually the items all have the

same type.
2. Accessing List
● Accessing List

Like strings (and all other built-in sequence type), lists can be indexed
and

Sliced

fruits = [‘apple’, ‘orange’, ‘banana’, ‘grapes’]

Example: fruits[0]

Example :fruits[-3:-1]
3. Operations
● “in” operator :- This operator is used to check if an element is
present in the list or not.

Returns true if element is present in list else returns false.

● “not in” operator :- This operator is used to check if an element is


not present in the list or not.

Returns true if element is not present in list else returns false.


4. Working:
2.1.1 List as Queue

● Click here

2.1.2 List Demo

● Click Here

2.1.3 List operations

● Click here

2.1.4 List pattern


5. Functions and Methods
5. Functions and Methods
5. Functions and Methods
Tuple
1. Introduction
2. Accessing tuples
3. Operations
4. Working
5. Functions and Methods
1. Introduction
● A tuple is a sequence of immutable Python objects.

● Tuples are sequences, just like lists.

● The differences between tuples and lists are, the tuples cannot be changed
unlike lists and tuples use parentheses, whereas lists use square brackets.

● Eg fruits=(“Mango’”,”Banana”,”Oranges”,23,44)

● Eg numbers=(11,22,33,44)

● Eg fruits=“Mango”,”Banana”,”Oranges’”
Introduction
● Unlike lists, tuples are immutable.

This means that elements of a tuple cannot be changed once it has been

assigned.

● But if the element is itself a mutable data type like list, its nested items can
be

changed.

● We can also assign a tuple to different values (reassignment).

● Also We cannot delete or remove items from a tuple.


2. Accessing tuples
● There are various ways in which we can access the elements of a tuple.

● We can use the index operator [] to access an item in a tuple.

● Index starts from 0. The index must be an integer.

● Python allows negative indexing for its sequences.

● The index of -1 refers to the last item, -2 to the second last item and so on.

● We can access a range of items in a tuple by using the slicing operator


(colon).
3. Operations

● With Tuples we can do concatenation ,repetition,iterations etc


4. Working
2.2.1 add item tuple

Click here

2.2.2 convert list tuple

Click here

2.2.3 convert tuple string

Click here

2.2.4 create tuple with numbers


5. Functions
5. Functions
Refer this example :
2.2.5 create tuple

Click here

2.2.6 find repeat item tuple

Click here

2.2.7 slice tuple

Click here
Set
A set is a Sequence of unique elements.

To create a set:

set1=set((1,2,3,4,3,4,5,6,5,5,4))

print(set1)

Output: {1, 2, 3, 4, 5, 6}
Dictionaries
1. Introduction
2. Accessing values in dictionaries
3. Working with dictionaries
4. Properties
5. Functions
Introduction
● Dictionaries are sometimes found in other languages as “associative
memories” or “associative arrays”.

● Unlike sequences, which are indexed by a range of numbers, dictionaries


are indexed by keys, which can be any immutable type; strings and numbers
can always be keys.

● Tuples can be used as keys if they contain only strings, numbers, or tuples;
if a tuple contains any mutable object either directly or indirectly, it cannot
be used as a key.
1. Introduction
● You can’t use lists as keys, since lists can be modified in place using index
assignments, slice assignments, or methods like append() and extend().

● The main operations on a dictionary are storing a value with some key and
extracting the value given the key.

● Like lists they can be easily changed, can be shrunk and grown ad libitum
at run time. They shrink and grow without the necessity of making copies.
Dictionaries can be contained in lists and vice versa.
Introduction

● A list is an ordered sequence of objects, whereas dictionaries are


unordered sets.

● But the main difference is that items in dictionaries are accessed via keys
and not via their position.
2. Accessing Values
● To access dictionary elements, we can use the familiar square brackets
along with the key to obtain its value.

● It is an error to extract a value using a non-existent key.

● We can also create a dictionary using the built-in class dict() (constructor).

● We can test if a key is in a dictionary or not using the keyword in.

● The membership test is for keys only, not for values.


3. Working
2.3.1 Dictionary example

Click here

2.3.2 Dictionary method demo

Click here
4. Properties
● Properties of Dictionaries

○ Dictionary values have no restrictions.

○ They can be any arbitrary Python object, either standard objects or

user-defined objects.

○ However, same is not true for the keys.

● More than one entry per key not allowed.

○ Which means no duplicate key is allowed. ○ When duplicate keys


Properties

● Keys must be immutable.

○ Which means you can use strings, numbers or tuples as dictionary keys
but

something like ['key'] is not allowed.


5. Methods and Functions
Conditional Statements
Types of conditional Statements
● If .. Statement

● If.. else Statement

● If..Elif..else Statement

● Nested if Statement
If Statements
● It is similar to that of other languages.

● The if statement contains a logical expression using which data is


compared and a decision is made based on the result of the comparison.

● Syntax :

if condition:

statements
If .. else statement
● It is similar to that of other languages.

● It is frequently the case that you want one thing to happen when a
condition it true, and something else to happen when it is false.

● For that we have the if else statement.

● Syntax :

if condition:

statements

else:
If..elif..else statement
● It is similar to that of other languages.

● The elif is short for else if. It allows us to check for multiple expressions.

● If the condition for if is False, it checks the condition of the next elif block
and so on.

● If all the conditions are False, body of else is executed.

● Only one block among the several if...elif...else blocks is executed


according to the condition.
If..elif..else statement
● Syntax :

If condition:

statement

Elif condition:

statement
Nested if….else statement
● There may be a situation when you want to check for another condition
after a condition resolves to true.

● In such a situation, you can use the nested if construct.

● Syntax :

if condition: statements

if condition: statements

else:

statement(s)
Refer this Example :
1.2.1 if statement

Click here

1.2.2 if else statement

Click here

1.2.3 elif statement

Click here

1.2.4 Nested if statement


Looping Statements
Loop Statement
● A loop statement allows us to execute a statement or group of statements
multiple times.

● Python programming language provides following types of loops to handle


looping requirements.

○ For Loop

○ While Loop
For Loops
● For loop has the ability to iterate over the items of any sequence, such as a
list or a string.

● Syntax :

for iterating_var in sequence:

statements(s)

● If a sequence contains an expression list, it is evaluated first.

● Then, the first item in the sequence is assigned to the iterating variable
iterating_var.
● Next, the statements block is executed.

● Each item in the list is assigned to iterating_var, and the statement(s) block
is executed until the entire sequence is exhausted

Refer this Example :


1.3.1 for loop

Click here

1.3.2 for loop with list

Click here
Nested Loops
Nested loop means a loop statement inside another loop
statement.

Syntax :

for iterating_var in sequence:

for iterating_var in sequence:

statements(s)

statements(s)
Refer this Example :
1.3.5 nested for loop: Click here

1.3.6 pattern 1: Click here

1.3.7 pattern 2: Click here

1.3.8 pattern 3: Click here

1.3.9 pattern 4: Click here


range() function

● To loop through a set of code a specified number of times, we can use the
range() function,

● The range() function returns a sequence of numbers, starting from 0 by


default, and increments by 1 (by default), and ends at a specified number.

● The range() function defaults to 0 as a starting value, however it is possible


to specify the starting value by adding a parameter: range(2, 6), which
means values from 2 to 6 (but not including 6):
Refer this Example :

1.3.3 for loop with range

Click here

1.3.4 for loop with decrement range

Click here
While Loop
● A while loop statement in Python programming language repeatedly
executes a target statement as long as a given condition is true.

● Here, statement(s) may be a single statement or a block of statements. The


condition may be any expression, and true is any non-zero value. The loop
iterates while the condition is true.

● Syntax :

while expression:

statement(s)
Control Statements
Control Statements
● Loop control statements change execution from its normal sequence.

● When execution leaves a scope, all automatic objects that were created in
that scope are destroyed.

● Python supports the following control statements.

1. Break
2. Continue
3. Pass
Break Statement
● It brings control out of the loop and transfers execution to the
statement immediately following the loop.

● Syntax : break

Refer this Example :

1.4.1 break statement

Click Here
Continue Statements

● It continues with the next iteration of the loop.

● Syntax : continue

Refer this Example :

1.4.2 continue statement

Click here
Pass Statements
● The pass statement does nothing.

● It can be used when a statement is required syntactically but the program


requires no action.

● Syntax : pass
● Refer this Example :

1.4.3 pass statement

Click here
FUNCTIONS
Function Definition
● A function is a block of organized, reusable code that is used to
perform a single, related action.

● Functions provide better modularity for your application and a


high degree of code reusing.
Types of Function
Defining a Functions
● Defining a function

○ Python gives us many built-in functions like print(), etc. but we can
also create our own functions.

○ A function in Python is defined by a def statement. The general syntax


looks like this:

def function-name(Parameter list):

statements, i.e. the function body

return [expression]
Defining a Functions
● The keyword ”def” introduces a function definition.

● It must be followed by the function name and the parenthesized list of


formal parameters. The statements that form the body of the function start
at the next line, and must be indented.

● The ”return” statement returns with a value from a function.”return”


without an expression argument returns None.

● Falling off the end of a function also returns None


Calling a Functions
● Once function define, we can call it directly or in any other function also.

● Syntax :

functionname() or functionname(argument)

3.1.1 Create function

Click here
Function Arguments
● It is possible to define functions with a variable number of
arguments.

● The function arguments can be

○ Default arguments values

○ Keyword arguments
Function Arguments
● Default arguments values

○ The most useful form is to specify a default value for one or more
arguments.

○ This creates a function that can be called with fewer arguments than it
is defined to allow.

○ Eg. def employeeDetails(name,gender=’male’,age=35)

○ This function can be called in several ways:

○ giving only the mandatory argument : employeeDetails(“Ramesh”)


Function Arguments
○ giving one of the optional arguments:
employeeDetails(“Ramesh”,’Female’)

○ or even giving all arguments : employeeDetails(“Ramesh”,’Female’,31)

● Note : The default value is evaluated only once. This makes a difference
when the default is a mutable object such as a list, dictionary, or instances of
most classes.
Keyword Arguments

○ Functions can also be called using keyword arguments of the form


kwarg=value.

○ For instance, the following function:

○ def parrot(voltage, state='a stiff', action='voom', type='Norwegian


Blue'):

○ accepts one required argument (voltage) and three optional


arguments (state, action, and type).
Function Arguments
Function Arguments
● Arbitrary Argument Lists

○ These arguments will be wrapped up in a tuple . Before the variable


number of arguments, zero or more normal arguments may occur.

○ Eg : def write_multiple_items(file, separator, *args):


file.write(separator.join(args))

○ Normally, these variadic arguments will be last in the list of formal


parameters, because they scoop up all remaining input arguments that are
passed to the function.
Function Arguments
○ Any formal parameters which occur after the *args parameter are
‘keyword-only’ arguments, meaning that they can only be used as keywords
rather than positional arguments.

def concat(*args, sep="/"):

return sep.join(args)

concat("earth", "mars", "venus") O/P->'earth/mars/venus ‘

concat("earth", "mars", "venus", sep=".") O/P- >'earth.mars.venus'


Refer this example :
3.1.2 Function with parameter: click here

3.1.3 Function with default value: click here

3.1.4 Function with return values: click here

3.1.5 tuple as parameter: click here

3.1.6 dict as parameter: click here


Lambda functions
○ Lambda functions can have any number of arguments but only one
expression. ○ The expression is evaluated and returned.

○ Lambda functions can be used wherever function objects are required.

○ Python supports a style of programming called functional programming


where you can pass functions to other functions to do stuff.

3.2.1 lambda function

Click here
Scope of Variables
Global variables
○ Defining a variable on the module level makes it a global variable, you
don't need to global keyword.

○ The global keyword is needed only if you want to reassign the global
variables in the function/method.

○ Defining a variable on the module level makes it a global variable, you


don't need to global keyword.

○ The global keyword is needed only if you want to reassign the global
variables in the function/method.
Local variables
○ If a variable is assigned a value anywhere within the function’s body, it’s
assumed to be a local unless explicitly declared as global.

○ Local variables of functions can't be accessed from outside.

3.3.1 global variable: click here

3.3.2 local variable: click here

3.3.3 global keyword: click here


Modules
Introductions
● A module is a file containing Python definitions and statements.

● The file name is the module name with the suffix .py appended.

● Within a module, the module’s name (as a string) is available as the value
of the

global variable __name__.


Importing Module
● Modules can import other modules.

● It is customary but not required to place all import statements at the


beginning of a module (or script, for that matter).

● The imported module names are placed in the importing module’s global
symbol table.

Eg : import fibo

● Using the module name we can access functions.

fibo.fib(10)
Importing Module
● There is a variant of the import statement that imports names from a
module directly into the importing module’s symbol table.

For example:

from fibo import fib, fib2

fib(500)

another way to import is

from fibo import *


Math Module
● This module is always available.

● It provides access to the mathematical functions defined by the C


standard.

● This functions are divided into some categories like Number theoretic and
representation functions, Power and logarithmic functions, Trigonometric
functions, Angular conversion, Hyperbolic functions, Special functions. ●
Constants

● math.pi : The mathematical constant π = 3.141592..., to available


precision.
Math Module
● math.inf : A floating-point positive infinity. (For negative infinity, use -
math.inf.) Equivalent to the output of float('inf').

● math.nan : A floating-point “not a number” (NaN) value. Equivalent to the


output of float('nan')
4.3.1 Math module

Click here
Packages
● Packages are a way of structuring Python’s module namespace by using
“dotted module names”.

● For example, the module name A.B designates a submodule named B in a


package named A.

● A package is basically a directory with Python files.

4.4.1 Package example

Click here
Input-Output
Reading from keyboard
● To read data from keyboard “input()” is use.

● The input of the user will be returned as a string without any


changes.

● If this raw input has to be transformed into another data type


needed by the algorithm, we can use either a casting function or
the eval function
Printing on screen
● “print()” is use to print on screen.

● print(value, ..., sep=' ', end='\n', file=sys.stdout)

● prints the values to a stream, or to sys.stdout by default.

● Optional keyword arguments:

file: a file-like object (stream); defaults to the current sys.stdout. sep:


string

inserted between values, default a space. end: string appended after the
Files and Exceptions Handling
Opening and Closing file
● To read data from keyboard “input()” is use.

● “open()” is use to open the file.

● It returns file object.

● syntax ● open(fileName,mode).

● fileName: Name of the file that we wants to open.

● mode: ‘r’ (only for reading), ‘w’ (only for writing), ‘a’ (for append) , r+ (for read and write). ● Normally, files are
opened in text mode, that means, you read and write strings from and to the file, which are encoded in a specific
encoding.

● If encoding is not specified, the default is platform dependent


Close the file
call f.close() to close it and free up any system resources taken up
by the open file.
Reading and writing files
● file.read(size) :This function is use to read a file’s contents.

● If size is omitted or negative then the entire content is returned.

● If the end of the file has been reached, f.read() will return an empty string
(’ ’).

● f.readline() reads a single line from the file; a newline character (\n) is left
at the end of the string, and is only omitted on the last line of the file if the
file doesn’t end in a newline.

● For reading lines from a file, you can loop over the file object.
Reading and writing files
● f.write(string) : writes the contents of string to the file, returning the
number of characters written.

● Other types of objects need to be converted – either to a string (in text


mode) or a bytes object (in binary mode) –before writing them.

● f.tell() : It returns an integer giving the file object’s current position in the
file represented as number of bytes from the beginning of the file when in
binary mode and an opaque number when in text mode.

● f.seek(): To change the file object’s position. f.seek(offset, from_what)


Refer this examples
5.1.1 File example

Click here

5.2.1 Create folder

Click here

5.2.2 delete folder

Click here
Exception Handling
Exception

Exception handling

Try , except and finally clause

User Defined Exceptions


Exception
● In python , there are two distinguishable kinds of errors: syntax errors and
exceptions.

● Syntax errors, also known as parsing errors

● Eg: for i in range(1,10)

print(i) Here missing #“:” after for is syntax error.

● Exceptions : Even if a statement or expression is syntactically correct, it


may cause an error when an attempt is made to execute it.

● Errors detected during execution are called exceptions


Exception
● Exceptions handling in Python is very similar to Java.

● But whereas in Java exceptions are caught by catch clauses, we have


statements introduced by an "except" keyword in Python.

● Eg : n = int(input("Please enter a number: "))

Please enter a number: 23.50 Exception occurs like

ValueError: invalid literal for int() with base 10: '23.5'


Except clause
● A try statement may have more than one except clause for different
exceptions. ● But at most one except clause will be executed.

6.1.1 Exception example 1

Click here

6.1.2 Exception example 2

Click here
Try …..finally clause
● try statement had always been paired with except clauses. But there is
another way to use it as well.

● The try statement can be followed by a finally clause.

● Finally clauses are called clean-up or termination clauses, because they


must be executed under all circumstances, i.e. a "finally" clause is always
executed regardless if an exception occurred in a try block or not.

6.2.1 Try finally use

Click Here
User defined Exception

● Python also allows you to create your own exceptions by deriving classes
from the standard built-in exceptions.

class MyNewError(Exception):

pass

raise MyNewError("Something happened in my program")


OOPs Concept
● OOPs Concept

● Class and object

● Attributes

● Inheritance

● Overloading

● Overriding
Class And Object
● Python classes provide all the standard features of Object Oriented
Programming: the class inheritance mechanism allows multiple base classes,
a derived class can override any methods of its base class or classes, and a
method can call the method of a base class with the same name.

● The class definition looks like :

class ClassName:

Statement 1

Statement 2 …….
Class And Object
● The statements inside a class definition will usually be function definitions,
but other statements are also allowed.

● When a class definition is entered, a new namespace is created, and used


as the local scope—thus, all assignments to local variables go into this new
namespace.

● In particular, function definitions bind the name of the new function here.
Member methods in class
● The class_suite consists of all the component statements defining class
members, data attributes and functions.

● The class attributes are data members (class variables and instance
variables) and methods, accessed via dot notation.

● Eg. displayDetails()
Object
● Class objects support two kinds of operations: attribute references and
instantiation.

● Attribute references use the standard syntax used for all attribute
references in Python: obj.name

● Valid attribute names are all the names that were in the class’s namespace
when the class object was created..
Object
● Class objects support two kinds of operations: attribute references and
instantiation.

● Class instantiation uses function notation.

● Just pretend that the class object is a parameterless function that returns a
new instance of the class.

7.1.1 class and object example1: Click here

7.1.2 class and object example2: Click here


Inheritance
● Python supports inheritance, it even supports multiple inheritance. ●
Classes can inherit from other classes.

● A class can inherit attributes and behaviour methods from another class,
called the superclass.

● A class which inherits from a superclass is called a subclass, also called heir
class or child class.

● Superclasses are sometimes called ancestors as well.

7.2.1 Inheritance: Click Here


Overloading
● Python supports operator and function overloading.

● Method overloading

● Overloading is the ability to define the same method, with the same
name but with a different number of arguments and types.
● It's the ability of one function to perform different tasks, depending on
the number of parameters or the types of the parameters.
● Python operators work for built-in classes.
● But same operator behaves differently with different types.
Overloading
● For example, the + operator will, perform arithmetic addition on two
numbers, merge two lists and concatenate two strings.

● This feature in Python, that allows same operator to have different


meaning according to the context is called operator overloading.

7.3.1 Operator overloading example1: Click here

7.3.2 Operator overloading example2: Click here

7.3.3 Operator overloading example3: Click here


SQL
SQL
Modules:
● Joins
● DBMS and RDBMS ● Function
● E-R Relational Schema ● Transaction Concept (Rollback, Commit,
● Types of database Save Point)
● Normalization
● Algebra
● Database Programming language SQL
● Keys: Primary Key, Unique Key, Foreign Key
● SQL Statement types: DML, DDL, TQL, TCL
DBMS
● DBMS stands for Data Base Management System.

Data + Management System

● Database is a collection of inter-related data and Management System is a


set of programs to store and retrieve those data.
● DBMS is a collection of inter-related data and set of programs to store &
access those data in an easy and effective manner.
● For Example, university database organizes the data about students, faculty,
and admin staff etc. which helps in efficient retrieval, insertion and
deletion of data from it.
Database Management Systems
● A DBMS consists of 2 main pieces:
○ the data
○ the DB engine
○ the data is typically stored in one or more files

● Two most common types of DBMS are:


○ Local
○ Server
Popular DBMS Software

Here, is the list of some popular DBMS system:

➢ MySQL
➢ Microsoft Access
➢ Oracle
➢ PostgreSQL
➢ SQLite
➢ Mongo DB
➢ IBM DB2, etc.
What is the need of DBMS?
Database systems are basically developed for large amount of data. When
dealing with huge amount of data, there are two things that require
optimization: Storage of data and retrieval of data.

Storage:
● According to the principles of database systems, the data is stored in such a way that it
acquires lot less space as the redundant data (duplicate data) has been removed before
storage.

Fast Retrieval of data:


● Along with storing the data in an optimized and systematic manner, it is also important that
we retrieve the data quickly when needed. Database systems ensure that the data is
PURPOSE OF DBMS
The main purpose of database systems is to manage the data.

Consider a university that keeps the data of students, teachers, courses,


books etc. To manage this data we need to store this data somewhere where
we can add new data, delete unused data, update outdated data, retrieve
data, to perform these operations on data we need a Database management
system that allows us to store the data in such a way so that all these
operations can be performed on the data efficiently.
RDBMS
● Stands for "Relational Database Management System."
● An RDBMS is a type of DBMS designed specifically for relational databases.
● A relational database refers to a database that stores data in a structured format, using
rows and columns.
● This makes it easy to locate and access specific values within the database.
● It is "relational" because the values within each table are related to each other.
● Tables may also be related to other tables. The relational structure makes it possible to
run queries across multiple tables at once.
● The RDBMS refers to the that executes queries on the data, including adding, updating,
and software searching for values.
● An RDBMS may also provide a visual representation of the data. For example, it may
display data in a tables like a spreadsheet, allowing you to view and even edit individual
values in the table.
Types of DBMS ...
1.Hierarchical databases-

● It is very fast and simple.


● This kind of database model uses a tree-like structure which links a
number of dissimilar elements to one primary record –the "owner"
or "parent".
● Each record in a hierarchical database contains information about a
group of parent child relationships.
2. Network databases –

● The network database model can be viewed as a net-like form


where a single element can point to multiple data elements and can
itself be pointed to by multiple data elements.
● The network database model allows each record to have multiple
parents as well as multiple child records, which can be visualized as
a web-like structure of networked records.
3. Relational databases –

● A relational database is one in which data is stored in the form of


tables, using rows and columns.
● This arrangement makes it easy to locate and access specific data
within the database. It is “relational” because the data within each
table are related to each other.
4. Object-oriented databases -

● The recent development in database technology is the incorporation


of the object concept that has become significant in programming
languages.
● In object-oriented databases, all data are objects. Objects may be
linked to each other by an “is-part-of” relationship to represent
larger, composite objects
● Object DBMS's increase the semantics of the C++ and Java
Relational Databases
● RDBMS is the basis for SQL, and for all modern database systems like MS
SQL Server, IBM DB2, Oracle, MySQL, and Microsoft Access.

● Most of today's databases are relational:


○ database contains 1 or more tables
○ table contains 1 or more records
○ record contains 1 or more fields
○ fields contain the data
E-R Model ...
● The ER or (Entity Relational Model) is a high-level conceptual data model
diagram.
● Entity-Relation model is based on the notion of real-world entities and
the relationship between them.
● ER modeling helps you to analyze data requirements systematically to
produce a well-designed database.
● So, it is considered a best practice to complete ER modeling before
implementing your database.
ER Diagram
● Entity relationship diagram displays the relationships of entity set stored
in a database.
● In other words, we can say that ER diagrams help you to explain the
logical structure of databases.
● At first look, an ER diagram looks very similar to the flowchart. However,
ER Diagram includes many specialized symbols, and its meanings make
this model unique.
ER Diagram
Components of ER Diagram
● Entities
● Attributes
● Relationships
● For example, in a University database, we might have entities for
Students, Courses, and Lecturers. Students entity can have attributes like
Rollno, Name, and DeptID. They might have relationships with Courses
and Lecturers.
Components of ER Diagram
Entity
● A real-world thing either living or non-living that is easily recognizable
and non recognizable. It is anything in the enterprise that is to be
represented in our database. It may be a physical thing or simply a fact
about the enterprise or an event that happens in the real world.
● An entity can be place, person, object, event or a concept, which stores
data in the database. The characteristics of entities are must have an
attribute, and a unique key. Every entity is made up of some 'attributes'
which represent that entity.
● Set of all entity is called entity set

Examples of entities:
Attribute
Attributes are the properties which define the entity type. For example,
Roll_No, Name, DOB, Age, Address, Mobile_Noare the attributes which
defines entity type Student. In ER diagram, attribute is represented by an
oval.

1.Key Attribute:

● The attribute which uniquely identifies each entity in the entity set is
called key attribute.
● For example, Roll_Nowill be unique for each student. In ER diagram, key
attribute is represented by an oval with underlying lines.
2. Composite Attribute –

An attribute composed of many other attribute is called as composite


attribute. For example, Address attribute of student Entity type consists of
Street, City, State, and Country. In ER diagram, composite attribute is
represented by an oval comprising of ovals
3.Multivalued Attribute –

An attribute consisting more than one value for a given entity. For example,
Phone_No(can be more than one for a given student). In ER diagram,
multivalued attribute is represented by double oval.
4. Derived Attribute –

An attribute which can be derived from other attributes of the entity type is
known as derived attribute. e.g.; Age (can be derived from DOB). In ER
diagram, derived attribute is represented by dashed oval.
The complete entity type Student with its
attributes can be represented as:
Relationship Type and Relationship Set:
A relationship type represents the association between entity types. For
example, ‘Enrolled in’ is a relationship type that exists between entity type
Student and Course. In ER diagram, relationship type is represented by a
diamond and connecting the entities with lines.

A set of relationships of same type is known as relationship set. The


following relationship set depicts S1 is enrolled in C2, S2 is enrolled in C1 and
S3 is enrolled in C3.
Degree of a relationship set: ….
1. Unary Relationship –

When there is only ONE entity set participating in a relation, the relationship
is called as unary relationship. For example, one person is married to only
one person.
2.Binary Relationship–

When there are TWO entities set participating in a relation, the relationship
is called as binary relationship. For example, Student is enrolled in Course.

3. N-ary Relationship –

When there are n entities set participating in a relation, the relationship is


The number of times an entity of an entity set participates in a relationship
set is known as cardinality. Cardinality can be of different types:

One to one – When each entity in each entity set can take part only once in
the relationship, the cardinality is one to one. Let us assume that a male can
marry to one female and a female can marry to one male. So the relationship
will be one to one.
Many to one –When entity set can take part only on entities in once in the relationship set and entities in
other entity set can take part more than once in the relationship set,cardinality is many to one.

Many to many –When entities in all entity sets can take part more than once in the relationship cardinality
is many to many. Let us assume that a student can take more than one course and one course can be taken
by many students. So the relationship will be many to many.
Participation Constraint:Participation Constraint is applied on the entity participating in the relationship
set.
Total Participation –Each entity in the entity set must participate in the relationship. If each student must
enroll in a course, the participation of student will be total. Total participation is shown by double line in ER
diagram.
Partial Participation –The entity in the entity set may or may NOT participate in the relationship. If some
courses are not enrolled by any of the student, the participation of course will be partial. The diagram
depicts the ‘Enrolled in’ relationship set with Student Entity set having total participation and Course Entity
set having partial participation
Weak Entity Type and Identifying Relationship:

● An entity type has a key attribute which uniquely identifies each entity in the entity set.
● But there exists some entity type for which key attribute can’t be defined. These are called Weak
Entity type.
● For example, A company may store the information of dependants (Parents, Children, Spouse) of an
Employee. But the dependents don’t have existence without the employee. So Dependent will be
weak entity type and Employee will be Identifying Entitytype for Dependant.
Algebra
Relational Algebra is procedural query language, which takes Relation as input and generate relation as
output. Relational algebra mainly provides theoretical foundation for relational databases and SQL.
Unary Relational Operations Binary Relational Operations

● SELECT (symbol: σ) ● JOIN


● PROJECT (symbol: π) ● DIVISION
● RENAME (symbol: )
Relational Algebra Operations From Set Theory

● UNION (υ)
● INTERSECTION ( ),
● DIFFERENCE (-)
● CARTESIAN PRODUCT ( x )
What is SQL?
● SQL is Structured Query Language, which is a computer language for storing, manipulating and
retrieving data stored in relational database.
● SQL is a language of database, it includes database creation, deletion, fetching rows and modifying
rows etc.
● SQL is the standard language for Relation Database System. All relational database management
systems like MySQL, MS Access, Oracle, Sybase, Informix, postgre and SQL Server use SQL as standard
database language.
● Also, they are using different dialects, such as:
● MS SQL Server using T-SQL, ANSI SQL
● Oracle using PL/SQL
● MS Access version of SQL is called JET SQL (native format) etc
Objectives
“The large majority of today's business applications revolve around relational databases and the SQL
programming language (Structured Query Language). Few businesses could function without these
technologies…”
Why SQL?
● Allows users to access data in relational database management systems.

● Allows users to describe the data.

● Allows users to define the data in database and manipulate that data.

● Allows to embed within other languages using SQL modules, libraries & pre-compilers.

● Allows users to create and drop databases and tables.

● Allows users to create view, stored procedure, functions in a database.

● Allows users to set permissions on tables, procedures, and views


What is SQL? (Cont…)
● SQL stands for Structured Query Language
● SQL allows you to access a database
● SQL is an ANSI standard computer language
● SQL can execute queries against a database
● SQL can retrieve data from a database
● SQL can insert new records in a database
● SQL can delete records from a database
● SQL can update records in a database
● SQL is easy to learn
● SQL is written in the form of queries
● action queries insert, update & delete data
● select queries retrieve data from DB
Keys
Primary Key:

● A primary key is a column of table which uniquely identifies each tuple (row) in that table.
● Primary key enforces integrity constraints to the table.
● Only one primary key is allowed to use in a table.
● The primary key does not accept the any duplicate and NULL values.
● The primary key value in a table changes very rarely so it is chosen with care where the changes can
occur in a seldom manner.
● A primary key of one table can be referenced by foreign key of another table.
Keys
Unique Key:
● Unique key constraints also identifies an individual table uniquely in a relation or table.
● A table can have more than one unique key unlike primary key.
● Unique key constraints can accept only one NULL value for column.
● Unique constraints are also referenced by the foreign key of another table.
● It can be used when someone wants to enforce unique constraints on a column and a group of
columns which is not a primary key.
Keys
Foreign Key:
● When, "one" table's primary key field is added to a related "many" table in order to create the
common field which relates the two tables, it is called a foreign key in the "many" table.

● In the example given below, salary of an employee is stored in salary table. Relation is established via
is stored in "Employee" table. To identify the salary of "Jforeign key column “Employee_ID_Ref” which
refers “Employee_ID” field in Employee table.of "Jhon" is stored in "Salary" table. But his employee
info For example, salary hon", his "employee id" is stored with each salary record.
Database Normalization ...
● Normalization is the process of minimizing redundancy (duplicity) from a relation or set of relations.
● Redundancy in relation may cause insertion, deletion and updation anomalies. So, it helps to
minimize the redundancy in relations.

Most Commonly used normal forms:


● First normal form(1NF)
● Second normal form(2NF)
● Third normal form(3NF)
● Boyce & Codd normal form (BCNF)
First Normal Form
● If a relation contain composite or multi-valued attribute, it violates first normal form or a relation is in
first normal form if it does not contain any composite or multi-valued attribute.
● A relation is in first normal form if every attribute in that relation is singled valued attribute.
Example 1 –Relation STUDENT in table 1 is not in 1NF because of multi-valued attribute STUD_PHONE. Its
decomposition into 1NF has been shown in table 2.
Second Normal Form
● To be in second normal form, a relation must be in first normal form and relation must not contain
any partial dependency.
● relation is in 2NF if it has No Partial Dependency,i.e.,no non-prime attribute (attributes which are not
part of any candidate key) is dependent on any proper subset of any candidate key of the table.
● Partial Dependency –If the proper subset of candidate key determines non-prime attribute, it is called
partial dependency.
● The table is not in 2nf form
● Note that, there are many courses having the same course fee.
● COURSE_FEE cannot alone decide the value of COURSE_NO or STUD_NO;
● COURSE_FEE together with STUD_NO cannot decide the value of COURSE_NO;
● COURSE_FEE together with COURSE_NO cannot decide the value of STUD_NO;
● Hence, COURSE_FEE would be a non-prime attribute, as it does not belong to the one only candidate key {STUD_NO,
COURSE_NO} ;
● But, COURSE_NO -> COURSE_FEE , i.e., COURSE_FEE is dependent on COURSE_NO,which is a proper subset of the
candidate key.
● Non-prime attribute COURSE_FEE is dependent on a proper subset of the candidate key, which is a partial dependency
and so this relation is not in 2NF.

To convert the above relation to 2NF,we need to split the table into two tables such as :Table 1: STUD_NO, COURSE_NOTable
2: COURSE_NO, COURSE_FEE
Third Normal Form
● A relation is in third normal form, if there is no transitive dependency for non-prime attributes as well
as it is in second normal form.A relation is in 3NF if at least one of the following condition hold sin
every non-trivial function dependency X –> Y
○ X is a super key.
○ Y is a prime attribute (each element of Y is part of some candidate key).
● Transitive dependency –If A->B and B->C are two FDs then A->C is called transitive dependency.
In relation STUDENT given in Table 4, FD set: {STUD_NO -> STUD_NAME, STUD_NO -> STUD_STATE,
STUD_STATE -> STUD_COUNTRY, STUD_NO -> STUD_AGE}Candidate Key: {STUD_NO}

For this relation in table 4, STUD_NO -> STUD_STATE and STUD_STATE -> STUD_COUNTRY are true.
STUD_COUNTRY is transitively dependent on STUD_NO. It violates the third normal form. To convert it in
third normal form, we will decompose the relation STUDENT (STUD_NO, STUD_NAME, STUD_PHONE,
STUD_STATE, STUD_COUNTRY_STUD_AGE) as:STUDENT (STUD_NO, STUD_NAME, STUD_PHONE,
STUD_STATE, STUD_AGE)STATE_COUNTRY (STATE, COUNTRY)
Boyce Codd normal form (BCNF)
● It is an advance version of 3NF that’s why it is also referred as 3.5NF. BCNF is stricter than 3NF.
● A table complies with BCNF if it is in 3NF and for every functional dependency X->Y, X should be the super
key of the table.

Example: Suppose there is a company wherein employees work in more than one department. They store the
data like this:
● Functional dependencies :
emp_id-> emp_nationality
emp_dept->{dept_type, dept_no_of_emp}

● Candidate key: {emp_id, emp_dept)


The table is not in BCNF as neither emp_idnor emp_deptalone are keys.

● To make the table comply with BCNF we can break the table in three tables like this:
SQL Process
● When you are executing an SQL command for any RDBMS, the system determines the best way to
carry out your request and SQL engine figures out how to interpret the task.

● There are various components included in the process.


○ These components are Query Dispatcher, Optimization Engines, Classic Query Engine and SQL
Query Engine, etc.
○ Classic query engine handles all non-SQL queries but SQL query engine won't handle logical
files.
SQL Statement Types
● DDL–Data Definition Language
● DML–Data Manipulation Language
● DCL–Data Control Language
● DQL–Data Query Language
DDL -Data Definition Language

DQL –Data Query Language


DML –Data Manipulation Language

DCL –Data Control Language


Create, Drop, Use Database Syntax

● SQL CREATE DATABASE STATEMENT


○ CREATE DATABASE database_name;

● SQL DROP DATABASE Statement:


○ DROP DATABASE database_name;

● SQL USE STATEMENT


○ USE DATABASE database_name;
Create, Drop, Alter Table Syntax
● SQL CREATE TABLE STATEMENT
○ CREATE TABLE table_name( column1 datatype, column2 datatype, column3 datatype, ..... ,
columnN datatype, PRIMARY KEY( one or more columns ) );

● SQL DROP TABLE STATEMENT


○ DROP TABLE table_name;

● SQL TRUNCATE TABLE STATEMENT


○ TRUNCATE TABLE table_name;

● SQL ALTER TABLE STATEMENT


○ ALTER TABLE table_name ADD NEW_COL INT
○ ALTER TABLE table_name{ADD|DROP|MODIFY}column_name{data_ype};

● SQL ALTER TABLE STATEMENT (RENAME)


○ ALTER TABLE table_nameRENAME TO new_table_name;
Insert, Update, Delete Syntax
● SQL INSERT INTO STATEMENT
○ INSERT INTO table_name( column1, column2....columnN) VALUES ( value1, value2....valueN);

● SQL UPDATE STATEMENT


○ UPDATE table_nameSET column1 = value1, column2 = value2....columnN=valueN[ WHERE
CONDITION ];

● SQL DELETE STATEMENT


○ DELETE FROM table_nameWHERE {CONDITION};
Select Statement Syntax
● SQL SELECT STATEMENT
○ SELECT column1, column2....columnN FROM table_name;

● SQL DISTINCT CLAUSE


○ SELECT DISTINCT column1, column2....columnN FROM table_name;

● SQL WHERE CLAUSE


○ SELECT column1, column2....columnN FROM table_nameWHERE CONDITION;

● SQL AND/OR CLAUSE


○ SELECT column1, column2....columnN FROM table_nameWHERE CONDITION-1 {AND|OR}
CONDITION-2;
Select Statement Syntax
● SQL IN CLAUSE
○ SELECT column1, column2....columnN FROM table_nameWHERE column_nameIN (val-1, val-
2,...val-N);

● SQL BETWEEN CLAUSE


○ SELECT column1, column2....columnN FROM table_name WHERE column_name BETWEEN val-1
AND val-2;

● SQL LIKE CLAUSE


○ SELECT column1, column2....columnN FROM table_nameWHERE column_nameLIKE { PATTERN };

● SQL ORDER BY CLAUSE


○ SELECT column1, column2....columnN FROM table_nameWHERE CONDITION ORDER BY
column_name{ASC|DESC};
Select Statement Syntax
● SQL GROUP BY CLAUSE
○ SELECT SUM(column_name) FROM table_name WHERE CONDITION GROUP BY
column_name;

● SQL COUNT CLAUSE


○ SELECT COUNT(column_name)FROM table_name WHERE CONDITION;

● SQL HAVING CLAUSE


○ SELECT SUM(column_name) FROM table_name WHERE CONDITION GROUP BY
column_nameHAVING (arithematicfunction condition);
Create and Drop Index Syntax
● SQL CREATE INDEX Statement :

○ CREATE UNIQUE INDEX index_nameON table_name( column1, column2,...columnN);

● SQL DROP INDEX STATEMENT


● ALTER TABLE table_nameDROP INDEX index_name;

● SQL DESC Statement :


○ DESC table_name;
Commit and Rollback Syntax

● SQL COMMIT STATEMENT


○ COMMIT;

● SQL ROLLBACK STATEMENT


○ ROLLBACK;
SQL Join Types
● INNER JOIN: returns rows when there is a match in both tables.
● LEFT JOIN: returns all rows from the left table, even if there are no matches in the right table.

● RIGHT JOIN: returns all rows from the right table, even if there are no matches in the left table.

● FULL JOIN: returns rows when there is a match in one of the tables.
JOIN
● A SQL Join statement is used to combine data or rows from two or more tables based on a common
field between them.

● Different types of Joins are:


1.INNER JOIN
2.LEFT JOIN
3.RIGHT JOIN
4.FULL JOIN
Inner Join Syntax
● The most frequently used and important of the joins is the
INNER JOIN. They are also referred to as an EQUIJOIN.

● The INNER JOIN creates a new result table by combining


column values of two tables (table1 and table2) based upon the
join-predicate.
● The query compares each row of table1 with each row of
table2 to find all pairs of rows which satisfy the join-predicate.
When the join-predicate is satisfied, column values for each
matched pair of rows of A and B are combined into a result
row.
● SYNTAX:
Left Join Syntax
● The SQL LEFT JOIN returns all rows from the left table, even if there are no matches in the right table.
This means that if the ON clause matches 0 (zero) records in right table, the join will still return a row
in the result, but with NULL in each column from right table.
● This means that a left join returns all the values from the left table, plus matched values from the
right table or NULL in case of no matching join predicate.
● SYNTAX:
○ SELECT table1.column1, table2.column2...FROM table1 LEFT JOIN table2 ON table1.common_filed
= table2.common_field;
Right Join Syntax
● The SQL RIGHT JOIN returns all rows from the right table, even if there are no matches in the left
table. This means that if the ON clause matches 0 (zero) records in left table, the join will still return a
row in the result, but with NULL in each column from left table.
● This means that a right join returns all the values from the right table, plus matched values from the
left table or NULL in case of no matching join predicate.
● SYNTAX:
○ SELECT table1.column1, table2.column2...FROM table1 RIGHT JOIN table2 ON
table1.common_filed = table2.common_field;
Full Join Syntax
● The SQL FULL JOIN combines the results of both left and right outer joins.
● The joined table will contain all records from both tables, and fill in NULLs for missing matches on
either side.
● SYNTAX:
○ SELECT table1.column1, table2.column2...FROM table1 FULL JOIN table2 ON table1.common_filed
= table2.common_field;
Function
SQL has many built-in functions for performing calculations on data. They are divided into 2 categories:

1. Aggregate Function
2. Scalar Function
Aggregate Function
These functions are used to do operations from the values of the column and a single value is returned.

● AVG()-Returns the average value


● COUNT()-Returns the number of rows
● FIRST()-Returns the first value
● LAST()-Returns the last value
● MAX()-Returns the largest value
● MIN()-Returns the smallest value
● SUM() -Returns the sum
1. AVG()
● Syntax:
○ SELECT AVG(column_name) FROM table_name;

● Example:
○ SELECT AVG(AGE) AS AvgAge FROM Students;

● Output: Student Table


2. COUNT()
● Syntax:
○ SELECT COUNT(column_name) FROM table_name;

● Example:
○ SELECT COUNT(*) AS NumStudents FROM Students;

● Output: Student Table


3. FIRST()
● Syntax:
○ SELECT FIRST(column_name) FROM table_name;

● Example:
○ SELECT FIRST(MARKS) AS MarksFirst FROM Students;

● Output: Student Table


4. LAST()
● Syntax:
○ SELECT LAST(column_name) FROM table_name;

● Example:
○ SELECT LAST(MARKS) AS MarksLast FROM Students;

Student Table
● Output:
5. MAX()
● Syntax:
○ SELECT MAX(column_name) FROM table_name;

● Example:
○ SELECT MAX(MARKS) AS MaxMarks FROM Students;

● Output: Student Table


6. MIN()
● Syntax:
○ SELECT MIN(column_name) FROM table_name;

● Example:
○ SELECT MIN(MARKS) AS MinMarks FROM Students;

MinMarks
● Output: 50 Student Table
7. SUM()
● Syntax:
○ SELECT SUM(column_name) FROM table_name;

● Example:
○ SELECT SUM(MARKS) AS TotalMarks FROM Students;

● Output: Student Table


Scalar functions
These functions are based on user input, these too returns single value.

● UCASE() -Converts a field to upper case


● LCASE() -Converts a field to lower case
● MID() -Extract characters from a text field
● LENGTH() -Returns the length of a text field
● ROUND() -Rounds a numeric field to the number of decimals specified
● NOW() -Returns the current system date and time
● FORMAT() -Formats how a field is to be displayed
1. UCASE()
● Syntax:
○ SELECT UCASE(column_name) FROM table_name;

● Example:
○ SELECT UCASE(NAME) FROM Students;

● Output: Student Table


2. LCASE()
● Syntax:
○ SELECT LCASE(column_name) FROM table_name;

● Example:
○ SELECT LCASE(NAME) FROM Students;

● Output: Student Table


NAME
harsh
suresh
pratik
dhanraj
ram
3. MID()
● Syntax:
○ SELECT MID(column_name,start,length) AS some_nameFROM table_name;
○ specifying length is optional here, and start signifies start position ( starting from 1 )

● Example:
○ SELECT MID(NAME,1,4) FROM Students;

● Output: Student Table


4. LENGTH()
● Syntax:
○ SELECT LENGTH(column_name) FROM table_name;

● Example:
○ SELECT LENGTH(NAME) FROM Students;

● Output: Student Table


5. ROUND()
● Syntax:
○ SELECT ROUND(column_name,decimals) FROM table_name;
○ decimals-number of decimals to be fetched.

● Example:
○ SELECT ROUND(MARKS,0) FROM table_name;

Student Table
● Output:
6. NOW()
● Syntax:
○ SELECT NOW() FROM table_name;

● Example:
○ SELECT NAME, NOW() AS DateTime FROM Students

● Output: Student Table


7. FORMAT()
● Syntax:
○ SELECT FORMAT(column_name) FROM table_name;

● Example:
○ SELECT NAME, FORMAT(Now(),'YYYY-MM-DD') AS Date FROM Students;

● Output:
Student Table
PROCEDURE
● A stored procedure is a prepared SQL code that you can save, so the code can be reused
over and over again.
● So if you have an SQL query that you write over and over again, save it as a stored
procedure, and then just call it to execute it.
● You can also pass parameters to a stored procedure, so that the stored procedure can act
based on the parameter value(s) that is passed.

● To create procedure, use syntax:


CREATEPROCEDUREprocedure_name
AS
Sql_statement
GO;

● To execute created procedure, use syntax:


Refer This Example
● Cursor Example:
○ https://github.com/TopsCode/Data_AnalyticsWithPython_ML_AI/blob/master/SQL/Cursor/exa
mple

● Stored Procedure with one parameter:


○ https://github.com/TopsCode/Data_AnalyticsWithPython_ML_AI/blob/master/SQL/Cursor/curs
orExampleOneparam

● Stored Procedure with multiple parameter:


○ https://github.com/TopsCode/Data_AnalyticsWithPython_ML_AI/blob/master/SQL/Cursor/mul
tipleParam
Trigger
● A trigger is a stored procedure in database which automatically invokes whenever a special event in
the database occurs
● For example, a trigger can be invoked when a row is inserted into a specified table.

● Syntax:
create trigger [trigger_name]
[before | after]
{insert | update | delete}
on [table_name]
[for each row]
[trigger_body]
Explanation of syntax:
create trigger [trigger_name]:Creates or replaces an existing trigger with the trigger_name.

[before | after]:This specifies when the trigger will be executed.

{insert | update | delete}:This specifies the DML operation.

on [table_name]:This specifies the name of the table associated with the trigger.

[for each row]: This specifies a row-level trigger, i.e., the trigger will be executed for each row being
affected.

[trigger_body]: This provides the operation to be performed as trigger is fired

BEFORE and AFTER of Trigger:


BEFORE triggers run the trigger action before the triggering statement is run.
AFTER triggers run the trigger action after the triggering statement is run.
Refer This Example
● Before Trigger:
○ https://github.com/TopsCode/Data_AnalyticsWithPython_ML_AI/blob/master/SQL/Tri
gger/beforeTrigger

● After Trigger:
○ https://github.com/TopsCode/Data_AnalyticsWithPython_ML_AI/blob/master/SQL/Trigger/
afterTrigger
Transaction
● A transaction is a logical unit of work of database processing that includes one or more database
access operations.
● A transaction can be defined as an action or series of actions that is carried out by a single user or
application program to perform operations for accessing the contents of the database.
● The operations can include retrieval, (Read), insertion (Write), deletion and modification.
● Each transaction begins with a specific task and ends when all the tasks in the group successfully
complete. If any of the tasks fail, the transaction fails. Therefore, a transaction has only two
results:success or failure.
● In order to maintain consistency in a database, before and after transaction, certain properties are
followed. These are called ACID properties(Atomicity, Consistency, Isolation, Durability) .
ACID PROPERTY
3. Isolation:

● In a database system where more than one transaction are being executed simultaneously and in
parallel, the property of isolation states that all the transactions will be carried out and executed as if
it is the only transaction in the system.
● No transaction will affect the existence of any other transaction.
● For example, in an application that transfers funds from one account to another, the isolation
property ensures that another transaction sees the transferred funds in one account or the other, but
not in both, nor in neither.
Transaction Control
The following commands are used to control transactions.

● COMMIT− to save the changes.

● ROLLBACK− to roll back the changes.

● SAVEPOINT− creates points within the groups of transactions in which to ROLLBACK.


1. COMMIT
● The COMMIT command is the transactional command used to save changes invoked by a transaction
to the database.

● The COMMIT command saves all the transactions to the database since the last COMMIT or
ROLLBACK command.

● The syntax for the COMMIT command is as follows:

○ COMMIT;
2. ROLLBACK
● The ROLLBACK command is the transactional command used to undo transactions that have not
already been saved to the database.

● This command can only be used to undo transactions since the last COMMIT or ROLLBACK command
was issued.

● The syntax for the COMMIT command is as follows:

○ ROLLBACK;
3. SAVEPOINT
● A SAVEPOINT is a point in a transaction when you can roll the transaction back to a certain point
without rolling back the entire transaction.

● The syntax for a SAVEPOINT command is as shown below.

○ SAVEPOINT SAVEPOINT_NAME;

● This command serves only in the creation of a SAVEPOINT among all the transactional statements.
The ROLLBACK command is used to undo a group of transactions.

● The syntax for rolling back to a SAVEPOINT is as shown below.

○ ROLLBACK TO SAVEPOINT_NAME;
Cursor
● It is a temporary area for work in memory system while the execution of a statement is done.
● A Cursor in SQL is an arrangement of rows together with a pointer that recognizes a present row.
● It is a database object to recover information from a result set one row at once.
● It is helpful when we need to control the record of a table in a singleton technique, at the end of the
day one row at any given moment. The arrangement of columns the cursor holds is known as the
dynamic set.
Main components of Cursors
Each cursor contains the followings 5 parts,
● Declare Cursor:In this part we declare variables and return a set of values.
○ DECLARE cursor_nameCURSOR FOR SELECT_statement;

● Open: This is the entering part of the cursor.


○ OPEN cursor_name;

● Fetch: Used to retrieve the data row by row from a cursor.


○ FETCH cursor_nameINTO variables list;

● Close: This is an exit part of the cursor and used to close a cursor.
○ CLOSE cursor_name;
Syntax:
DECLARE variables;
records;
create a cursor;
BEGIN
OPEN cursor;
FETCH cursor;
process the records;
CLOSE cursor;
END;
Database Backup and Recovery
● Database Backup is storage of data that means the copy of the data.

● It is a safeguard against unexpected data loss and application errors.

● It protects the database against data loss.

● If the original data is lost, then using the backup it can reconstructed.

● The backups are divided into two types,


1. Physical Backup
2. Logical Backup
1. Physical Backups
● Physical Backups are the backups of the physical files used in storing and recovering your database,
such as datafiles, control files and archived redo logs, log files.

● It is a copy of files storing database information to some other location, such as disk, some offline
storage like magnetic tape.

● Physical backups are the foundation of the recovery mechanism in the database.

● Physical backup provides the minute details about the transaction and modification to the database.
2. Logical Backup
● Logical Backup contains logical data which is extracted from a database.

● It includes backup of logical data like views, procedures, functions, tables, etc.

● It is a useful supplement to physical backups in many circumstances but not a sufficient protection
against data loss without physical backups, because logical backup provides only structural
information.
Importance of Backups
● Planning and testing backup helps against failure of media, operating system, software and any other
kind of failures that cause a serious data crash.

● It determines the speed and success of the recovery.

● Physical backup extracts data from physical storage (usually from disk to tape). Operating system is an
example of physical backup.

● Logical backup extracts data using SQL from the database and store it in a binary file.

● Logical backup is used to restore the database objects into the database. So the logical backup utilities
allow DBA (Database Administrator) to back up and recover selected objects within the database.
Causes of Failure
1.System Crash
● System crash occurs when there is a hardware or software failure or external factors like a power
failure.
● The data in the secondary memory is not affected when system crashes because the database has lots
of integrity. Checkpoint prevents the loss of data from secondary memory.

2. Transaction Failure
● The transaction failure is affected on only few tables or processes because of logical errors in the
code.
● This failure occurs when there are system errors like deadlock or unavailability of system resources to
execute the transaction.
3. Network Failure
● A network failure occurs when a client –server configuration or distributed database system are
connected by communication networks.

4. Disk Failure
● Disk Failure occurs when there are issues with hard disks like formation of bad sectors, disk head
crash, unavailability of disk etc.

5. Media Failure
● Media failure is the most dangerous failure because, it takes more time to recover than any other kind
of failures.
● A disk controller or disk head crash is a typical example of media failure.
Recovery
● Recovery is the process of restoring a database to the correct state in the event of a failure.

● It ensures that the database is reliable and remains in consistent state in case of a failure.

Database recovery can be classified into two parts;


1. Rolling Forward applies redo records to the corresponding data blocks.

1. Rolling Back applies rollback segments to the datafiles. It is stored in transaction tables.
Excel

Modules:

1. Introductions to Excel
2. Excel Functions
3. Excel Charts
Example : click here for formula and function practical
Example : click here for count practical
Example : click here for countif practical
Example : click here for countifs practical
Example : click here for sum practical
Example : click here for sumif practical
Example : click here for average practical
Example : click here for averageif practical
Example : click here for averageifs practical
Example : click here for iferror practical
Example : click here for vlookup practical
Example : click here for hlookup practical
Practice

Pivot Data Link: Click Here

Pivot Solution Link: Click Here


Example : click here for column chart practical
Example : click here for bar chart practical
Example : click here for pie chart practical
Example : click here for scatter chart practical
Example : click here for line
chart practical
Example : click here for area
chart practical
Pareto charts highlight the biggest factors
in a set of data.

Following the idea of 80/20 analysis, they


try to show which (approximately) 20% of
the categories contribute 80% of the data
being measured.
Example : click here for average
Pros practical

● Quickly highlights most important data


● Excel automatically builds histogram
and adds Pareto line
Statistics
Modules:
● Introduction to Inferential Statistics
● Introduction ● Estimation and errors
● Population ● Point estimation
● Sample ● Confidence Interval
● Statistic and Parameter ● Hypothesis and its types
● Introduction of Descriptive Statistics
● Measures of Central Tendency
● Measures of Spread
● Introduction of Probability
● Terminology
● Types of Probability
● Probability Distribution
What is Statistics?
Statistics is a mathematical science including methods of collecting, organizing and analyzing data in such a
way that meaningful conclusions can be drawn from them. In general, its investigations and analyses fall
into two broad categories called descriptive and inferential statistics.

Descriptive statistics deals with the processing of data without attempting to draw any inferences from it.
The data are presented in the form of tables and graphs. The characteristics of the data are described in
simple terms. Events that are dealt with include everyday happenings such as accidents, prices of goods,
business, incomes, epidemics, sports data, population data.

Inferential statistics is a scientific discipline that uses mathematical tools to make forecasts and projections
by analyzing the given data. This is of use to people employed in such fields as engineering, economics,
biology, the social sciences, business, agriculture and communications.
Introduction to Population and Sample
A population often consists of a large group of specifically defined elements. For example, the population of a
specific country means all the people living within the boundaries of that country.

Usually, it is not possible or practical to measure data for every element of the population under study. We
randomly select a small group of elements from the population and call it a sample. Inferences about the
population are then made on the basis of several samples.

Example:
● A company is thinking about buying 50,000 electric batteries from a manufacturer. It will buy the
batteries if no more that 1% of the batteries are defective. It is not possible to test each battery in the
population of 50,000 batteries since it takes time and costs money. Instead, it will select few samples of
500 batteries each and test them for defects. The results of these tests will then be used to estimate the
percentage of defective batteries in the population.
Quantitative Data and Qualitative Data
Data is quantitative if the observations or measurements made on a given variable of a sample or
population have numerical values.

Example: height, weight, number of children, blood pressure, current, voltage.

Data is qualitative if words, groups and categories represents the observations or measurements.

Example: colors, yes-no answers, blood group.

Quantitative data is discrete if the corresponding data values take discrete values and it is continuous if the
data values take continuous values.

Example of discrete data: number of children, number of cars.


Example of continuous data: speed, distance, time, pressure.
Statistic and Parameter
Many have trouble understanding the difference between parameter and
statistic, but it’s important to know what exactly these measures mean and
how to distinguish them.

Parameter vs statistic – both are similar, yet different measures. The first one
describes the whole population, while the second describes a part of the
population.
What is Parameter?
It is a measure of a characteristic of an entire population (a mass of all units under consideration that share
common characteristics) based on all the elements within that population. For example, all people living in
one city, all-male teenagers in the world, all elements in a shopping trolley, or all students in a classroom.

If you ask all employees in a factory what kind of lunch they prefer and half of them say pasta, you get a
parameter here – 50% of the employees like pasta for lunch. On the other hand, it’s impossible to count
how many men in the whole world like pasta for lunch, since you can’t ask all of them about their choice. In
that case, you’d probably survey just a representative sample (a portion) of them and extrapolate the
answer to the entire population of men. This brings us to the other measure called statistic.

It’s a measure of characteristic saying something about a fraction (a sample) of the population under study.
A sample in statistics is a part or portion of a population. The goal is to estimate a certain population
parameter. You can draw multiple samples from a given population, and the statistic (the result) acquired
from different samples will vary, depending on the samples. So, using data about a sample or portion allows
you to estimate the characteristics of an entire population.
Parameter vs Statistics
Can you tell the difference between statistics and parameters now?

● A parameter is a fixed measure describing the whole population (population being a group of people, things,
animals, phenomena that share common characteristics.) A statistic is a characteristic of a sample, a portion
of the target population.
● A parameter is fixed, unknown numerical value, while the statistic is a known number and a variable which
depends on the portion of the population.
● Sample statistic and population parameters have different statistical notations:

In population parameter, population proportion is represented by P, mean is represented by µ (Greek letter mu), σ2
represents variance, N represents population size, σ (Greek letter sigma) represents standard deviation, σx̄
represents Standard error of mean, σ/µ represents Coefficient of variation, (X-µ)/σ represents standardized variate
(z), and σp represents standard error of proportion.

In sample statistics, mean is represented by x̄ (x-bar), sample proportion is represented by p̂ (p-hat), s represents
standard deviation, s2 represents variance, sample size is represented by n, sx̄ represents Standard error of mean,
sp represents standard error of proportion, s/(x̄) represents Coefficient of variation, and (x-x̄)/s represents
standardized variate (z).
Example of Parameters

● 20% of U.S. senators voted for a specific measure. Since there are only 100 senators, you can count
what each of them voted.

Example of Statistic
● 50% of people living in the U.S. agree with the latest health care proposal. Researchers can’t ask
hundreds of millions of people if they agree, so they take samples, or part of the population and
calculate the rest.
What Are The Differences Between Population Parameters and Sample Statistics?

The average weight of adult men in the U.S. is a parameter with an exact value – but, we don’t know it.
Standard deviation and population mean are two common parameters.

A statistic is a characteristic of a group of population, or sample. You get sample statistics when you collect
a sample and calculate the standard deviation and the mean. You can use sample statistics to make certain
conclusions about an entire population thanks to inferential statistics. But, you need particular sampling
techniques to draw valid conclusions. Using these techniques ensures that samples deliver unbiased
estimates – correct on average. When it comes to based estimates, they are systematically too low or too
high, so you don’t need them.

To estimate population parameters in inferential statistics, you use sample statistics. For instance, if you
collect a random sample of female teenagers in the U.S. and measure their weights, you can calculate the
sample mean. You can use the sample mean as an unbiased estimate of the population mean.
Introduction of Descriptive Statistics
Descriptive statistics is the term given to the analysis of data that helps describe, show or summarize data in
a meaningful way such that, for example, patterns might emerge from the data. Descriptive statistics do not,
however, allow us to make conclusions beyond the data we have analysed or reach conclusions regarding
any hypotheses we might have made. They are simply a way to describe our data.

Descriptive statistics are very important because if we simply presented our raw data it would be hard to
visualize what the data was showing, especially if there was a lot of it. Descriptive statistics therefore
enables us to present the data in a more meaningful way, which allows simpler interpretation of the data.
For example, if we had the results of 100 pieces of students' coursework, we may be interested in the overall
performance of those students. We would also be interested in the distribution or spread of the marks.
Descriptive statistics allow us to do this. How to properly describe data through statistics and graphs is an
important topic and discussed in other Laerd Statistics guides. Typically, there are two general types of
statistic that are used to describe data
Introduction of Descriptive Statistics
● Measures of central tendency: these are ways of describing the central position of a frequency
distribution for a group of data. In this case, the frequency distribution is simply the distribution and
pattern of marks scored by the 100 students from the lowest to the highest. We can describe this
central position using a number of statistics, including the mode, median, and mean.

● Measures of spread: these are ways of summarizing a group of data by describing how spread out
the scores are. For example, the mean score of our 100 students may be 65 out of 100. However, not
all students will have scored 65 marks. Rather, their scores will be spread out. Some will be lower
and others higher. Measures of spread help us to summarize how spread out these scores are. To
describe this spread, a number of statistics are available to us, including the range, quartiles,
absolute deviation, variance and standard deviation.

● When we use descriptive statistics it is useful to summarize our group of data using a combination of
tabulated description (i.e., tables), graphical description (i.e., graphs and charts) and statistical
commentary (i.e., a discussion of the results).
Measures of Central Tendency
A measure of central tendency (also referred to as measures of centre or central location) is a summary
measure that attempts to describe a whole set of data with a single value that represents the middle or
centre of its distribution.

There are three main measures of central tendency:


● Mode
● Median
● Mean

Each of these measures describes a different indication of the typical or central value in the distribution.
Mode
The mode is the most frequent occurring value in a distribution.

Consider this dataset showing the retirement age of 11 people, in whole years:

54, 54, 54, 55, 56, 57, 57, 58, 58, 60, 60

This table shows a simple frequency distribution of the retirement age data.

Total Frequency: 11

The most commonly


occurring value is 54,
therefore the mode of this
distribution is 54 years.
Advantage of the mode:
The mode has an advantage over the median and the mean as it can be found for both numerical and
categorical (non-numerical) data.
Limitations of the mode:
The are some limitations to using the mode. In some distributions, the mode may not reflect the centre of
the distribution very well. When the distribution of retirement age is ordered from lowest to highest value,
it is easy to see that the centre of the distribution is 57 years, but the mode is lower, at 54 years.

54, 54, 54, 55, 56, 57, 57, 58, 58, 60, 60

It is also possible for there to be more than one mode for the same distribution of data, (bi-modal, or multi-
modal). The presence of more than one mode can limit the ability of the mode in describing the centre or
typical value of the distribution because a single value to describe the centre cannot be identified.

In some cases, particularly where the data are continuous, the distribution may have no mode at all (i.e. if
all values are different).

In cases such as these, it may be better to consider using the median or mean, or group the data in to
appropriate intervals, and find the modal class.
Median
The median is the middle value in distribution when the values are arranged in ascending or descending
order.

The median divides the distribution in half (there are 50% of observations on either side of the median
value). In a distribution with an odd number of observations, the median value is the middle value.
Median
Looking at the retirement age distribution (which has 11 observations), the median is the middle value,
which is 57 years:

54, 54, 54, 55, 56, 57, 57, 58, 58, 60, 60

When the distribution has an even number of observations, the median value is the mean of the two
middle values. In the following distribution, the two middle values are 56 and 57, therefore the median
equals 56.5 years:

52, 54, 54, 54, 55, 56, 57, 57, 58, 58, 60, 60
Advantage of the median:
The median is less affected by outliers and skewed data than the mean, and is usually the preferred
measure of central tendency when the distribution is not symmetrical.

Limitation s of the median:


The median cannot be identified for categorical nominal data, as it cannot be logically ordered.
Mean
The mean is the sum of the value of each observation in a dataset divided by the number of
observations. This is also known as the arithmetic average.

Looking at the retirement age distribution again:

54, 54, 54, 55, 56, 57, 57, 58, 58, 60, 60

The mean is calculated by adding together all the values (54+54+54+55+56+57+57+58+58+60+60 =


623) and dividing by the number of observations (11) which equals 56.6 years.
Advantage of the mean:
The mean can be used for both continuous and discrete numeric data.

Limitations of the mean:


The mean cannot be calculated for categorical data, as the values cannot be summed.

As the mean includes every value in the distribution the mean is influenced by outliers and skewed
distributions.
What else do I need to know about the mean?
The population mean is indicated by the Greek symbol µ (pronounced ‘mu’). When the mean is
calculated on a distribution from a sample it is indicated by the symbol x̅ (pronounced X-bar).
Measures of Spread
Measures of spread describe how similar or varied the set of observed values are for a particular variable
(data item). Measures of spread include the range, quartiles and the interquartile range, variance and
standard deviation.

There are three main measures of spread:


● Range
● Quartiles
● Standard Deviation
● Variance
When can we measure spread?
The spread of the values can be measured for quantitative data, as the
variables are numeric and can be arranged into a logical order with a low
end value and a high end value.

Why do we measure spread?


Summarising the dataset can help us understand the data, especially when
the dataset is large. As discussed in the Measures of Central Tendency , the
mode, median, and mean summarise the data into a single value that is
typical or representative of all the values in the dataset, but this is only part
of the 'picture' that summarises a dataset. Measures of spread summarise
the data in a way that shows how scattered the values are and how much
Example
Range
The range is the difference between the smallest value and the largest
value in a dataset.
Example
Quartiles
Quartiles divide an ordered dataset into four equal parts, and refer to the values of the point
between the quarters. A dataset may also be divided into quintiles (five equal parts) or deciles (ten
equal parts).

The lower quartile (Q1) is the point between the lowest 25% of values and the highest 75% of values. It is also
called the 25th percentile.

The second quartile (Q2) is the middle of the data set. It is also called the 50th percentile, or the median.

The upper quartile (Q3) is the point between the lowest 75% and highest 25% of values. It is also called the 75th
percentile.
Example
Interquartile Range(IQR)
The interquartile range (IQR) is the difference between the upper (Q3) and lower (Q1) quartiles,
and describes the middle 50% of values when ordered from lowest to highest. The IQR is often
seen as a better measure of spread than the range as it is not affected by outliers.
Introduction of Probability
How likely something is to happen.

Many events can't be predicted with total certainty. The best we can say is how likely they are to happen,
using the idea of probability.

Tossing a Coin
When a coin is tossed, there are two possible outcomes:

● Heads(H) or
● Tails(T)

We say that the probability of the coin landing H is ½

And the probability of the coin landing T is ½


Introduction of Probability
Throwing Dice
When a single die is thrown, there are six possible outcomes: 1, 2, 3, 4, 5, 6.
The probability of any one of them is 1
6
Probability
In general:
Example
● The chances of rolling a "4" with a die

● There are 5 marbles in a bag: 4 are blue, and 1 is red. What is the probability that a blue
marble gets picked?
Probability Tree
The tree diagram helps to organize and visualize the different possible outcomes. Branches and ends of the
tree are two main positions. Probability of each branch is written on the branch, whereas the ends are
containing the final outcome. Tree diagram is used to figure out when to multiply and when to add. You can
see below a tree diagram for the coin:
Types of Probability
There are two major types of probabilities:
● Theoretical Probability
● Experimental Probability
Theoretical Probability
Theoretical Probability is what is expected to happen based on
mathematics.

Example:
A coin is tossed.
Experimental Probability
Experimental Probability is found by repeating an experiment and observing
the outcomes.

Example:
A Coin is tossed 10 times: A head is recorded 7 times and a tail 3 times.
Probability Distribution
A probability distribution is a statistical function that describes all the possible values and likelihoods
that a random variable can take within a given range. This range will be bounded between the
minimum and maximum possible values, but precisely where the possible value is likely to be plotted
on the probability distribution depends on a number of factors. These factors include the
distribution's mean (average), standard deviation, skewness.

Types of Distributions:

1. Bernoulli Distribution
2. Uniform Distribution
3. Binomial Distribution
4. Normal Distribution
5. Poisson Distribution
6. Exponential Distribution
Bernoulli Distribution
All you cricket junkies out there! At the beginning of any cricket match, how do you decide who is
going to bat or ball? A toss! It all depends on whether you win or lose the toss, right? Let’s say if the
toss results in a head, you win. Else, you lose. There’s no midway.

A Bernoulli distribution has only two possible outcomes, namely 1 (success) and 0 (failure), and a
single trial. So the random variable X which has a Bernoulli distribution can take value 1 with the
probability of success, say p, and the value 0 with the probability of failure, say q or 1-p.

the occurrence of a head denotes success, and the occurrence of a tail denotes failure.

Probability of getting a head = 0.5 = Probability of getting a tail since there are only two possible
outcomes.
Bernoulli Distribution
The probability mass function is given by: p^x(1-p)^1-x where x € (0, 1).
It can also be written as

The probabilities of success and failure need not be equally likely, like the result of a fight between
me and Undertaker. He is pretty much certain to win. So in this case probability of my success is 0.15
while my failure is 0.85
Bernoulli Distribution
Here, the probability of success(p) is not same as the probability of failure.
So, the chart below shows the Bernoulli Distribution of our fight.
Bernoulli Distribution
Here, the probability of success = 0.15 and probability of failure = 0.85. The expected value is exactly
what it sounds. If I punch you, I may expect you to punch me back. Basically expected value of any
distribution is the mean of the distribution. The expected value of a random variable X from a
Bernoulli distribution is found as follows:

E(X) = 1*p + 0*(1-p) = p


The variance of a random variable from a bernoulli distribution is:

V(X) = E(X²) – [E(X)]² = p – p² = p(1-p)


There are many examples of Bernoulli distribution such as whether it’s going to rain tomorrow or not
where rain denotes success and no rain denotes failure and Winning (success) or losing (failure) the
game.
Binomial Distribution
Suppose that you won the toss today and this indicates a successful event. You toss again but you
lost this time. If you win a toss today, this does not necessitate that you will win the toss tomorrow.
Let’s assign a random variable, say X, to the number of times you won the toss. What can be the
possible value of X? It can be any number depending on the number of times you tossed a coin.

There are only two possible outcomes. Head denoting success and tail denoting failure. Therefore,
probability of getting a head = 0.5 and the probability of failure can be easily computed as: q = 1- p =
0.5.

A distribution where only two outcomes are possible, such as success or failure, gain or loss, win or
lose and where the probability of success and failure is same for all the trials is called a Binomial
Distribution.
The outcomes need not be equally likely. Remember the example of a fight between me and
Undertaker? So, if the probability of success in an experiment is 0.2 then the probability of failure
can be easily computed as q = 1 – 0.2 = 0.8.
Binomial Distribution
Each trial is independent since the outcome of the previous toss doesn’t determine or affect the
outcome of the current toss. An experiment with only two possible outcomes repeated n number of
times is called binomial. The parameters of a binomial distribution are n and p where n is the total
number of trials and p is the probability of success in each trial.

On the basis of the above explanation, the properties of a Binomial Distribution are

1. Each trial is independent.


2. There are only two possible outcomes in a trial- either a success or a failure.
3. A total number of n identical trials are conducted.
4. The probability of success and failure is same for all trials. (Trials are identical.)
Binomial Distribution
The mathematical representation of binomial distribution is given
by:
Binomial Distribution
A binomial distribution graph where the probability of success does not equal the probability of
failure looks like
Binomial Distribution
Now, when probability of success = probability of failure, in such a situation the graph of binomial
distribution looks like
Binomial Distribution
The mean and variance of a binomial distribution are given by:

Mean -> µ = n*p

Variance -> Var(X) = n*p*q


Geometric Distribution
The geometric distribution represents the number of failures before you get
a success in a series of Bernoulli trials. This discrete probability distribution is
represented by the probability density function.

f(x) = (1 − p)x − 1p = p*(1-p)^(x-1)

For example, you ask people outside a polling station who they voted for
until you find someone that voted for the independent candidate in a local
election. The geometric distribution would represent the number of people
who you had to poll before you found someone who voted independent. You
would need to get a certain number of failures before you got your first
Geometric Distribution
If you had to ask 3 people, then X=3; if you had to ask 4 people, then X=4
and so on. In other words, there would be X-1 failures before you get your
success.
If X=n, it means you succeeded on the nth try and failed for n-1 tries. The
probability of failing on your first try is 1-p. For example, if p = 0.2 then your
probability of success is .2 and your probability of failure is 1 – 0.2 = 0.8.
Independence (i.e. that the outcome of one trial does not affect the next)
means that you can multiply the probabilities together. So the probability of
failing on your second try is (1-p)(1-p) and your probability of failing on the
nth-1 tries is (1-p)^n-1. If you succeeded on your 4th try, n = 4, n – 1 = 3, so
the probability of failing up to that point is (1-p)(1-p)(1-p) = (1-p)^3.
Geometric Distribution
Example:-
If your probability of success is 0.2, what is the probability you meet an independent voter on your
third try?
Inserting 0.2 as p and with X = 3, the probability density function becomes:

f(x) = (1 − p)x − 1*p

P(X=3) = (1 − 0.2)3 − 1(0.2)

P(X=3) = (0.8)2*0.2 = 0.128.


Geometric Distribution

Theoretically, there are an infinite number of geometric distributions. The


value of any specific distribution depends on the value of the probability p.
Assumptions for the Geometric Distribution
The three assumptions are:

● There are two possible outcomes for each trial (success or failure).
● The trials are independent.
● The probability of success is the same for each trial.
Uniform Distribution
When you roll a fair die, the outcomes are 1 to 6. The probabilities of getting these outcomes are
equally likely and that is the basis of a uniform distribution. Unlike Bernoulli Distribution, all the n
number of possible outcomes of a uniform distribution are equally likely.

A variable X is said to be uniformly distributed if the density function is:

The graph of a uniform distribution curve looks like


Uniform Distribution
You can see that the shape of the Uniform distribution curve is rectangular,
the reason why Uniform distribution is called rectangular distribution.

For a Uniform Distribution, a and b are the parameters.


Uniform Distribution
The number of bouquets sold daily at a flower shop is uniformly distributed with a maximum of 40
and a minimum of 10.
Let’s try calculating the probability that the daily sales will fall between 15 and 30.
The probability that daily sales will fall between 15 and 30 is (30-15)*(1/(40-10)) = 0.5
Similarly, the probability that daily sales are greater than 20 is = 0.667
The mean and variance of X following a uniform distribution is:
Mean -> E(X) = (a+b)/2
Variance -> V(X) = (b-a)²/12
The standard uniform density has parameters a = 0 and b = 1, so the PDF for standard
uniform density is given by:
Exponential Distribution
Let’s consider the call center example one more time. What about the interval of time between the
calls ? Here, exponential distribution comes to our rescue. Exponential distribution models the
interval of time between the calls.

Other examples are:

1. Length of time between metro arrivals,

2. Length of time between arrivals at a gas station

3. The life of an Air Conditioner

Exponential distribution is widely used for survival analysis. From the expected life of a machine to
the expected life of a human, exponential distribution successfully delivers the result.
Exponential Distribution
A random variable X is said to have an exponential distribution with PDF:

f(x) = { λe^-λx, x ≥ 0

and parameter λ>0 which is also called the rate.

For survival analysis, λ is called the failure rate of a device at any time t, given that it has survived up
to t.

Mean and Variance of a random variable X following an exponential distribution:

Mean -> E(X) = 1/λ

Variance -> Var(X) = (1/λ)²


Exponential Distribution
Also, the greater the rate, the faster the curve drops and the lower the rate, flatter the curve. This is
explained better with the graph shown below.
Exponential Distribution
To ease the computation, there are some formulas given below.

P{X≤x} = 1 – e-λx, corresponds to the area under the density curve to the left of x.

P{X>x} = e-λx, corresponds to the area under the density curve to the right of x.

P{x1<X≤ x2} = e-λx1 – e-λx2, corresponds to the area under the density curve between x1 and x2.
Normal Distribution
Normal distribution represents the behavior of most of the situations in the universe (That is why it’s
called a “normal” distribution. I guess!). The large sum of (small) random variables often turns out to
be normally distributed, contributing to its widespread application. Any distribution is known as
Normal distribution if it has the following characteristics:

1. The mean, median and mode of the distribution coincide.


2. The curve of the distribution is bell-shaped and symmetrical about the line x=μ.
3. The total area under the curve is 1.
4. Exactly half of the values are to the left of the center and the other half to the right.

A normal distribution is highly different from Binomial Distribution. However, if the number of trials
approaches infinity then the shapes will be quite similar.
Normal Distribution
The PDF of a random variable X following a normal distribution is given by:

The mean and variance of a random variable X which is said to be normally distributed is given by:

Mean -> E(X) = µ

Variance -> Var(X) = σ^2


Normal Distribution
Here, µ (mean) and σ (standard deviation) are the parameters.

The graph of a random variable X ~ N (µ, σ) is shown below.


Normal Distribution
A standard normal distribution is defined as the distribution with mean 0 and standard deviation 1.
For such a case, the PDF becomes:
Introduction of Inferential Statistics
It is about using data from sample and then making inferences about the larger population from which
the sample is drawn. The goal of the inferential statistics is to draw conclusions from a sample and
generalize them to the population. It determines the probability of the characteristics of the sample
using probability theory. The most common methodologies used are hypothesis tests, Analysis of
variance etc.

For example: Suppose we are interested in the exam marks of all the students in India. But it is not
feasible to measure the exam marks of all the students in India. So now we will measure the marks of a
smaller sample of students, for example 1000 students. This sample will now represent the large
population of Indian students. We would consider this sample for our statistical study for studying the
population from which it’s deduced.
Hypothesis testing
We evaluate 2 mutual exclusive statement on population data using sample
data

Null Hypothesis and Alternative Hypothesis

Steps:

1. Make initial Assumption


2. Collecting data
3. Gather evidence to reject or not to reject the null Hypothesis .
Practical :
Hypothesis Testing : click_here

Central Limit Theorem : Click_here


Data Analytics and Machine
Learning
Modules
● Model Selection and
● Introduction to Data Analytics ● Hyperparameter Tuning
● Intro to types of Algorithms ● Ensemble Learning
● Regression ● Associate Rules Mining
and
● Classification ● Recommendations (with
● Dimensionality Reduction case study)
Techniques ● Time Series Forecasting
● Reinforcement Learning
● Clustering
Introduction to Data Analytics
1. What is Data Analytics?
2. Importance of Data Analytics
3. Types of Data Analytics
What is Data Analytics?

A domain focussed to gain


meaningful and valuable
insights from the data
available to do predictions.
Importance of Data Analytics

● Helps businesses optimize their performances.


● Analysis is done to study purchase patterns.
● This can also help in improving managerial operations
and leverage organisations to next level.
● Data and information are increasing rapidly.
Types of Data Analytics
Descriptive analytics describes what has happened over a
given period of time. Have the number of views gone up?

Diagnostic analytics focuses more on why something


happened.Did the weather affect beer sales? Did that latest
marketing campaign impact sales?

Predictive analytics moves to what is likely going to


happen in the near term. What happened to sales the last
time we had a hot summer?

Prescriptive analytics suggests a course of action.


Descriptive analytics
● Descriptive analytics is the interpretation of historical data to draw
a better Comparison.
● Takes raw data and parses that data to draw conclusions.
● Return on Investment Capital (ROIC) is a descriptive analytic
created by taking three data points:

net income, dividends, and total capital, and turning those data
points into an easy-to-understand percentage that can be used to
compare one company’s performance to others.
Diagnostic analytics
Diagnostic analytics is a form of advanced analytics that examines
data or content to answer the question, “Why did it happen?”

It is characterized by techniques such as data discovery, data


mining and correlations.
Predictive analytics
Predictive analytics is the use of data, statistical algorithms and
machine learning techniques to identify the likelihood of future
outcomes based on historical data.

Prediction

Forecasting, etc
Prescriptive analytics
● Prescriptive analytics makes use of machine learning to help
businesses decide a course of action based on a computer
program’s predictions.

● Prescriptive analytics works with predictive analytics, which


uses data to determine near-term outcomes
Intro to types of Algorithms
● Supervised
● UnSupervised
● Reinforcement Learning
● Applications
Supervised Machine Learning:
Supervised learning is when the model is getting trained on
a labelled dataset.
Un-Supervised Machine Learning:
Un-Supervised learning is when the model is getting
trained on a Un-labelled dataset.
Reinforcement Machine Learning:

Positive Reward for Good Action and Negative for Bad Action given a State
in the Environment, then agent tries to Maximize the Reward points
Machine Learning Regression
Regression Algorithms

● Linear Regression
● Ridge Regression
● Lasso Regression
● Polynomial Regression
Linear Regression
Simple Linear Regression:

y = dependent variable x = independent variable

b0 = constant
b1 = determines how a unit change in x will make a unit
change in y.

It is also known as the slope of the line which


determines to which extent the change will inflate or
deflate.
◦ Simple Linear Regression is
used with data having only one
feature and one label.

◦ SLR is more suited for data


visualization as there are only
two axes to plot the variables.
GOAL

DATA
train - test Split

>>>from sklearn.linear_model import train_test_split.


x_train, x_test, y_train, y_test = train_test_split(x,y, test_size=0.2)

● Split arrays or matrices into random train and test


subsets.
● “test_size” should be between 0.0 and 1.0 and represent
the proportion of the dataset to include in the test split.
Training Linear Regression
>>> from sklearn.linear_model import LinearRegression

>>>model=LinearRegression()

>>>model.fit(x_train, y_train)
Mean Square Error:
Measures the Average of the Squares of the error

● Where One is Actual Value and the other one is the


predicted value.
● Predicted Value = mx + b, where “m” is slope and
“b” is the intercept
r2_Score
>>> from sklearn.metrics import r2_score

r2_score=1 means:
● prediction = actual value
Simple Linear Regression
Practical
Practical link : Click_here
Gradient Descent
Gradient Descent is an Optimization Algorithm that find the local minimum of a
Loss/ Error function.
● i - ith Sample
● ŷ - Predicted Value
● y - Actual Value

Gradients

predicted value = mx + b
Learning Rate (alpha)
Learning rate controls how quickly or slowly a model learns a problem.
Updating Parameters

> m = m - alpha*gradient_m
> b = b - alpha*gradient_b
Practical For Gradient
Descent
Practical link : click_here
Multiple Linear Regression:
y = dependent variable

b0 = constant

b1, b2 … bn = coefficients

x1, x2, … xn = independent variables


Multiple Linear Regression
● Uses several explanatory(predictor) variables to predict the outcome of a
response variable.

Necessary Assumption
◦ Linearity

◦ Homoscedasticity

◦ Multivariate Normality

◦ Independence of Error

◦ Lack of Multicollinearity
Practical for Multiple Linear
Regression
Practical link : Click _here
Ridge Regression
Ridge Regression is a technique for analyzing multiple regression data that
suffer from multicollinearity as well used when we have a high stepped line and
we want to reduce it using some regularization technique

It Try to increase bias and to lower the variance.

Linear least squares with l2 regularization

Minimizes the objective function:

||y - Xw||^2 + alpha * ||w||^2


● from sklearn.linear_model import Ridge
● Model = Ridge(alpha=1.0)
Ridge Regression Practical

Practical link : click_here


Lasso Regression
“LASSO” stands for Least Absolute Shrinkage and Selection Operator.

Lasso regression is a type of linear regression that uses shrinkage.

Does same job as ridge Regression but hear it also used as feature selection as
its value shrink to zero wheres ridge move toward zero but never reach there.

This particular type of regression is well-suited for models showing high levels of
multi-collinearity.

● from sklearn.linear_model import Lasso


● Model = Lasso(alpha=1.0)
Practical Lasso Regression
||y - Xw||^2 + alpha * ||w||

Practical link :click_here


Polynomial Linear Regression

In Linear regression

◦It is not necessary to have data which can be plotted linearly,


sometimes data can be in a nonlinear form as well, like a parabola
as shown in the figure.

◦In polynomial regression, we have a variable with different powers


Polynomial Regression Practical
Classification Algorithms

● Logistic Regression
● k-Nearest Neighbors
● Decision Tree Classifier
● Naive bayes Classifier
● SVM Classifier
Logistic Regression

The logistic regression is a predictive analysis.

Logistic Regression, also known as Logit Regression or Logit


Model, is a mathematical model used in statistics to estimate
(guess) the probability of an event occurring having been given
some previous data.
Sigmoid Function
Practical For Logistic Regression

Practical link : click_here


K-Nearest Neighbour
● It’s a Classification technique.
How does it work? ● You intend to find out the class of the blue
star.
● The “K” is KNN algorithm is the nearest
neighbor we wish to take the vote from.
Let’s say K = 3.

● Hence, we will now make a circle with BS


as the center just as big as to enclose only
three data points on the plane.

● The three closest points to BS is all RC.


Hence, with a good confidence level, we
can say that the BS should belong to the
class RC.
How to apply?
from sklearn.neighbors import
KNeighborsClassifier

clf=KNeighborsClassifier(n_neighbors=3)
Now you can simply fit this Classifier Algorithm, just like a normal
classifier, using clf.fit() method.
Practical For KNN Classifier

Practical : click_here
Decision Tree
Important Terminology related to Decision
Trees
1. Root Node: It represents the entire population or sample and
this further gets divided into two or more homogeneous sets.

1. Splitting: It is a process of dividing a node into two or more


sub-nodes.
3. Decision Node: When a sub-node splits into further sub-nodes,
then it is called the decision node.

4. Leaf / Terminal Node: Nodes do not split is called Leaf or


Terminal node.

5. Pruning: When we remove sub-nodes of a decision node, this


process is called pruning. You can say the opposite process of
splitting.
6. Branch / Sub-Tree: A subsection of the entire tree is called
branch or sub-tree.

7. Parent and Child Node: A node, which is divided into sub-nodes


is called a parent node of sub-nodes whereas sub-nodes are the
child of a parent node.
Solving this attribute selection problem

1. Ginni Index
2. Entropy
3. Information Gain
Gini Index

Steps to Calculate Gini index for a split


1. Calculate Gini for sub-nodes, using the above formula
for success(p) and failure(q)
2. Calculate the Gini index for split using the weighted
Gini score of each node of that split.

CART (Classification and Regression Tree) uses the Gini


index method to create split points.
What is Entropy and Information Gain?

Entropy:
Entropy is a measure of the randomness in the information being
processed. The higher the entropy, the harder it is to draw any
conclusions from that information.

Where p(x) = fraction of examples in a class.


Flipping a coin is an
example of an action that
provides information that
is random.
Information Gain:

Where IG = Information gain, and WA = Weighted Average.


Decision Tree - Classification
Practical : click_here
Algorithm

The core algorithm for building decision trees called ID3 by J. R. Quinlan
which employs a top-down, greedy search through the space of possible
branches with no backtracking.

ID3 uses Entropy and Information Gain to construct a decision tree.


Entropy
A decision tree is built top-down from a root node and involves partitioning
the data into subsets that contain instances with similar values
(homogenous). ID3 algorithm uses entropy to calculate the homogeneity of a
sample. If the sample is completely homogeneous the entropy is zero and if
the sample is an equally divided it has entropy of one.
Entropy
Entropy is a measure of randomness. In other words, its a measure of unpredictability.

We will take a moment here to give entropy in case of binary event(like the coin toss,

where output can be either of the two events, head or tail) a mathematical face:

Entropy = -(probability(a) * log2(probability(a))) – (probability(b) *


log2(probability(b)))
To build a decision tree, we need to calculate two types of entropy using
frequency tables as follows:

a) Entropy using the frequency table of one attribute:


b) Entropy using the frequency table of two attributes:
Information Gain
The information gain is based on the decrease in entropy after a dataset is
split on an attribute. Constructing a decision tree is all about finding
attribute that returns the highest information gain (i.e., the most
homogeneous branches).

Step 1: Calculate entropy of the target.


Step 2: The dataset is then split on the different attributes. The entropy for
each branch is calculated. Then it is added proportionally, to get total entropy
for the split. The resulting entropy is subtracted from the entropy before the
split. The result is the Information Gain, or decrease in entropy.
Step 3: Choose attribute with the largest information gain as the decision
node, divide the dataset by its branches and repeat the same process on
every branch.
Step 4a: A branch with entropy of 0 is a leaf node.
Step 4b: A branch with entropy more than 0 needs further splitting.

Step 5: The ID3 algorithm is run recursively on the non-leaf branches, until all data is
classified.
Naive Bayes Classifier
Naive Bayes Classifier
The Naive Bayesian classifier is based on Bayes’ theorem with the
independence assumptions between predictors.

Naive Bayesian model is easy to build, with no complication.

● P(c|x) is the posterior probability of


class (target) given predictor
(attribute).
● P(c) is the prior probability of class.
● P(x|c) is the likelihood which is the
probability of predictor given class.
● P(x) is the prior probability of
predictor.
● The posterior probability can be calculated by first, constructing
a frequency table for each attribute against the target.
● Then, transforming the frequency tables to likelihood tables
and finally use the Naive Bayesian equation to calculate the
posterior probability for each class.
● The class with the highest posterior probability is the outcome
of prediction.
Example 2:
In this example we have 4 inputs (predictors). The final posterior probabilities can be
standardized between 0 and 1.
Naive Bayes Practical
Practical : click_here
Support Vector Machine - Classification (SVM)

Subset of Vector
A Support trainingMachine
points in the decision
(SVM) performsfunction (called
classification by support vectors)
finding the
hyperplane that maximizes the margin between the two classes. The vectors
(cases) that define the hyperplane are the support vectors.

It is used for both classification or regression challenges.


we plot each data item as a point in n-dimensional space (where n is number of
features you have) with the value of each feature being the value of a particular
coordinate.

we perform classification by finding the hyper-plane that differentiates the two


classes very well

The simplest way to


separate two groups of data
is with a straight line (1
dimension), flat plane (2
dimensions) or an N-
dimensional hyperplane.
However, there are situations where a nonlinear region can separate the groups more
efficiently. SVM handles this by using a kernel function (nonlinear) to map the data into a
different space where a hyperplane (linear) cannot be used to do the separation.
SVM algorithms use a set of mathematical functions that are defined as the kernel. The function
of kernel is to take data as input and transform it into the required form. Different SVM
algorithms use different types of kernel functions. These functions can be different types. For
example linear, nonlinear, polynomial, radial basis function (RBF), and sigmoid.

Identify the right hyper-plane


z=x^2+y^2

● All values for z would be positive always


because z is the squared sum of both x and y
● In the original plot, red circles appear close to
the origin of x and y axes, leading to lower
value of z and star relatively away from the
origin result to higher value of z.
kernel: We have already discussed about it. Here, we have various options available
with kernel like, “linear”, “rbf”,”poly” and others (default value is “rbf”). Here “rbf” and
“poly” are useful for non-linear hyper-plane. Let’s look at the example, where we’ve used
linear kernel on two feature of iris data set to classify their class.

Support Vector Machine -Practical


Practical : click_here
Dimensionality
Reduction Techniques
PCA
Principal Component Analysis (PCA) is an unsupervised,

High dimensionality means that the dataset has a large number of features.
Principal Component
The first principal component expresses the most amount of variance.

Each additional component expresses less variance and more noise, so


representing the data with a smaller subset of principal components
preserves the signal and discards the noise.
PCA Limitations
Model performance: PCA can lead to a reduction in model performance on datasets with

no or low feature correlation or does not meet the assumptions of linearity.

Classification accuracy: Variance based PCA framework does not consider the

differentiating characteristics of the classes. Also, the information that distinguishes one

class from another might be in the low variance components and may be discarded.
Practical For PCA

Practical link : click_here


Outliers: PCA is also affected by outliers, and normalization of the data needs to be an

essential component of any workflow.

Interpretability: Each principal component is a combination of original features and does

not allow for the individual feature importance to be recognized.

Supervised:

Click_here
LDA :
Clustering

Clustering vs Classification

● Observations (or data points) in a classification task have labels. Each observation is

classified according to some measurements. Classification algorithms try to model the

relationship between measurements (features) on observations and their assigned class.

Then the model predicts the class of new observations.

● Observations (or data points) in clustering do not have labels. We expect the model to find

structures in the dataset so that similar observations can be grouped into clusters. We

basically ask the model to label observations.


Clustering Algorithm

- K-mean Clustering
- Hierarchical
K-mean Clustering
K-means clustering aims to partition data into k clusters in a way that data points in the same
cluster are similar and data points in the different clusters are farther apart.

There are many methods to measure the distance. Euclidean distance (minkowski distance

with p=2) is one of most commonly used distance measurements.

K-means clustering tries to minimize distances within a cluster and maximize the distance

between different clusters. K-means algorithm is not capable of determining the number of

clusters.
Hierarchical Clustering

Hierarchical clustering starts by treating each observation as a separate cluster. Then, it repeatedly executes the

following two steps: (1) Identify the two clusters that are closest together, and (2) Merge the two most similar

clusters. There are two types of hierarchical clustering:

Agglomerative clustering and Divisive

clustering
Hierarchical clustering typically works by sequentially
merging similar clusters, as shown above. This is known as
agglomerative hierarchical clustering. In theory, it can also
be done by initially grouping all the observations into one
cluster, and then successively splitting these clusters. This is
known as Divisive hierarchical clustering. Divisive
clustering is rarely done in practice.
Agglomerative clustering
Agglomerative clustering is kind of a bottom-up
approach. Each data point is assumed to be a separate cluster
at first. Then the similar clusters are iteratively combined.

● Stop after a number of clusters is reached (n_clusters)

● Set a threshold value for linkage (distance_threshold). If the distance

between two clusters are above the threshold, these clusters will not be merged.
The figure above is called dendrogram which is a diagram representing
tree-based approach. In hierarchical clustering, dendrograms are used to
visualize the relationship among clusters.
Math Intuition ( What Is Linkage )
Here’s one way to calculate similarity – Take the distance between the centroids of these
clusters. The points having the least distance are referred to as similar points and we can
merge them. We can refer to this as a distance-based algorithm as well (since we are
calculating the distances between the clusters).

In hierarchical clustering, we have a concept called a proximity matrix. This stores the
distances between each point.

During both the types of hierarchical clustering, the distance between two sub-clusters needs to be computed.
The different types of linkages describe the different approaches to measure the distance between two sub-
clusters of data points.
Perform Hierarchical Clustering
Problem: Teacher Wants To Separate Students According Their Marks.
Initialization
Creating a Proximity Matrix

All the Distance Are Calculated By Euclidean Distance Formula


Distance between point 1 and 2:

√(10-7)^2 = √9 = 3

____________________________________________________________
Next, we will look at the smallest distance in the proximity matrix and
merge the points with the smallest distance. We then update the proximity
matrix:

Let’s look at the updated clusters and accordingly update the proximity
matrix:
Updated Metrics
We will repeat step 2 until only a
single cluster is left

We started with 5 clusters and finally


have a single cluster. This is how
agglomerative hierarchical clustering
works.
How should we Choose the Number of Clusters in Hierarchical Clustering?

More the distance of the vertical lines in the dendrogram, more the
distance between those clusters.
Now, we can set a threshold distance and draw a horizontal l

ine (Generally, we try to set the threshold in such a way tha

t it cuts the tallest vertical line).

The number of clusters will be the number of vertical

lines which are being intersected by the

line drawn using the threshold.


Clustering PRACTICAL

Practical link : click_here


Cross Validation

1. K-Fold
a. Split dataset into k consecutive folds (without shuffling by
default).
2. LOOCV
a. Provides train/test indices to split data in train/test sets. Each
sample is used once as a test set (singleton) while the remaining
samples form the training set.
K-FOLD LOOCV

class sklearn.model_selection.LeaveOne
sklearn.model_selection.KFold(n_sp
Out
lits=5)
PRACTICAL

Practical link : click_here


BOOTSTRAP
th cross validation and bootstrapping are resampling methods.
● bootstrap resamples with replacement (and usually produces
new "surrogate" data sets with the same number of cases as the original data set).
Due to the drawing with replacement, a bootstrapped data set may contain
multiple instances of the same original cases, and may completely omit other
original cases.
● This is done by training the model on the sample and evaluating the skill of the
model on those samples not included in the sample. These samples not
included in a given sample are called the out-of-bag samples, or OOB for
short.
# scikit-learn bootstrap
from sklearn.utils import resample
# data sample
data = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6]
# prepare bootstrap sample
boot = resample(data, replace=True, n_samples=4,
random_state=1)
print('Bootstrap Sample: %s' % boot)
# out of bag observations
oob = [x for x in data if x not in boot]
Grid Search CV
GridSearchCV implements a “fit” and a “score” method.

class
sklearn.model_selection.GridSearchCV(estimator,
param_grid, *, scoring=None, cv=None)

param_grid = [

{'C': [1, 10, 100, 1000], 'kernel': ['linear']},

{'C': [1, 10, 100, 1000], 'gamma': [0.001, 0.0001],


'kernel': ['rbf']},]
Randomized Search CV
RandomizedSearchCV implements a “fit” and a “score”
method.

In contrast to GridSearchCV, not all parameter values are


tried out, but rather a fixed number of parameter settings
is sampled from the specified distributions.

class
sklearn.model_selection.RandomizedSearchCV(
estimator, param_distributions, scoring=None, cv = None)
PRACTICAL
Practical link :click_here
Ensemble Learning
Ensemble learning helps improve machine learning results by combining several models. Ensemble
methods are meta-algorithms that combine several machine learning techniques into one predictive
model in order to decrease variance (bagging), bias (boosting).

Bagging
■ Random Forest
Boosting
■ AdaBoost
■ XGBoost
Bagging
Boosting
Adaptive boosting or AdaBoost is one of the simplest boosting algorithms. Usually, decision
trees are used for modelling. Multiple sequential models are created, each correcting the errors
from the last model. AdaBoost assigns weights to the observations which are incorrectly
predicted and the subsequent model works to predict these values correctly.
PRACTICAL
Practical link : click_here
Time - Series Forecasting
DATA:
TIME SERIES FOR APARTMENT PRICE IN VESU

Time –Series Data:


● Sequence of Data Observed at Regular Time-Interval.
● For ex: Hourly, Daily, Weekly, Monthly,
©2022 TOPS Technolgies. All RightsQuaterly,
Reserved Yearly.
COMPONENTS OF TIME SERIES

1. SEASONAL: REPEAT OVER A SPECIFIC PERIOD.


2. TREND: VARIATIONS THAT MOVE UP OR DOWN
3. CYCLICAL
4. RANDOM VARIATION: UNPREDICTABLE, ALSO CALLED NOISE

©2022 TOPS Technolgies. All Rights Reserved


STATIONARITY
• STATIONARY PROPERTIES OF TIME SERIES (DON'T CHANGE OVER TIME)
EX: CONSTANT "MEAN" AND "VARIANCE"

STATIONARITY CHECK
- DICKEY-FULLER TEST
- H0 -> DATA IS NOT STATIONARY

©2022 TOPS Technolgies. All Rights Reserved


Time series forecasting with ARIMA
We are going to apply one of the most commonly used method for time-series forecasting,

known as ARIMA, which stands for Autoregressive Integrated Moving Average. ARIMA

models are denoted with the notation ARIMA(p, d, q). These three parameters

account for seasonality, trend, and noise in data


GENERAL TRANSFORMATIONS TO MAKE
DATA STATIONARY:

1.Log
2.Difference
3.Square root, etc

©2022 TOPS Technolgies. All Rights Reserved


### **Augmented DickeyFuller Test**.

- This checks the data is stationary an important test before applying


time-series models.

### **Null Hypotheisis:**

- Given Data is not Stationary, Hence if we consider 95% Confidence


Interval the if obtained p<=0.05 we can reject Null Hypotheisis which
signifies that our data is "Stationary"

Note: After Transforming you will loose the original data, hence be sure
that you have option to transform prediction Similar to Original original
value Somehow

©2022 TOPS Technolgies. All Rights Reserved


- Autocorrelation helps us study how each time series observation is related to its
recent (or not so recent) past.

Forcasting time series using ARIMA

The ARIMA forcasting for stationary time series is nothing but linear equation(like
linear regression).

- The predictor depend on (p, d, q) of Arima model.

- ACF and PACF grpahically summarizes the strength of relationship with an


observation in timeseries with observations at prior times steps

©2022 TOPS Technolgies. All Rights Reserved


AUTOCORRELATION:
Autocorrelation helps us study how each time series observation is
related to its past.
● p means the number of preceding (“lagged”) Y values that have to
be added/subtracted to Y in the model, so as to make better
predictions based on local periods of growth/decline in our data.
ARIMA MODEL:
This captures the “autoregressive” nature of ARIMA.

● q represents the number of preceding/lagged values for the error


term that are added/subtracted to Y. This captures the “moving
average” part of ARIMA

©2022 TOPS Technolgies. All Rights Reserved


d represents the number of times that the data have to be
“differenced” to produce a stationary signal (i.e., a signal that
has a constant mean over time). This captures the “integrated”
nature of ARIMA. If d=0, this means that our data does not tend
to go up/down in the long term (i.e., the model is already
“stationary”).

©2022 TOPS Technolgies. All Rights Reserved


FITTING AN ARIMA MODEL

FORECASTING THE RESULT

©2022 TOPS Technolgies. All Rights Reserved


Practical link : click_here

©2022 TOPS Technolgies. All Rights Reserved


Apriori Algorithm

For finding frequent itemsets in a dataset


Apriori because it uses prior knowledge of frequent itemset
properties.

©2022 TOPS Technolgies. All Rights Reserved


What is Frequent Itemset?
Frequent itemsets are those items whose support is greater than the
threshold value or user-specified minimum support. It means if A &
B are the frequent itemsets together, then individually A and B
should also be the frequent itemset.

Suppose there are the two transactions: A= {1,2,3,4,5}, and B=


{2,3,7}, in these two transactions, 2 and 3 are the frequent
itemsets.

©2022 TOPS Technolgies. All Rights Reserved


Steps for Apriori Algorithm

Step-1: Determine the support of itemsets


Step-2: Take all supports in the transaction with higher
support value than the minimum
Step-3: Find all the rules of these subsets that have
higher confidence value than the threshold
Step-4: Sort the rules as the decreasing order of lift.

©2022 TOPS Technolgies. All Rights Reserved


©2022 TOPS Technolgies. All Rights Reserved
As per Minimum support
is 02

©2022 TOPS Technolgies. All Rights Reserved


Finding the association rules for the subsets?
To generate the association rules, first, we will create a new table
with the possible rules from the occurred combination {A, B.C}.
For all the rules, we will calculate the Confidence using formula
sup( A ^B)/A. After calculating the confidence value for all rules,
we will exclude the rules that have less confidence than the
minimum threshold(50%).
©2022 TOPS Technolgies. All Rights Reserved
©2022 TOPS Technolgies. All Rights Reserved
Rules Support Confidence

A ^B → C 2 Sup{(A ^B) ^C}/sup(A ^B)= 2/4=0.5=50%

B^C → A 2 Sup{(B^C) ^A}/sup(B ^C)= 2/4=0.5=50%

A^C → B 2 Sup{(A ^C) ^B}/sup(A ^C)= 2/4=0.5=50%

C→ A ^B 2 Sup{(C^( A ^B)}/sup(C)= 2/5=0.4=40%

A→ B^C 2 Sup{(A^( B ^C)}/sup(A)= 2/6=0.33=33.33%

B→ B^C 2 Sup{(B^( B ^C)}/sup(B)= 2/7=0.28=28%

©2022 TOPS Technolgies. All Rights Reserved


Recommendation System

©2022 TOPS Technolgies. All Rights Reserved


©2022 TOPS Technolgies. All Rights Reserved
Collaborative filtering methods for recommender systems
are methods that are solely based on the past interactions between users and the
target items. Thus, the input to a collaborative filtering system will be all historical
data of user interactions with target items. This data is typically stored in a matrix
where the rows are the users, and the columns are the items.
The core idea behind such systems is that the historical data of the users should be
enough to make a prediction. I.e we don’t need anything more than that historical
data, no extra push from the user, no presently trending information, etc.

©2022 TOPS Technolgies. All Rights Reserved


Content-based
In contrast to collaborative filtering, content-based approaches will use
additional information about the user and / or items to make
predictions.
Content-based recommenders treat recommendation as a user-specific classification
problem and learn a classifier for the user's likes and dislikes based on an item's
features. In this system, keywords are used to describe the items and a user profile is
built to indicate the type of item this user likes.

©2022 TOPS Technolgies. All Rights Reserved


Practical
Click_here

©2022 TOPS Technolgies. All Rights Reserved


Reinforcement Learning

©2022 TOPS Technolgies. All Rights Reserved


REINFORCEMENT LEARNING

● In this type of Machine Learning, there is an agent which tries to learn what action it
should take for a given state in the environment in order to maximize the cumulative
Reward.
● In short Learning through Experience.
©2022 TOPS Technolgies. All Rights Reserved
Actions: Actions are the Agent’s methods which allow it to interact and change its
environment, and thus transfer between states. Every action performed by the
Agent yields a reward from the environment. The decision of which action to
choose is made by the policy.

Agent: The learning and acting part of a Reinforcement Learning problem, which
tries to maximize the rewards it is given by the Environment.

Environment: Everything which isn’t the Agent; everything the Agent can
interact with, either directly or indirectly. The environment changes as the Agent
performs actions; every such change is considered a state-transition. Every action
the Agent performs yields a reward received by the Agent.

©2022 TOPS Technolgies. All Rights Reserved


Thomson Sampling
Thompson Sampling makes use of Probability Distribution and Bayes Theorem to generate success
rate distributions.

Thompson Sampling is also sometimes referred to as Posterior Sampling or Probability Matching.

Follows exploration and exploitation.

New choices are explored to maximize rewards while exploiting the already explored choices.

©2022 TOPS Technolgies. All Rights Reserved


Basic Intuition Behind Thompson Sampling
1. To begin with, all machines are assumed to have a uniform distribution of the
probability of success, in this case getting a reward
2. For each observation obtained from a Slot machine, based on the reward a new
distribution is generated with probabilities of success for each slot machine
3. Further observations are made based on these prior probabilities obtained on each
round or observation which then updates the success distributions.
4. After sufficient observations, each slot machine will have a success distribution
associated with it which can help the player in choosing the machines wisely to get the
maximum rewards.

©2022 TOPS Technolgies. All Rights Reserved


Multi-Arm Bandits
Bandits: Formally named “k-Armed Bandits” after the
nickname “one-armed bandit” given to slot-machines,
these are considered to be the simplest type of
Reinforcement Learning tasks. Bandits have no
different states, but only one — and the reward taken
under consideration is only the immediate one.
Hence, bandits can be thought of as having single-state
episodes. Each of the k-arms is considered an action,
and the objective is to learn the policy which will
maximize the expected reward after each action (or arm-
pulling).

©2022 TOPS Technolgies. All Rights Reserved


UCB
Step 1: Two values are considered for each round of exploration of a machine

1. The number of times each machine has been selected till round n
2. The sum of rewards collected by each machine till round n

Step 2: At each round, we compute the average reward and the confidence interval
of the machine i up to n rounds as follows:

Average UCB
Confidence Interval
©2022 TOPS Technolgies. All Rights Reserved
Practical
Click_here

©2022 TOPS Technolgies. All Rights Reserved


Q-learning

©2022 TOPS Technolgies. All Rights Reserved


Q-learning is an off policy reinforcement learning algorithm that seeks to find
the best action to take given the current state. It's considered off-policy because
the q-learning function learns from actions that are outside the current policy,
like taking random actions, and therefore a policy isn't needed.

What is Q?

Q - stands for the quality, how useful a given action is in gaining

rewards (Finding Optimal Path)


©2022 TOPS Technolgies. All Rights Reserved
Create a q-table

When q-learning is performed we create what’s called a q-table or matrix that

follows the shape of [state, action] and we initialize our values to zero.

We then update and store our q-values after an episode. This q-table becomes

a reference table for our agent to select the best action based on the q-value.

©2022 TOPS Technolgies. All Rights Reserved


Q-learning and making updates

The next step is simply for the agent to interact with

the environment and make updates to the state action

pairs in our q-table Q[state, action].


©2022 TOPS Technolgies. All Rights Reserved
Here are the 3 basic steps:

1. Agent starts in a state (s1) takes an action (a1) and receives a reward (r1)

2. Agent selects action by referencing Q-table with highest value (max) OR

by random (epsilon, ε)

3. Update q-values

©2022 TOPS Technolgies. All Rights Reserved


# Update q values

Q[state, action] = Q[state, action] + lr *


(reward + gamma * np.max(Q[new_state, :]) —
Q[state, action])

Click_here

©2022 TOPS Technolgies. All Rights Reserved


Intro To Deep Learning
Deep learning is part of a broader family of machine learning methods based on
artificial neural networks with representation learning.

Deep Learning is a subfield of machine learning concerned with algorithms inspired


by the structure and function of the brain called artificial neural networks.

©2022 TOPS Technolgies. All Rights Reserved


DL VS ML
1. The main difference between deep learning and machine learning is due to the
way data is presented in the system. Machine learning algorithms almost always
require structured data, while deep learning networks rely on layers of ANN
(artificial neural networks) and it is not necessary that it require the structure
data , it can work on both structure and unstructured data .
2. Machine learning algorithms are designed to “learn” to act by understanding
labeled data and then use it to produce new results with more datasets.

©2022 TOPS Technolgies. All Rights Reserved


1. Deep learning networks do not require human intervention, as multilevel layers
in neural networks place data in a hierarchy of different concepts, which
ultimately learn from their own mistakes. However, even they can be wrong if
the data quality is not good enough.
2. Data decides everything. It is the quality of the data that ultimately determines
the quality of the result.

©2022 TOPS Technolgies. All Rights Reserved


DL VS Human Brain

Deep learning consists of artificial neural networks that are modeled on similar
networks present in the human brain. As data travels through this artificial
mesh, each layer processes an aspect of the data, filters outliers, spots familiar
entities, and produces the final output.
The human brain functions in a similar fashion — but only at a highly advanced
level. The human brain is a far more complex web of diverse neurons where
each node performs a separate task. Our understanding of things is far more
superior. If we are taught that lions are dangerous, we can deduce that bears
are too.

©2022 TOPS Technolgies. All Rights Reserved


Single Layer perceptron

©2022 TOPS Technolgies. All Rights Reserved


Working

Single layer perceptron is the first proposed neural model created. The content of the local memory of the neuron
consists of a vector of weights. The computation of a single layer perceptron is performed over the calculation of sum
of the input vector each with the value multiplied by corresponding element of vector of the weights. The value
which is displayed in the output will be the input of an activation function.

Tacking Activation function as Sigmoid function

● The weights are initialized with random values at the beginning of the training.
● For each element of the training set, the error is calculated with the difference between desired output and
the actual output. The error calculated is used to adjust the weights.
● The process is repeated until the error made on the entire training set is not less than the specified threshold,
until the maximum number of iterations is reached.

©2022 TOPS Technolgies. All Rights Reserved


Multilayer Perceptron
Multi-Layer perceptron defines the most complicated architecture of artificial neural networks. It is substantially
formed from multiple layers of perceptron.

©2022 TOPS Technolgies. All Rights Reserved


Intro ……….
The field of artificial neural networks is often just called neural networks or
multi-layer perceptrons after perhaps the most useful type of neural
network. A perceptron is a single neuron model that was a precursor to
larger neural networks.
The predictive capability of neural networks comes from the hierarchical or
multi-layered structure of the networks. The data structure can pick out
(learn to represent) features at different scales or resolutions and combine
them into higher-order features. For example from lines, to collections of
lines to shapes.

©2022 TOPS Technolgies. All Rights Reserved


What is Neurons………..
The building block for neural networks are artificial neurons.

These are simple computational units that have weighted input signals
and produce an output signal using an activation function.

©2022 TOPS Technolgies. All Rights Reserved


You may be familiar with linear regression, in which case the weights on the
inputs are very much like the coefficients used in a regression equation.

Like linear regression, each neuron also has a bias which can be thought of
as an input that always has the value 1.0 and it too must be weighted.

For example, a neuron may have two inputs in which case it requires three
weights. One for each input and one for the bias.

Weights are often initialized


to small random values,
such as values in the range
0 to 0.3, although more
complex initialization
schemes can be used.

©2022 TOPS Technolgies. All Rights Reserved


Activation Function
The weighted inputs are summed and passed through an activation function,
sometimes called a transfer function.

An activation function is a simple mapping of summed weighted input to the


output of the neuron. It is called an activation function because it governs the
threshold at which the neuron is activated and strength of the output signal.
Historically simple step
activation functions were
used where if the summed
input was above a
threshold, for example 0.5,
then the neuron would
output a value of 1.0,
otherwise it would output a
0.0. ©2022 TOPS Technolgies. All Rights Reserved
Traditionally non-linear activation functions are used. This allows the network to combine the inputs in
more complex ways and in turn provide a richer capability in the functions they can model. Non-linear
functions like the logistic also called the sigmoid function were used that output a value between 0 and 1
with an s-shaped distribution, and the hyperbolic tangent function also called tanh that outputs the
same distribution over the range -1 to +1.

More recently the rectifier activation function has been shown to provide better results.

Sigmoid Function Tanh Function

©2022 TOPS Technolgies. All Rights Reserved


Layers In DL
Neurons are arranged into networks of neurons.

A row of neurons is called a layer and one network can have multiple layers.
The architecture of the neurons in the network is often called the network
topology.

©2022 TOPS Technolgies. All Rights Reserved


Input Layer
The bottom layer that takes input from your dataset
is called the visible layer, because it is the exposed
part of the network. Often a neural network is
drawn with a visible layer with one neuron per input
value or column in your dataset. These are not
neurons as described above, but simply pass the
input value though to the next layer.

©2022 TOPS Technolgies. All Rights Reserved


Hidden Layer

Layers after the input layer are called hidden layers because that are
not directly exposed to the input. The simplest network structure is to
have a single neuron in the hidden layer that directly outputs the
value.

Given increases in computing power and efficient libraries, very deep


neural networks can be constructed. Deep learning can refer to having
many hidden layers in your neural network. They are deep because
they would have been unimaginably slow to train historically, but may
take seconds or minutes©2022
to TOPS Technolgies. All Rights Reserved
train using modern techniques and
Output Layer
The final hidden layer is called the output layer and it is responsible for
outputting a value or vector of values that correspond to the format
required for the problem.

©2022 TOPS Technolgies. All Rights Reserved


Choice Of Activation function
The choice of activation function in The output layer is strongly constrained by the type of
problem that you are modeling. For example:

● A regression problem may have a single output neuron and the neuron may have no
activation function.
● A binary classification problem may have a single output neuron and use a sigmoid
activation function to output a value between 0 and 1 to represent the probability of
predicting a value for the class 1. This can be turned into a crisp class value by using a
threshold of 0.5 and snap values less than the threshold to 0 otherwise to 1.
● A multi-class classification problem may have multiple neurons in the output layer, one
for each class
● In this case a softmax activation function may be used to output a probability of the
network predicting each of the class values. Selecting the output with the highest
probability can be used to produce a crisp class classification value.

©2022 TOPS Technolgies. All Rights Reserved


Types of Activation function
Sigmoid Function

In an ANN, the sigmoid function is a non-linear AF used primarily in


feedforward neural networks. It is a differentiable real function, defined
for real input values, and containing positive derivatives everywhere with
a specific degree of smoothness. The sigmoid function appears in the
output layer of the deep learning models and is used for predicting
probability-based outputs. The sigmoid function is represented as:

©2022 TOPS Technolgies. All Rights Reserved


Relu Activation

Max(0,Z) ,
If Negative == 0
Else
All The values of Z Or input value

©2022 TOPS Technolgies. All Rights Reserved


Leaky relu

max(0.01 * Z , Z)

©2022 TOPS Technolgies. All Rights Reserved


SoftMax

©2022 TOPS Technolgies. All Rights Reserved


What Is Learning Rate
the learning rate is a configurable hyperparameter used in the training of neural networks that has a small
positive value, often in the range between 0.0 and 1.0.

The learning rate controls how quickly the model is adapted to the problem. Smaller learning rates require more
training epochs given the smaller changes made to the weights each update, whereas larger learning rates
result in rapid changes and require fewer training epochs.

©2022 TOPS Technolgies. All Rights Reserved


A learning rate that is too large can cause the model to converge too quickly to a suboptimal solution,
whereas a learning rate that is too small can cause the process to get stuck.

©2022 TOPS Technolgies. All Rights Reserved


What Is Batch
The batch size is a hyperparameter that defines the number of samples to work through before
updating the internal model parameters.

Think of a batch as a for-loop iterating over one or more samples and making predictions. At the end
of the batch, the predictions are compared to the expected output variables and an error is
calculated. From this error, the update algorithm is used to improve the model, e.g. move down
along the error gradient

When all training samples are used to create one batch, the learning algorithm is called batch
gradient descent. When the batch is the size of one sample, the learning algorithm is called
stochastic gradient descent. When the batch size is more than one sample and less than the size of
the training dataset, the learning algorithm is called mini-batch gradient descent.

©2022 TOPS Technolgies. All Rights Reserved


©2022 TOPS Technolgies. All Rights Reserved
Epoch
The number of epochs is a hyperparameter that defines the number times that the learning algorithm will work
through the entire training dataset.

One epoch means that each sample in the training dataset has had an opportunity to update the internal model
parameters. An epoch is comprised of one or more batches. an epoch that has one batch is called the batch
gradient descent learning algorithm.

It is common to create line plots that show epochs along the x-axis as time and the error or skill of the model on
the y-axis. These plots are sometimes called learning curves.

©2022 TOPS Technolgies. All Rights Reserved


Loss Function

with neural networks, we seek to minimize the error. As such, the objective function is often referred
to as a cost function or a loss function and the value calculated by the loss function is referred to as
simply “loss.”

The cost or loss function has an important job in that it must faithfully distill all aspects of the model down
into a single number in such a way that improvements in that number are a sign of a better model.

©2022 TOPS Technolgies. All Rights Reserved


It is important, therefore, that the function faithfully represent our design goals. If we choose a poor
error function and obtain unsatisfactory results, the fault is ours for badly specifying the goal of the
search.

©2022 TOPS Technolgies. All Rights Reserved


Choice OF Loss Function

The choice of cost function is tightly coupled with the choice of output unit. Most of the time, we
simply use the cross-entropy between the data distribution and the model distribution. The choice of
how to represent the output then determines the form of the cross-entropy function

©2022 TOPS Technolgies. All Rights Reserved


Regression Problem
● Loss Function: Mean Squared Error (MSE).

Binary Classification Problem


● Loss Function: Cross-Entropy

Multi-Class Classification Problem


● Loss Function: MultiClass Cross-Entropy,

©2022 TOPS Technolgies. All Rights Reserved


BackPropagation

©2022 TOPS Technolgies. All Rights Reserved


What is BackPropagation

Back-propagation is the essence of neural net training. It is the method of fine-tuning the weights of a neural
net based on the error rate obtained in the previous epoch (i.e., iteration). Proper tuning of the weights
allows you to reduce error rates and to make the model reliable by increasing its generalization.
It is a standard method of training artificial neural networks. This method helps to calculate the gradient of a
loss function with respects to all the weights in the network.

©2022 TOPS Technolgies. All Rights Reserved


How Does It works

1. Inputs X, arrive through the preconnected path


2. Input is modeled using real weights W. The weights are usually randomly selected.
3. Calculate the output for every neuron from the input layer, to the hidden layers, to the output layer.
4. Calculate the error in the outputs
5. Travel back from the output layer to the hidden layer to adjust the weights such that the error is
decreased.

Keep repeating the process until the desired output is achieved

©2022 TOPS Technolgies. All Rights Reserved


©2022 TOPS Technolgies. All Rights Reserved
Update the Weight Of W

Equations : W1 = W1 - Alpha * DL/DW1


DL / DW = Chain Rule Equation
Like DL / DW1 = DL / DZ * DZ / DW1 : Till Your Requires W Point

Same For Bias


B = B - Alpha * DL/DB

©2022 TOPS Technolgies. All Rights Reserved


Optimizers

Optimizers are algorithms or methods used to change the attributes of your neural
network such as weights and learning rate in order to reduce the losses.
How you should change your weights or learning rates of your neural network to
reduce the losses is defined by the optimizers you use. Optimization algorithms or
strategies are responsible for reducing the losses and to provide the most accurate
results possible.

©2022 TOPS Technolgies. All Rights Reserved


Types
Gradient Descent
Stochastic Gradient Descent
Mini-batch Gradient Descent
Adagrad
Adam and Rms Prop
Etc.

©2022 TOPS Technolgies. All Rights Reserved


Gradient descent
All Data point At a Time : J(Theta) = 1/2m * sum(h(X) - Y)^2
Loop {
Theta = Theta - alpha * D J / D Theta
( j = 0,...,n )
}

©2022 TOPS Technolgies. All Rights Reserved


Stochastic Gradient Descent
Single Data Point at a Time
1. Randomly shuffle dataset
2. Loop {
For i = 1,...,m {
Theta_ j = Theta_ j - alpha * D ( h(X) - Y )X_j / DTheta
( j = 0,..., n )
}
}
©2022 TOPS Technolgies. All Rights Reserved
SGD With Momentum

Concept Of Exponential Weighted Average


T1 T2 T3 T4
b1 b2 b3 b4
Then
Vt1 =b1
Vt2 = Y * Vt1 + b2
Vt3 = Y * Vt3 + b3
Y = 0 to 1 ©2022 TOPS Technolgies. All Rights Reserved
Equation For Weight Updation
W = W - Alpha * DL / DW = Old
W = W - [ Y * Vt-1 + n * DL / DWold ] = New
Vt-1 = 1*[Dl / Dw_old] + Y[DL / Dw_old]t-1 + Y^2 [DL / DW_old]t-2

©2022 TOPS Technolgies. All Rights Reserved


MiNi Batch Gradient Descent
Single Data Point at a Time
1. Batch = 20 , dataPoint = 100
2. Loop {
For i = 20,..,100 {
Theta_ j = Theta_ j - alpha * 1/20 sum ( h(X) -Y)X
( j = 0,..., n ) ( sum range = i to i + 19 )
}
}
©2022 TOPS Technolgies. All Rights Reserved
AdaGrade

Equation = W = W - Alpha * DL / DW
Here Very the Alpha ( learning Rate )
Alpha = Alpha / Sqrt ( eta_t + sigma )
Where sigma = any small positive value
Eta_t = sum ( DL / DW_t)^2 ; sum range = t = 1 to t

©2022 TOPS Technolgies. All Rights Reserved


AdaDelta And RmsProp

W = W - Alpha * DL / DW
Alpha = Alpha / Sqrt ( Wavg + Sigma )
Wavg = Bita * Wavg_t-1 + (1 - Bita) * (DL / DW)^2
Generally B = 0.95

©2022 TOPS Technolgies. All Rights Reserved


Practical

NN_From_Skretch : = click_here

Hand_DIgit_Recognition = click_here

Single Layer Perceptron := click_here

©2022 TOPS Technolgies. All Rights Reserved


Tableau
What is Data Visualization

Data Viz is The Representation of data or information in a graph, charts or Other Visual
Format . its just delts with the Graphic Represenation os the data .

It communicates the relationship of the data with images .

Connects thousands and millions and lines of numbers with the nice visualt image to
effectively analyze it .
Importance of Viz

This is important because it allows trends and patterns to be more easily seen.

With the rise of big data upon us, we need to be able to interpret increasingly larger
batches of data.

Machine learning makes it easier to conduct analyses such as predictive analysis, which
can then serve as helpful visualizations to present.

Data Viz is not only related with Data Scientist and Data Analytics , This Skiils Can be
used at almost every filed to analyse the data
Need of Visualization
We need data visualization because a visual summary of information makes it easier to
identify patterns and trends than looking through thousands of rows on a spreadsheet.
It’s the way the human brain works.

Since the purpose of data analysis is to gain insights, data is much more valuable when
it is visualized.

Even if a data analyst can pull insights from data without visualization, it will be more
difficult to communicate the meaning without visualization. Charts and graphs make
communicating data findings easier
Use cases of data Viz

1. Identifying Trends and Spikes


2. Monitoring Goals and Results
3. Getting Notified When Changes Occur
4. Aggregating Diverse Data Sets
5. Accessing Data Displays Remotely
And Many More……………...
Data Visualization Tools
What is data Visualization Tool
Its a software that takes data from a specific source and turns it into visual charts ,
graphs , dashboard and reports

There are multiple Data Visualization tools available in market ,

Some are..

1. Visme
2. Tableau
3. Power Bi
4. Infogram
5. Whatagraph
6. Sisense
7. DataBox
And Many More……………….
Tableau
Why tableau
1. Quick And Interactive Visualization

Specialization in beautiful visualizations , it offers


instantaneous insights with simple drag and drop
feature , thus helping you easily analyze key data
and share crucial insights .
2. Easy To use

Compare to other BI tools , Tableau lets you


create rich visualization in just a few seconds .
It lets you perform complex tasks with simple
drag-and-drop functionalities ,
3. Handles Copious amounts of data

If you need all the records of path 8-10 years ?


Tableau can handle millions of rows of data without
impacting the performance of the dashboard .

It can connect to live daa source to provide companies with


the real-time results on some key business metics .
4. Mobile Friendly Dashboard

Tableau dashboards can be viewed and operated on


several devices such as your laptop, mobile, or even a
tablet! You aren’t required to perform any additional steps
in order to make your dashboards mobile-friendly.
Tableau automatically understands the device that you’re
viewing the report on and makes adjustments
accordingly.
5. Integrates with Scripting Languages

The BI tool lets you integrate with R or Python, thus


helping you amplify data with visual analytics.
Tableau Example dashboard
Installing Tableau
Tableau Public
GOTO : https://public.tableau.com/en-us/s/download and enter the email ID an download the app
Except the terms
and conditions
Tableau Interface
A = Left pane – Displays the connected data source and other details about your data.

B = Canvas: logical layer - The canvas opens with the logical layer, where you can create
relationships between logical tables

C = Canvas: physical layer – Double-click a table in the logical layer to go to the physical
layer of the canvas, where you can add joins and unions between tables.

D = Data grid – Displays first 1,000 rows of the data contained in the Tableau data source.

E = Metadata grid – Displays the fields in your data source as rows.


A = Workbook name. A workbook contains sheets. A sheet can be a worksheet, a dashboard,
or a story.

B = Drag fields to the cards and shelves in the workspace to add data to your view

C = Use the toolbar to access commands and analysis and navigation tools

D = This is the canvas in the workspace where you create a visualisation (also referred to as a
"viz").

E = Click this icon to go to the Start page, where you can connect to data. For more
information

F = Side Bar - In a worksheet, the side bar area contains the Data pane and the Analytics
pane.

G = Click this tab to go to the Data Source page and view your data.

H = Status bar - Displays information about the current view

I = Sheet tabs - Tabs represent each sheet in your workbook. This can include worksheets,
dashboards and stories.
Data Types
There are primarily seven data types used in Tableau. Tableau automatically detects the
data types of various fields as soon as new data gets uploaded from source to Tableau and
assigns it to the fields. You can also modify these data types after uploading your data into
Tableau .

1. String values
2. Number (Integer) values
3. Date values
4. Date & Time values
5. Boolean values
6. Geographic values
7. Cluster or mixed values
Connecting To Data Source

One of the basic operation we need to learn is to connect to a data source.

Once we establish a successful connection with a data source, we can access all its data,
bring some part of it in Tableau’s repository (extract) and use it for our analysis.

Tableau offers a myriad of data sources such as local text files, MS Excel, PDFs, JSON or
databases and servers like Tableau Server, MySQL Server, Microsoft SQL Server, etc.

Categorically, there are two types of data sources that you can connect to in Tableau;
To a file and To a server.
Connecting To A File
Tableau offers a variety of options to connect and get data from a file in your
system.

The connection to a file section has file options such as MS Excel, MS Access,
JSON, text file, PDF file, spatial file, etc.

In addition to this, with the help of the More option, you can access the data
files residing in your system and connect them with Tableau.
Connecting To A Server
The connection to a server section has countless options for an online data source. Here you
will find connectors to different kinds of online data sources such as,

● Server-based Relational Database: Tableau Server, Microsoft SQL Server, Oracle,


MySQL, Salesforce, IBM DB, Mongo DB, PostgreSQL, Maria DB, etc.
● Cloud-based Data Sources: Cloudera Hadoop, Google Cloud SQL, Amazon
Aurora, etc.
● Web-based Data Sources: Web Data Connector
● Big Data Sources: Google BigQuery
● In-memory Database: SAP HANA
● ODBC and JDBC connections
Connect to DataBase Practically
Data preparation in Tableau
As we all know that data Will Never be in our required form and every time we need to pre
process that before doing any analysis .

Generally we need to stuck our head in excel for doing the same .

Tableau Provide a Better Functionality to Doing the same , we can use the Tableau Prep for
Cleaning the data

Tableau Prep is designed to reduce the struggle of common yet complex tasks—such as
joins, unions, pivots, and aggregations—with a drag-and-drop visual experience. No
scripting required.
Joins
In general, there are four types of joins that you can use in Tableau: inner, left, right, and full outer.

Inner :

When you use an inner join to combine tables, the result is a table that contains values that have
matches in both tables.

When a value doesn't match across both tables, it is dropped entirely.


Left :

When you use a left join to combine tables, the result is a table that contains all values from the left
table and corresponding matches from the right table.

When a value in the left table doesn't have a corresponding match in the right table, you see a null
value in the data grid.
Right :

When you use a right join to combine tables, the result is a table that contains all values from the
right table and corresponding matches from the left table.

When a value in the right table doesn't have a corresponding match in the left table, you see a null
value in the data grid.
Full Outer :

When you use a full outer join to combine tables, the result is a table that contains all values from
both tables.

When a value from either table doesn't have a match with the other table, you see a null value in the
data grid.
Union :

Though union is not a type of join, union is another method for combining two or more tables by
appending rows of data from one table to another. Ideally, the tables that you union have the same
number of fields, and those fields have matching names and data types.
Filters
Filters are a smart way to collate and segregate data based on its
dimensions and sets to reduce the overall data frequency for
faster processing.

There are six different types of filters in tableau desktop based on


their various objectives
1. Extract Filter

As understood by its name, the extract filters are used to extract data from the
various sources,
Such methods can help in lowering the tableau queries to the data source.

1. Data Source Filter

Used mainly to restrict sensitive data from the data viewers, the data source filters
are similar to the extract filters in minimizing the data feeds for faster processing.

The data source filter in tableau helps in the direct application of the filter
environment to the source data and quickly uploads data that qualifies the scenario
into the tableau workbook.
3. Context Filter

A context filter is a discrete filter on its own, creating datasets based on the
original datasheet and the presets chosen for compiling the data. Since all
the types of filters in tableau get applied to all rows in the datasheet,
irrespective of any other filters, the context filter would ensure that it is first to
get processed.

Despite being constrained to view all data rows, it can be implemented to


choose sheets as and when required to optimize its performance by
minimizing the data efficiently.
4. Dimension filter

Now that you’ve chosen the data, you can access the values highlighted or
remove them from the selected dimension, represented as strikethrough
values. You can click All or None to select or deselect based on your
operation in case of multiple dimensions.

5. Measure Filter

In this filter, you can apply the various operations like Sum, Avg, Median,
Standard Deviation, and other aggregate functions. In the next stage, you
would be presented with four choices: Range, At least, At most, and Special
for your values. Every time you drag the data you want to filter, you can do
that in a specific setting.
6. Table Filter

The last filter to process is the table calculation that gets executed once the
data view has been rendered. With this filter, you can quickly look into the
data without any filtering of the hidden data.
Charts And Graphs
We can Create many charts in tableau depending upon our requirement

We can have …………..

Bar Chart
Line Chart
Pie Chart
Maps
Density Maps
Scatter Plot
Gantt Plot
Bubble Chart
Tree Map

And many more


Practical
Calculated Field
Calculated fields allow you to create new data from data that already
exists in your data source. When you create a calculated field, you
are essentially creating a new field (or column) in your data source,
the values or members of which are determined by a calculation that
you control.

This new calculated field is saved to your data source in


Tableau, and can be used to create more robust
visualizations. But don't worry: your original data remains
untouched.
You can use calculated fields for many, many reasons. Some examples
might include:

● To segment data
● To convert the data type of a field, such as converting a string to a
date.
● To aggregate data
● To filter results
● To calculate ratios
How to perform Calculation on Tableau .

1. In Tableau, select Analysis > Create Calculated Field.


2. In the Calculation Editor that opens, do the following:
● Enter a name for the calculated field. In this example, the field is called, Discount
Ratio.
● Enter a formula. This example uses the following formula:

IF([sales] != 0 , [discount]/[sales],0)

This formula checks if sales is not equal to zero. If true, it returns the discount
ratio (Discount/Sales); if false, it returns zero.

1. When finished, click OK.


Story Making
Use stories to make your case more compelling by showing how facts are connected, and how
decisions relate to outcomes. You can then publish your story to the web, or present it to an
audience.

Each story point can be based on a different view or dashboard, or the entire story can be based on the
same visualization seen at different stages, with different filters and annotations.
1. Click the New Story tab.
2. In the lower-left corner of the screen, choose a size for
your story. Choose from one of the predefined sizes, or
set a custom size, in pixels
3. By default, your story gets its title from the sheet name.
To edit it, right-click the sheet tab, and choose Rename
Sheet.
4. To start building your story, double-click a sheet on the
left to add it to a story point.
5. Click Add a caption to summarize the story point.
6. To further highlight the main idea of this story point,
you can change a filter or sort on a field in the view.
Then save your changes by clicking Update on the story
toolbar above the navigator box:
Dashboard And Reports
Reports VS Dashboard

A report is a more detailed collection of tables, charts, and graphs and it is used for a
much more detailed, full analysis while a dashboard is used for monitoring what is
going on. The behavior of the pieces that make up dashboards and reports are similar,
but their makeup itself is different.

A dashboard answers a question in a single view and a report provides information.

The report can provide a more detailed view of the information that is presented on a
dashboard.
Practical For report And Dashboard

You might also like