Welcome to Scribd!

0% found this document useful (0 votes)

4 views

Dsbda Ass3

Uploaded by

ngak1214

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Dsbda Ass3

Uploaded by

ngak1214

0% found this document useful (0 votes)

4 views22 pages

Original Title

DSBDA_ASS3

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Download as pdf or txt

0% found this document useful (0 votes)

4 views22 pages

Dsbda Ass3

Uploaded by

ngak1214

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Download as pdf or txt

Jump to Page

You are on page 1of 22

Search inside document

Data Science and Big

Data Analytics
Laboratory
Third Year 2019 Course
Assignment No 3

Prof.K.B.Sadafale
Assistant Professor
Computer Dept. GCOEAR, Avasari
Descriptive Statistics - Measures of Central Tendency and variability

Perform the following operations on any open source dataset

(e.g., data.csv)
1. Provide summary statistics (mean, median, minimum,
maximum, standard deviation) for a dataset (age, income etc.)
with numeric variables grouped by one of the qualitative
(categorical) variable. For example, if your categorical variable is
age groups and quantitative variable is income, then provide
summary statistics of income grouped by the age groups. Create
a list that contains a numeric value for each response to the
categorical variable.
2. Write a Python program to display some basic statistical
details like percentile, mean, standard deviation etc. of the
species of ‘Iris-setosa’, ‘Iris-versicolor’ and 'Iris-virginica‘ of
iris.csv dataset.
Provide the codes with outputs and explain everything that you
do in this step.
Descriptive Statistics
create a DataFrame as follows
Output
sum()
✔ Returns the sum of the values for the requested axis.
✔ By default, axis is index (axis=0)
print (df.sum())

Each individual column is added individually (Strings are

appended).
print (df.sum(1))
mean()
Returns the average value
print (df.mean())
std()

✔ Returns the standard deviation of the numerical

columns.
print (df.std())
Functions & Description
✔ Let us now understand the functions under Descriptive
Statistics in Python Pandas.
✔ The following table list down the important functions
✔ Note − Since DataFrame is a Heterogeneous data
structure. Generic operations don’t work with all functions.
✔ Summarizing Data

✔ print (df.describe())
print (df.describe(include=['object']))

print (df.describe(include='all'))
Example
Read csv “mtcars”

Output
# Get the mean of each column

mtcars.mean()()
Output
# Get the mean of each row
median
✔ The median of a distribution is the value where 50% of the
data lies below it and 50% lies above it.
✔ In essence, the median splits the data in half.
✔ The median is also known as the 50% percentile since
50% of the observations are found below it.
✔ you can get the median using the df.median() function:

✔ # Get the median of each column

✔ mtcars.median()
Mode
✔ The mode of a variable is simply the value that appears
most frequently.

✔ Unlike mean and median, you can take the mode of a

categorical variable and it is possible to have multiple
modes.

✔ Find the mode with df.mode()

mtcars.mode()

✔ The columns with multiple modes (multiple values with the

same count) return multiple values as the mode.
✔ Columns with no mode (no value that appears more than
once) return NaN.
Example
✔ Write a Python program to display some basic statistical
details like percentile, mean, standard deviation etc. of the
species of ‘Iris-setosa’, ‘Iris-versicolor’ and ‘Iris- versicolor’
of iris.csv dataset.
Iris.csv
data.describe()
• Problem Statement : Write a Python program to display some
basic statistical details like standard deviation, mean, standard
deviation etc. of the species of ‘Iris-setosa’, ‘Iris-versicolor’ and
'Iris-virginica' of iris.csv dataset.
Code:

print('iris-setosa')
setosa=data['target']=='Iris-setosa'
print(data[setosa].describe())

print('Iris-virginica')
setosa=data['target']=='Iris-virginica'
print(data[setosa].describe())

print('Iris- versicolor')
setosa=data['target']=='Iris- versicolor'
print(data[setosa].describe())

CH 05 Probability An Introduction To Modeling Uncertainty
Document31 pages
CH 05 Probability An Introduction To Modeling Uncertainty
唐嘉玥
No ratings yet
Credit Eda Case Study Analysis
Document13 pages
Credit Eda Case Study Analysis
vinaychandrawanshi
75% (4)
Stata Cheat Sheets
Document6 pages
Stata Cheat Sheets
ameliasoeharman
No ratings yet
Information Practices
Document141 pages
Information Practices
aj
No ratings yet
Exercise and Experiment 3
Document14 pages
Exercise and Experiment 3
h8792670
No ratings yet
Python Pandas2 PDF
Document38 pages
Python Pandas2 PDF
Lakshya Gupta
No ratings yet
Chapter1.2 PythonPandas2
Document38 pages
Chapter1.2 PythonPandas2
Coding corner
No ratings yet
Machine Learning-1
Document24 pages
Machine Learning-1
factpolice007
No ratings yet
Python Codes Test 2
Document12 pages
Python Codes Test 2
Manish Mohapatra
No ratings yet
Business Analytics and Data Mining Modeling Using R
Document6 pages
Business Analytics and Data Mining Modeling Using R
Daredevil Creations
No ratings yet
Descriptive Statistics With Pandas: Data Handling Using Pandas - II
Document37 pages
Descriptive Statistics With Pandas: Data Handling Using Pandas - II
B. Jennifer
100% (1)
Python Pandas II Notes XII
Document20 pages
Python Pandas II Notes XII
Mlkns
No ratings yet
EDA Lab Manual
Document93 pages
EDA Lab Manual
Yash Rox
100% (2)
Group A Assignment No2 Writeup
Document9 pages
Group A Assignment No2 Writeup
403 Chaudhari Sanika Sagar
No ratings yet
Pivot Table
Document16 pages
Pivot Table
Cherry
No ratings yet
Machine Learning Notes
Document6 pages
Machine Learning Notes
Nikhita Nair
No ratings yet
DWDM - Lab Manual1
Document40 pages
DWDM - Lab Manual1
aathyukthas.ai20001
No ratings yet
Unit V Statistics R
Document60 pages
Unit V Statistics R
PRAJWAL SINGH
No ratings yet
Fds Answers
Document53 pages
Fds Answers
saranyatvcet
No ratings yet
Module - 4 (R Training) - Basic Stats & Modeling
Document15 pages
Module - 4 (R Training) - Basic Stats & Modeling
RohitGahlan
No ratings yet
Capital Gains
Document8 pages
Capital Gains
hariprasanna951
No ratings yet
AIML 01 Merged
Document25 pages
AIML 01 Merged
IT : 47 Patel Jyot
No ratings yet
AD3411 - 1 To 5
Document11 pages
AD3411 - 1 To 5
Raj kamal
No ratings yet
10_Jayesh _Prakash_Rane
Document26 pages
10_Jayesh _Prakash_Rane
jayeshrane450
No ratings yet
ML File
Document12 pages
ML File
hdofficial2003
No ratings yet
Exp2 - Data Visualization and Cleaning and Feature Selection
Document13 pages
Exp2 - Data Visualization and Cleaning and Feature Selection
mnbatrawi
No ratings yet
Applied Business Analytics Using R & PYTHON: 3 Credits, 24 Sessions, 30 Hours
Document42 pages
Applied Business Analytics Using R & PYTHON: 3 Credits, 24 Sessions, 30 Hours
Rahul B 2B.ComSec 2
No ratings yet
Time Series Using Python
Document18 pages
Time Series Using Python
graduation
No ratings yet
ASSi2 DSBDA
Document4 pages
ASSi2 DSBDA
adagalepayale023
No ratings yet
Pfa QTN Bank
Document74 pages
Pfa QTN Bank
Vignesh Shetty
No ratings yet
Unit - Iii - Eda
Document25 pages
Unit - Iii - Eda
rp402948
No ratings yet
Final Project Implementation
Document3 pages
Final Project Implementation
mail.information0101
No ratings yet
Aayushi ML File
Document37 pages
Aayushi ML File
25532aayushi.2020
No ratings yet
Differentiate Between Data Type and Data Structures
Document11 pages
Differentiate Between Data Type and Data Structures
krishnakumar
No ratings yet
r15-16 - B.sc. Computer Science - Sem2 - 3-2-108
Document18 pages
r15-16 - B.sc. Computer Science - Sem2 - 3-2-108
sana
No ratings yet
Bdo Co1 Session 4
Document43 pages
Bdo Co1 Session 4
s.m.pasha0709
No ratings yet
Introduction To Data Science With R Programming
Document91 pages
Introduction To Data Science With R Programming
Vimal Kumar
No ratings yet
Ass-3 Ds
Document7 pages
Ass-3 Ds
Vedant Andhale
No ratings yet
Ai - Phase 3
Document9 pages
Ai - Phase 3
Manikandan N
No ratings yet
Python Program
Document53 pages
Python Program
gourav.jatin86
No ratings yet
Variables
Document7 pages
Variables
Geethu Mohan
No ratings yet
Dmdw-Lab Manual
Document61 pages
Dmdw-Lab Manual
manimellaankammarao
No ratings yet
PRACTICAL QUESTIONS For DSBDA
Document9 pages
PRACTICAL QUESTIONS For DSBDA
ngak1214
No ratings yet
Control Flow - Looping
Document18 pages
Control Flow - Looping
Nur Syazliana
No ratings yet
UNIT 1 Exploratory Data Analysis
Document8 pages
UNIT 1 Exploratory Data Analysis
parimala balamurugan
100% (1)
R Lab File Deepak
Document27 pages
R Lab File Deepak
parv saxena
No ratings yet
Assignment 1 - LP1
Document14 pages
Assignment 1 - LP1
bbad070105
No ratings yet
Surabhi Charu Project
Document16 pages
Surabhi Charu Project
sachin joshi
No ratings yet
Ge2112 - Fundamentals of Computing and Programming: Unit V Functions and Pointers
Document28 pages
Ge2112 - Fundamentals of Computing and Programming: Unit V Functions and Pointers
punithavallisudan
No ratings yet
Maths
Document30 pages
Maths
koraseeka midhun
No ratings yet
Datastructurespart 2
Document37 pages
Datastructurespart 2
dslalitbusiness2589
No ratings yet
What Is Exploratory Data Analysis
Document13 pages
What Is Exploratory Data Analysis
Ramkrishna
No ratings yet
Ap Question Bank With Answers
Document34 pages
Ap Question Bank With Answers
Karthik G B
No ratings yet
Python For Finance - The Complete Beginner's Guide - by Behic Guven - Jul, 2020 - Towards Data Science PDF
Document12 pages
Python For Finance - The Complete Beginner's Guide - by Behic Guven - Jul, 2020 - Towards Data Science PDF
Econometrista Iel Emecep
100% (1)
Urmi ML Practical File
Document37 pages
Urmi ML Practical File
bekar ka
No ratings yet
Stats Lab1
Document11 pages
Stats Lab1
Chetan Cherry
No ratings yet
Q1) Solve Any Five A) What Is The Difference Between Inferential and Descriptive Statistics? Sample
Document6 pages
Q1) Solve Any Five A) What Is The Difference Between Inferential and Descriptive Statistics? Sample
Amar Nath Babar
No ratings yet
Pandas
Document29 pages
Pandas
Vineet Saraswat
No ratings yet
Functions and Packages
Document7 pages
Functions and Packages
Nur Syazliana
No ratings yet
Introduction To Business Statistics Through R Software: Software
From Everand
Introduction To Business Statistics Through R Software: Software
Editor IJSMI
No ratings yet
Advanced C Concepts and Programming: First Edition
From Everand
Advanced C Concepts and Programming: First Edition
Gayatri
Rating: 3 out of 5 stars
3/5 (1)
Algorithms and Data Structures: An Easy Guide to Programming Skills
From Everand
Algorithms and Data Structures: An Easy Guide to Programming Skills
Rigdon Jonathan
No ratings yet
Prezentare
Document20 pages
Prezentare
Veselina Stefanova
No ratings yet
Measures of Central Tendency and Variability
Document9 pages
Measures of Central Tendency and Variability
CHRISTNIL KATE SEVILLA
No ratings yet
ES-MAT-61-Business Statistics
Document3 pages
ES-MAT-61-Business Statistics
Subhahs
No ratings yet
Assignment 1 Data Interpretation
Document4 pages
Assignment 1 Data Interpretation
jeyanthirajagur418
No ratings yet
CHEMOMETRICS
Document45 pages
CHEMOMETRICS
Aphelele
No ratings yet
COMM5501 W1 Slides
Document45 pages
COMM5501 W1 Slides
ASHISH RANJAN
No ratings yet
Metropolitan Research Inc. Case Study
Document6 pages
Metropolitan Research Inc. Case Study
ADITYA VERMA
No ratings yet
What Is The Central Tendency Formula
Document18 pages
What Is The Central Tendency Formula
Carel Faith Andres
No ratings yet
Central Tendency
Document24 pages
Central Tendency
John Ivan Maurat
No ratings yet
ETABS 20.2.0-Report Viewer Spreader Bar HB200
Document15 pages
ETABS 20.2.0-Report Viewer Spreader Bar HB200
Indra Rifaldy
No ratings yet
7 International Probabilistic Workshop
Document588 pages
7 International Probabilistic Workshop
Dproske
100% (1)
BA 311 Act#3 Formative Answer
Document17 pages
BA 311 Act#3 Formative Answer
joffer idago
No ratings yet
Statistics
Document211 pages
Statistics
Hasan Hüseyin Çakır
100% (6)
Instruction Manual T21P Indicator
Document104 pages
Instruction Manual T21P Indicator
Jose Escalona
No ratings yet
Basics of Statistics: Definition: Science of Collection, Presentation, Analysis, and Reasonable
Document33 pages
Basics of Statistics: Definition: Science of Collection, Presentation, Analysis, and Reasonable
Jyothsna Tirunagari
100% (1)
XR05 53
Document3 pages
XR05 53
Trần Thu Thảo
No ratings yet
The Mean, Median, Mode, Range
Document2 pages
The Mean, Median, Mode, Range
Ramareziel Parreñas Rama
No ratings yet
MCQ Business Statistics
Document41 pages
MCQ Business Statistics
Shivani Kapoor
0% (1)
Business Statistics
Document7 pages
Business Statistics
heyawesomearvs
No ratings yet
Python Pandas
Document1 page
Python Pandas
Henish N Jain
No ratings yet
Ass. Business Analytic
Document9 pages
Ass. Business Analytic
s221222819
No ratings yet
Testing and Measurement
Document352 pages
Testing and Measurement
RABYA
No ratings yet
Emcu002 Educational Statistics
Document42 pages
Emcu002 Educational Statistics
Denis
No ratings yet
MCQs Unit 2 Measures of Central Tendency
Document16 pages
MCQs Unit 2 Measures of Central Tendency
Gamer Bhagvan
100% (1)
ECON1203 PASS Week 3
Document4 pages
ECON1203 PASS Week 3
mothermonk
No ratings yet
Normal Distribution: Statistics and Probability Topic #4
Document18 pages
Normal Distribution: Statistics and Probability Topic #4
Diama, Hazel Anne B. 11-STEM 9
No ratings yet
Chapter-04: Measures of Central Tendency
Document71 pages
Chapter-04: Measures of Central Tendency
Samir Siddique
No ratings yet
CHAPTER8 QS026 semII 2009 10
Document13 pages
CHAPTER8 QS026 semII 2009 10
Saidin Ahmad
No ratings yet