0% found this document useful (0 votes)

14 views

Grace Python Numpy MB Final

Uploaded by

bharathi

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views

Grace Python Numpy MB Final

Uploaded by

bharathi

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 55

GRACE COLLEGE OF ENGINEERING

MULLAKKADU, THOOTHUKUDI -628005

DEPARTMENT OF COMPUTER SCIENCE AND

ENGINEERING

CS3361- DATA SCIENCE LABORATORY

II YEAR/III SEMESTER
REGULATION 2021

LAB MANUAL

PREPARED BY

Mrs.P.JOY SUGANTHY BAI,

Assistant professor
CSE Department
Grace College of Engineering, Thoothukudi

1
List of Experiments
1. Download, install and explore the features of NumPy, SciPy, Jupyter,
Statsmodels and Pandaspackages.
2. Working with Numpy arrays
3. Working with Pandas data frames
4. Reading data from text files, Excel and the web and exploring various
commands for doingdescriptive analytics on the Iris data set.

5. Use the diabetes data set from UCI and Pima Indians Diabetes data set
for performing thefollowing:
a. Univariate analysis: Frequency, Mean, Median, Mode, Variance,
Standard Deviation,Skewness and Kurtosis.
b. Bivariate analysis: Linear and logistic regression modeling
c. Multiple Regression analysis
d. Also compare the results of the above analysis for the two data sets.

6. Apply and explore various plotting functions on UCI data sets.

a. Normal curves
b. Density and contour plots
c. Correlation and scatter plots
d. Histograms
e. Three dimensional plotting

7. Visualizing Geographic Data with Basemap

2
Ex.No:1 Download, install and explore the features of
NumPy, SciPy, Jupyter, Statsmodels and Pandas
packages

Anaconda:

Anaconda is a distribution of the Python and R programming languages for scientific

computing (data science, machine learning applications, large-scale data processing, predictive
analytics, etc.), that aims to simplify package management and deployment.

Jupyter NoteBook:
The Jupyter Notebook is an open-source web application that allows you to create and
share documents that contain live code, equations, visualizations, and narrative text. Its uses
include data cleaning and transformation, numerical simulation, statistical modeling, data
visualization, machine learning, and much more.

NumPy:
NumPy is a Python library used for working with arrays. It also has functions for
working in domain of linear algebra, fourier transform, and matrices.

SciPy:
SciPy is a scientific computation library that uses NumPy underneath. SciPy stands
for Scientific Python. It provides more utility functions for optimization, stats and signal
processing. Like NumPy, SciPy is open source so we can use it freely

NumPy, stands for Numerical Python, is used for the manipulation of elements of numerical
array data. SciPy, stands for Scientific Python, is used for numerical computations in Python.
Both these packages provide extended functionality to work with Python.

Statsmodels :
Statsmodels is a popular library in Python that enables us to estimate and analyze various
statistical models. It is built on numeric and scientific libraries like NumPy and SciPy. It
includes various models of linear regression like ordinary least squares, generalized least
squares, weighted least squares, etc

Pandas
Pandas are really powerful. They provide you with a huge set of important commands and
features which are used to easily analyze your data. We can use Pandas to perform various tasks
like filtering your data according to certain conditions, or segmenting and segregating the data
according to preference, etc.

Download Anaconda:

3
1. Type “Anaconda Download” in Google Chrome.

2. Scroll the anaconda products website below.Click the 64bit installer.

3. Click the downloaded .exe file and install Anaconda

4
4. Click the Just Me option

5
5. Anaconda Stored in C: \ Users path

6.After installation open the anaconda navigator

6
6. Launch Jupitor Notebook for typing programs

7. Click “ New”

7
8. Click the Run button to get the output.

9. Sample programs

Result :
Thus the NumPy, SciPy, Jupyter, Statsmodels and Pandas packages are downloaded and
installed successfully.

8
Ex.No:2 Working with Numpy arrays

Aim:
To work with Numpy array using Jupyter Notebook.

Program 1:
Python program to demonstrate
# basic array characteristics
import numpy as np

# Creating array object

arr = np.array( [[ 1, 2, 3],
[ 4, 2, 5]] )

# Printing type of arr object

print("Array is of type: ", type(arr))

# Printing array dimensions (axes)

print("No. of dimensions: ", arr.ndim)

# Printing shape of array

print("Shape of array: ", arr.shape)

# Printing size (total number of elements) of array

print("Size of array: ", arr.size)

# Printing type of elements in array

print("Array stores elements of type: ", arr.dtype)

Output:
array is of type:
No. of dimensions: 2
Shape of array: (2, 3)
Size of array: 6
Array stores elements of type: int64

Program 2:
# Python program to demonstrate
# array creation techniques
import numpy as np

# Creating array from list with type float

9
a = np.array([[1, 2, 4], [5, 8, 7]], dtype = 'float')
print ("Array created using passed list:\n", a)

# Creating array from tuple

b = np.array((1 , 3, 2))
print ("\nArray created using passed tuple:\n", b)

# Creating a 3X4 array with all zeros

c = np.zeros((3, 4))
print ("\nAn array initialized with all zeros:\n", c)

# Create a constant value array of complex type

d = np.full((3, 3), 6, dtype = 'complex')
print ("\nAn array initialized with all 6s."
"Array type is complex:\n", d)

# Create an array with random values

e = np.random.random((2, 2))
print ("\nA random array:\n", e)

# Create a sequence of integers

# from 0 to 30 with steps of 5
f = np.arange(0, 30, 5)
print ("\nA sequential array with steps of 5:\n", f)

# Create a sequence of 10 values in range 0 to 5

g = np.linspace(0, 5, 10)
print ("\nA sequential array with 10 values between"
"0 and 5:\n", g)

# Reshaping 3X4 array to 2X2X3 array

arr = np.array([[1, 2, 3, 4],
[5, 2, 4, 2],
[1, 2, 0, 1]])

newarr = arr.reshape(2, 2, 3)

print ("\nOriginal array:\n", arr)

print ("Reshaped array:\n", newarr)

# Flatten array
arr = np.array([[1, 2, 3], [4, 5, 6]])
flarr = arr.flatten()

print ("\nOriginal array:\n", arr)

print ("Fattened array:\n", flarr)

10
Output:
Array created using passed list:
[[ 1. 2. 4.]
[ 5. 8. 7.]]

Array created using passed tuple:

[1 3 2]

An array initialized with all zeros:

[[ 0. 0. 0. 0.]
[ 0. 0. 0. 0.]
[ 0. 0. 0. 0.]]

An array initialized with all 6s. Array type is complex:

[[ 6.+0.j 6.+0.j 6.+0.j]
[ 6.+0.j 6.+0.j 6.+0.j]
[ 6.+0.j 6.+0.j 6.+0.j]]

A random array:
[[ 0.46829566 0.67079389]
[ 0.09079849 0.95410464]]

A sequential array with steps of 5:

[ 0 5 10 15 20 25]

A sequential array with 10 values between 0 and 5:

[ 0. 0.55555556 1.11111111 1.66666667 2.22222222 2.77777778
3.33333333 3.88888889 4.44444444 5. ]

Original array:
[[1 2 3 4]
[5 2 4 2]
11
[1 2 0 1]]
Reshaped array:
[[[1 2 3]
[4 5 2]]

[[4 2 1]
[2 0 1]]]

Original array:
[[1 2 3]
[4 5 6]]
Fattened array:
[1 2 3 4 5 6]

Program3:
# Python program to demonstrate
# indexing in numpy
import numpy as np

# An exemplar array
arr = np.array([[-1, 2, 0, 4],
[4, -0.5, 6, 0],
[2.6, 0, 7, 8],
[3, -7, 4, 2.0]])

# Slicing array
temp = arr[:2, ::2]
print ("Array with first 2 rows and alternate"
"columns(0 and 2):\n", temp)

# Integer array indexing example

temp = arr[[0, 1, 2, 3], [3, 2, 1, 0]]
print ("\nElements at indices (0, 3), (1, 2), (2, 1),"
"(3, 0):\n", temp)

# boolean array indexing example

cond = arr > 0 # cond is a boolean array
temp = arr[cond]
12
print ("\nElements greater than 0:\n", temp)
Output:

Array with first 2 rows and alternatecolumns(0 and 2):

[[-1. 0.]
[ 4. 6.]]

Elements at indices (0, 3), (1, 2), (2, 1),(3, 0):

[ 4. 6. 0. 3.]

Elements greater than 0:

[ 2. 4. 4. 6. 2.6 7. 8. 3. 4. 2. ]
Program 4:

Python program to demonstrate

basic operations on single array
mport numpy as np

= np.array([1, 2, 5, 3])

add 1 to every element

rint ("Adding 1 to every element:", a+1)

# subtract 3 from each element

print ("Subtracting 3 from each element:", a-3)

# multiply each element by 10

print ("Multiplying each element by 10:", a*10)

# square each element

print ("Squaring each element:", a**2)

# modify existing array

a *= 2
print ("Doubled each element of original array:", a)

# transpose of array
a = np.array([[1, 2, 3], [3, 4, 5], [9, 6, 0]])

print ("\nOriginal array:\n", a)

print ("Transpose of array:\n", a.T)

Output :
13
Adding 1 to every element: [2 3 6 4]
Subtracting 3 from each element: [-2 -1 2 0]
Multiplying each element by 10: [10 20 50 30]
Squaring each element: [ 1 4 25 9]
Doubled each element of original array: [ 2 4 10 6]

Original array:
[[1 2 3]
[3 4 5]
[9 6 0]]
Transpose of array:
[[1 3 9]
[2 4 6]
[3 5 0]]

# Python program to demonstrate sorting in numpy

import numpy as np

a = np.array([[1, 4, 2],
[3, 4, 6],
[0, -1, 5]])

# sorted array
print ("Array elements in sorted order:\n",
np.sort(a, axis = None))

# sort array row-wise

print ("Row-wise sorted array:\n",
np.sort(a, axis = 1))

# specify sort algorithm

print ("Column wise sort by applying merge-sort:\n",
np.sort(a, axis = 0, kind = 'mergesort'))

# Example to show sorting of structured array

# set alias names for dtypes
dtypes = [('name', 'S10'), ('grad_year', int), ('cgpa', float)]

14
# Values to be put in array
values = [('Hrithik', 2009, 8.5), ('Ajay', 2008, 8.7),
('Pankaj', 2008, 7.9), ('Aakash', 2009, 9.0)]

# Creating array
arr = np.array(values, dtype = dtypes)
print ("\nArray sorted by names:\n",
np.sort(arr, order = 'name'))

print ("Array sorted by graduation year and then cgpa:\n",

np.sort(arr, order = ['grad_year', 'cgpa']))

Output:
Array elements in sorted order:
[-1 0 1 2 3 4 4 5 6]
Row-wise sorted array:
[[ 1 2 4]
[ 3 4 6]
[-1 0 5]]
Column wise sort by applying merge-sort:
[[ 0 -1 2]
[ 1 4 5]
[ 3 4 6]]

Array sorted by names:

[('Aakash', 2009, 9.0) ('Ajay', 2008, 8.7) ('Hrithik', 2009, 8.5)
('Pankaj', 2008, 7.9)]
Array sorted by graduation year and then cgpa:
[('Pankaj', 2008, 7.9) ('Ajay', 2008, 8.7) ('Hrithik', 2009, 8.5)
('Aakash', 2009, 9.0)]

Result:
Thus the programs using numpy executed successfully.

15
Ex.No:3 Working with Pandas Data Frames

Aim:
To Work with Pandas data frames using Jupyter Notebook

Pandas Data Frames:

Pandas is a Python library used for working with data sets.

It has functions for analyzing, cleaning, exploring, and manipulating data.

The name "Pandas" has a reference to both "Panel Data", and "Python Data Analysis" and was
created by Wes McKinney in 2008.

Why Use Pandas?

Pandas allows us to analyze big data and make conclusions based on statistical theories.

Pandas can clean messy data sets, and make them readable and relevant.

Relevant data is very important in data science.

Example1:

import pandas

mydataset = {
'cars': ["BMW", "Volvo", "Ford"],
'passings': [3, 7, 2]
}

myvar = pandas.DataFrame(mydataset)

print(myvar)
Create Labels
import pandas as pd

a = [1, 7, 2]

myvar = pd.Series(a, index = ["x", "y", "z"])

print(myvar)

16
What is a DataFrame?

A Pandas DataFrame is a 2 dimensional data structure, like a 2 dimensional array, or a table

with rows and columns.

import pandas as pd

data = {
"calories": [420, 380, 390],
"duration": [50, 40, 45]
}

#load data into a DataFrame object:

df = pd.DataFrame(data)

print(df)

Read CSV Files

A simple way to store big data sets is to use CSV files (comma separated files).

CSV files contains plain text and is a well know format that can be read by everyone
including Pandas.

import pandas as pd

df = pd.read_csv(' C:\Users\New\Desktop\AD8302\data.csv')

print(df.to_string())

Result:
Thus the program using Data Frames were executed successfully.

17
Ex.No:4 Reading data from text files , Excel and the web and
exploring various commands for doing descriptive analysis
on the Iris data set.

Aim:
To Read data from text files , Excel and the web and exploring various commands for
doing descriptive analysis on the Iris data set.

Exploratory Data Analysis

Exploratory Data Analysis (EDA) is a technique to analyze data using some visual
Techniques. With this technique, we can get detailed information about the statistical summary
of the data.

Iris Dataset
Iris Dataset is considered as the Hello World for data science. It contains five columns namely
– Petal Length, Petal Width, Sepal Length, Sepal Width, and Species Type. Iris is a flowering
plant, the researchers have measured various features of the different iris flowers and recorded
them digitally.

Read a web dataset:

import pandas as pd

# Reading the CSV file

df = pd.read_csv("Iris.csv")

# Printing top 5 rows

df.head()

Output:

I. Getting Information about the Dataset

1. Shape: the shape parameter to get the shape of the
dataset df.shape

18
Output:

(150, 6)
The dataframe contains 6 columns and 150 rows

2. Info(): the columns and their data types

df.info()
Output:

3. Describe():The describe() function applies basic statistical computations on the dataset like
extreme values, count of data points standard deviation, etc. Any missing value or NaN value
is automatically skipped. describe() function gives a good picture of the distribution of data
df.describe()
Output:

II. Checking Missing Values

1. Isnull():We will check if our data contains any missing values or not. Missing values can
occur when no information is provided for one or more items or for a whole unit.
df.isnull().sum()

19
Output:

2. Checking Duplicates:
Pandas drop_duplicates() method helps in removing duplicates from the data frame
data = df.drop_duplicates(subset ="class")

Output:

3. Count:
Series.value_counts() function. This function returns a Series containing counts of
unique values.
df.value_counts("class")

Output:

III. Data Visualization: We will use Matplotlib and Seaborn library for the data visualization.
Matplotlib is easy to use and an amazing visualizing library in Python. It is built on NumPy
arrays and designed to work with the broader SciPy stack and consists of several plots like
line, bar, scatter, histogram, etc
Seaborn is a library mostly used for statistical plotting in Python. It is built on top of
Matplotlib and provides beautiful default styles and color palettes to make statistical plots
more attractive

Data visualization using Matplot and Seeborn Library:

# importing packages
import seaborn as sns
import matplotlib.pyplot as plt
sns.countplot(x='class', data=df, )
plt.show()

20
Output:

Relation between variables

Hue: Hue parameter denotes which column decides the kind of color.
Legend(): A legend() is an area describing the elements of the graph.
Bounding Box: bbox_to_anchor=[x0, y0] will create a bounding box with lower left corner at
position [x0, y0] . The legend will then be placed 'inside' this box and overlapp it according to
the specified loc parameter.
Loc:The attribute Loc in legend() is used to specify the location of the legend. Default value of
loc is loc=”best” (upper left)

Example 1: Comparing Sepal Length and Sepal Width

# importing packages
import seaborn as sns
import matplotlib.pyplot as plt

sns.scatterplot(x='sepallength', y='sepalwidth',
hue='class', data=df, )

# Placing Legend outside the Figure

plt.legend(bbox_to_anchor=(1, 1), loc=2)

plt.show()

Output:

21
From the above plot, we can infer that –
 Species Setosa has smaller sepal lengths but larger sepal widths.
 Versicolor Species lies in the middle of the other two species in terms of sepal length and
width
 Species Virginica has larger sepal lengths but smaller sepal widths.

Example 2: Comparing Petal Length and Petal Width

# importing packages
import seaborn as sns
import matplotlib.pyplot as plt

sns.scatterplot(x='petallength', y='petalwidth',
hue='class', data=df, )

# Placing Legend outside the Figure

plt.legend(bbox_to_anchor=(1, 1), loc=2)

plt.show()

22
Output:

From the above plot, we can infer that –

 Species Setosa has smaller petal lengths and widths.
 Versicolor Species lies in the middle of the other two species in terms of petal length and
width
 Species Virginica has the largest of petal lengths and widths.
Let’s plot all the column’s relationships using a pairplot. It can be used for multivariate
analysis.
Example:
# importing packages
import seaborn as sns
import matplotlib.pyplot as plt

sns.pairplot(df.drop(['Id'], axis = 1),

hue='class', height=2)

Histograms

Histograms allow seeing the distribution of data for various columns. It can be used for uni as
well as bi-variate analysis.
Example:
import seaborn as sns
import matplotlib.pyplot as plt

fig, axes = plt.subplots(2, 2, figsize=(10,10))

axes[0,0].set_title("Sepal Length")
axes[0,0].hist(df['sepallength'], bins=7)

23
axes[0,1].set_title("Sepal Width")
axes[0,1].hist(df['sepalwidth'], bins=5);

axes[1,0].set_title("Petal Length")
axes[1,0].hist(df['petallength'], bins=6);

axes[1,1].set_title("Petal Width")
axes[1,1].hist(df['petalwidth'], bins=6);

Output:

24
Handling Correlation

Pandas dataframe.corr() is used to find the pairwise correlation of all columns in the
dataframe. Any NA values are automatically excluded. For any non-numeric data type
columns in the dataframe it is ignored.

data = df.drop_duplicates(subset ="class",)

data.corr(method='pearson')

Output:

Box Plots

We can use boxplots to see how the categorical value os distributed with other numerical
values.
Example:
# importing packages
import seaborn as sns
import matplotlib.pyplot as plt

def graph(y):
sns.boxplot(x="class", y=y, data=df)

plt.figure(figsize=(10,10))

# Adding the subplot at the specified

# grid position
plt.subplot(221)
graph('sepallength')

plt.subplot(222)
graph('sepalwidth')

plt.subplot(223)
graph('petallength')

plt.subplot(224)
graph('petalwidth')

plt.show()

25
Output:

Handling Outliers
An Outlier is a data-item/object that deviates significantly from the rest of the (so-called
normal)objects. They can be caused by measurement or execution errors. The analysis for outlier
detection is referred to as outlier mining. There are many ways to detect the outliers, and the
removal process is the data frame same as removing a data item from the panda’s dataframe.
Let’s consider the iris dataset and let’s plot the boxplot for the SepalWidthCm column.
Example:
# importing packages
import seaborn as sns
import matplotlib.pyplot as plt

# Load the dataset

df = pd.read_csv('iris_csv.csv')

sns.boxplot(x='sepalwidth', data=df)

26
Output:

Removing Outliers

For removing the outlier, one must follow the same process of removing an entry from the
dataset using its exact position in the dataset because in all the above methods of detecting the
outliers end result is the list of all those data items that satisfy the outlier definition according
to the method used.
Example: We will detect the outliers using IQR and then we will remove them. We will also
draw the boxplot to see if the outliers are removed or not.
import sklearn
from sklearn.datasets import load_boston
import pandas as pd
import seaborn as sns
import numpy as np

# Load the dataset

df = pd.read_csv('iris_csv.csv')

# IQR
Q1 = np.percentile(df['sepalwidth'], 25,
interpolation = 'midpoint')

Q3 = np.percentile(df['sepalwidth'], 75,
interpolation = 'midpoint')
IQR = Q3 - Q1

print("Old Shape: ", df.shape)

# Upper bound
upper = np.where(df['sepalwidth'] >= (Q3+1.5*IQR))

27
# Lower bound
lower = np.where(df['sepalwidth'] <= (Q1-1.5*IQR))

# Removing the Outliers

df.drop(upper[0], inplace = True)
df.drop(lower[0], inplace = True)

print("New Shape: ", df.shape)

sns.boxplot(x='sepalwidth', data=df)

Output:

Ex.No:4b:Various Commands in data frame:

1. pandas.DataFrame

pandas.DataFrame() used to create a DataFrame in pandas. There are two ways to use this
function. You can form a DataFrame column-wise by passing a dictionary into
the pandas.DataFrame() function. Here, each key is a column, while the values are the rows:

import pandas
DataFrame = pandas.DataFrame({"A" : [1, 3, 4], "B": [5, 9, 12]})
print(DataFrame)

28
A B
0 1 5
1 3 9
2 4 12

2. Read From and Write to Excel or CSV in pandas

You can read or write to Excel or CSV files with pandas.

import pandas as pd
df = pd.read_csv("iris_csv.csv")
print(df)

3. Get the Mean, Median, and Mode

We can also compute the central tendencies of each column in a DataFrame using
pandas.:

DataFrame.mean()

df.median()

df.mode()

4. DataFrame.transform

pandas' DataFrame.transform() modifies the values of a DataFrame. It accepts a function as an

argument.

data = df.transform(lambda y: y*3)

print(data)
5. DataFrame.isnull

This function returns a Boolean value and flags all rows containing null values as True:

df.isnull().sum()

sepallength 0
sepalwidth 0
petallength 0
petalwidth 0
class 0
dtype: int64

6. Dataframe.info
29
It returns the summary of non-missing values for each column instead:

df.info()

df.describe()

7. DataFrame.loc

loc to used find the elements in a particular index. To view all items in the third row, for
instance:

data=df.loc[2]

print(data)

8. DataFrame.max, min

Getting the maximum and minimum values using pandas is easy:

df.min()

df.max()

9. DataFrame.astype

The astype() function changes the data type of a particular column or DataFrame.

DataFrame.astype(str)

10. DataFrame.insert

' insert() function used to add a new column to a DataFrame. It accepts three keywords,
the column name, a list of its data, and its location, which is a column index.

DataFrame.insert(column = 'C', value = [3, 4, 6], loc=0)

print(DataFrame)

11. DataFrame.sum

The sum() function in pandas returns the sum of the values in each column

DataFrame.cumsum()

30
12. Correlation:

Want to find the correlation between integer or float columns? pandas can help you
achieve that using the corr() function.

DataFrame.corr()

13. DataFrame.add

The add() function used to add a specific number to each value in DataFrame. It works
by iterating through a DataFrame and operating on each item.

DataFrame['A'].add(20)

14. DataFrame.sub

Like the addition function, you can also subtract a number from each value in a
DataFrame or specific column:

DataFrame['A'].sub(10)

15. DataFrame.mul

This is a multiplication version of the addition function of pandas:

DataFrame['A'].mul(10)

16. DataFrame.div

We can divide each data point in a column or DataFrame by a specific number:

DataFrame['A'].div(2)

17. DataFrame.std

Using the std() function, pandas also lets you compute the standard deviation for each
column in a DataFrame. It works by iterating through each column in a dataset and calculating
the standard deviation for each:

DataFrame.std()

18. DataFrame.melt

31
The melt() function in pandas flips the columns in a DataFrame to individual rows. It's
like exposing the anatomy of a DataFrame. So it lets you view the value assigned to each
column explicitly.

newDataFrame = DataFrame.melt()

print(newDataFrame)

19. DataFrame.pop

This function lets you remove a specified column from a pandas DataFrame. It accepts
an item keyword, returns the popped column, and separates it from the rest of the DataFrame:

DataFrame.pop(item= 'B')

print(DataFrame)

20. DataFrame.dropna

The dropna() method removes all rows containing null values:

DataFrame.dropna(inplace = True)

print(DataFrame)

Result:

Thus the various commands in Data Frame are executed successfully

32
Ex.No:5 Univariate analysis: Frequency, Mean, Median, Mode,
Variance, Standard Deviation, Skewness and Kurtosis

Aim :
To find Frequency, Mean, Median, Mode, Variance, Standard Deviation,
Skewness and Kurtosis using diabetes dataset.

Read Diabetes data set:

Import pandas as pd
df = pd.read_csv('diabetes.csv') df.head()
df.shape
OutPut
(768, 9)

df.dtypes
Output:
Pregnancies int64
Glucose int64
BloodPressure int64
SkinThickness int64
Insulin int64
BMI float64
DiabetesPedigreeFunction float64
Age int64
Outcome int64
dtype: object

df['Outcome']=df['Outcome'].astype('bool')
df.dtypes['Outcome']
Output:

df.info()
dtype('bool')
Output:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 768 entries, 0 to 767
Data columns(total 9 columns):
Pregnancies 768 non-null int64
Glucose 768 non-null int64
BloodPressure 768 non-null int64
SkinThickness 768 non-null int64
Insulin 768 non-null int64
BMI 768 non-null float64
33
DiabetesPedigreeFunction 768 non-null float64
Age 768 non-null int64
Outcome 768 non-null bool
dtypes: bool(1), float64(2), int64(6)
memory usage: 48.9 KB

df.describe().T

Pregnency Propagation:
import numpy as np
preg_proportion = np.array(df['Pregnancies'].value_counts()) preg_month
= np.array(df['Pregnancies'].value_counts().index)
preg_proportion_perc = np.array(np.round(preg_proportion/sum(preg_proportion),3)
*100,dtype=int)
preg = pd.DataFrame({'month': Pregnancies,'count_of_preg_prop':preg_proportion,'
percentage_proportion':preg_proportion_perc}) preg.set_index(['month'],inplace=True)
preg.head(10)

import seaborn as sns

import matplotlib.pyplot as plt
fig,axes = plt.subplots(nrows=3,ncols=2,dpi=120,figsize = (8,6))

plot00=sns.countplot('Pregnancies',data=df,ax=axes[0][0],color='green')
axes[0][0].set_title('Count',fontdict={'fontsize':8}) axes[0]
[0].set_xlabel('Month of Preg.',fontdict={'fontsize':7}) axes[0]
[0].set_ylabel('Count',fontdict={'fontsize':7})
plt.tight_layout()

plot01=sns.countplot('Pregnancies',data=df,hue='Outcome',ax=axes[0][1])
axes[0][1].set_title('Diab. VS Non-Diab.',fontdict={'fontsize':8}) axes[0]
[1].set_xlabel('Month of Preg.',fontdict={'fontsize':7}) axes[0]
[1].set_ylabel('Count',fontdict={'fontsize':7}) plot01.axes.legend(loc=1)
plt.setp(axes[0][1].get_legend().get_texts(), fontsize='6')
plt.setp(axes[0][1].get_legend().get_title(), fontsize='6')
plt.tight_layout()

plot10 = sns.distplot(df['Pregnancies'],ax=axes[1][0]) axes[1]

[0].set_title('Pregnancies Distribution',fontdict={'fontsize':8}) axes[1]
[0].set_xlabel('Pregnancy Class',fontdict={'fontsize':7}) axes[1]
[0].set_ylabel('Freq/Dist',fontdict={'fontsize':7}) plt.tight_layout()

plot11 = df[df['Outcome']==False]['Pregnancies'].plot.hist(ax=axes[1][1],label='Non-Diab.')
plot11_2=df[df['Outcome']==True]['Pregnancies'].plot.hist(ax=axes[1][1],label='Diab.') axes[1]
[1].set_title('Diab. VS Non-Diab.',fontdict={'fontsize':8}) axes[1][1].set_xlabel('Pregnancy
Class',fontdict={'fontsize':7}) axes[1][1].set_ylabel('Freq/Dist',fontdict={'fontsize':7})
plot11.axes.legend(loc=1)
plt.setp(axes[1][1].get_legend().get_texts(), fontsize='6') # for legend text
34
plt.setp(axes[1][1].get_legend().get_title(), fontsize='6') # for legend title
plt.tight_layout()

plot20 = sns.boxplot(df['Pregnancies'],ax=axes[2][0],orient='v') axes[2]

[0].set_title('Pregnancies',fontdict={'fontsize':8}) axes[2]
[0].set_xlabel('Pregnancy',fontdict={'fontsize':7}) axes[2]
[0].set_ylabel('Five Point Summary',fontdict={'fontsize':7})
plt.tight_layout()

plot21 = sns.boxplot(x='Outcome',y='Pregnancies',data=df,ax=axes[2][1])
axes[2][1].set_title('Diab. VS Non-Diab.',fontdict={'fontsize':8}) axes[2]
[1].set_xlabel('Pregnancy',fontdict={'fontsize':7}) axes[2]
[1].set_ylabel('Five Point Summary',fontdict={'fontsize':7})
plt.xticks(ticks=[0,1],labels=['Non-Diab.','Diab.'],fontsize=7)
plt.tight_layout()
plt.show()

Understanding Distribution
The distribution of Pregnancies in data is unimodal and skewed to the right, centered at
about 1 with most of the data between 0 and 15, A range of roughly 15, and outliers are
present on the higher end.

Glucose Variable
35
df.Glucose.describe()
Output:
count 768.000000
mean 120.894531
std 31.972618
min 0.000000
25% 99.000000
50% 117.000000
75% 140.250000
max 199.000000
Name Glucose, dtype: float64
:

#sns.set_style('darkgrid')
fig,axes = plt.subplots(nrows=2,ncols=2,dpi=120,figsize = (8,6))

plot00=sns.distplot(df['Glucose'],ax=axes[0][0],color='green') #axes[0]
[0].yaxis.set_major_formatter(FormatStrFormatter('%.3f')) axes[0]
[0].set_title('Distribution of Glucose',fontdict={'fontsize':8}) axes[0]
[0].set_xlabel('Glucose Class',fontdict={'fontsize':7}) axes[0]
[0].set_ylabel('Count/Dist.',fontdict={'fontsize':7}) plt.tight_layout()

plot01=sns.distplot(df[df['Outcome']==False]['Glucose'],ax=axes[0][1],color='gre
en',label='Non Diab.') sns.distplot(df[df.Outcome==True]['Glucose'],ax=axes[0]
[1],color='red',label='Di ab')
axes[0][1].set_title('Distribution of Glucose',fontdict={'fontsize':8}) axes[0]
[1].set_xlabel('Glucose Class',fontdict={'fontsize':7}) axes[0]
[1].set_ylabel('Count/Dist.',fontdict={'fontsize':7}) #axes[0]
[1].yaxis.set_major_formatter(FormatStrFormatter('%.3f'))
plot01.axes.legend(loc=1) plt.setp(axes[0][1].get_legend().get_texts(),
fontsize='6') plt.setp(axes[0][1].get_legend().get_title(), fontsize='6')
plt.tight_layout()

plot10=sns.boxplot(df['Glucose'],ax=axes[1][0],orient='v') axes[1]
[0].set_title('Numerical Summary',fontdict={'fontsize':8}) axes[1]
[0].set_xlabel('Glucose',fontdict={'fontsize':7}) axes[1][0].set_ylabel(r'Five Point
Summary(Glucose)',fontdict={'fontsize':7}) plt.tight_layout()

plot11=sns.boxplot(x='Outcome',y='Glucose',data=df,ax=axes[1][1]) axes[1]
[1].set_title(r'Numerical Summary (Outcome)',fontdict={'fontsize':8}) axes[1]
[1].set_ylabel(r'Five Point Summary(Glucose)',fontdict={'fontsize':7})
plt.xticks(ticks=[0,1],labels=['Non-Diab.','Diab.'],fontsize=7) axes[1]
[1].set_xlabel('Category',fontdict={'fontsize':7})
plt.tight_layout()

plt.show()

36
Understanding Distribution
The distribution of Glucose level among patients is unimodal and roughly bell shaped,
centered at about 115 with most of the data between 90 and 140, A range of roughly 150, and
outliers are present on the lower end(Glucose ==0).

verify distribution by keeping only non zero entry of Glucose:

fig,axes = plt.subplots(nrows=1,ncols=2,dpi=120,figsize = (8,4))

plot0=sns.distplot(df[df['Glucose']!=0]['Glucose'],ax=axes[0],color='green')
#axes[0].yaxis.set_major_formatter(FormatStrFormatter('%.3f'))
axes[0].set_title('Distribution of Glucose',fontdict={'fontsize':8})
axes[0].set_xlabel('Glucose Class',fontdict={'fontsize':7})
axes[0].set_ylabel('Count/Dist.',fontdict={'fontsize':7})
plt.tight_layout()

plot1=sns.boxplot(df[df['Glucose']!=0]['Glucose'],ax=axes[1],orient='v')
axes[1].set_title('Numerical Summary',fontdict={'fontsize':8})
axes[1].set_xlabel('Glucose',fontdict={'fontsize':7})
axes[1].set_ylabel(r'Five Point Summary(Glucose)',fontdict={'fontsize':7})
plt.tight_layout()

37
Blood Pressure variable
df.BloodPressure.describe()
count 768.000000
mean 69.105469
std 19.355807
min 0.000000
25% 62.000000
50% 72.000000
75% 80.000000
max 122.000000
Name BloodPressure, dtype: float64
:

fig,axes = plt.subplots(nrows=2,ncols=2,dpi=120,figsize = (8,6))

plot00=sns.distplot(df['BloodPressure'],ax=axes[0][0],color='green')
axes[0][0].yaxis.set_major_formatter(FormatStrFormatter('%.3f')) axes[0]
[0].set_title('Distribution of BP',fontdict={'fontsize':8}) axes[0]
[0].set_xlabel('BP Class',fontdict={'fontsize':7}) axes[0]
[0].set_ylabel('Count/Dist.',fontdict={'fontsize':7}) plt.tight_layout()

plot01=sns.distplot(df[df['Outcome']==False]['BloodPressure'],ax=axes[0][1],colo
r='green',label='Non Diab.') sns.distplot(df[df.Outcome==True]
['BloodPressure'],ax=axes[0][1],color='red',lab el='Diab')
axes[0][1].set_title('Distribution of BP',fontdict={'fontsize':8}) axes[0]
[1].set_xlabel('BP Class',fontdict={'fontsize':7}) axes[0]
[1].set_ylabel('Count/Dist.',fontdict={'fontsize':7}) axes[0]
[1].yaxis.set_major_formatter(FormatStrFormatter('%.3f'))
plot01.axes.legend(loc=1) plt.setp(axes[0][1].get_legend().get_texts(),
fontsize='6') plt.setp(axes[0][1].get_legend().get_title(), fontsize='6')
plt.tight_layout()

plot10=sns.boxplot(df['BloodPressure'],ax=axes[1][0],orient='v')

38
axes[1][0].set_title('Numerical Summary',fontdict={'fontsize':8}) axes[1]
[0].set_xlabel('BP',fontdict={'fontsize':7}) axes[1][0].set_ylabel(r'Five Point
Summary(BP)',fontdict={'fontsize':7}) plt.tight_layout()

plot11=sns.boxplot(x='Outcome',y='BloodPressure',data=df,ax=axes[1][1])
axes[1][1].set_title(r'Numerical Summary (Outcome)',fontdict={'fontsize':8})
axes[1][1].set_ylabel(r'Five Point Summary(BP)',fontdict={'fontsize':7})
plt.xticks(ticks=[0,1],labels=['Non-Diab.','Diab.'],fontsize=7) axes[1]
[1].set_xlabel('Category',fontdict={'fontsize':7})
plt.tight_layout()

plt.show()
Understanding Distribution
The distribution of BloodPressure among patients is unimodal (This is not a bimodal
because BP=0 does not make any sense and it is Outlier) and bell shaped, centered at about
65 with most of the data between 60 and 90, A range of roughly 100, and outliers are present
on the lower end(BP ==0).

5b) Bivariate analysis: Linear and logistic regression

modeling

import os
import pandas as pd
import random
import matplotlib.pyplot as plt import
seaborn as sns
import numpy as np

os.chdir("C:/Users/Administrator/Desktop/DS") df
= pd.read_csv('diabetes.csv')
df.head()
sns.scatterplot(df.DiabetesPedigreeFunction,df.Glucose) plt.ylim(0,20000)

39
sns.scatterplot(df.BMI,df.Age)
plt.ylim(0,20000)

sns.scatterplot(df.BloodPressure,df.Glucose)
plt.ylim(0,20000)

plt.figure(figsize=(12,8))
sns.kdeplot(data=df,x=df.Glucose,hue=df.Outcome,fill=True)

40
5C) Multiple Regression analysis

df.isnull().values.any()
False

(df.Pregnancies == 0).sum(),(df.Glucose==0).sum(),(df.BloodPressure==0).sum(),
(df.SkinThickness==0).sum(),(df. Insulin==0).sum(),(df.BMI==0).sum(),
(df.DiabetesPedigreeFunction==0).sum(),(df.Age==0).su m()

## Counting cells with 0 Values for each variable and publishing the counts below

Output:
(111, 5, 35, 227, 374, 11, 0, 0)

drop_Glu=df.index[df.Glucose == 0].tolist()
drop_BP=df.index[df.BloodPressure == 0].tolist()
drop_Skin = df.index[df.SkinThickness==0].tolist()
drop_Ins = df.index[df.Insulin==0].tolist()
drop_BMI = df.index[df.BMI==0].tolist()
c=drop_Glu+drop_BP+drop_Skin+drop_Ins+drop_BMI
dia=df.drop(df.index[c])
dia.info()

Output:
class 'pandas.core.frame.DataFrame'>
Int64Index: 392 entries, 3 to 765
Data columns (total 9 columns):
# Column Non-Null Count Dtype

0 Pregnancies 392 non-null int64

1 Glucose 392 non-null int64
2 BloodPressure 392 non-null int64
41
3 SkinThickness 392 non-null int64
4 Insulin 392 non-null int64
5 BMI 392 non-null float64
6 DiabetesPedigreeFunction 392 non-null float64
7 Age 392 non-null int64
8 Outcome 392 non-null bool
dtypes: bool(1), float64(2), int64(6)
memory usage: 27.9 KB

cor = dia.corr(method ='pearson')

cor

Output

sns.heatmap(cor)

Result:
Thus the univariate ,bivariate, multivariate analysis performed successfully.

42
Ex.No:6 Apply and explore various plotting functions
on UCI data sets

Aim:
Apply and explore various plotting functions on UCI data sets.
To load and quickly visualize the Multiple Features Dataset [1] from the UCI repository, which
is available in mvlearn. This dataset can be a good tool for analyzing the effectiveness of
multiview algorithms. It contains 6 viewsof handwritten digit images, thus allowing for
analysis of multiview algorithms in multiclass or unsupervised tasks.

a. Normal curves
A probability distribution is a statistical function that describes the likelihood of obtaining the
possible values that a random variable can take. By this, we mean the range of values that a
parameter can take when we randomly pick upvalues from it. If we were asked to pick up 1
adult randomly and asked what his/her (assuming gender does not affect height) height would
be? There’s no way to know what the height will be. But if we have the distribution of heights
of adults in the city, we can bet on the most probable outcome.A Normal Distribution is also
known as a Gaussian distribution or famously Bell Curve. People use both words
interchangeably, but it means the same thing. It is a continuous probability distribution.

Code:
import numpy as np

import matplotlib.pyplot as plt

# Creating a series of data of

in range of 1-50. x =

np.linspace(1,50,200)

#Creating a Function.

def normal_dist(x , mean , sd):

prob_density = (np.pisd) np.exp(-0.5*((x-

mean)/sd)**2) return prob_density

43
#Calculate mean and

Standard deviation. mean =

np.mean(x) sd = np.std(x)

#Apply function to the

data. pdf =

normal_dist(x,mean,sd

)#Plotting the Results

plt.plot(x,pdf , color =

'red') plt.xlabel('Data

points')

plt.ylabel('Probability Density')

b. Density and contour plots

Contour plots also called level plots are a tool for doing multivariate analysis and visualizing 3-
D plots in 2-D space. If we consider X and Y as our variables we want to plot then the response
Z will be plotted as slices on the X-Y plane due to which contours are sometimes referred as Z-
slices or iso-response.

Contour plots are widely used to visualize density, altitudes or heights of the mountain as well
as in the meteorological department. Due to such wide usage matplotlib.pyplot provides a

44
method contour to make it easy for us to draw contour plots.

Code:
import

matplotlib.pyplot as plt

import numpy as np

feature_x = np.arange(0, 50, 2)

feature_y =

np.arange(0, 50, 3)#

Creating 2-D grid of

features

[X, Y] = np.meshgrid(feature_x,

feature_y) fig, ax = plt.subplots(1,

Z = np.cos(X / 2) +

np.sin(Y / 4)# plots

contour lines

ax.contour(X, Y, Z)

45
ax.set_title('Contou

r Plot')

ax.set_xlabel('featu

re_x')

ax.set_ylabel('featu

re_y') plt.show()

c. Correlation and scatter plots

Correlation means an association, It is a measure of the extent to which two variables are related.

1. Positive Correlation: When two variables increase together and decrease together. They are positively
correlated. ‘1’ is a perfect positive correlation. For example – demand and profit are positively correlated the more
the demand for the product, the more profit hence positive correlation.

2. Negative Correlation: When one variable increases and the other variable decreases together and vice-
versa. They are negatively correlated. For example, If the distance between magnet increases their attraction
decreases, and vice-versa. Hence, a negative correlation. ‘-1’ is no correlation

3. Zero Correlation( No Correlation): When two variables don’t seem to be linked at all. ‘0’ is a perfect
negative correlation. For Example, the amount of tea you take and level of intelligence.

Code:
import pandas as pd

46
con

=pd.read_csv('concrete.cs

v ') con

list(con.columns)
con.head()

con['cement'] =

con['cement'].astype('category')

con.describe(include='category') import seaborn as sns

sns.scatterplot(x="water", y="coarseagg", data=con);

ax = sns.scatterplot(x="water", y="coarseagg",

data=con) ax.set_title("Concrete Strength vs.

47
Fly ash") ax.set_xlabel("coarseagg");

sns.lmplot(x="water", y="coarseagg", data=con);

d. Histograms:
A histogram is basically used to represent data provided in a form of some groups.It is accurate
method for the graphical representation of numerical data distribution.It is a type of bar plot
where X-axis represents the bin ranges while Y-axis gives information about frequency.

Creating a Histogram
To create a histogram the first step is to create bin of the ranges, then distribute the whole
range of the values into a series of intervals, and count the values which fall into each of the
intervals.Bins are clearly identified as consecutive, non-overlapping intervals of variables. The
matplotlib.pyplot.hist() function is used to compute and create histogram of x.

Code:

from matplotlib import pyplot as

pltimport numpy as np

# Creating dataset

a = np.array([22, 87, 5, 43, 56,

73, 55, 54, 11,

20, 51, 5, 79, 31,

48
27])

# Creating histogram

fig, ax = plt.subplots(figsize =(10, 7))

ax.hist(a, bins = [0, 25, 50, 75,

100])# Show plot

plt.show()

Code:

import matplotlib.pyplot as

pltimport numpy as np

from matplotlib import colors

from matplotlib.ticker import

PercentFormatter# Creating dataset

np.random.seed(23685752)

N_points = 10000

n_bins = 20

# Creating distribution
x = np.random.randn(N_points)

49
y = .8 ** x + np.random.randn(10000) +

25# Creating histogram

fig, axs = plt.subplots(1, 1,figsize =(10, 7),tight_layout = True)

axs.hist(x, bins = n_bins)

# Show

plot

plt.show()

e. Three dimensional plotting

Matplotlib was introduced keeping in mind, only two-dimensional plotting. But at the time
when the release of 1.0 occurred, the 3d utilities were developed upon the 2d and thus, we have
3d implementation of data available today! The 3d plots are enabled by importing the mplot3d
toolkit. In this article, we will deal with the 3d plots using matplotlib.

Code:

from mpl_toolkits import

mplot3dimport numpy as np

import matplotlib.pyplot as plt

fig = plt.figure()
50
# syntax for 3-D projection

ax = plt.axes(projection ='3d')

# defining axes

z = np.linspace(0, 1, 100)

x = z * np.sin(25 * z)

y = z * np.cos(25 * z)

c=x+y

ax.scatter(x, y, z, c = c)

# syntax for plotting

ax.set_title('3d Scatter plot')

plt.show()

Result:
Thus the various plots are executed and plotted successfully.

51
Ex.No:7 Visualizing Geographic Data with Basemap

Aim:
To Visualizing Geographic Data with Basemap
One common type of visualization in data science is that of geographic data. Matplotlib's main
tool for this type of visualization is the Basemap toolkit, which is one of several Matplotlib
toolkits which lives under the mpl_toolkits namespace. Admittedly, Basemap feels a bit clunky
to use, and often even simple visualizations take much longer to render than you might hope.
More modern solutions such as leaflet or the Google Maps API may be a better choice for
more intensive map visualizations. Still, Basemap is a useful tool for Python users to have in
their virtual toolbelts. In this section, we'll show several examples of the type of map
visualization that is possible with this toolkit.

Installation of Basemap is straightforward; if you're using conda you can type this and the
package will be downloaded:

conda install basemap

Code:
fig = plt.figure(figsize=(8, 8))

Basemap(projection='lcc',

resolution=None,

width=8E6, height=8E6,

lat_0=45, lon_0=-100,)
m.etopo(scale=0.5, alpha=0.5)

# Map (long, lat) to (x,

y) for plotting x, y =

m(-122.3, 47.6)

plt.plot(x, y, 'ok',

markersize=5)
52
plt.text(x, y, ' Seattle',

fontsize=12);

from mpl_toolkits.basemap

import Basemap

import matplotlib.pyplot as

plt

fig =

plt.figure(figsize =

(12,12)) m =

Basemap()

m.drawcoastlines()

m.drawcoastlines(linewidth=1.0,

linestyle='dashed', color='red')
53
plt.title("Coastlines", fontsize=20)

plt.show()

import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

import seaborn as sns

import geopandas as gpd

import shapefile as shp

from

shapely.geometry

import Point sns.set_style('whitegrid')

fp = r'Maps_with_python\india-polygon.shp' map_df =gpd.read_file(fp)

map_df_copy = gpd.read_file(fp)

plt.plot(map_df , markersize=5)

54
Result :

Thus the program using Basemap was installed and successfully executed geographic

visualization.

Van Der Post H. Python For Finance. A Crash Course Modern Guide 2024
80% (5)
Van Der Post H. Python For Finance. A Crash Course Modern Guide 2024
304 pages
CS3361 - Data Science
No ratings yet
CS3361 - Data Science
56 pages
CS3361 Data Science Lab Manual
No ratings yet
CS3361 Data Science Lab Manual
43 pages
python-notes-BCC-302 (Unit - 05)
No ratings yet
python-notes-BCC-302 (Unit - 05)
25 pages
What is NumPy.docx
No ratings yet
What is NumPy.docx
5 pages
vertopal.com_C1_W1_Lab_1_introduction_to_numpy_arrays
No ratings yet
vertopal.com_C1_W1_Lab_1_introduction_to_numpy_arrays
12 pages
numpy
No ratings yet
numpy
7 pages
Data Science Using Python Lab Manual
No ratings yet
Data Science Using Python Lab Manual
68 pages
Module3 Advance Pythonlibraries
No ratings yet
Module3 Advance Pythonlibraries
53 pages
lab manual fds
No ratings yet
lab manual fds
44 pages
fdsa lab manual final
No ratings yet
fdsa lab manual final
70 pages
CS3361-DATA SCIENCE LAB MANUAL
No ratings yet
CS3361-DATA SCIENCE LAB MANUAL
44 pages
NumPy Python Library by ChatGPT
No ratings yet
NumPy Python Library by ChatGPT
30 pages
LAB 2 DWM
No ratings yet
LAB 2 DWM
13 pages
Module Numpy
No ratings yet
Module Numpy
67 pages
Num Py
No ratings yet
Num Py
15 pages
New Chat
No ratings yet
New Chat
30 pages
HKU - 7001 - 3.2 Managing Data II
No ratings yet
HKU - 7001 - 3.2 Managing Data II
67 pages
Fds Lab Record
No ratings yet
Fds Lab Record
84 pages
45B AIML Practical1.1
No ratings yet
45B AIML Practical1.1
57 pages
Efficient Computing with NumPy
No ratings yet
Efficient Computing with NumPy
73 pages
Lab 1 - Introduction
No ratings yet
Lab 1 - Introduction
14 pages
NumPy Basics
No ratings yet
NumPy Basics
23 pages
Lab description file (4)
No ratings yet
Lab description file (4)
11 pages
Numpy Handbook
No ratings yet
Numpy Handbook
16 pages
Grace Python Numpy MB
No ratings yet
Grace Python Numpy MB
56 pages
Numpy
No ratings yet
Numpy
71 pages
Final Fds Manual
No ratings yet
Final Fds Manual
77 pages
Unit-3_PSC
No ratings yet
Unit-3_PSC
62 pages
Print
No ratings yet
Print
296 pages
FDS Lab Meterial CS3361
No ratings yet
FDS Lab Meterial CS3361
30 pages
Dsa Lab Manual Inserting Pages
No ratings yet
Dsa Lab Manual Inserting Pages
6 pages
An Introduction To Numpy and Scipy by Scott Shell
No ratings yet
An Introduction To Numpy and Scipy by Scott Shell
24 pages
Unit5 NumPy Pandas Notes
No ratings yet
Unit5 NumPy Pandas Notes
90 pages
FDS Record
No ratings yet
FDS Record
59 pages
FINAL FDS MANUAL print
No ratings yet
FINAL FDS MANUAL print
55 pages
Pendahuluan Python
No ratings yet
Pendahuluan Python
29 pages
Mds1111 Merged Numbered (1)
No ratings yet
Mds1111 Merged Numbered (1)
41 pages
Chapter 2
No ratings yet
Chapter 2
32 pages
UNIT 5 python aktu
No ratings yet
UNIT 5 python aktu
49 pages
Numpy
No ratings yet
Numpy
4 pages
Python-Unit-4
No ratings yet
Python-Unit-4
43 pages
Scientific Computing
No ratings yet
Scientific Computing
24 pages
Final Fds Manual Print
No ratings yet
Final Fds Manual Print
55 pages
11_NumPy
No ratings yet
11_NumPy
14 pages
APznzaaqszKXWidB7ZcUyElwKtMW9baPO5uwgBspe7mup3-RAjUbFs9a5J0SWJx5baBOtL8oMAExrcfE-xNmC3fbtEqgqkuUDV3hM3RFDNeuJc8K5DkloC95lixWjd8hSK4WWqCMirKOpcOSGSRNGGugDyjrAf-wzcSS5bC_l3kfkAro7lqM_CfNu8jP_XQRy6CFb
No ratings yet
APznzaaqszKXWidB7ZcUyElwKtMW9baPO5uwgBspe7mup3-RAjUbFs9a5J0SWJx5baBOtL8oMAExrcfE-xNmC3fbtEqgqkuUDV3hM3RFDNeuJc8K5DkloC95lixWjd8hSK4WWqCMirKOpcOSGSRNGGugDyjrAf-wzcSS5bC_l3kfkAro7lqM_CfNu8jP_XQRy6CFb
51 pages
15.NUMPY
No ratings yet
15.NUMPY
32 pages
Practical Guide To NumPy For Data Science
No ratings yet
Practical Guide To NumPy For Data Science
27 pages
Experiment 3
No ratings yet
Experiment 3
3 pages
Numpy
No ratings yet
Numpy
64 pages
Module 3.2.5
No ratings yet
Module 3.2.5
21 pages
Basic of Numphy
No ratings yet
Basic of Numphy
14 pages
Numpy
No ratings yet
Numpy
14 pages
Python Numpy
100% (1)
Python Numpy
31 pages
3252-1,2,3
No ratings yet
3252-1,2,3
20 pages
Python Presentation 3
No ratings yet
Python Presentation 3
44 pages
Unit 1
No ratings yet
Unit 1
170 pages
Numerical Python Numpy
No ratings yet
Numerical Python Numpy
28 pages
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Python For Beginners
From Everand
Python For Beginners
Célio Azevedo
No ratings yet
Advanced C Concepts and Programming: First Edition
From Everand
Advanced C Concepts and Programming: First Edition
Gayatri
3/5 (1)
Python Data Analytics Libraries
No ratings yet
Python Data Analytics Libraries
8 pages
Machine Learning Course Content For Classroomdocx - 240504 - 163403
No ratings yet
Machine Learning Course Content For Classroomdocx - 240504 - 163403
6 pages
DA Unit - IV
No ratings yet
DA Unit - IV
216 pages
Jessica Xujia Wei - Resume (2025.02.07)
No ratings yet
Jessica Xujia Wei - Resume (2025.02.07)
1 page
MLT Lab Manual
No ratings yet
MLT Lab Manual
41 pages
Question Bank
No ratings yet
Question Bank
2 pages
My Siwes Report For Printing
No ratings yet
My Siwes Report For Printing
81 pages
LP Vi Lab Manual 2022-23 Final
No ratings yet
LP Vi Lab Manual 2022-23 Final
72 pages
(Ebook) Data Science Fundamentals with R, Python, and Open Data by Marco Cremonini ISBN 9781394213245, 1394213247instant download
100% (4)
(Ebook) Data Science Fundamentals with R, Python, and Open Data by Marco Cremonini ISBN 9781394213245, 1394213247instant download
60 pages
Syllabus Till Term 1
No ratings yet
Syllabus Till Term 1
2 pages
Aryan Sunil Mishra
No ratings yet
Aryan Sunil Mishra
1 page
Quality Control Sheet
No ratings yet
Quality Control Sheet
2 pages
PIIS235271102300153X
No ratings yet
PIIS235271102300153X
8 pages
Mastering Exploratory Data Analysis With Python - A Comprehensive Guide To Unveiling Hidden Insights
No ratings yet
Mastering Exploratory Data Analysis With Python - A Comprehensive Guide To Unveiling Hidden Insights
73 pages
GE3171 PSPP LAB MANUAL UPDATED
No ratings yet
GE3171 PSPP LAB MANUAL UPDATED
42 pages
Python Papper
No ratings yet
Python Papper
43 pages
Data Science Bootcamp (Day-01) (1) - Compressed
No ratings yet
Data Science Bootcamp (Day-01) (1) - Compressed
161 pages
Unit 7 Python Libraries For Data Science
No ratings yet
Unit 7 Python Libraries For Data Science
34 pages
12 IP REVISION PAPER
No ratings yet
12 IP REVISION PAPER
7 pages
Unit 1
No ratings yet
Unit 1
164 pages
VSEMESTERIT (1)
No ratings yet
VSEMESTERIT (1)
16 pages
EDUREKHA Data Science and ML Internship Program V2 - Program Brochure
No ratings yet
EDUREKHA Data Science and ML Internship Program V2 - Program Brochure
60 pages
XII IP Pracprograms
No ratings yet
XII IP Pracprograms
40 pages
Data Science Lab (To Write)
No ratings yet
Data Science Lab (To Write)
64 pages
Walmart Data Analyst Interview Experience
No ratings yet
Walmart Data Analyst Interview Experience
10 pages
Tech Bazaar 7
No ratings yet
Tech Bazaar 7
39 pages
EDA with Pandas
No ratings yet
EDA with Pandas
8 pages
Practical Record Programs - Solutions
No ratings yet
Practical Record Programs - Solutions
23 pages
CO2 Emission Project Source Code
No ratings yet
CO2 Emission Project Source Code
2 pages