Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
180 views

Experiment No: 1 Introduction To Data Analytics and Python Fundamentals Page-1/11

This document provides an introduction to data analytics and Python fundamentals. It discusses understanding data, different types of data analytics including descriptive, diagnostic, predictive and prescriptive analytics. It also introduces important Python packages for data science like NumPy, Pandas, Matplotlib and Scikit-learn. It demonstrates basic operations in NumPy like creating arrays, unary and binary operators. It also shows how to work with Pandas DataFrames, including creating, viewing, selecting and describing data. Finally, it discusses importing and exporting data in Python using CSV files and pandas.

Uploaded by

Harshali Mane
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
180 views

Experiment No: 1 Introduction To Data Analytics and Python Fundamentals Page-1/11

This document provides an introduction to data analytics and Python fundamentals. It discusses understanding data, different types of data analytics including descriptive, diagnostic, predictive and prescriptive analytics. It also introduces important Python packages for data science like NumPy, Pandas, Matplotlib and Scikit-learn. It demonstrates basic operations in NumPy like creating arrays, unary and binary operators. It also shows how to work with Pandas DataFrames, including creating, viewing, selecting and describing data. Finally, it discusses importing and exporting data in Python using CSV files and pandas.

Uploaded by

Harshali Mane
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

AMRUTVAHINI COLLEGE OF ENGINEERING, SANGAMNER

Data Analytics

Experiment No: 1 Introduction to data analytics and Python fundamentals Page- 1/11

Aim of Experiment:
Introduction to data analytics and Python fundamentals:
 Understanding the Data.
 Python Packages for Data Science.
 Importing and Exporting Data in Python.
 Getting Started Analyzing Data in Python.
 Accessing Databases with Python

Learning Outcomes:

Theory:

Understanding the Data-


Data can be generated by Humans, Machines or Humans-machines combines. Data can be
generated anywhere where any information is generated and stored in structured or
unstructured formats.
 Data helps in make better decisions
 Data helps in solve problems by finding the reason for underperformance
 Data helps one to evaluate the performance.
 Data helps one improve processes.
 Data helps one understand consumers and the market.

Data Analytics-
Analytics is defined as “the scientific process of transforming data into insights for making
better decisions”
Analytics, is the use of data, information technology, statistical analysis, quantitative
methods, and mathematical or computer-based models to help managers gain improved
insight about their business operations and make better, fact-based decisions – James
Evans

Classification of Data analytics


There are four major types of data analytics.
• Descriptive analytics
• Diagnostic analytics
• Predictive analytics
• Prescriptive analytics

Descriptive Analytics
• Descriptive Analytics, is the conventional form of data analysis
• It seeks to provide a depiction or “summary view” of facts and figures in
• an understandable format
• This either inform or prepare data for further analysis
• Descriptive analysis or statistics can summarize raw data and convert it
into a form that can be easily understood by humans

PREPARED BY APPROVED BY CONTROLLED COPY STAMP MASTER COPY STAMP


ACAD-R-27B, Rev.: 00 Date: 15-06-2017
AMRUTVAHINI COLLEGE OF ENGINEERING, SANGAMNER
Data Analytics

Experiment No: 1 Introduction to data analytics and Python fundamentals Page- 2/11

• They can describe in detail about an event that has occurred in the past
A common example of Descriptive Analytics are company reports that simply provide a
historic review like:
• Data Queries
• Reports
• Descriptive Statistics
• Data Visualization
• Data dashboard

Diagnostic analytics
• Diagnostic Analytics is a form of advanced analytics which examines data
• or content to answer the question “Why did it happen?”
• Diagnostic analytical tools aid an analyst to dig deeper into an issue so
• that they can arrive at the source of a problem
• In a structured business environment, tools for both descriptive and
• diagnostic analytics go parallel
It uses techniques such as:
• Data Discovery
• Data Mining
• Correlations

Predictive analytics
• Predictive analytics helps to forecast trends based on the current events
• Predicting the probability of an event happening in future or estimating
• the accurate time it will happen can all be determined with the help of
• predictive analytical models
• Many different but co-dependent variables are analysed to predict a trend
• in this type of analysis
Set of techniques that use model constructed from past data to predict the future or
ascertain impact of one variable on another:
1. Linear regression
2. Time series analysis and forecasting
3. Data mining

Prescriptive analytics
• Set of techniques to indicate the best course of action
• It tells what decision to make to optimize the outcome
The goal of prescriptive analytics is to enable:
1. Quality improvements
2. Service enhancements
3. Cost reductions and
4. Increasing productivity

PREPARED BY APPROVED BY CONTROLLED COPY STAMP MASTER COPY STAMP


ACAD-R-27B, Rev.: 00 Date: 15-06-2017
AMRUTVAHINI COLLEGE OF ENGINEERING, SANGAMNER
Data Analytics

Experiment No: 1 Introduction to data analytics and Python fundamentals Page- 3/11

Elements of data Analytics

Python Packages for Data Science


A package is a collection of Python modules.
• Numpy
• SciPy
• Pandas
• Statsmodels
• Matplotlib
• Seaborn
• Plotly
• Bokeh
• Scikit Learn
• Keras

Getting started: Analyzing Data in Python


Introduction to Numpy
NumPy, short for Numerical Python, is a general-purpose array-processing package. It
provides a high-performance multidimensional array object, and tools for working with these
arrays.

# Python program to demonstrate basic array characteristics

import numpy as np
# Creating array object
arr = np.array( [[ 1, 2, 3],
[ 4, 2, 5]] )
# Printing type of arr object
print("Array is of type: ", type(arr))

# Printing array dimensions (axes)


print("No. of dimensions: ", arr.ndim)

PREPARED BY APPROVED BY CONTROLLED COPY STAMP MASTER COPY STAMP


ACAD-R-27B, Rev.: 00 Date: 15-06-2017
AMRUTVAHINI COLLEGE OF ENGINEERING, SANGAMNER
Data Analytics

Experiment No: 1 Introduction to data analytics and Python fundamentals Page- 4/11

# Printing shape of array


print("Shape of array: ", arr.shape)

# Printing size (total number of elements) of array


print("Size of array: ", arr.size)

# Printing type of elements in array


print("Array stores elements of type: ", arr.dtype)

Output-
Array is of type:
No. of dimensions: 2
Shape of array: (2, 3)
Size of array: 6
Array stores elements of type: int64

# Python program to demonstrate unary operators in numpy


import numpy as np

arr = np.array([[1, 5, 6],


[4, 7, 2],
[3, 1, 9]])

# maximum element of array


print ("Largest element is:", arr.max())
print ("Row-wise maximum elements:",
arr.max(axis = 1))

# minimum element of array


print ("Column-wise minimum elements:",
arr.min(axis = 0))

# sum of array elements


print ("Sum of all array elements:",
arr.sum())

Output-
Largest element is: 9
Row-wise maximum elements: [6 7 9]
Column-wise minimum elements: [1 1 2]
Sum of all array elements: 38

# Python program to demonstrate binary operators in Numpy


import numpy as np

a = np.array([[1, 2],
[3, 4]])
b = np.array([[4, 3],
[2, 1]])

# add arrays
print ("Array sum:\n", a + b)

PREPARED BY APPROVED BY CONTROLLED COPY STAMP MASTER COPY STAMP


ACAD-R-27B, Rev.: 00 Date: 15-06-2017
AMRUTVAHINI COLLEGE OF ENGINEERING, SANGAMNER
Data Analytics

Experiment No: 1 Introduction to data analytics and Python fundamentals Page- 5/11

# multiply arrays (elementwise multiplication)


print ("Array multiplication:\n", a*b)

# matrix multiplication
print ("Matrix multiplication:\n", a.dot(b))

Output
Array sum:
[[5 5]
[5 5]]
Array multiplication:
[[4 6]
[6 4]]
Matrix multiplication:
[[ 8 5]
[20 13]]

Introduction to pandas
Pandas is an open-source library that allows to you perform data manipulation and analysis
in Python.
Pandas Dataframe- A Data frame is a two-dimensional data structure, i.e., data is aligned
in a tabular fashion in rows and columns. A pandas DataFrame can be created using
various inputs like – Lists, dictionary, series, Numpy ndarrays, another DataFrame.
Example
import pandas as pd
data = [['Alex',10],['Bob',12],['Clarke',13]]
df = pd.DataFrame(data,columns=['Name','Age'],dtype=float)
print (df)
Output-
Name Age
0 Alex 10.0
1 Bob 12.0
2 Clarke 13.0
Simple operations on pandas Dataframe
Viewing the first n rows
print(df.head())
Output-
Name Age
0 Alex 10.0
1 Bob 12.0
2 Clarke 13.0
Print a concise summary of a DataFrame
print(df.info())
Output-
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Name 3 non-null object
1 Age 3 non-null float64

PREPARED BY APPROVED BY CONTROLLED COPY STAMP MASTER COPY STAMP


ACAD-R-27B, Rev.: 00 Date: 15-06-2017
AMRUTVAHINI COLLEGE OF ENGINEERING, SANGAMNER
Data Analytics

Experiment No: 1 Introduction to data analytics and Python fundamentals Page- 6/11

dtypes: float64(1), object(1)


memory usage: 176.0+ bytes
None
Calculating some statistical data like percentile, mean and std of the numerical
values of the DataFrame-
print(df.describe())
Output-
Age
count 3.000000
mean 11.666667
std 1.527525
min 10.000000
25% 11.000000
50% 12.000000
75% 12.500000
max 13.000000
Selecting column
print(df['Name'])
Output-
0 Alex
1 Bob
2 Clarke
Name: Name, dtype: object
Selecting row
print (df.iloc[2])
Output-
Name Clarke
Age 13
Name: 2, dtype: object

Importing and Exporting Data in Python


Reading and Writing CSV Files in Python With pandas
pandas is an open-source Python library that provides high performance data analysis tools
and easy to use data structures.
Name,Hire Date,Salary,Sick Days remaining
Graham Chapman,03/15/14,50000.00,10
John Cleese,06/01/15,65000.00,8
Eric Idle,05/12/14,45000.00,10
Terry Jones,11/01/13,70000.00,3
Terry Gilliam,08/12/14,48000.00,7
Michael Palin,05/23/13,66000.00,8
Reading the CSV into a pandas DataFrame
import pandas
df = pandas.read_csv('hrdata.csv')
print(df)

Output-
Name Hire Date Salary Sick Days remaining
0 Graham Chapman 03/15/14 50000.0 10
1 John Cleese 06/01/15 65000.0 8
2 Eric Idle 05/12/14 45000.0 10
3 Terry Jones 11/01/13 70000.0 3

PREPARED BY APPROVED BY CONTROLLED COPY STAMP MASTER COPY STAMP


ACAD-R-27B, Rev.: 00 Date: 15-06-2017
AMRUTVAHINI COLLEGE OF ENGINEERING, SANGAMNER
Data Analytics

Experiment No: 1 Introduction to data analytics and Python fundamentals Page- 7/11

4 Terry Gilliam 08/12/14 48000.0 7


5 Michael Palin 05/23/13 66000.0 8

Writing CSV Files With pandas


Using DataFrame’s to_csv method, we can write the data out to a comma-separated
file:
import pandas as pd

cars = {'Brand': ['Honda Civic','Toyota Corolla','Ford Focus','Audi A4'],


'Price': [22000,25000,27000,35000]
}

df = pd.DataFrame(cars, columns= ['Brand', 'Price'])

df.to_csv (r'C:\Users\Ron\Desktop\export_dataframe.csv', index = False,


header=True)

print (df)

Accessing Databases with Python


The Python standard for database interfaces is the Python DB-API. Python Database API
supports a wide range of database servers such as –MySQL, Microsoft SQL Server,
PostgreSQL, SQLite etc.
DB-API 2.0 interface for SQLite databases
SQLite is a C library that provides a lightweight disk-based database that doesn’t require a
separate server process and allows accessing the database using the SQL query language.
import sqlite3
conn = sqlite3.connect('example.db')
c = conn.cursor()

# Create table
c.execute('''CREATE TABLE stock
(date text, trans text, symbol text, qty real, price real)''')

# Insert a row of data


c.execute("INSERT INTO stock VALUES ('2006-01-05','BUY','RHAT',100,35.14)")
c.execute("INSERT INTO stock VALUES ('2006-03-28', 'BUY', 'IBM', 1000, 45.00)")
c.execute("INSERT INTO stock VALUES ('2006-04-05', 'BUY', 'MSFT', 1000, 72.00)")
# Save (commit) the changes
conn.commit()
for row in c.execute('SELECT * FROM stock ORDER BY price'):
print(row)
# We can also close the connection if we are done with it.
# Just be sure any changes have been committed or they will be lost.
conn.close()

Output-
('2006-01-05', 'BUY', 'RHAT', 100.0, 35.14)
('2006-03-28', 'BUY', 'IBM', 1000.0, 45.0)
('2006-04-05', 'BUY', 'MSFT', 1000.0, 72.0)

PREPARED BY APPROVED BY CONTROLLED COPY STAMP MASTER COPY STAMP


ACAD-R-27B, Rev.: 00 Date: 15-06-2017
AMRUTVAHINI COLLEGE OF ENGINEERING, SANGAMNER
Data Analytics

Experiment No: 1 Introduction to data analytics and Python fundamentals Page- 8/11

Assignment:

References:

Conclusion:

PREPARED BY APPROVED BY CONTROLLED COPY STAMP MASTER COPY STAMP


ACAD-R-27B, Rev.: 00 Date: 15-06-2017

You might also like