01 Introduction to Python
01 Introduction to Python
Kailash Singh
Professor, Department of Chemical Engineering
MNIT Jaipur
Why Python?
• Python is an interpreted, high-level, general-purpose programming language.
• One of the key design of Python is code readability. Ease of use and high
productivity have made Python very popular.
• It has a comprehensive set of core libraries for data analysis and visualization.
• Python’s strong community continuously evolves its data science libraries and
keeps it cutting edge.
• It has libraries for linear algebra, statistical analysis, machine learning,
visualization, optimization, stochastic models, etc.
• Python provides an interactive interface for data analysis.
• Python seems to have gained attention and popularity significantly compared
to other languages since 2012.
Growth of major programming languages
Integrated Development Environment (IDE)
Main Libraries in Python
Library Purpose
Numpy Efficient storage of arrays and matrices. Backbone of all scientific calculations and
algorithms.
pandas High-performance, easy-to-use data structures for data manipulation and analysis.
Pandas provide the features of DataFrame, which is very useful in the area of data
analytics.
SciPy Library for scientific computing. linear algebra, statistical computations, optimization
algorithm.
matplotlib Plotting and visualization
StatsModel Library for scientific computing. linear algebra, statistical computations, optimization
algorithm.
Scikit-learn Machine learning library. Collection of ML algorithms: Supervised and Unsupervised.
seaborn data visualization library based on matplotlib
jupyter notebook
In cmd type: jupyter notebook. Open a new notebook file.
Google Colab
• Colab is a hosted Jupyter Notebook service that requires no setup to
use and provides free access to computing resources.
• Go to the link:
https://colab.research.google.com/
• Open a new notebook
• Type the following code:
a=5
b=6
c=a+b
print(c)
Variable Declaration
• Python supports the following variable types:
1. int – Integer type.
2. float – Floating point numbers.
3. bool – Booleans are subtypes of integers and assigned value using literals
True and False.
4. str – Textual data.
• Python automatically infers the variable type from values assigned to
it.
Conditional Statements
• Python supports if-elif-else for writing conditional statements.
• Indentation is must.
• Examples:
# Checking a condition…
x=0
if x> 1:
print("Bigger than 1")
elif x<1:
print("Less than 1")
else:
print("equal to 1")
Control flow statement
#Create sequence of numbers x=0,1,2,3,4,5 using for loop
for x in range(6):
print(x)
a,b=2,3
c=addfun(a,b)
print(c)
Working with Collections
• List
• Tuple
• Set
• Dictionary
List
• Lists are like arrays, but can contain heterogeneous items, that is, a
single list can contain items of type integer, float, string, or objects.
• Example:
x=[1,2,7,10]
y=[1,2.5,6,8.9]
z=['Orange',2.3,5]
print(z[0:2]) #prints ['Orange',2.3]
a=[] #Empty List
print(z.index(2.3)) #prints 1
Tuple
• Tuple is also a list, but it is immutable. Once a tuple has been created
it cannot be modified.
• Example:
x=('Orange',2.3,5)
print(x[1])
Set
• A set is a collection of unique elements, that is, the values cannot
repeat.
• Example:
x={3,2,7,2}
print(x) #It will print {2,3,7}
Dictionary
• Dictionary is a list of key and value pairs. All the keys in a dictionary
are unique.
x={'Ram': 45, 'Shyam': 36, 'Mohan': 58}
x['Ram'] #prints 45
d={'Ram':{'English':45,'Maths':30},'Shyam':{'English':60,'Maths':70}}
d['Ram']['English'] #prints 45
map
• Example: Create a list of squares of the following list:
List1= [1,2,3,4,5,6]
Solution:
List1=[1,2,3,4,5,6] Alternatively:
List2=[] def fun(x):
for x in List1: return x*x List1=[1,2,3,4,5,6]
List2.append(x*x)
print(List2) m=map(fun,List1)
List2=list(m)
print(List2)
lambda
• Example: Create a list of squares of the following list:
List1= [1,2,3,4,5,6]
f=lambda x: x*x
List1=[1,2,3,4,5,6]
m=map(f,List1)
List2=list(m)
print(List2)
Filter
• Example: Filter only integer values from List1=[1,2,3,4,5,6]
f=lambda x: x%2==0
List1=[1,2,3,4,5,6]
y=filter(f, List1)
List2=list(y)
print(List2)
Pandas
Dataframes in Python
• The primary objective of descriptive analytics is comprehension of data
using data summarization, basic statistical measures and visualization.
• Data visualization is an integral component of business intelligence (BI).
• Data scientists deal with structured data; one of the most used is
structured query language (SQL) table.
• SQL tables represent data in rows and columns and make it convenient to
explore and apply transformations.
• The similar structure of presenting data is supported in Python through
DataFrames.
• DataFrames are inherited into Python by Pandas library.
Pandas library
• Pandas library support methods to explore, analyze, and prepare data.
• It can be used for performing activities such as load, filter, sort, group,
join datasets and also for dealing with missing data.
• Example: Create a dataframe for the following data:
Name Age City import pandas as pd
Mahesh 25 Delhi a= ['Mahesh', 'Suresh', 'Paresh']
Suresh 30 Mumbai b=[25,30,40]
Paresh 40 Kolkata c=['Delhi', 'Mumbai', 'Kolkata']
data = {'Name': a, 'Age': b, 'City': c}
df = pd.DataFrame(data)
print(df)
Contd…
• Locate row
print(df.loc[0]) # it prints first row of data
print(df.loc[[0,1]] # prints first and second row
print(df.head(2)) #prints first two rows of data
df.columns #prints column names
• Import csv
df = pd.read_csv('data.csv')
Dropping null values in DataFrame
• Sometime the data has null values (known as “None” in python). How
to remove such records?
#Dropping null values
• Example: d = {'col1': [1, 2,None], 'col2': [4, 5, 6]}
df=pd.DataFrame(d)
df2=df.dropna() # It does not change the original dataframe
Print(df2)
x = np.array([1, 2, 3, 4])
x.shape
Sorting
• Example: sort x but do not change original array
import numpy as np
x = np.array([3,2,0,1])
y=np.sort(x)
print(x)
print(y)
plt.hist(x)
plt.show()