Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
27 views

pandas_workshop - Jupyter Notebook

PANDAS LIBRARY CODE

Uploaded by

Arundhathi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views

pandas_workshop - Jupyter Notebook

PANDAS LIBRARY CODE

Uploaded by

Arundhathi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Introduction to Pandas dataframe

Data frame is a main object in pandas. It is used to represent data with rows and columns

Data frame is a datastructure represent the data in tabular or excel spread sheet like data)

creating dataframe:

In [1]:

1 import pandas as pd
2 df = pd.read_csv("weather_data.csv") #read weather.csv data
3 df

Out[1]:

day temperature windspeed event

0 1/1/2017 32 6 Rain

1 1/2/2017 35 7 Sunny

2 1/3/2017 28 2 Snow

3 1/4/2017 24 7 Snow

4 1/5/2017 32 4 Rain

5 1/6/2017 31 2 Sunny

In [2]:

1 #list of tuples
2
3 weather_data = [('1/1/2017', 32, 6, 'Rain'),
4 ('1/2/2017', 35, 7, 'Sunny'),
5 ('1/3/2017', 28, 2, 'Snow'),
6 ('1/4/2017', 24, 7, 'Snow'),
7 ('1/5/2017', 32, 4, 'Rain'),
8 ('1/6/2017', 31, 2, 'Sunny')
9 ]
10 df = pd.DataFrame(weather_data, columns=['day', 'temperature', 'windspeed', 'event'])
11 df

Out[2]:

day temperature windspeed event

0 1/1/2017 32 6 Rain

1 1/2/2017 35 7 Sunny

2 1/3/2017 28 2 Snow

3 1/4/2017 24 7 Snow

4 1/5/2017 32 4 Rain

5 1/6/2017 31 2 Sunny
In [3]:

1 #get dimentions of the table


2
3 df.shape #total number of rows and columns

Out[3]:

(6, 4)

In [4]:

1 #if you want to see initial some rows then use head command (default 5 rows)
2 df.head()

Out[4]:

day temperature windspeed event

0 1/1/2017 32 6 Rain

1 1/2/2017 35 7 Sunny

2 1/3/2017 28 2 Snow

3 1/4/2017 24 7 Snow

4 1/5/2017 32 4 Rain

In [5]:

1 #if you want to see last few rows then use tail command (default last 5 rows will print
2 df.tail()

Out[5]:

day temperature windspeed event

1 1/2/2017 35 7 Sunny

2 1/3/2017 28 2 Snow

3 1/4/2017 24 7 Snow

4 1/5/2017 32 4 Rain

5 1/6/2017 31 2 Sunny

In [6]:

1 #slicing
2 df[2:5]

Out[6]:

day temperature windspeed event

2 1/3/2017 28 2 Snow

3 1/4/2017 24 7 Snow

4 1/5/2017 32 4 Rain
In [0]:

1 df.columns #print columns in a table

Out[21]:

Index(['day', 'temperature', 'windspeed', 'event'], dtype='object')

In [7]:

1 df.day #print particular column data

Out[7]:

0 1/1/2017
1 1/2/2017
2 1/3/2017
3 1/4/2017
4 1/5/2017
5 1/6/2017
Name: day, dtype: object

In [0]:

1 #another way of accessing column


2 df['day'] #df.day (both are same)

Out[24]:

0 1/1/2017
1 1/2/2017
2 1/3/2017
3 1/4/2017
4 1/5/2017
5 1/6/2017
Name: day, dtype: object

In [0]:

1 #get 2 or more columns


2 df[['day', 'event']]

Out[26]:

day event

0 1/1/2017 Rain

1 1/2/2017 Sunny

2 1/3/2017 Snow

3 1/4/2017 Snow

4 1/5/2017 Rain

5 1/6/2017 Sunny
In [0]:

1 #get all temperatures


2 df['temperature']

Out[28]:

0 32
1 35
2 28
3 24
4 32
5 31
Name: temperature, dtype: int64

In [0]:

1 #print max temperature


2 df['temperature'].max()

Out[29]:

35

In [0]:

1 #print max temperature


2 df['temperature'].min()

Out[30]:

24

In [0]:

1 #print max temperature


2 df['temperature'].describe()

Out[31]:

count 6.000000
mean 30.333333
std 3.829708
min 24.000000
25% 28.750000
50% 31.500000
75% 32.000000
max 35.000000
Name: temperature, dtype: float64
In [10]:

1 # select rows which has maximum temperature


2 df[df.temperature == df.temperature.max()]
3

Out[10]:

day temperature windspeed event

1 1/2/2017 35 7 Sunny

In [11]:

1 # select rows which has maximum temperature


2 df[df['temperature'] == df['temperature'].max()]

Out[11]:

day temperature windspeed event

1 1/2/2017 35 7 Sunny

In [0]:

1 #select only day column which has maximum temperature


2 df.day[df.temperature == df.temperature.max()]
3

Out[33]:

1 1/2/2017
Name: day, dtype: object

In [0]:

You might also like