Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
82 views

Python Pandas Series

Pandas is a Python package that provides data structures called Series and DataFrames to make working with labeled or relational data easy. Series are essentially columns of data, while DataFrames are collections of Series that form a multi-dimensional table. Pandas aims to be the fundamental high-level building block for practical real-world data analysis in Python.

Uploaded by

R
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
82 views

Python Pandas Series

Pandas is a Python package that provides data structures called Series and DataFrames to make working with labeled or relational data easy. Series are essentially columns of data, while DataFrames are collections of Series that form a multi-dimensional table. Pandas aims to be the fundamental high-level building block for practical real-world data analysis in Python.

Uploaded by

R
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 37

Pandas

Pandas is a Python package providing fast, flexible, and expressive data structures designed to
make working with 'relational' or 'labeled' data both easy and intuitive. It aims to be the
fundamental high-level building block for doing practical, real world data analysis in Python.

List of Pandas Exercises:

pandas is well suited for many different kinds of data:

 Tabular data with heterogeneously-typed columns, as in an SQL table or Excel


spreadsheet

 Ordered and unordered (not necessarily fixed-frequency) time series data.

 Arbitrary matrix data with row and column labels

 Any other form of observational / statistical data sets.

pip install pandas

Alternatively, if you're currently viewing this article in a Jupyter notebook you


can run this cell:

!pip install pandas


The ! at the beginning runs cells as if they were in a terminal.
To import pandas we usually import it with a shorter name since it's used so
much:

import pandas as pd
Now to the basic components of pandas.

Core components of pandas: Series and DataFrames


The primary two components of pandas are the Series and DataFrame.
A Series is essentially a column, and a DataFrame is a multi-dimensional table
made up of a collection of Series.

DataFrames and Series are quite similar in that many operations that you can do
with one you can do with the other, such as filling in null values and calculating
the mean.

Import the following commands to start:

import pandas as pd
import numpy as np
Pandas version:

import pandas as pd
print(pd.__version__)
Key and Imports

pandas DataFrame object

pandas Series object

Create DataSeries:
import pandas as pd

L=[2,4,6,8,10]

s = pd.Series([2, 4, 6, 8, 10])

s = pd.Series(L)

print(s)

Sample Output:

0 2
1 4
2 6
3 8

4 10
dtype: int64

Create Dataframe:

import pandas as pd

df = pd.DataFrame({'X':[78,85,96,80,86], 'Y':[84,94,89,83,86],'Z':
[86,97,96,72,83]});

print(df)

Sample Output:

X Y Z
0 78 84 86
1 85 94 97
2 96 89 96
3 80 83 72
4 86 86 83
Create a Series in python – pandas

Series is a one-dimensional labeled array capable


of holding data of any type (integer, string, float,
python objects, etc.).There are different ways to
create a series in python pandas (create empty
series, series from array without index, series from
array with index, series from dictionary and scalar
value ). The axis labels are called as indexes.

Create an Empty Series:


A basic series, which can be created is an Empty
Series. Below example is for creating an empty
series.

1
2
3
4
5
# Example Create an Empty Series

import pandas as pd
s = pd.Series()
print s
output:

Series([], dtype: float64)

Create a series from array without index:


Lets see an example on how to create series from
an array.

1
2
3
4
5
6
7
# Example Create a series from array
import pandas as pd
import numpy as np
data = np.array(['a','b','c','d','e','f'])
s = pd.Series(data)
print s
output:
0  a
1 b
2 c
3 d
4 e
5 f
dtype: object
Create a series from array with index:
This example depicts how to create a series in
python with index, Index starting from 1000 has
been added in the below example.

1
2
3
4
5
6
7
# Example Create a series from array with
specified index

import pandas as pd
import numpy as np
data = np.array(['a','b','c','d','e','f'])
s=
pd.Series(data,index=[1000,1001,1002,1003,1004,1
005])
print s
output:
1000   a
1001   b
1002   c
1003   d
1004   e
1005   f
dtype: object

Create a series from Dictionary


This example depicts how to create a series in
python with dictionary. Dictionary keys are used to
construct index.
#Example Create a series from dictionary

import pandas as pd

data = {'a' : 0., 'b' : 1., 'c' : 2.}


s=pd.Series(data)
print(s)

s = pd.Series(data,index=['b','c','d','a'])
print s
Index order is maintained and the missing element
is filled with NaN (Not a Number). So the output will
be

output:
b   1.0
c   2.0
d   NaN
a   0.0
dtype: float64
Create a series from Scalar value
This example depicts how to create a series in
python from scalar value. If data is a scalar value,
an index must be provided. The value will be
repeated to match the length of index

# create a series from scalar

import pandas as pd
import numpy as np
s = pd.Series(7, index=[0, 1, 2, 3])
print s
output:
0 7
1 7
2 7
3 7
dtype: int64
How to Access the elements of a Series in
python – pandas

Accessing Data from Series with Position in python


pandas
Retrieve Data Using Label (index) in python
pandas
Accessing data from series with position:
Accessing or retrieving the first element:
Retrieve the first element. As we already know, the
counting starts from zero for the array, which means
the first element is stored at zeroth position and so
on.

# create a series
import pandas as pd
import numpy as np
data = np.array(['a','b','c','d','e','f'])
s = pd.Series(data)

#retrieve the first element


print s[0]
output:

a
Access or Retrieve the first three elements in
the Series:

# create a series
import pandas as pd
import numpy as np
data = np.array(['a','b','c','d','e','f'])
s = pd.Series(data)

# retrieve first three elements


print s[:3]
output:

0 a
1 b
2 c
dtype: object

Access or Retrieve the last three elements in the


Series:
# create a series
import pandas as pd
import numpy as np
data = np.array(['a','b','c','d','e','f'])
s = pd.Series(data)

# retrieve last three elements


print s[-3:]
output:

3 d
4 e
5 f
dtype: object

Accessing data from series with Labels or index:


A Series is like a fixed-size dictionary in that you
can get and set values by index label.
Retrieve a single element using index label:

# create a series
import pandas as pd
import numpy as np
data = np.array(['a','b','c','d','e','f'])
s=pd.Series(data,index=[100,101,102,103,104,105])

print s[102]

output:
c

Retrieve multiple elements using index labels:


# create a series
import pandas as pd
import numpy as np
data = np.array(['a','b','c','d','e','f'])
s=pd.Series(data,index=[100,101,102,103,104,105])

# retrieve multiple elements with labels or index

print s[[102,103,104]]
output:
102 c
103 d
104 e

dtype: object

Note: If label or index is not mentioned properly an


exception will be raised.

http://www.datasciencemadesimple.com/access-elements-series-python-pandas/

Python Pandas - Series

Series is a one-dimensional labeled array capable of holding data of any type (integer,
string, float, python objects, etc.). The axis labels are collectively called index.
pandas.Series
A pandas Series can be created using the following constructor −
pandas.Series( data, index, dtype, copy)
The parameters of the constructor are as follows −

Sr.No Parameter & Description

1
data
data takes various forms like ndarray, list, constants

2
index
Index values must be unique and hashable, same length as data. Default np.arrange(n) if no index is
passed.

3
dtype
dtype is for data type. If None, data type will be inferred

4
copy
Copy data. Default False

A series can be created using various inputs like −

 Array
 Dict
 Scalar value or constant

Create an Empty Series


A basic series, which can be created is an Empty Series.

Example
#import the pandas library and aliasing as pd
import pandas as pd
s = pd.Series()
print s

Its output is as follows −
Series([], dtype: float64)

Create a Series from ndarray


If data is an ndarray, then index passed must be of the same length. If no index is passed,
then by default index will be range(n) where n is array length, i.e.,
[0,1,2,3…. range(len(array))-1].

Example 1

#import the pandas library and aliasing as pd


import pandas as pd
import numpy as np
data = np.array(['a','b','c','d'])
s = pd.Series(data)
print s

Its output is as follows −
0 a
1 b
2 c
3 d
dtype: object
We did not pass any index, so by default, it assigned the indexes ranging from 0
to len(data)-1, i.e., 0 to 3.

Example 2

#import the pandas library and aliasing as pd


import pandas as pd
import numpy as np
data = np.array(['a','b','c','d'])
s = pd.Series(data,index=[100,101,102,103])
print s

Its output is as follows −
100 a
101 b
102 c
103 d
dtype: object
We passed the index values here. Now we can see the customized indexed values in the
output.

Create a Series from dict


A dict can be passed as input and if no index is specified, then the dictionary keys are
taken in a sorted order to construct index. If index is passed, the values in data
corresponding to the labels in the index will be pulled out.

Example 1

#import the pandas library and aliasing as pd


import pandas as pd
import numpy as np
data = {'a' : 0., 'b' : 1., 'c' : 2.}
s = pd.Series(data)
print s

Its output is as follows −
a 0.0
b 1.0
c 2.0
dtype: float64
Observe − Dictionary keys are used to construct index.

Example 2

#import the pandas library and aliasing as pd


import pandas as pd
import numpy as np
data = {'a' : 0., 'b' : 1., 'c' : 2.}
s = pd.Series(data,index=['b','c','d','a'])
print s

Its output is as follows −
b 1.0
c 2.0
d NaN
a 0.0
dtype: float64
Observe − Index order is persisted and the missing element is filled with NaN (Not a
Number).

Create a Series from Scalar


If data is a scalar value, an index must be provided. The value will be repeated to match
the length of index
Live Demo
#import the pandas library and aliasing as pd
import pandas as pd
import numpy as np
s = pd.Series(5, index=[0, 1, 2, 3])
print s

Its output is as follows −
0 5
1 5
2 5
3 5
dtype: int64

Accessing Data from Series with Position


Data in the series can be accessed similar to that in an ndarray.
Example 1

Retrieve the first element. As we already know, the counting starts from zero for the array,
which means the first element is stored at zeroth position and so on.
Live Demo
import pandas as pd
s = pd.Series([1,2,3,4,5],index = ['a','b','c','d','e'])

#retrieve the first element


print s[0]

Its output is as follows −
1

Example 2

Retrieve the first three elements in the Series. If a : is inserted in front of it, all items from
that index onwards will be extracted. If two parameters (with : between them) is used,
items between the two indexes (not including the stop index)
Live Demo
import pandas as pd
s = pd.Series([1,2,3,4,5],index = ['a','b','c','d','e'])

#retrieve the first three element


print s[:3]

Its output is as follows −
a 1
b 2
c 3
dtype: int64

Example 3

Retrieve the last three elements.


Live Demo
import pandas as pd
s = pd.Series([1,2,3,4,5],index = ['a','b','c','d','e'])

#retrieve the last three element


print s[-3:]

Its output is as follows −
c 3
d 4
e 5
dtype: int64

Retrieve Data Using Label (Index)


A Series is like a fixed-size dict in that you can get and set values by index label.

Example 1

Retrieve a single element using index label value.


Live Demo
import pandas as pd
s = pd.Series([1,2,3,4,5],index = ['a','b','c','d','e'])

#retrieve a single element


print s['a']

Its output is as follows −
1

Example 2

Retrieve multiple elements using a list of index label values.


Live Demo
import pandas as pd
s = pd.Series([1,2,3,4,5],index = ['a','b','c','d','e'])

#retrieve multiple elements


print s[['a','c','d']]

Its output is as follows −
a 1
c 3
d 4
dtype: int64

Example 3

If a label is not contained, an exception is raised.


import pandas as pd
s = pd.Series([1,2,3,4,5],index = ['a','b','c','d','e'])

#retrieve multiple elements


print s['f']

Its output is as follows −

KeyError: 'f'

Python Programs

# 1.Creating series from list


import pandas as pd
import numpy as np
S1=pd.Series([101,102,103,104,105])
print(S1)

>>>

0 101
1 102
2 103
3 104
4 105
dtype: int64

# 2.Assigning index to elements of Series


import pandas as pd

S1=pd.Series([101,102,103,104,105],index=['A1','B1','C1','D1','E1'])

print(S1)

>>>
A1 101
B1 102
C1 103
D1 104
E1 105
dtype: int64

#3.Create series using range() function


S2=pd.Series(range(10,21))

print(S2)

>>>

0 10
1 11
2 12
3 13
4 14
5 15
6 16
7 17
8 18
9 19
10 20
dtype: int64

#4.Create series using range() function and


changing data type
S2=pd.Series(range(10),dtype='float32')
print(S2)

>>>
0 0.0
1 1.0
2 2.0
3 3.0
4 4.0
5 5.0
6 6.0
7 7.0
8 8.0
9 9.0
dtype: float32

#5.Printing Series elements and Series indexes


S3=pd.Series([20,np.NaN,np.NaN,45,67,89,54,45,23],index=['Anil',
'BN','BM','Ankit','Ram','Vishal','Ankita','Lokesh','Venkat'])

print(S3)
print(S3.index)
print(S3.values)
print(S3.dtype)
print(S3.shape)
print(S3.nbytes)
print(S3.ndim)
print(S3.itemsize)
print(S3.size)
print(S3.hasnans)

>>>
Anil 20.0
BN NaN
BM NaN
Ankit 45.0
Ram 67.0
Vishal 89.0
Ankita 54.0
Lokesh 45.0
Venkat 23.0
dtype: float64

Index(['Anil', 'BN', 'BM', 'Ankit', 'Ram', 'Vishal', 'Ankita',


'Lokesh','Venkat'],

dtype='object')
[20. nan nan 45. 67. 89. 54. 45. 23.]
float64
(9,)
72
1
8
9
True
#6.Accessing elements of Series

print(S3)

Anil 20.0
BN NaN
BM NaN
Ankit 45.0
Ram 67.0
Vishal 89.0
Ankita 54.0
Lokesh 45.0
Venkat 23.0
dtype: float64

print(S3[6])

>>>54.0

print(S3[:2])

>>>
Anil 20.0
BN NaN
dtype: float64

print(S3[1:4])

>>>
BN NaN
BM NaN
Ankit 45.0
dtype: float64
#7.Series with two different Lists

dayno=[1,2,3,4,5,6,7]
dayname=["Monday","Tuesday","Wednesday","Thursday","Friday",
"Saturday","Sunday"]

ser_week=pd.Series(dayname,index=dayno)
print(ser_week)

>>>
1 Monday
2 Tuesday
3 Wednesday
4 Thursday
5 Friday
6 Saturday
7 Sunday
dtype: object

#8.Creating series with integer, Nan and float


values
#Look at the change of data type of Series

#import numpy as np
S1=pd.Series([101,102,103,104,np.NaN,90.7])
print(S1)

>>>

0 101.0
1 102.0
2 103.0
3 104.0
4 NaN
5 90.7
dtype: float64

#9. Creating Series from dictionary


# Keys become index no. and values become Columns
# Check the change in data type

D1={'1':'Monday','2':'Tuesday','3':'Wednesday','4':'Thursday',
'5':'Friday','6':'Saturday','7':'Sunday'}
print(D1)
S5=pd.Series(D1)
print(S5)

>>>
{'1': 'Monday', '2': 'Tuesday', '3': 'Wednesday', '4': 'Thursday',
'5': 'Friday', '6': 'Saturday', '7': 'Sunday'}

1 Monday
2 Tuesday
3 Wednesday
4 Thursday
5 Friday
6 Saturday
7 Sunday
dtype: object

#10.Creating Series using a scalar/constant value


S9=pd.Series(90.7,index=['a','b','c','d','e','f','g'])
print(S9)

>>>
a 90.7
b 90.7
c 90.7
d 90.7
e 90.7
f 90.7
g 90.7
dtype: float64

S7=pd.Series(90)
print(S7)

>>>
0 90
dtype: int64

S8=pd.Series(90,index=[1])
print(S8)

>>>
1 90
dtype: int64

#11.Specifying range() function in index


attribute to generate a series object with
constant/scalar value
S90=pd.Series(95,index=range(5))
print(S90)

>>>
0 95
1 95
2 95
3 95
4 95
dtype: int64
#12. iloc() Method
S8=pd.Series([1,2,3,4,5,6,7],index=['a','b','c','d','e','f','g'])

print(S8.iloc[1:5])

>>>
b 2
c 3
d 4
e 5
dtype: int64

#13. loc() Method

print(S8.loc['b':'e'])

>>>
b 2
c 3
d 4
e 5
dtype: int64
#14.Extract those values of series for specified
index positions - take() Method

dayno=[91,92,93,94,95,96,97]
dayname=["Monday","Tuesday","Wednesday","Thursday","Friday",
"Saturday","Sunday"]

ser_week=pd.Series(dayname,index=dayno)

print(ser_week)
>>>

91 Monday
92 Tuesday
93 Wednesday
94 Thursday
95 Friday
96 Saturday
97 Sunday
dtype: object

pos=[0,2,5]
print(ser_week.take(pos))

>>>
91 Monday
93 Wednesday
96 Saturday
dtype: object

print(ser_week[91])

>>>
Monday
#15.Stack 2 Series horizontally

ss1=pd.Series([1,2,3,4,5],index=[11,12,13,14,15])

ss2=pd.Series(['a','b','c','d','e'])

print(ss1.append(ss2))

>>>
11 1
12 2
13 3
14 4
15 5
0 a
1 b
2 c
3 d
4 e
dtype: object

#Index numbers are repeated

print(ss1)
>>>
11 1
12 2
13 3
14 4
15 5
dtype: int64

print(ss2)
>>>
0 a
1 b
2 c
3 d
4 e
dtype: object
ss3=ss1.append(ss2)

print(ss3)

11 1
12 2
13 3
14 4
15 5
0 a
1 b
2 c
3 d
4 e
dtype: object

#Index numbers are repeated

head() and tail() methods

head () Function in Python (Get First N Rows):

head() function with no arguments gets the first five rows of data from the data
series .

Tail() Function in Python (Get Last N Rows):

tail() function with no arguments gets the last five rows of data from the data
series.
import pandas as pd

S8=pd.Series([1,2,3,4,5,6,7],index=['a','b','c','d','e','f','g'])
print(S8.head())

print(S8.tail())

print(S8.head(7))
print(S8.tail(6))
>>>
RESTART: C:/Users/naman/AppData/Local/Programs/Python/Python37-
32/panda-series.py
0 101
1 102
2 103
3 104
4 105
dtype: int64
A1 101
B1 102
C1 103
D1 104
E1 105
dtype: int64
0 10
1 11
2 12
3 13
4 14
5 15
6 16
7 17
8 18
9 19
10 20
dtype: int64
0 0.0
1 1.0
2 2.0
3 3.0
4 4.0
5 5.0
6 6.0
7 7.0
8 8.0
9 9.0
dtype: float32
Anil 20.0
BN NaN
BM NaN
Ankit 45.0
Ram 67.0
Vishal 89.0
Ankita 54.0
Lokesh 45.0
Venkat 23.0
dtype: float64
Index(['Anil', 'BN', 'BM', 'Ankit', 'Ram', 'Vishal', 'Ankita',
'Lokesh',
'Venkat'],
dtype='object')
[20. nan nan 45. 67. 89. 54. 45. 23.]
float64
(9,)
72
1
9
True
Anil 20.0
BN NaN
BM NaN
Ankit 45.0
Ram 67.0
Vishal 89.0
Ankita 54.0
Lokesh 45.0
Venkat 23.0
dtype: float64
54.0
Anil 20.0
BN NaN
dtype: float64
BN NaN
BM NaN
Ankit 45.0
dtype: float64
1 Monday
2 Tuesday
3 Wednesday
4 Thursday
5 Friday
6 Saturday
7 Sunday
dtype: object
0 101.0
1 102.0
2 103.0
3 104.0
4 NaN
5 90.7
dtype: float64
{'1': 'Monday', '2': 'Tuesday', '3': 'Wednesday', '4': 'Thursday',
'5': 'Friday', '6': 'Saturday', '7': 'Sunday'}
1 Monday
2 Tuesday
3 Wednesday
4 Thursday
5 Friday
6 Saturday
7 Sunday
dtype: object
a 90.7
b 90.7
c 90.7
d 90.7
e 90.7
f 90.7
g 90.7
dtype: float64
0 90
dtype: int64
1 90
dtype: int64
0 95
1 95
2 95
3 95
4 95
dtype: int64
b 2
c 3
d 4
e 5
dtype: int64
b 2
c 3
d 4
e 5
dtype: int64
91 Monday
92 Tuesday
93 Wednesday
94 Thursday
95 Friday
96 Saturday
97 Sunday
dtype: object
91 Monday
93 Wednesday
96 Saturday
dtype: object
Monday
11 1
12 2
13 3
14 4
15 5
0 a
1 b
2 c
3 d
4 e
dtype: object
11 1
12 2
13 3
14 4
15 5
dtype: int64
0 a
1 b
2 c
3 d
4 e
dtype: object
11 1
12 2
13 3
14 4
15 5
0 a
1 b
2 c
3 d
4 e
dtype: object
>>>

You might also like