3Python Processing CSV Data
3Python Processing CSV Data
Advertisements
Reading data from CSVcommaseparatedvalues is a fundamental necessity in Data Science. Often, we get data
from various sources which can get exported to CSV format so that they can be used by other systems. The
Panadas library provides features using which we can read the CSV file in full as well as in parts for only a selected
group of columns and rows.
You can create this file using windows notepad by copying and pasting this data. Save the file as input.csv using
the save As All files∗. ∗ option in notepad.
id,name,salary,start_date,dept
1,Rick,623.3,2012‐01‐01,IT
2,Dan,515.2,2013‐09‐23,Operations
3,Tusar,611,2014‐11‐15,IT
4,Ryan,729,2014‐05‐11,HR
5,Gary,843.25,2015‐03‐27,Finance
6,Rasmi,578,2013‐05‐21,IT
7,Pranab,632.8,2013‐07‐30,Operations
8,Guru,722.5,2014‐06‐17,Finance
import pandas as pd
data = pd.read_csv('path/input.csv')
print (data)
When we execute the above code, it produces the following result. Please note how an additional column starting
with zero as a index has been created by the function.
import pandas as pd
data = pd.read_csv('path/input.csv')
0 623.30
1 515.20
2 611.00
3 729.00
4 843.25
Name: salary, dtype: float64
import pandas as pd
data = pd.read_csv('path/input.csv')
salary name
0 623.30 Rick
1 515.20 Dan
2 611.00 Tusar
3 729.00 Ryan
4 843.25 Gary
5 578.00 Rasmi
6 632.80 Pranab
7 722.50 Guru
import pandas as pd
data = pd.read_csv('path/input.csv')
# Use the multi‐axes indexing funtion
print (data.loc[[1,3,5],['salary','name']])
salary name
1 515.2 Dan
3 729.0 Ryan
5 578.0 Rasmi
import pandas as pd
data = pd.read_csv('path/input.csv')
salary name
2 611.00 Tusar
3 729.00 Ryan
4 843.25 Gary
5 578.00 Rasmi
6 632.80 Pranab