Python Pandas - Home
Python Pandas - Introduction
Python Pandas - Environment Setup
Python Pandas - Basics
Python Pandas - Introduction to Data Structures
Python Pandas - Index Objects
Python Pandas - Panel
Python Pandas - Basic Functionality
Python Pandas - Indexing & Selecting Data
Python Pandas - Series
Python Pandas - Series
Python Pandas - Slicing a Series Object
Python Pandas - Attributes of a Series Object
Python Pandas - Arithmetic Operations on Series Object
Python Pandas - Converting Series to Other Objects
Python Pandas - DataFrame
Python Pandas - DataFrame
Python Pandas - Accessing DataFrame
Python Pandas - Slicing a DataFrame Object
Python Pandas - Modifying DataFrame
Python Pandas - Removing Rows from a DataFrame
Python Pandas - Arithmetic Operations on DataFrame
Python Pandas - IO Tools
Python Pandas - IO Tools
Python Pandas - Working with CSV Format
Python Pandas - Reading & Writing JSON Files
Python Pandas - Reading Data from an Excel File
Python Pandas - Writing Data to Excel Files
Python Pandas - Working with HTML Data
Python Pandas - Clipboard
Python Pandas - Working with HDF5 Format
Python Pandas - Comparison with SQL
Python Pandas - Data Handling
Python Pandas - Sorting
Python Pandas - Reindexing
Python Pandas - Iteration
Python Pandas - Concatenation
Python Pandas - Statistical Functions
Python Pandas - Descriptive Statistics
Python Pandas - Working with Text Data
Python Pandas - Function Application
Python Pandas - Options & Customization
Python Pandas - Window Functions
Python Pandas - Aggregations
Python Pandas - Merging/Joining
Python Pandas - MultiIndex
Python Pandas - Basics of MultiIndex
Python Pandas - Indexing with MultiIndex
Python Pandas - Advanced Reindexing with MultiIndex
Python Pandas - Renaming MultiIndex Labels
Python Pandas - Sorting a MultiIndex
Python Pandas - Binary Operations
Python Pandas - Binary Comparison Operations
Python Pandas - Boolean Indexing
Python Pandas - Boolean Masking
Python Pandas - Data Reshaping & Pivoting
Python Pandas - Pivoting
Python Pandas - Stacking & Unstacking
Python Pandas - Melting
Python Pandas - Computing Dummy Variables
Python Pandas - Categorical Data
Python Pandas - Categorical Data
Python Pandas - Ordering & Sorting Categorical Data
Python Pandas - Comparing Categorical Data
Python Pandas - Handling Missing Data
Python Pandas - Missing Data
Python Pandas - Filling Missing Data
Python Pandas - Interpolation of Missing Values
Python Pandas - Dropping Missing Data
Python Pandas - Calculations with Missing Data
Python Pandas - Handling Duplicates
Python Pandas - Duplicated Data
Python Pandas - Counting & Retrieving Unique Elements
Python Pandas - Duplicated Labels
Python Pandas - Grouping & Aggregation
Python Pandas - GroupBy
Python Pandas - Time-series Data
Python Pandas - Date Functionality
Python Pandas - Timedelta
Python Pandas - Sparse Data Structures
Python Pandas - Sparse Data
Python Pandas - Visualization
Python Pandas - Visualization
Python Pandas - Additional Concepts
Python Pandas - Caveats & Gotchas

Python Pandas - Reading Tabular Data

Quiz

The Pandas library in Python provides a wide variety of functions to read tabular data from different sources, including CSV, Excel, SQL databases, JSON files, and more. Depending on your data format, you can choose an appropriate function (read_table(), read_csv(), read_excel(), etc.) to load the data into a Pandas DataFrame for further analysis.

In this tutorial, we will learn about commonly used Pandas methods for reading tabular data, including examples for each format. Whether you're working with CSV files, Excel spreadsheets, or other delimited text files, Pandas makes it easy to load and analyze your tabular data in Pandas.

Common Methods in Pandas for Reading Tabular Data

Pandas provides several functions to load tabular data into Pandas objects. You can choose an appropriate method depending on the file format. The most commonly used methods are −

read_table(): Used for reading tab-separated values (TSV) or other delimited text files.
read_csv(): Used for reading CSV (comma-separated values) files.
read_excel(): Used for reading Excel files (.xls, .xlsx).
read_sql(): Used for reading data from a SQL database.
read_json(): Used for reading JSON files.
read_html(): Used for reading HTML tables.

Reading Tab-Delimited Files

The read_table() function is used to read tab-separated values (TSV) or other delimited files. This method assumes that the data provided in the text file is separated by tabs (\t) by default.

Example

This example demonstrates reading tabular data from a tab-delimited text file to Pandas DataFrame using the read_table() method. In this example, we explicitly specified a column name to set it as the DataFrame index using the index_col parameter.

import pandas as pd
# Import StringIO to load a file-like object
from io import StringIO

# Create a tab-delimited data
data = """
Sr.no\tName\tGender\tAge
1\tBraund\tmale\t22
2\tCumings\tfemale\t38
3\tHeikkinen\tfemale\t26
4\tFutrelle\tfemale\t35
"""

# Use StringIO to convert the string data into a file-like object
obj = StringIO(data)

# Reading tab-delimited data
df = pd.read_table(obj, index_col="Sr.no")

print("DataFrame from Tab-Delimited File:")
print(df)

Following is an output of the above code −

DataFrame from Tab-Delimited File:

	Name	Gender	Age
Sr.no
1	Braund	male	22
2	Cumings	female	38
3	Heikkinen	female	26
4	Futrelle	female	35

Reading CSV Files

The most common format for tabular data is CSV (comma-separated values). Pandas provides the read_csv() method to read tabular data from CSV files into a DataFrame. This method provides various customization options through its parameters, like skiprows to skip rows, dtype to specify column data types, and na_values to treat specific values as NaN, and more.

Example

This example shows loading CSV data into Pandas DataFrame using the read_csv() method.

import pandas as pd

url ="https://raw.githubusercontent.com/Opensourcefordatascience/Data-sets/master/blood_pressure.csv"

# Read a CSV file into a DataFrame
df = pd.read_csv(url)

# Display the first few rows
print("Loaded CSV Tabular Data:")
print(df.head())

When we run the above program, it produces the following result −

Loaded CSV Tabular Data:

	patient	sex	agegrp	bp_before	bp_after
0	1	Male	30-45	143	153
1	2	Male	30-45	163	170
2	3	Male	30-45	153	168
3	4	Male	30-45	153	142
4	5	Male	30-45	146	141

Reading Excel Files

Excel is one of the most popular formats for storing tabular data. Pandas provides the read_excel() function to load data from Excel files (.xls, .xlsx).

Example

This example shows how to read tabular data from an excel file to Pandas DataFrame using the read_excel() method.

import pandas as pd

# Read an Excel file 
df = pd.read_excel('data.xlsx')

# Print the DataFrame
print("DataFrame from Excel File:")
print(df)

While executing the above code we get the following output −

DataFrame from Excel File:

	Car	Date_of_purchase
0	BMW	10-10-2024
1	Lexus	12-10-2024
2	Audi	17-10-2024
3	Mercedes	16-10-2024
4	Jaguar	19-10-2024
5	Bentley	22-10-2024

Reading Data from SQL Databases

Pandas provides the read_sql() method to load SQL table, execute SQL queries and load the result into a DataFrame.

Example

This example shows creating an in-memory SQL table from a DataFrame and loading back to new DataFrame using the to_sql() and read_sql() methods respectively.

import pandas as pd
from sqlite3 import connect

# Create a database connection
conn = connect(':memory:')

# Create a sample DataFrame
df = pd.DataFrame({
"Col_1": list("abc"),
"Col_2": list(range(1, 4)),
"Col_3": pd.date_range("20240101", periods=3)
})

# Save the DataFrame in a SQL table with 'test_data' name
df.to_sql(name='test_data', con=conn)

# Read SQL table into a DataFrame
result = pd.read_sql('SELECT Col_1, Col_2, Col_3 FROM test_data', conn)

# Display the retrieved data
print("DataFrame from SQL Table:")
print(result.head())

When we run the above program, it produces the following result −

DataFrame from SQL Table:

	Col_1	Col_2	Col_3
0	a	1	2024-01-01 00:00:00
1	b	2	2024-01-02 00:00:00
2	c	3	2024-01-03 00:00:00

Reading JSON Files

JSON (JavaScript Object Notation) is a popular data interchange format. You can use read_json() to read data from JSON files into a DataFrame.

Example

This example shows reading JSON file data into Pandas DataFrame using the read_json() method.

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
"Col_1": list("abc"),
"Col_2": list(range(1, 4)),
"Col_3": pd.date_range("20240101", periods=3)
})

# Write DataFrame to a JSON file
df.to_json("output_json_file.json")

# Read JSON data into a DataFrame
result = pd.read_json("output_json_file.json")

# Display the retrieved data
print('DataFrame from JSON file:')
print(result.head())

When we run the above program, it produces the following result −

DataFrame from JSON file:

	Col_1	Col_2	Col_3
0	a	1	2024-01-01 00:00:00
1	b	2	2024-01-02 00:00:00
2	c	3	2024-01-03 00:00:00

Reading HTML Tables

If you need to read tables directly from an HTML file or webpage, read_html() can scrape and parse HTML tables into a Pandas DataFrames.

Example

This example demonstrates reading tabular data from a HTML file using the read_html() method.

import pandas as pd

# Create a HTML string
html_str = """
<table>
   <tr><th>Col_1</th><th>Col_2</th><th>Col_3</th></tr>
   <tr><td>a</td><td>b</td><td>c</td></tr>
   <tr><td>x</td><td>y</td><td>z</td></tr>
</table>
"""

# Save to a temporary file and read
with open("temp.html", "w") as f:
    f.write(html_str)

df = pd.read_html("temp.html")[0]

# Display the output
print("DataFrame from HTML File:")
print(df)

Following is an output of the above code −

DataFrame from HTML File:

	Col_1	Col_2	Col_3
0	a	b	c
1	x	y	z

Print Page