Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
100% found this document useful (1 vote)
71 views

How To Work With Excel Spreadsheets Using Python

The document discusses how to work with Excel spreadsheets using the Python programming language. It describes how to open and read Excel files using the openpyxl module in Python. Some key uses cases covered include importing new product data from an Excel file into a database, exporting database data to an Excel spreadsheet, and appending additional information to an existing spreadsheet. The document provides code examples for reading cell values, iterating through rows and columns, and slicing ranges of cells from a spreadsheet.

Uploaded by

sai prashanth
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
71 views

How To Work With Excel Spreadsheets Using Python

The document discusses how to work with Excel spreadsheets using the Python programming language. It describes how to open and read Excel files using the openpyxl module in Python. Some key uses cases covered include importing new product data from an Excel file into a database, exporting database data to an Excel spreadsheet, and appending additional information to an existing spreadsheet. The document provides code examples for reading cell values, iterating through rows and columns, and slicing ranges of cells from a spreadsheet.

Uploaded by

sai prashanth
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

7/10/2021 How to Work with Excel Spreadsheets using Python?

Python Programming

How to Work with Excel Spreadsheets using Python 


14616

by
Priyankur Sarkar (https://www.knowledgehut.com/blog/author/priyankur-sarkar)
31st Oct, 2019
Last updated on 07th Apr, 2021
16 mins read

(http://www.facebook.com/sharer/sharer.php?u=https://www.knowledgehut.com/blog/programming/how-to-work-with-excel-using-python)

(http://twitter.com/share?via=Knowledgehut&url=https://www.knowledgehut.com/blog/programming/how-to-work-with-excel-using-python&text=How to Work with Excel
Spreadsheets using Python&hashtags=)

(https://www.linkedin.com/cws/share?url=https://www.knowledgehut.com/blog/programming/how-to-work-with-excel-using-python)
 (https://wa.me/?text=https://www.knowledgehut.com/blog/programming/how-to-work-with-excel-using-python)

tnuocsiD %02
Excel is considered as one of the most popular and widely used spreadsheet applications developed by Microsoft. You can organize, analyze and store your
data into tabular sheets with the help of Excel. From analysts and sales managers, to CEOs, professionals from every field use Excel for creating quick statistics
and for data crunching.

Spreadsheets are commonly used in the present world because of their intuitive nature and the ability to handle large datasets. Most importantly, they can
work without any prior technical background.

(https://www.knowledgehut.com/programming/python-programming-certification-training?
utm_source=blog&utm_medium=contentcta&utm_campaign=blogs2020)

Finding different ways to work with Excel using code is essential since working with data and in Python has some serious advantages in comparison with Excel’s
UI. Developers of Python have implemented ways to read, write and manipulate Excel documents.

You can check the quality of your spreadsheet application by going over the checklist below:

Is the spreadsheet able to represent static data?


Is the spreadsheet able to mix data, calculations, and reports?
Is the data in your spreadsheet complete and consistent in nature?
Does the spreadsheet have an organized worksheet structure?

This checklist will help you in verifying the qualitative nature of the spreadsheet application you’re going to work on.
Subscribe to our newsletter. Enter Your E-mail SUBSCRIBE 
https://www.knowledgehut.com/blog/programming/how-to-work-with-excel-using-python 1/21
7/10/2021 How to Work with Excel Spreadsheets using Python?

Practical Applications  
In this article, we would be using openpyxl  to work on data. With the help of this module, you can extract data from a database into an Excel spreadsheet or
you can also convert an Excel spreadsheet into a programmatic format. There can be a lot of possible situations where you might feel the need to use a
package like openpyxl. Let us discuss a few of them to get a comprehensive overview of it.

Importing New Products Into a Database 


Consider yourself working in an online store company. When they want to add new products to the online store, they make an Excel spreadsheet with a few
hundred rows along with the name of the product, description, price and a few more basic information and then they give it to you. 

(https://www.knowledgehut.com/data-science/machine-learning-with-python-certification-training)

Now, if you want to import this particular data, you need to iterate over each row of the spreadsheet and then add each of the products into the database of the
online store.

Exporting Database Data Into a Spreadsheet


Consider you have a Database table. In this particular table, you have collected information of all your users which includes their name, contact number, email
address, and so forth. Now, the Marketing Team is willing to collectively contact all the users and promote a new product of the company. However, neither do
they have access to the Database nor they have any idea about using SQL to extract the information. 

In this situation, openpyxl  comes to play. You can use it effectively to iterate over each User record and transform the required information into an Excel

tnuocsiD %02
spreadsheet.    

Appending Information to an Existing Spreadsheet


Consider the same online store example we discussed above. You have an Excel spreadsheet with a list of users and your job is to append to each row the total
amount they have spent in your store.

In order to perform this, you have to read the spreadsheet first and then iterate through each row and fetch the total amount spent from the Database. Finally,
you need to write it back to the spreadsheet.

Starting openpyxl
You can install the openpyxl package using pip. Open your terminal and write the following command: 

$ pip install openpyxl

After you have installed the spreadsheet, you can make up your own simple spreadsheet: 

from openpyxl import Workbook

workbook = Workbook()

spreadsheet = workbook.active

spreadsheet["A1"] = "Hello"

spreadsheet["B1"] = "World!"

workbook.save(filename="HelloWorld.xlsx")

How to Read Excel Spreadsheets with openpyxl 


Subscribe to our newsletter. Enter Your E-mail SUBSCRIBE 
https://www.knowledgehut.com/blog/programming/how-to-work-with-excel-using-python 2/21
7/10/2021 How to Work with Excel Spreadsheets using Python?

Let us start with the most important thing that you can do with a spreadsheet,i.e. read it. We will be using a Watch Sample Dataset
(https://github.com/realpython/materials/raw/master/openpyxl-excel-spreadsheets-python/reviews-sample.xlsx) which contains a list of 100 watches with
information like product name, product ID, review and so forth.  

A Simple Way to Read an Excel Spreadsheet 


Let us start with opening our sample spreadsheet:

>>> from openpyxl import load_workbook

>>> workbook = load_workbook(filename="sample.xlsx")

>>> workbook.sheetnames

['Sheet 1']

>>> spreadsheet = workbook.active

>>> spreadsheet

<Worksheet "Sheet 1">

>>> spreadsheet.title

In the example code above, we open the spreadsheet using load_workbook and then we check all the sheets that are available to work with using
workbook.sheetnames . Then Sheet 1 is automatically selected using workbook.active since it is the first sheet available. This is the most common way of
opening a spreadsheet.  

Now, let us see the code to retrieve data from the spreadsheet: 

>>> spreadsheet["A1"]

<Cell 'Sheet 1'.A1>

tnuocsiD %02
>>> spreadsheet["A1"].value

'marketplace'

>>> spreadsheet["F10"].value

"G-Shock Men's Grey Sport Watch"

You can retrieve the actual value and the cell value  both. To get the actual value, use .value  and to get the cell, you can use .cell() :

>>> spreadsheet.cell(row=10, column=6)

<Cell 'Sheet 1'.F10>

>>> spreadsheet.cell(row=10, column=6).value

"G-Shock Men's Grey Sport Watch"

Importing Data from a Spreadsheet 


In this section, we will discuss how to iterate through the data, and about conversion into a more useful format using Python.

Let us first start with iterating through the data. There are a number of iterating methods that depend solely on the user.

You can slice the data with a combination of rows and columns:

>>> spreadsheet["A1:C2"]

((<Cell 'Sheet 1'.A1>, <Cell 'Sheet 1'.B1>, <Cell 'Sheet 1'.C1>),

 (<Cell 'Sheet 1'.A2>, <Cell 'Sheet 1'.B2>, <Cell 'Sheet 1'.C2>))

You can also iterate through the dataset by ranging between rows and columns: 

Subscribe to our newsletter. Enter Your E-mail SUBSCRIBE 


https://www.knowledgehut.com/blog/programming/how-to-work-with-excel-using-python 3/21
7/10/2021 How to Work with Excel Spreadsheets using Python?

>>> # Get all cells from column A 

>>> spreadsheet["A"]

(<Cell 'Sheet 1'.A1>,

 <Cell 'Sheet 1'.A2>, 

 ... 

 <Cell 'Sheet 1'.A99>, 

 <Cell 'Sheet 1'.A100>)

>>> # Get all cells for a range of columns

>>> spreadsheet["A:B"] 

((<Cell 'Sheet 1'.A1>, 

  <Cell 'Sheet 1'.A2>, 

  ... 

  <Cell 'Sheet 1'.A99>, 

  <Cell 'Sheet 1'.A100>), 

 (<Cell 'Sheet 1'.B1>, 

  <Cell 'Sheet 1'.B2>, 

  ... 

  <Cell 'Sheet 1'.B99>, 

  <Cell 'Sheet 1'.B100>)) 

>>> # Get all cells from row 5

>>> spreadsheet[5]

(<Cell 'Sheet 1'.A5>,

 <Cell 'Sheet 1'.B5>,

tnuocsiD %02
 ... 

 <Cell 'Sheet 1'.N5>,

 <Cell 'Sheet 1'.O5>)

>>> # Get all cells for a range of rows

>>> spreadsheet[5:6]

((<Cell 'Sheet 1'.A5>,

  <Cell 'Sheet 1'.B5>, 

  ... 

  <Cell 'Sheet 1'.N5>, 

  <Cell 'Sheet 1'.O5>), 

 (<Cell 'Sheet 1'.A6>, 

  <Cell 'Sheet 1'.B6>, 

  ... 

  <Cell 'Sheet 1'.N6>, 

  <Cell 'Sheet 1'.O6>))

Python offers arguments by which you can set limits to the iteration with the help of Python generators like .iter_rows() and  .iter_cols() : 

Subscribe to our newsletter. Enter Your E-mail SUBSCRIBE 


https://www.knowledgehut.com/blog/programming/how-to-work-with-excel-using-python 4/21
7/10/2021 How to Work with Excel Spreadsheets using Python?

>>> for row in spreadsheet.iter_rows(min_row=1,

... max_row=2,

... min_col=1,

... max_col=3):

... print(row)

(<Cell 'Sheet 1'.A1>, <Cell 'Sheet 1'.B1>, <Cell 'Sheet 1'.C1>)


(<Cell 'Sheet 1'.A2>, <Cell 'Sheet 1'.B2>, <Cell 'Sheet 1'.C2>)

>>> for column in spreadsheet.iter_cols(min_row=1, 

... max_row=2,

... min_col=1,

... max_col=3):

... print(column)

(<Cell 'Sheet 1'.A1>, <Cell 'Sheet 1'.A2>)

(<Cell 'Sheet 1'.B1>, <Cell 'Sheet 1'.B2>) 

(<Cell 'Sheet 1'.C1>, <Cell 'Sheet 1'.C2>)

You can also add Boolean values_only in the above example and set it to True  to get the values of cell: 

>>> for value in spreadsheet.iter_rows(min_row=1, 

... max_row=2, 

... min_col=1, 

... max_col=3, 

... values_only=True):

... print(value)

('marketplace', 'customer_id', 'review_id')

tnuocsiD %02
('US', 3653882, 'R3O9SGZBVQBV76')

Since we are now done with iterating the data, let us now manipulate data using Python’s primitive data structures. 

Consider a situation where you want to extract information of a product from the sample spreadsheet and then store it into the dictionary. The key to the
dictionary would be the product ID.   

Convert Data into Python classes


To convert data into Python data classes (https://realpython.com/python-data-classes/), let us first decide what we want to store and how to store it.  

The two essential elements that can be extracted from the data are as follows:

                                                     1. Products                                             2. Review

                                                          • ID                                                         • ID

                                                          • Title                                                     • Customers ID

                                                          • Parent                                                 • Headline

Subscribe to our newsletter. Enter Your E-mail SUBSCRIBE 


https://www.knowledgehut.com/blog/programming/how-to-work-with-excel-using-python 5/21
7/10/2021 How to Work with Excel Spreadsheets using Python?

                                                          • Category                                            • Body

                                                                                                                         • Date

Let us implement the two elements: 

import datetime

from dataclasses import dataclass

@dataclass

class Product:

id: str

parent: str

title: str

category: str

@dataclass

class Review:

id: str

customer_id: str

stars: int

headline: str

body: str 

date: datetime.datetime

The next step is to create a mapping between columns and the required fields: 

>>> for value in spreadsheet.iter_rows(min_row=1,

... max_row=1,

tnuocsiD %02
... values_only=True):

... print(value)

('marketplace', 'customer_id', 'review_id', 'product_id', ...)

>>> # Or an alternative

>>> for cell in sheet[1]:

... print(cell.value)

marketplace

Customer_ID

Review_ID

Product_ID

Product_Parent

...

Finally, let us convert the data into new structures which will parse the data in spreadsheet into a list of products and review objects: 

Subscribe to our newsletter. Enter Your E-mail SUBSCRIBE 


https://www.knowledgehut.com/blog/programming/how-to-work-with-excel-using-python 6/21
7/10/2021 How to Work with Excel Spreadsheets using Python?

from datetime import datetime 

from openpyxl import load_workbook 

from classes import Product,Review 

from mapping import PRODUCT_ID,PRODUCT_PARENT,PRODUCT_TITLE, \

PRODUCT_CATEGORY,REVIEW_DATE,REVIEW_ID,REVIEW_CUSTOMER, \

REVIEW_STARS,REVIEW_HEADLINE,REVIEW_BODY

# Using the read_only method since you're not gonna be editing the spreadsheet

workbook = load_workbook(filename="watch_sample.xlsx",read_only=True) 

spreadsheet = workbook.active

products = []

reviews = []

# Using the values_only because you just want to return the cell value

for row in spreadsheet .iter_rows(min_row=2, values_only=True): 


product = Product(id=row[PRODUCT_ID], 

parent=row[PRODUCT_PARENT], 

title=row[PRODUCT_TITLE], 

category=row[PRODUCT_CATEGORY]) 

products.append(product)

# You need to parse the date from the spreadsheet into a datetime format

spread_date = row[REVIEW_DATE] 

parsed_date = datetime.strptime(spread_date,"%Y-%m-%d")

tnuocsiD %02
review = Review(id=row[REVIEW_ID],

Customer_ID=row[REVIEW_CUSTOMER],

stars=row[REVIEW_STARS],

headline=row[REVIEW_HEADLINE],

body=row[REVIEW_BODY],

date=parsed_date)

reviews.append(review)

print(products[0])

print(reviews[0])

After you execute the code, you will get an output that looks like this:

Product(id='A90FALZ1ZC',parent=937111370,...)

Review(id='D3O9OGZVVQBV76',customer_id=3903882,...)

Appending Data 
To understanding how to append data, let us hover back to the first sample spreadsheet. We will open the document and append some data to it: 

Subscribe to our newsletter. Enter Your E-mail SUBSCRIBE 


https://www.knowledgehut.com/blog/programming/how-to-work-with-excel-using-python 7/21
7/10/2021 How to Work with Excel Spreadsheets using Python?

from openpyxl import load_workbook

# Start by opening the spreadsheet and selecting the main sheet


workbook = load_workbook(filename="hello_world.xlsx")

spreadsheet = workbook.active

# Write what you want into a specific cell

spreadsheet["C1"]="Manipulating_Data ;)"

# Save the spreadsheet

workbook.save(filename="hello_world_append.xlsx"

If you open your Excel file, you will notice the additional Manipulating_Data being added to an adjacent cell. 

Writing Excel Spreadsheets With openpyxl 


A spreadsheet is a file that helps to store data in specific rows and columns. We can calculate and store numerical data and also perform computation using
formulas. 

So, let’s begin with some simple Spreadsheets and understand what each line means. 

Creating our first simple Spreadsheet


1 from openpyxl import Workbook

 2  

 3 filename = "first_program.xlsx"

 4  

tnuocsiD %02
 5 workbook = Workbook()

 6 spreadsheet = workbook.active

 7  

 8 sheet["A1"] = "first"

 9 sheet["B1"] = "program!"

10  

11 workbook.save(filename=filename)

Line 5: In order to make a Spreadsheet, at first,  we have to create an Empty workbook to perform further operations. 

Lines 8 and 9 : We can add data to a specific cell as per our requirement. In this example, we can see that two values “first” and “program” have been added to
specific cells in the sheet. 

Line 11: The line shows how to save data after all the operations we have done. 

Basic Spreadsheet Operations 


Before going to the difficult coding part, at first we have to build our building blocks like how to add and update values, how to manage rows and columns,
adding filters, styles or formulas in a Spreadsheet. 

We have already explained the following code by which we can add values to a Spreadsheet: 

>>> spreadsheet["A1"] = "the_value_we_want_to_add"

There is another way that we can add values to Spreadsheet: 

Subscribe to our newsletter. Enter Your E-mail SUBSCRIBE 


https://www.knowledgehut.com/blog/programming/how-to-work-with-excel-using-python 8/21
7/10/2021 How to Work with Excel Spreadsheets using Python?

>>> cell = sheet["A1"]

>>> cell

<Cell 'Sheet'.A1>

>>> cell.value

'hello'

>>> cell.value = "hey"

>>> cell.value

'hey'

Line 1: In the first line at first we have declared the cell and updated its value. 

Line 5: We have printed the value of the cell as “first”  because  in the first program we have already assigned sheet["A1"] with “first” 

Line 8 : We have updated the value of the cell as "second" by simply assigning it to cell.value . 

Lines 9 : In this line, we have just printed the updated value of cell. 

Finally, you have to save all the operations you have performed into the spreadsheet once you call workbook.save() .

If  the cell didn’t exist while adding a value ,then openpyxl creates a cell:

>>> # Before, our spreadsheet has only 1 row

>>> print_rows()

('first', 'program!')

>>> # Try adding a value to row 10

tnuocsiD %02
>>> spreadsheet["B10"] = "test"

>>> print_rows()

('first', 'program!')

(None, None) 

(None, None) 

(None, None) 

(None, None) 

(None, None) 

(None, None) 

(None, None) 

(None, None) 

(None, 'test')

Managing Rows and Columns in Spreadsheet 


Insertion or deletion of rows (adding or removing elements of rows /columns) is one of the most basic operations in Spreadsheet. In openpyxl.We can perform
these operations by simply calling these methods and passing its arguments. 

.insert_rows()

.delete_rows()

.insert_cols()

.delete_cols()

We can pass 2 types of arguments to the methods :  

1. idx 
2. amount 

Idx  stands for index position and  amount  refers to the number of values we can store in the Spreadsheet. 
Subscribe to our newsletter. Enter Your E-mail SUBSCRIBE 
https://www.knowledgehut.com/blog/programming/how-to-work-with-excel-using-python 9/21
7/10/2021 How to Work with Excel Spreadsheets using Python?

Using our basic knowledge based on the first  simple program, let’s see how we can use these methods inside the program: 

>>> print_rows()

('first', 'program!')

>>> # Insert a column at the first position before column 1 ("A")

>>> spreadsheet.insert_cols(idx=1)

>>> print_rows()

(None, 'first', 'program!')

>>> # Insert 5 columns in  between column 2 ("B") and 3 ("C")

>>> spreadsheet.insert_cols(idx=3,amount=5)

>>> print_rows()

(None, 'first', None, None, None, None, None, 'program!')

>>> # Delete the created columns

>>> spreadsheet.delete_cols(idx=3,amount=5)

>>> v.delete_cols(idx=1)

>>> print_rows()

('first', 'program!')

>>> # Insert a new row in the beginning

>>> spreadsheet.insert_rows(idx=1)

>>> print_rows()

(None, None)

tnuocsiD %02
('first', 'program!')

>>> # Insert 3 new rows in the beginning 

>>> spreadsheet.insert_rows(idx=1,amount=3)

>>> print_rows()

(None, None)

(None, None) 

(None, None) 

(None, None) 

('first', 'program!')

>>> # Delete the first 4 rows 

>>> spreadsheet.delete_rows(idx=1,amount=4) 

>>> print_rows() 

('first', 'program!')

Managing Sheets
We have seen the following recurring piece of code in our previous examples .This is one of the ways of selecting the default sheet from the Spreadsheet: 

spreadsheet = workbook.active

However, if you want to open a spreadsheet with multiple sheets, you can write the following command: 

Subscribe to our newsletter. Enter Your E-mail SUBSCRIBE 


https://www.knowledgehut.com/blog/programming/how-to-work-with-excel-using-python 10/21
7/10/2021 How to Work with Excel Spreadsheets using Python?

>>> # Let's say you have two sheets: "Products" and "Company Sales"

>>> workbook.sheetnames

['Products', 'Company Sales']

>>> # You can select a sheet using its title

>>> Products_Sheet = workbook["Products"]

>>> Sales_sheet = workbook["Company Sales"]

If we want to change the title of the Sheet, execute the following code: 

>>> workbook.sheetnames

['Products', 'Company Sales']

>>> Products_Sheet = workbook["Products"]

>>> Products_Sheet.title = "New Products"

>>> workbook.sheetnames

['New Products', 'Company Sales']

We can CREATE / DELETE Sheets also with the help of two methods - .create_sheet() and   .remove() : 

>>> #To print the available sheet names

>>> workbook.sheetnames 

['Products', 'Company Sales']

>>> #To create a new Sheet named "Operations"

>>> Operations_Sheet = workbook.create_sheet("Operations")

tnuocsiD %02
>>> #To print the updated available sheet names

>>> workbook.sheetnames

['Products', 'Company Sales', 'Operations']

>>> # To define the position Where we want to create the Sheet(here “HR” sheet is created at the first position .H
ere index 0 represents the first position)

>>> HR_Sheet = workbook.create_sheet("HR",0)

>>> #To again  print the updated available sheet names

>>> workbook.sheetnames

['HR', 'Products', 'Company Sales', 'Operations']

>>> # To remove them,we just have to send the sheet names as an argument which we want to delete to the method  .r
emove() 

>>> workbook.remove(Operations_Sheet)

>>> workbook.sheetnames

['HR', 'Products', 'Company Sales']

>>> #To delete hr_sheet

>>> workbook.remove(hr_sheet)

>>> workbook.sheetnames

['Products', 'Company Sales']

Adding Filters to the Spreadsheet 


We can use openpyxl  to add filters in our Spreadsheet but when we open our Spreadsheet, the data won’t be rearranged according to these sorts and filters. 
Subscribe to our newsletter. Enter Your E-mail SUBSCRIBE 
https://www.knowledgehut.com/blog/programming/how-to-work-with-excel-using-python 11/21
7/10/2021 How to Work with Excel Spreadsheets using Python?

When you’re programmatically creating a spreadsheet and it is going to be sent and used by someone else, it is a good practice to add different filters and
allow people to use it afterward. 

In the code below there is a simple example which shows how to add a simple filter to your spreadsheet: 

>>> # Check the used spreadsheet space using the attribute "dimensions"

>>> spreadsheet.dimensions

'A1:O100'

>>> spreadsheet.auto_filter.ref="A1:O100"

>>> workbook.save(filename="watch_sample_with_filters.xlsx")

Adding Formulas to the Spreadsheet 


Formulas are one of the most commonly used and powerful features of spreadsheets. By using formulas, you can solve various mathematical equations with
the additional support of openpyxl  which makes those calculations as simple as editing a specific cell’s value.

The list of formulas supported by openpyxl  are:

>>> from openpyxl.utils import FORMULAE

>>> FORMULAE

frozenset({'ABS',

           'AMORLINC',

           'ACCRINT', 

           'ACOS', 

           'ACCRINTM', 

           'ACOSH', 

tnuocsiD %02
            ...,       

           'AND',

           'YEARFRAC', 

           'YIELDDISC', 

           'AMORDEGRC', 

           'YIELDMAT', 

           'YIELD', 

           'ZTEST'})

Let’s add some formulas to our spreadsheet. 

Let’s check the average star rating of  the 99 reviews within the spreadsheet: 

>>> # Star rating is in column "H" 

>>> spreadsheet["P2"] = "=AVERAGE(H2:H100)"

>>> workbook.save(filename = "first_example.xlsx")

Now, if we open your spreadsheet and go to cell P2, you can see the value to be 4.18181818181818.  

Similarly, we can use this methodology to include any formulas for our requirements in our spreadsheet. For example, if we want to count the number of
helpful reviews: 

>>> # The helpful votes  counted in column "I" 

>>> spreadsheet["P3"] = '=COUNTIF(I2:I100, ">0")'

>>> workbook.save(filename = "first_example.xlsx")

Adding Styles to the Spreadsheet


It is not so important and usually, we don’t use this in everyday code but for the sake of completeness, we will also understand this with the following example.

Using openpyxl , we get multiple styling options such as including fonts, colors,  borders,and so on.
Subscribe to our newsletter. Enter Your E-mail SUBSCRIBE 
https://www.knowledgehut.com/blog/programming/how-to-work-with-excel-using-python 12/21
7/10/2021 How to Work with Excel Spreadsheets using Python?

Let’s have a look at an example:

>>> # Import necessary style classes

>>> from openpyxl.styles import Font,Color,Alignment,Border,Side,colors

>>> # Create a few styles

>>> Bold_Font = Font(bold=True)

>>> Big_Red_Text = Font(color=colors.RED,size=20) 

>>> Center_Aligned_Text = Alignment(horizontal="center") 

>>> Double_Border_Side = Side(border_style="double") 

>>> Square_Border = Border(top=double_border_side, 

... right=double_border_side, 

... bottom=double_border_side, 

... left=double_border_side) 

>>> # Style some cells!

>>> spreadsheet["A2"].font = Bold_Font

>>> spreadsheet["A3"].font = Big_Red_Text

>>> spreadsheet["A4"].alignment = Center_Aligned_Text

>>> spreadsheet["A5"].border = Square_Border

>>> workbook.save(filename="sample_styles.xlsx")

If you want to apply multiple styles to one or several cells in our spreadsheets,you can use  NamedStyle  class: 

>>> from openpyxl.styles import NamedStyle

tnuocsiD %02
>>> # Let's create a style template for the header row

>>> header = NamedStyle(name="header")

>>> header.font = Font(bold=True)

>>> header.border = Border(bottom=Side(border_style="thin"))

>>> header.alignment = Alignment(horizontal="center",vertical="center")

>>> # Now let's apply this to all first row (header) cells

>>> header_row = sheet[1]

>>> for cell in header_row:

... cell.style = header

>>> workbook.save(filename="sample_styles.xlsx")

Adding Charts to our Spreadsheet


Charts are a good way to compute and understand large amounts of data quickly and easily. We have a lot of charts such as bar chart, pie chart, line chart, and
so on. 

Let us start by creating a new workbook with some data: 

Subscribe to our newsletter. Enter Your E-mail SUBSCRIBE 


https://www.knowledgehut.com/blog/programming/how-to-work-with-excel-using-python 13/21
7/10/2021 How to Work with Excel Spreadsheets using Python?

 1 from openpyxl import Workbook 

 2 from openpyxl.chart import BarChart,Reference 

 3  

 4 workbook = Workbook() 

 5 spreadsheet = workbook.active 

 6  

 7 # Let's create some sample sales data

 8 rows = [ 

 9    ["Product","Online","Store"], 

10    [1,30,45], 

11    [2,40,30], 

12    [3,40,25], 

13    [4,50,30], 

14    [5,30,25], 

15    [6,25,35], 

16    [7,20,40], 

17 ] 

18  

19 for row in rows:

20    spreadsheet .append(row)

Now let us create a bar chart that will show the total number of sales per product: 

22 chart = BarChart()

23 data = Reference(worksheet=sheet,

24                 min_row=1, 

tnuocsiD %02
25                 max_row=8, 

26                 min_col=2, 

27                 max_col=3) 

28  

29 chart.add_data(data,titles_from_data=True)

30 spreadsheet .add_chart(chart, "E2")

31

32 workbook.save("chart.xlsx")

You can also create a line chart by simply making some changes to the data: 

Subscribe to our newsletter. Enter Your E-mail SUBSCRIBE 


https://www.knowledgehut.com/blog/programming/how-to-work-with-excel-using-python 14/21
7/10/2021 How to Work with Excel Spreadsheets using Python?

1 import random 

 2 from openpyxl import Workbook 

 3 from openpyxl.chart import LineChart,Reference 

 4  

 5 workbook = Workbook()

 6 sheet = workbook.active

 7  

 8 # Let's create some sample sales data 

 9 rows= [

10    ["", "January", "February", "March", "April", 

11    "May", "June", "July", "August", "September", 

12     "October", "November", "December"], 

13    [1, ],

14    [2, ],

15    [3, ],

16 ] 

17  

18 for row in rows: 

19    sheet.append(row)

20  

21 for row in sheet.iter_rows(min_row=2,

22                           max_row=4,

23                           min_col=2,

24                           max_col=13):

25    for cell in row:

tnuocsiD %02
26        cell.value = random.randrange(5,100)

There are numerous types of charts and various types of customizations you can apply to your spreadsheet to make it more attractive.

Convert Python Classes to Excel Spreadsheet


Let us now learn how to convert the Excel Spreadsheet data to Python classes.  

Assume we have a database and we use some Object Relational mapping to map the database into Python classes and then export the objects into
spreadsheets: 

from dataclasses import dataclass

from typing import List

@dataclass

class Sale:

id: str 

quantity: int

@dataclass 

class Product: 

id: str 

name: str 

sales:List[Sale]

Now, to generate some random data, let’s assume that the above classes are stored in   db_classes.py file then: 

Subscribe to our newsletter. Enter Your E-mail SUBSCRIBE 


https://www.knowledgehut.com/blog/programming/how-to-work-with-excel-using-python 15/21
7/10/2021 How to Work with Excel Spreadsheets using Python?

1 import random 

 2  

 3 # Ignore these for now. You'll use them in a sec ;) 

 4 from openpyxl import Workbook 

 5 from openpyxl.chart import LineChart,Reference 

 6  

 7 from db_classes import Product,Sale 

 8  

 9 products_range = [] 

10  

11 # Let's create 5 products

12 for idx in range(1,6):

13    sales = [] 

14  

15    # Create 5 months of sales 

16    for_in range(5):

17        sale_range = Sale(quantity=random.randrange(5,100))

18        sales.append(sale)

19  

20    product = Product(id=str(idx),

21                      name="Product %s" % idx,

22                      sales=sales)

23    products_range.append(product)

By running this code, we will get 5 products in 5 months of sale with a random quantity of sales for each month. 

tnuocsiD %02
Now, we have  to convert this into a spreadsheet in which we need to iterate over the data: 

25 workbook = Workbook() 

26 spreadsheet = workbook.active 

27

28 # Append column names first 

29 spreadsheet.append(["Product ID","Product Name","Month 1", 

30              "Month 2","Month 3","Month 4","Month 5"]) 

31  

32 # Append the data 

33 for product in products_range:

34    data = [product.id,product.name]

35    for sale in product.sales:

36        data.append(sale.quantity) 

37    spreadsheet.append(data)

This will create a spreadsheet with some data coming from your database. 

How to work with pandas to handle Spreadsheets?

Subscribe to our newsletter. Enter Your E-mail SUBSCRIBE 


https://www.knowledgehut.com/blog/programming/how-to-work-with-excel-using-python 16/21
7/10/2021 How to Work with Excel Spreadsheets using Python?

We have learned to work with Excel in Python because Excel is one of the most popular tools and finding a way to work with Excel is critical. Pandas  is a great
tool to work with Excel in Python. It has unique methods to read all kinds of data in an Excel file and we can export items back to Excel using it. 

To use it, at first we need to install pandas  package: 

$ pip install pandas 

Then, let’s create a simple DataFrame: 

1 import pandas as pd 

 2  

 3 data = { 

 4    "Product Name":["Product 1","Product 2"], 

 5    "Sales Month 1":[10, 20], 

 6    "Sales Month 2":[5, 35], 

 7 } 

 8 dataframe = pd.DataFrame(data)

Now we have some data, and to convert it from a DataFrame  into a worksheet we generally use . dataframe_to_rows() : 

10 from openpyxl import Workbook

11 from openpyxl.utils.dataframe import  dataframe_to_rows 

12  

13 workbook = Workbook() 

14 spreadsheet = workbook.active 

15  

16 for row in dataframe_to_rows(df, index=False,header=True): 

tnuocsiD %02
17    spreadsheet .append(row)

18  

19 workbook.save("pandas_spreadsheet.xlsx")

We need to use   read_excel method to read data from pandas DataFrame  object. 

excel_file =’movies.xls’ 

movies=pd.read_excel(excel_file)

We can also use Excel file class to use multiple sheets from the same excel file: 

movies_sheets = []

for sheet in xlsx.sheet_names:

    movies_sheets.append(xlsx.parse(sheet))

    movies = pd.concat(movies_sheets))

Indexes and columns allows you to access data from your DataFrame easily: 

Subscribe to our newsletter. Enter Your E-mail SUBSCRIBE 


https://www.knowledgehut.com/blog/programming/how-to-work-with-excel-using-python 17/21
7/10/2021 How to Work with Excel Spreadsheets using Python?

>>> df.columns 

Index(['marketplace', 'customer_id', 'review_id', 'product_id',


       'product_parent', 'product_title', 'product_category', 'star_rating', 

       'helpful_votes', 'total_votes', 'vine', 'verified_purchase', 

       'review_headline', 'review_body', 'review_date'], 

      dtype='object')

>>> # Get first 10 reviews' star rating 

>>> df["star_rating"][:10]

R3O9SGZBVQBV76    5

RKH8BNC3L5DLF     5 

R2HLE8WKZSU3NL    2 

R31U3UH5AZ42LL    5 

R2SV659OUJ945Y    4 

RA51CP8TR5A2L     5 

RB2Q7DLDN6TH6     5 

R2RHFJV0UYBK3Y    1 

R2Z6JOQ94LFHEP    5 

RX27XIIWY5JPB     4 

Name: star_rating, dtype: int64

>>> # Grab review with id "R2EQL1V1L6E0C9", using the index

>>> df.loc["R2EQL1V1L6E0C9"]

marketplace               US

customer_id         15305006 

review_id     R2EQL1V1L6E0C9 

tnuocsiD %02
product_id        B004LURNO6 

product_parent     892860326 

review_headline   Five Stars 

review_body          Love it 

review_date       2015-08-31 

Name: R2EQL1V1L6E0C9, dtype: object

Summary 
In this article we have covered: 

How to extract information from spreadsheets  


How to create Spreadsheets in different ways 
How to customize a spreadsheet by adding filters, styles, or charts and so on 
How to use pandas to work with spreadsheets 

Now you are well aware of the different types of implementations you can perform with spreadsheets using Python. However, if you are willing to gather more
information on this topic, you can always rely on the official documentation (https://openpyxl.readthedocs.io/en/stable/index.html) of openpyxl. To gain more
knowledge about Python tips and tricks, check out our Python tutorial (https://www.knowledgehut.com/tutorials/python-tutorial). To gain mastery
over Python coding,join ourPython certification course (https://www.knowledgehut.com/programming/python-programming-certification-training). 

Priyankur Sarkar

Data Science Enthusiast

Priyankur Sarkar loves to play with data and get insightful results out of it, then turn those data insights and results in business growth. He is an electronics engineer
with a versatile experience as an individual contributor and leading teams, and has actively worked towards building Machine Learning capabilities for organizations.

Subscribe to our newsletter. Enter Your E-mail SUBSCRIBE 


https://www.knowledgehut.com/blog/programming/how-to-work-with-excel-using-python 18/21
7/10/2021 How to Work with Excel Spreadsheets using Python?

JOIN THE DISCUSSION

comment*

Name*

Email*

COMMENT
Your email address will not be published. Required fields are marked *

SUGGESTED BLOGS

(https://www.knowledgehut.com/blog/programming/popular-programming-certifications)
BLOGS
(HTTPS://WWW.KNOWLE… 
3470

tnuocsiD %02
Top-Paying Programming Certifications for 2021
(https://www.knowledgehut.com/blog/programming/popular-programming-certifications)
by KnowledgeHut (https://www.knowledgehut.com/blog/author/knowledgehut-editor)
09 Jun 2021
6 mins read
Programming is at the core of software development...
READ MORE (HTTPS://WWW.KNOWLEDGEHUT.COM/BLOG/PROGRAMMING/POPULAR-PROGRAMMING-
CERTIFICATIONS)

(https://www.knowledgehut.com/blog/programming/top-java-developers-certifications)
BLOGS
(HTTPS://WWW.KNOWLE… 
5680

Top IT Certifications for Java Developers in 2021 (https://www.knowledgehut.com/blog/programming/top-


java-developers-certifications)
by Gaurav Kr. Roy (https://www.knowledgehut.com/blog/author/gaurav--kr-roy-2)
25 May 2021
12 mins read
Programming languages are at the heart of comput...
READ MORE (HTTPS://WWW.KNOWLEDGEHUT.COM/BLOG/PROGRAMMING/TOP-JAVA-DEVELOPERS-
CERTIFICATIONS)

Subscribe to our newsletter. Enter Your E-mail SUBSCRIBE 


https://www.knowledgehut.com/blog/programming/how-to-work-with-excel-using-python 19/21
7/10/2021 How to Work with Excel Spreadsheets using Python?

(https://www.knowledgehut.com/blog/programming/best-python-certifications)
BLOGS
(HTTPS://WWW.KNOWLE… 
9303

Best Python Certifications of 2021 (https://www.knowledgehut.com/blog/programming/best-python-


certifications)
by Gaurav Kr. Roy (https://www.knowledgehut.com/blog/author/gaurav--kr-roy-2)
25 May 2021
12 mins read
Programming is always at the core of computer scie...
READ MORE (HTTPS://WWW.KNOWLEDGEHUT.COM/BLOG/PROGRAMMING/BEST-PYTHON-CERTIFICATIONS)

LOAD MORE

Connect with us

(https://www.linkedin.com/company/knowledgehut) (https://www.facebook.com/KnowledgeHut.Global)
(https://www.instagram.com/knowledgehut.global) (https://www.youtube.com/user/TheKnowledgehut)
(https://twitter.com/KnowledgeHut)
Get Our Weekly Newsletter

tnuocsiD %02
Enter Your E-mail SUBSCRIBE

We Accept

USA : +1-469-442-0620 (tel:+1-469-442-0620), +1-832-684-0080 (tel:+1-832-684-0080)

India : +91-84484-45027 (tel:+91-84484-45027)

Toll Free: 1800-121-9232 (tel:1800-121-9232)


UK: +44-2080890434 (tel:+44-2080890434) Canada: +1-613-707-0763 (tel:+1-613-707-0763)

Singapore: +65-315-83941 (tel:+65-315-83941) New Zealand: +64-36694791 (tel:+64-36694791)

Malaysia: +601548770914 (tel:+601548770914) Ireland: +353-14402544 (tel:+353-14402544)


Australia: +61-290995641 (tel:+61-290995641)

UAE: Toll Free 8000180860 (tel:8000180860)

Company
Offerings
Resources
Partner with us
Support

Disclaimer: KnowledgeHut reserves the right to cancel or reschedule events in case of insufficient registrations, or if presenters cannot attend
due to unforeseen circumstances. You are therefore advised to consult a KnowledgeHut agent prior to making any travel arrangements for a
workshop. For more details, please refer Cancellation & Refund Policy (https://www.knowledgehut.com/refund-policy).
Subscribe to our newsletter. Enter Your E-mail
CSM®, CSPO®, CSD®, CSP®, A-CSPO®, A-CSM® are registered trademarks of Scrum All READ MORE
SUBSCRIBE 
https://www.knowledgehut.com/blog/programming/how-to-work-with-excel-using-python 20/21
7/10/2021 How to Work with Excel Spreadsheets using Python?

© 2011-21 KnowledgeHut. All Rights Reserved


Privacy policy (https://www.knowledgehut.com/privacy-policy) Terms of service (https://www.knowledgehut.com/terms-conditions)

tnuocsiD %02

Subscribe to our newsletter. Enter Your E-mail SUBSCRIBE 


https://www.knowledgehut.com/blog/programming/how-to-work-with-excel-using-python 21/21

You might also like