Using Python To Read and Save Your Outlook Emails! - by Alex Thines - Python in Plain English
Using Python To Read and Save Your Outlook Emails! - by Alex Thines - Python in Plain English
2
Photo by Brett Jordan on Unsplash
Use Case:
Downloading emails from Outlook and storing them so that you can trigger
other processes or securely back up emails for auditing purposes.
Getting started:
Python3
Pip3
Setup:
Install python along with PyAutoGui and pywin32. This specific use case and
program can only be used on Windows as we are accessing the via a
Windows Component Object Model (COM). This article is assuming you’ve
read my previous article here and would like to expand upon that idea (or
you were curious what emailReader.py was).
Part 1. PyAutoGUI:
The first portion of this is getting your clicking set up. This is going to be
extremely important as Window’s COM does not allow you to interact with it
via a scheduled task easily. Using baseBot.py found here, set up a series of
clicks to
1. Open a terminal
To save time and to reuse perfectly good code, I will use the code at the end
of the PyAutoGui article as a starting point for this code.
import pyautogui
import logging
import keyboard
import time
import argparse
import sys
logging.basicConfig(level=logging.INFO)
# Set up logging
def get_arg():
""" Takes nothing
Purpose: Gets arguments from command line
Returns: Argument's values
"""
parser = argparse.ArgumentParser()
# Information
parser.add_argument("-d","--debug",dest="debug",action="store_true",help="Turn o
# Functionality
parser.add_argument("-f","--find",dest="find",action="store_true",help="Turn on
options = parser.parse_args()
if options.debug:
logging.basicConfig(level=logging.DEBUG)
global DEBUG
DEBUG = True
else:
logging.basicConfig(level=logging.INFO)
return options
def finder():
""" Takes nothing
Purpose: Finds the mouse position and color
Returns: Nothing
"""
while keyboard.is_pressed('q') != True:
if keyboard.is_pressed('c') == True:
x, y = pyautogui.position()
r,g,b = pyautogui.pixel(x, y)
def typeWriter(text):
""" Takes text
Purpose: Types out the text
Returns: Nothing
"""
if text == "ENTER":
pyautogui.press('enter')
else:
pyautogui.typewrite(text)
pyautogui.press('enter')
def clicker(x,y):
""" Takes x and y coordinates
Purpose: Clicks the location
Returns: Nothing
"""
pyautogui.click(x,y)
def main():
options = get_arg()
logging.info("Starting program")
if options.find:
finder()
sys.exit(1)
if __name__ == "__main__":
main()
The above code s going to be the program that we call via a scheduled task.
While I am going to separate the code so that I can keep things more
organized, there is nothing wrong with adding an if statement to the above
code and making another argument call the win32com functionality we are
about to code.
Part 2. win32com:
This is where the more interesting part happens (and where we actually get
to exploit bypassing win32com’s restrictions).
import win32com.client
import win32com
import re
EMAILADDRESS = ""
IGNOREDSENDER = [""]
raw_emails = {}
def main():
accounts, outlook = init()
emails = getEmails(accounts, outlook)
print(emails)
if __name__ == "__main__":
main()
The email address will be used for your email. Helpful if there are multiple
accounts on your system but only want to scrape one of them. Ignored
Sender is amazing if you have an automated system that sends emails that
you do not want this program to interact with. Monitor.txt is extremely
useful if you only care about emails with a certain subject line.
The next portion is the initialization portion. This is extremely short for this
use case but can get more complex if you use different COM systems
potentially.
def init():
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
accounts= win32com.client.Dispatch("Outlook.Application").Session.Accounts
The “final” part is the actual meat and potatoes of this program. Since there
are many portions that do a lot of things and it can get extremely
confusing, I have made a lot of comments that explain what the line does
under it if I do not think it is completely obvious.
def getEmails(accounts, outlook):
"""Takes accounts and outlook
Purpose: Gets emails from outlook
Returns: Nothing
"""
# Counter used for counting the amount of emails per subject.
count = 0
# This is used if there are more than 1 account in outlook. If there are not
if str(account).lower() == EMAILADDRESS.lower():
print("Account: {}".format(account))
folders = outlook.Folders(account.DeliveryStore.DisplayName)
specific_folder = folders.Folders
# Prints subject
print("Subject: {}".format(single.Subject))
# Prints when the email was received
print("Received Time: {}".format(single.ReceivedTime
# Prints if the email is unread or not
print("Unread: {}".format(single.Unread))
# Converts the dictionary to a json file. Also replaces the single quotes with d
tmpEmails = raw_emails
tmpEmails = str(tmpEmails).replace('"', '|')
tmpEmails = str(tmpEmails).replace("'", '"')
tmpEmails = str(tmpEmails).replace("|", "'")
# Uncomment if you want it saved as a json file. You can also make this as a fla
# with open("emails.json", "w") as f:
# f.write(tmpEmails)
print("Finished Succesfully")
return raw_emails
The final program will look like this:
import win32com.client
import win32com
import re
EMAILADDRESS = ""
IGNOREDSENDER = [""]
raw_emails = {}
def init():
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
accounts= win32com.client.Dispatch("Outlook.Application").Session.Accounts
# This is used if there are more than 1 account in outlook. If there are not
if str(account).lower() == EMAILADDRESS.lower():
print("Account: {}".format(account))
folders = outlook.Folders(account.DeliveryStore.DisplayName)
specific_folder = folders.Folders
# Prints subject
print("Subject: {}".format(single.Subject))
# Prints when the email was received
print("Received Time: {}".format(single.ReceivedTime
# Prints if the email is unread or not
print("Unread: {}".format(single.Unread))
# Converts the dictionary to a json file. Also replaces the single quotes with d
tmpEmails = raw_emails
tmpEmails = str(tmpEmails).replace('"', '|')
tmpEmails = str(tmpEmails).replace("'", '"')
tmpEmails = str(tmpEmails).replace("|", "'")
# Uncomment if you want it saved as a json file. You can also make this as a fla
# with open("emails.json", "w") as f:
# f.write(tmpEmails)
print("Finished Succesfully")
return raw_emails
def main():
accounts, outlook = init()
emails = getEmails(accounts, outlook)
print(emails)
if __name__ == "__main__":
main()
Limitations:
If you need something in real time for updates, you will need another
system that you are not actively using for that. This can be achieved via a
dedicated Windows server or a windows Virtual Machine however.
Another interesting limitation is saving the emails when non basic latin
characters are present in the emails. This caused my original program to get
side tracked for roughly 2 hours while I was trying to sanitize a kanji email
signature…
In the end, I opted to have the entire email encoded to utf-8. In theory, you
can spend time calculating when the non latin characters start and when
they end. After that, you can encode just those characters and have the rest
of the email saved in their native format.
If you have gotten this far, I have to commend you on reading this far! When
I originally talked to my team and family about this idea, I was instantly
questioned about it since reading an email isn’t that hard. I always had to
explain to them the potential use cases for something like this.
Do you want to upload every email to Jira so that you can have the
information in a ticket for other analysts/testers/managers to see the entire
chain?
Do you want to parse every email into a database so that you have a more
in-depth knowledge base for a chatbot to respond with?
Do you want to send an email with through a Jira mail server or would you
prefer to send an email from a bot as if it was yourself?
These reasons (along with a few more client specific reasons) are why I
spent way too much time trying to figure out how to do everything listed in
the two programs above. Below I have included the code in their final forms.
If you get this far, thank you so much for taking the time to read this article
on “Using Python to read and save your Outlook emails!”
Code:
winBypass.py
import pyautogui
import logging
import keyboard
import time
import argparse
import sys
logging.basicConfig(level=logging.INFO)
# Set up logging
def get_arg():
""" Takes nothing
Purpose: Gets arguments from command line
Returns: Argument's values
"""
parser = argparse.ArgumentParser()
# Information
parser.add_argument("-d","--debug",dest="debug",action="store_true",help="Turn o
# Functionality
parser.add_argument("-f","--find",dest="find",action="store_true",help="Turn on
options = parser.parse_args()
if options.debug:
logging.basicConfig(level=logging.DEBUG)
global DEBUG
DEBUG = True
else:
logging.basicConfig(level=logging.INFO)
return options
def finder():
""" Takes nothing
Purpose: Finds the mouse position and color
Returns: Nothing
"""
while keyboard.is_pressed('q') != True:
if keyboard.is_pressed('c') == True:
x, y = pyautogui.position()
r,g,b = pyautogui.pixel(x, y)
def typeWriter(text):
""" Takes text
Purpose: Types out the text
Returns: Nothing
"""
if text == "ENTER":
pyautogui.press('enter')
else:
pyautogui.typewrite(text)
pyautogui.press('enter')
def clicker(x,y):
""" Takes x and y coordinates
Purpose: Clicks the location
Returns: Nothing
"""
pyautogui.click(x,y)
def main():
options = get_arg()
logging.info("Starting program")
if options.find:
finder()
sys.exit(1)
if pyautogui.pixel(1496, 1434)[0] in range(40,60) and pyautogui.pixel(1496, 1434
clicker(1496,1434) # Clicks the loction
time.sleep(3) # Wait for the program to load
typeWriter("cd testLocation") # Change to a different location
typeWriter("ENTER") # Press Enter
typeWriter("python emailReader.py") # Run emailReader.py program
typeWriter("ENTER") # Press Enter
time.sleep(60) # Wait 60 seconds
typeWriter("exit") # Close terminal
typeWriter("ENTER") # Press Enter
else:
logging.fatal("Color is not in range!") # Let user know that the color isn't i
if __name__ == "__main__":
main()
emailReader.py
import win32com.client
import win32com
import re
EMAILADDRESS = ""
IGNOREDSENDER = [""]
raw_emails = {}
def init():
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
accounts= win32com.client.Dispatch("Outlook.Application").Session.Accounts
# This is used if there are more than 1 account in outlook. If there are not
if str(account).lower() == EMAILADDRESS.lower():
print("Account: {}".format(account))
folders = outlook.Folders(account.DeliveryStore.DisplayName)
specific_folder = folders.Folders
# Prints subject
print("Subject: {}".format(single.Subject))
# Prints when the email was received
print("Received Time: {}".format(single.ReceivedTime
# Prints if the email is unread or not
print("Unread: {}".format(single.Unread))
# Converts the dictionary to a json file. Also replaces the single quotes with d
tmpEmails = raw_emails
tmpEmails = str(tmpEmails).replace('"', '|')
tmpEmails = str(tmpEmails).replace("'", '"')
tmpEmails = str(tmpEmails).replace("|", "'")
# Uncomment if you want it saved as a json file. You can also make this as a fla
# with open("emails.json", "w") as f:
# f.write(tmpEmails)
print("Finished Succesfully")
return raw_emails
def main():
accounts, outlook = init()
emails = getEmails(accounts, outlook)
print(emails)
if __name__ == "__main__":
main()
In Plain English
Thank you for being a part of our community! Before you go:
Alex
Thines
A simple hacker trying to learn as much as possible and share the lessons with everyone
35 2
Wanna Code Like a Google Engineer? Let’s Dive into Advanced Python Together!
Wanna Code Like a Google Engineer? Let’s Dive into Advanced Python
Together!
Unlock the secrets of advanced Python, straight from an Ex- Googler! Dive into syntax, efficient
looping, magical libraries, and more. If…
2.2K 10
741 10
Did I just win $500 in gift cards or 5 fun presentations from the IT team?
Alex Thines
Al
Did I just win $500 in gift cards or 5 fun presentations from the IT team?
Understanding how to investigate an email to determine if it is phishing or not.
Varun Singh
V
69 1
Lists
Parth Sojitra
P
33
Gaurav Kumar
G
18
🥊
Matplotlib vs Plotly Express: The Ultimate Python Data Visualization Brawl
👑 Imagine the world of data visualization as a friendly game of chess. On one side sits the
seasoned champion, Matplotlib, with years of…
58 1
📄 Python- docx: A Comprehensive Guide to Creating and Manipulating Word Documents in Python
Manoj Das
M
📄
Python-docx: A Comprehensive Guide to Creating and Manipulating Word
Documents in Python
Document Automation in Python.
29 1