Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Python 07 Files

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 23

Python : File I/O

reading and writing files

CT108-3-1 Programming With


Python (PYP)
Topic & Structure of the lesson

• Reading and writing files


– Creating a text file
– Opening files in different modes
– Writing data into a file
– Reading from a file
– Searching through a file

CT010-3-1 Fundamentals of Software Development Python Files I/O


Learning outcomes

• At the end of this lecture you should be


able to:
– Develop a problem-based strategy for
creating and applying programmed solutions
– Create, edit, compile, run, debug and test
programs using an appropriate development
environment

CT010-3-1 Fundamentals of Software Development Python Files I/O


Key terms you must be able to
use
• If you have mastered this topic, you should
be able to use the following terms correctly
in your assignments and exams:

– open

CT010-3-1 Fundamentals of Software Development Python Files I/O


File Processing

• A text file can be thought of as a


sequence of lines
From stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 2008
Return-Path: <postmaster@collab.sakaiproject.org>
Date: Sat, 5 Jan 2008 09:12:18 -0500To:
source@collab.sakaiproject.orgFrom:
stephen.marquard@uct.ac.zaSubject: [sakai] svn commit:
r39772 - content/branches/Details:
http://source.sakaiproject.org/viewsvn/?view=rev&rev=39772

http://www.py4inf.com/code/mbox-short.txt
CT010-3-1 Fundamentals of Software Development Python Files I/O
Opening a File

• Before we read the contents of a file we must


tell Python which file we are going to work with
and what we will be doing with the file
• This is done with the open() function
• open() returns a “file handle” - a variable used
to perform operations on the file
• Kind of like “File -> Open” in a Word Processor

CT010-3-1 Fundamentals of Software Development Python Files I/O


Using open()

handle = open(filename, mode)


• returns a handle, used to manipulate the file
• filename is a string
• mode is optional and should be 'r' if we are planning
reading the file and 'w' if we are going to write to the
file.

fhand = open('mbox.txt', 'r')

http://docs.python.org/lib/built-in-funcs.html
CT010-3-1 Fundamentals of Software Development Python Files I/O
What is a Handle?

>>> fhand = open('mbox.txt')


>>> print(fhand)
<open file 'mbox.txt', mode 'r' at 0x1005088b0>

CT010-3-1 Fundamentals of Software Development Python Files I/O


When Files are Missing

>>> fhand = open('stuff.txt')


Traceback (most recent call
last): File "<stdin>", line
1, in <module>IOError: [Errno
2] No such file or directory:
'stuff.txt'

CT010-3-1 Fundamentals of Software Development Python Files I/O


The newline
Character
• We use a special >>> stuff = 'Hello\nWorld!’
character to indicate >>> stuff
when a line ends >>> 'Hello\nWorld!’
>>> print(stuff)
called the "newline" Hello
• We represent it as \n World!
>>> stuff = 'X\nY’
in strings >>> print(stuff)
• Newline is still one X
Y
character - not two >>> len(stuff)3

CT010-3-1 Fundamentals of Software Development Python Files I/O


File Processing

• A text file can be thought of as a


sequence of lines

From stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 2008


Return-Path: <postmaster@collab.sakaiproject.org>
Date: Sat, 5 Jan 2008 09:12:18 -0500To:
source@collab.sakaiproject.orgFrom:
stephen.marquard@uct.ac.zaSubject: [sakai] svn commit:
r39772 - content/branches/Details:
http://source.sakaiproject.org/viewsvn/?view=rev&rev=39772

CT010-3-1 Fundamentals of Software Development Python Files I/O


File Processing

• A text file has newlines at the end of


each line

From stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 2008\n


Return-Path: <postmaster@collab.sakaiproject.org>\n
Date: Sat, 5 Jan 2008 09:12:18 -0500\nTo:
source@collab.sakaiproject.org\nFrom:
stephen.marquard@uct.ac.za\nSubject: [sakai] svn commit:
r39772 - content/branches/\nDetails:
http://source.sakaiproject.org/viewsvn/?view=rev&rev=39772\n

CT010-3-1 Fundamentals of Software Development Python Files I/O


File Handle as a Sequence
• A file handle open for read
can be treated as a
sequence of strings where
each line in the file is a string
in the sequence
• We can use the for
statement to iterate through
a sequence
• Remember - a sequence is
an ordered set
xfile = open('mbox.txt')
for cheese in xfile:
print(cheese)

CT010-3-1 Fundamentals of Software Development Python Files I/O


Counting Lines in a File

• Open a file read-only


• Use a for loop to fhand = open('mbox.txt')
read each line count = 0
for line in fhand:
• Count the lines and count = count + 1
print out the number print('Line Count:', count)
of lines
Output:
Line Count: 132045

CT010-3-1 Fundamentals of Software Development Python Files I/O


Searching Through a File
• We can put an if
statement in our
for loop to only
print lines that
meet some criteria
fhand = open('mbox-short.txt')
for line in fhand:
if line.startswith('From:') :
print(line)

CT010-3-1 Fundamentals of Software Development Python Files I/O


OOPS!

What are all these blank From: stephen.marquard@uct.ac.za


lines doing here?
From: louis@media.berkeley.edu

From: zqian@umich.edu

From: rjlowe@iupui.edu
...

CT010-3-1 Fundamentals of Software Development Python Files I/O


OOPS!

What are all these blank From: stephen.marquard@uct.ac.za\n


lines doing here? \n
From: louis@media.berkeley.edu\n
\n
From: zqian@umich.edu\n
Each line from the file has a \n
newline at the end. From: rjlowe@iupui.edu\n
\n
...
The print statement adds a
newline to each line.

CT010-3-1 Fundamentals of Software Development Python Files I/O


Searching Through a File (fixed)

• We can strip the


fhand = open('mbox-short.txt')
whitespace from the for line in fhand:
right hand side of the line = line.rstrip()
string using rstrip() if line.startswith('From:'):
print(line)
from the string library
• The newline is
considered "white
space" and is
From: stephen.marquard@uct.ac.za
stripped From: louis@media.berkeley.edu
From: zqian@umich.edu
From: rjlowe@iupui.edu
....

CT010-3-1 Fundamentals of Software Development Python Files I/O


Skipping with continue

• We can
convienently
skip a line by fhand = open('mbox-short.txt')
for line in fhand:
using the line = line.rstrip()
if not line.startswith('From:'):
continue continue
print(line)
statement

CT010-3-1 Fundamentals of Software Development Python Files I/O


Using in to select lines

• We can look for


fhand = open('mbox-short.txt')
a string for line in fhand:
anywhere in a line = line.rstrip()
if not '@uct.ac.za' in line :
line as our continue
print(line)
selection criteria

From stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 2008


X-Authentication-Warning: set sender to stephen.marquard@uct.ac.za using –f
From: stephen.marquard@uct.ac.zaAuthor: stephen.marquard@uct.ac.za
From david.horwitz@uct.ac.za Fri Jan 4 07:02:32 2008
X-Authentication-Warning: set sender to david.horwitz@uct.ac.za using -f...

CT010-3-1 Fundamentals of Software Development Python Files I/O


Prompt for File Name

fname = input('Enter the file name: ')


fhand = open(fname)
count = 0
for line in fhand:
if line.startswith('Subject:') :
count = count + 1
print('There were', count, 'subject lines in', fname)

Enter the file name: mbox.txt


There were 1797 subject lines in mbox.txt

Enter the file name: mbox-short.txt


There were 27 subject lines in mbox-short.txt

CT010-3-1 Fundamentals of Software Development Python Files I/O


Bad File Names
fname = input('Enter the file name: ')
try:
fhand = open(fname)
except:
print 'File cannot be opened:', fname
exit()
count = 0
for line in fhand:
if line.startswith('Subject:') :
count = count + 1
print ('There were', count, 'subject lines in', fname)

Enter the file name: mbox.txt


There were 1797 subject lines in mbox.txt

Enter the file name: na na boo boo


File cannot be opened: na na boo boo

CT010-3-1 Fundamentals of Software Development Python Files I/O


Summary

• Secondary storage
• Opening a file - file handle
• File structure - newline character
• Reading a file line-by-line with a for loop
• Searching for lines
• Reading file names
• Dealing with bad files

CT010-3-1 Fundamentals of Software Development Python Files I/O

You might also like