Lesson 5 File Handling Text Files
Lesson 5 File Handling Text Files
1
Data files
The data files are the files that store data pertaining to a specific application, for
later use. The data files can be stored in 2 ways:
Text files
Binary files
Text files
A text file consists of a sequence of lines. A line is a sequence of characters
( ASCII or UNICODE ) stored on a permanent storage media. Although default
character coding in Python is ASCII, using the constant ‘u’ with string, it
supports Unicode as well. In a text file each character is terminated by a special
character known as End of line(EOL).By default this EOL character is the
newline character (‘\n’) . So at the lowest level, text file will be a collection of
bytes. Text files are stored in human readable form and can be created using any
text editor.
Some internal translations take place when this EOL character is read or
written.
We use text files to store character data. For example, test.txt
Binary files
A binary file is just a file that contains information in the same format in which
the information is held in the memory. That is the file content is raw, with no
translation or encoding. In a binary file there is no delimiter for a line. Also no
translations occur in binary files. As a result, binary files are faster and easier
for a program to read and write than are text files.
What is the difference between text file and binary file?
A text file stores data in the form of alphabets, digits and other special symbols
by storing their ASCII values and are in a human readable format.
2
held in memory.
2 Each line of text is terminated with No delimiters are used for a line.
a special character known as EOL
(End of Line)
3 Some internal translations take No translation occurs in binary
place when this EOL character is files.
read or written.
4 Slower than binary files. Binary files are faster and easier
for a program to read and write
the text files.
3
Creating a copy of the file
Updating data in a file etc.
In order to work with a file from within a Python program, we need to open it in
a specific mode as per the file manipulation task we have to perform. Python
provides built in functions to perform the above mentioned tasks. To handle
data files in Python we need to have a file variable or file object or file handle.
File Handle
To work with a file (read or write)first thing we have to do is to open the file.
This is done using built in function open(). Opening the file communicates with
the operating system, which knows where the data for each file is stored. When
we open the file, we are asking the operating system to find the file by name
and make sure the file exists.
open() function takes the name of the file as the first argument. The second
argument indicates the mode of accessing the file.
< file variable > / < file object or handle > = open(file_name,access_mode)
4
Here the first argument specifies the name of the file to be opened and the
second argument describes the mode ie., how the file will be used throughout
the program. This is an optional parameter as the default mode is the read mode
(reading).
Modes for opening a file
The object of file type is returned using which we will manipulate the file in our
program. When we work with files, a buffer (area in memory where data is
temporarily stored before being written to the file) is automatically associated
with the file when we open it. While writing the content to the file, first it goes
to buffer, and once the buffer is full, data is written to the file. Also when the
file is closed, any unsaved data is transferred to the file. flush() function is used
to force transfer of data to the file.
If the opening is successful, the OS returns us a file handle. The file handle is
not the actual data contained in the file. Instead it is the handle that we can use
to read the data. We are given a handle if the requested file exists and we can
have proper permission to read the file.
Example
>>>f=open("test.txt")
OR
f=open("test.txt","r")
The above statement opens the file “test.txt” in read mode and attaches it to the
file object f.
When we open a file it opens in read mode by default and f is the file object or
file variable or file handle, shall be returned to the OS. The file object is a
reference to a file on the disk. It opens it and makes it available for a number of
different tasks. A file object is a stream of bytes where the data can be read
either byte by byte or line by line collectively.
5
Always create the file in the same default folder where Python has been
installed.
f=open("e:\\main\\test.txt","r")
Python will open the file in E in main folder and attaches it to object f.
When the file does not exist open will fail with a trace back and we will not get
a handle to access the contents of the file.
f=open("test1.txt")
FileNotFoundError: [Errno 2] No such file or directory: 'test1.txt'
f=open("test.txt","w")
>>> print(f)
<_io.TextIOWrapper name='test.txt' mode='w' encoding='cp1252'>
Note
f=open(“c:\\temp\\data.txt”,’r’)
6
f=open(r “c:\temp\data.txt”,’r’)
The prefix r in front of the string makes it raw string, that means there is
no special meaning attached to any character. In that case single slash can
be used in pathnames.
The file modes will decide how the file will be accessed. The second parameter
of the open() function corresponds to a mode which is either read (‘r’) ,
write(‘w’) or append(‘a’). The following are the file modes supported by
Python:
Description
Opens a file for reading only. The file pointer is placed at the beginning
of the file. This is the default mode. If the specified file does not exist, it
will generate FileNotFoundError.
Description
Opens a file for reading only in binary format. The file pointer is placed
at the beginning of the file. This is the default mode. If the specified file
does not exist, it will generate FileNotFoundError.
3. Mode – w - Text file
Description
Opens a file for writing only. Overwrites the file if the file exists. If the
file does not exist, it creates a new file for writing.
Description
7
Opens a file for writing only in binary format. Overwrites the file if the
file exists. If the file does not exist, it creates a new file for writing.
5. Mode – r+ - Text file
Description
Opens a file for both reading and writing(+). The file pointer is at the
beginning of the file. File must exist, else and exception is raised.
Description
Opens a file for both reading and writing in binary format. The file
pointer is at the beginning of the file. File must exist, else and exception
is raised.
Description
Opens a file for both reading and writing(+). Overwrites the file if the file
exists. If the file does not exist, creates a new file for reading and
writing.
Note
Description
Opens a file for both reading and writing(+) in binary format. Overwrites
the file if the file exists. If the file does not exist, creates a new file for
reading and writing.
8
Note
rb+ does not create the file from scratch
wb+ does create the file from scratch
Description
Opens a file for appending. File is in the write only mode. The file pointer
is at the end of the file if the file exists and the data is retained and new
data is appended to the file. If the file does not exist, it creates a new file
for writing.
Description
Opens a file for appending in binary format. File is in the write only
mode. The file pointer is at the end of the file if the file exists and the data
is retained and new data is appended to the file. That is the file is in the
append mode. If the file does not exist, it creates a new file for writing.
Description
Opens a file for appending and reading. The file pointer is at the end of
the file if the file exists and the data is retained and new data is appended
to the file.. That is the file is in the append mode. If the file does not exist,
it creates a new file for writing and reading.
Description
Opens a file for appending and reading in binary format. The file pointer
is at the end of the file if the file exists and the data is retained and new
data is appended to the file. That is the file is in the append mode. If the
file does not exist, it creates a new file for writing and reading.
9
Examples
>>>file=open(“test.txt”,”r+”)
will open the file test.txt for reading and writing purpose. Here the name
(by which it exists on secondary storage media) of the file specified is
constant. We can use a variable also. If the file already exists , it has to
be in the same folder where we are working now, otherwise we have to
give the complete path. It is not mandatory to have the file name with
extension. Text file extension is .txt and binary files .dat.
f=open("test.txt","w")
>>> print(f)
<_io.TextIOWrapper name='test.txt' mode='w' encoding='cp1252'>
Another function which can be used for creation of file is file(). Its syntax
and usage is same as open().
An open file is closed by calling the close method of its file object. Files
are automatically closed at the end of the program. But it’s a good
programming practice to close the files explicitly, because the OS may
not write the data out to the file until it is closed.
Syntax
<file handle>.close()
The close() function breaks the link of the file object and the file on the
disk. After close(), no tasks can be performed on that file through the file
object.
Example
f=open("test.txt")
print("The name of the file is",f.name)
The name of the file is test.txt
f.close()
10
Various properties of the file object:
Once open() is successful and file object gets created, we can retrieve
various details related to that file suing its associated properties.
Example
f=open("test.txt")
print("The name of the file is ",f.name)
print("The file mode is ",f.mode)
print("Is the file readable ",f.readable())
print("Is the file closed ",f.closed)
f.close()
print("Is the file closed ",f.closed)
Output
The name of the file is test.txt
The file mode is r
Is the file readable True
Is the file closed False
Is the file closed True
Writing to a file
To write character data to the file the following methods are used:
1. write(string)
2. writelines(sequence of lines/sequence)
1. write(string)
11
write() method takes a string as parameter and writes it to the file. For
storing data with end of line character, we will have to add ‘\n’ character
to the end of string. As the argument to the function has to be string, for
storing numeric value, we have to convert it to a string.
Syntax
<filehandle>.write(string)
Examples
#file creation
f=open("test4.txt","w")
f.write("We are writing\n")
f.write("data to a\n")
f.write("text file \n")
print("Data written to the file successfully\n")
f.close()
Output
In the above program data will be overwritten every time we run the
program.
While writing data to a file we must provide line separator (‘\n’) else the
data will be written in a single line.
#file creation
f=open("test2.txt","w")
12
f.write("We are writing names to a file\n")
for i in range(5):
name=input("Enter name")
f.write(name)
f.close()
Output
Enter nameAneesh
Enter nameAkash
Enter nameAbi
Enter nameAnisa
Enter nameArul
#file creation
f=open("test3.txt","w")
f.write("We are writing names to a file\n")
for i in range(5):
name=input("Enter name")
f.write(name)
f.write('\n')
f.close()
Output
Enter nameshena
Enter nameBora
Enter nameKamala
Enter nameKala
Enter nameMeera
4. Write a program to get roll numbers, names and marks of the students of
a class( get input from the user ) and store these details in a file called
marks.det.
#file creation
n=int(input("Enter total number of students :"))
f=open("marks.det","w")
13
for i in range(n):
print("Enter details of the student:",i+1)
rno=int(input("Enter the roll number:"))
name=input("Enter name:")
marks=float(input("Enter marks:"))
rec=str(rno)+","+name+","+str(marks)+'\n'
f.write(rec)
f.close()
Output
2. writelines(sequence of lines/sequence)
For writing a string at a time, we use write() method. It can’t be used for
writing a list, tuple etc. into a file. Sequence data type including strings
can be written using writelines() method in the file.
Syntax
<filehandle>.writelines(sequence of lines/sequence)
Examples
#file creation
f=open("test4.txt","w")
14
lst=["Computer Science\n","Physics\n","Chemistry\n","Maths\n"]
f.writelines(lst)
f.close()
#file creation
f=open("test5.txt","w")
f.writelines("We are writing\n")
f.writelines("data to a\n")
f.writelines("text file \n")
print("Data written to the file successfully\n")
f.close()
Output
Note
with statement
We have used close() method in all the programs to close the file in the end. In
case we don’t want to close the file explicitly using close() method, there is an
alternative statement which can be used in the program ie., with statement.
Apart from open() or file() function for creation of file, with statement can also
be used for the same purpose. Using with statement we can group file operation
statements within block. Using with ensures that all the resources allocated to
the file objects get deallocated automatically once we stop using the file. In case
of exceptions also we are not required to close the file explicitly using with
statement.
Syntax
15
Write a program to create a file with a few lines in it.
#file creation
with open("test6.txt","w") as f:
f.writelines("We are writing\n")
f.writelines("data to a\n")
f.writelines("text file \n")
print("Data written to the file successfully\n")
print("Is the file closed ",f.closed)
print("Is file closed",f.close())
Output
Data written to the file successfully
Is the file closed True
Is file closed None
Python provides various methods for reading data from a file. We can read
character data from the text file by using the following read methods:
1. read() – To read the entire data from the file. Starts reading from the
cursor up to the end of the file.
2. read(n) – To read ‘n’ characters from the file, starting from the cursor. If
the line holds fewer than ‘n’ characters, it will read until the end of file.
3. readline() – To read only one line from the file. Starts reading from the
cursor up to the end of the file and returns a list of lines.
4. readlines() – To read all the lines from the file into a list. Starts reading
from the cursor up to the end of the file and returns a list of lines.
1. read() – read() can be used to read entire string from the file. This
function also returns the string read from the file. At the end of the file,
again an empty string will be returned.
Syntax
16
fileObject.read()
Examples
f=open("test.txt")
data=f.read()
print(data)
OR
f=open("test.txt")
print(f.read())
Output
We are writing
data to a
text file
f=open("test1.txt",'w')
f.write("We are writing")
f.write("data to a")
f.write("text file")
print("Data written to the file successfully\n")
f.close()
f=open("test1.txt")
print(f.read())
f.close()
Output
3. Write a program to read the contents of the file marks.det and display the
contents on the screen.
17
#file creation
n=int(input("Enter total number of students :"))
f=open("marks.det","w")
for i in range(n):
print("Enter details of the student:",i+1)
rno=int(input("Enter the roll number:"))
name=input("Enter name:")
marks=float(input("Enter marks:"))
rec=str(rno)+","+name+","+str(marks)+'\n'
f.write(rec)
f.close()
Input given:
Output
1,Sasi,80.0
2,Anesh,45.0
18
2. read() – read() can be used to read entire string from the file. This
function also returns the string read from the file. At the end of the file,
again an empty string will be returned. Here size specifies the number of
bytes to be read from the file. One must take care of the memory size
before reading the entire content from the file.
Syntax
fileObject.read([size])
Examples
f=open("test1.txt",'w')
f.write("We are writing")
f.write("data to a")
f.write("text file")
print("Data written to the file successfully\n")
f.close()
f=open("test1.txt")
print(f.read(10))
f.close()
Output
We are wri
f=open("test1.txt")
print(f.read(20))
f.close()
Output
We are writingdata t
19
3. readline() – This function will return a line read, as a string from the file.
First call to the function will return the first line, second call to the second
line and so on. The file object keeps track from where reading / writing of
data should happen. For readline() line is terminated by ‘\n’. The newline
character is also read from the file and post fixed in the string. When end
of file is reached, readline() will return an empty string. A line is
considered till a newline character (EOL) is encountered in the data file.
Syntax
fileObject.readline()
Since this function returns a string, it should be returned and stored using
a variable as shown:
x=file.readline()
OR
print(file.readline())
For reading an entire file using readline(), we will have to use a loop over
the file object. This is a memory efficient, simple and fast way of reading
the file like:
f=open("marks.det")
print(f.readline())
f.close()
Output
1,Sasi,80.0
2. Write a program to read all the lines from the file marks.det
#file creation
n=int(input("Enter total number of students :"))
f=open("marks.det","w")
20
for i in range(n):
print("Enter details of the student:",i+1)
rno=int(input("Enter the roll number:"))
name=input("Enter name:")
marks=float(input("Enter marks:"))
rec=str(rno)+","+name+","+str(marks)+'\n'
f.write(rec)
f.close()
f=open("marks.det")
print(f.read())
f.close()
Output
1,Akash,80.0
2,Gina,45.0
3,Gigi,56.0
f=open("marks.det")
line1=f.readline()
print(line1,end='')
line2=f.readline()
print(line2,end='')
line3=f.readline()
21
print(line3,end='')
f.close()
Output
1,Akash,80.0
2,Gina,45.0
3,Gigi,56.0
f=open("marks.det")
line1=f.readline()
print(line1,end='')
line2=f.readline()
print(line2,end='')
line3=f.readline()
print(line3,end='')
line4=f.readline()
print(line4,end='')
f.close()
Output
1,Akash,80.0
2,Gina,45.0
3,Gigi,56.0
Argument end at the end of the print() statement will ensure that output
shows exact contents of the data file and there are no print inserted
newline
characters are there.
f=open("marks.det")
line=" "
while line:
line=f.readline()
print(line,end='')
f.close()
22
To read line 2
f=open("marks.det")
line1=f.readline()
line2=f.readline()
print(line2)
f.close()
Note
The readline() function reads the leading and trailing spaces (if any)
along with trailing newline character(‘\n’) also while reading the line. We
can remove these leading and trailing white spaces or tabs or newlines
using strip() without any argument. strip() without any argument removes
leading and trailing white spaces.
Syntax
Example
f=open("marks.det")
for line in f:
print(line)
f.close()
Output
1,Akash,80.0
2,Gina,45.0
23
3,Gigi,56.0
The output produced is the same as the previous one. The reason is that
when we iterate over a file handle using a for loop, then the for loop’s
variable moves through the file, line by line where a line of a file is
considered as a sequence of characters up to and including a special
character called a newline character(‘\n’). So the for loop’s variable starts
with first line and with each iteration, it moves to the next line. As the for
loop iterates through each line of the file the loop variable will contain
the
current line of the file as a string of characters.
4. readlines() – can be used to read the entire content of the file. This
method will return a list of strings , each separated by ‘\n’ .
Syntax
fileObject.readlines()
Examples
1. Write a program to read all the lines from the file test.txt
#file creation
f=open("test.txt","w")
f.write("This is \n")
f.write("a test program\n")
f.write("in Python")
f.close()
Output
24
This is
a test program
in Python
2. Write a program to display the contents from 2nd character into the list.
Output
Th
['is is \n', 'a test program\n', 'in Python']
Remaining data
Explanation
Note
While reading from or writing to the file, cursor always starts from the
beginning of the file.
25
Appending to a file
When the file is opened in write mode ‘w’, Python overwrites an existing file or
creates a non-existing file. This means for an existing file with the same name,
the earlier data gets lost. However, if we have to add data to the existing file
then the file has to be opened in append mode. Append means ‘to add to’.
‘Open for writing and if it exists, then append data to the end of the file.’
Therefore, in Python, ‘a’ mode is used to open an output file in append file. This
means:
If already the file exists, it will not be erased. If the file does not exist, it
will be created.
When data is written to the file, it will be written at the end of the file’s
current contents.
Syntax
<fileObject>=open(<filename>,’a’)
Here, ‘a’ stands for append mode, which allows to add data to the
existing data instead of overwriting in the file.
Example
f=open(“test1.txt”,”a”)
1. Write a program to add data to an existing file.
f=open("test6.txt",'a')
f.write("This is\n")
f.write("test program\n")
f.write("in Python\n")
f.close()
Output
26
We are writing
data to a
text file
We are writing
data to a
text file
This is
test program
in Python
2. Write a program to enter the name and age and store the data in a text
file.
name=input("Enter name:")
age=int(input("Enter age:"))
f=open("user_details",'a')
f.write(name)
f.write(str(age))
f.close()
f=open("user_details")
print(f.read())
f.close()
Output
Akash44Akil34
We use file objects to work with data file. Similarly input and output from
standard I/O devices is also performed using standard I/O stream object. Since
we use high level functions for performing input/output through keyboard and
monitor, such as eval(), input() and print() statement, it is not required to
explicitly use I/O stream object.
27
The standard streams available in Python are:
These are nothing but file objects, which get automatically connected to your
programs standard devices when we start Python. We need to import sys
module in order to work with standard I/O module. The methods available for
I/O operations in it are read() for reading a byte at a time from keyboard, write()
for writing data on console ie., monitor.
Example
import sys
f=open(r"Book.txt")
line1=f.readline()
line2=f.readline()
line3=f.readline()
sys.stdout.write(line1)
sys.stdout.write(line2)
sys.stdout.write(line3)
sys.stderr.write("\nNo errors occurred\n")
f.close()
Output
We are writing
data to a
text file
28
No errors occurred
When we write onto a file using any of the write functions, Python holds
everything to write into the file in buffer and pushes it to the actual file storage
device at a latter time. If we want to force Python to write the contents of buffer
onto storage, flush() function can be used.
Python automatically flushes the file buffers when closing them ie., this
function is implicitly called by the close() function. But we may want to flush()
the data before closing any file.
Syntax
<fileobject>.flush()
Example
f=open("out.log","w+")
f.write("The output is\n")
f.write("Python\n")
f.flush()
s='ok'
f.write(s)
f.write('\n')
f.write("Over\n")
f.flush()
f.close()
Therefore the flush() function forces the writing of data on disc still pending in
output buffer.
The read() and readline() functions read data from the file and return it in string
form and the readlines() function returns the entire file content in a list where
each line is one item of the list. All these read functions also read the leading
and trailing white spaces, tabs and newline characters.
29
To remove these use the following functions:
The strip() function removes the given character from both ends.
The rstrip() function removes the given character from trailing end(right
end)
The lstrip() function removes the given character from leading end(left
end)
f=open("file.txt")
line=f.read()
line=line.rstrip('\n') # will remove the newline character
print(line)
f.close()
Output
A file pointer tells the current position in the file where writing or reading will
take place.
Example
#file creation
f=open("file.txt","w")
f.write("This is")
f.write("test program")
f.write("in Python")
f.close()
#file reading
f=open("file.txt")
print(f.read())
f.close()
Output
Output
Th
is i
stest prog
ramin Py
Standard file streams
The keyboard is the basic device used for giving input. The output is displayed
through monitor. Therefore standard input device is the keyboard and standard
output device is monitor. Any error occurs it is displayed on the monitor. So
monitor is the standards error device.
These standard streams are nothing but file objects, which automatically get
connected to your program’s standard device(s) when we start Python, In order
to work with standard I/O stream, we need to import sys module. The methods
available for I/O operations in it are read() for reading a byte at a time from
keyboard and write() for writing data on console, ie., monitor.
31
Examples
import sys
f=open(r"test1.txt")
l1=f.readline()
l2=f.readline()
l3=f.readline()
sys.stdout.write(l1)
sys.stdout.write(l2)
sys.stdout.write(l3)
Output
Hello User
You are working with Python files
Simple syntax of the language
import sys
f=open(r"test1.txt")
l1=f.readline()
l2=f.readline()
l3=f.readline()
sys.stdout.write(l1)
sys.stdout.write(l2)
sys.stdout.write(l3)
sys.stderr.write("No errors occurred")
Output
Hello User
You are working with Python files
Simple syntax of the language
No errors occurred
The lines containing the method stdout.write() shall write the respective lines
from the file on device/file associated with sys.stdout, which is the monitor.
32
Files are organized into directories called folders. Every running program has a
“current directory” which is the default directory for most operations. For
example, while opening a file for reading, Python looks for it in the current
directory.
The os (operating system) module provides functions for working with files and
directories. os.getcwd() returns the name of the current directory.
>>> import os
>>> cwd=os.getcwd()
>>> print(cwd)
C:\Users\Udhaya Khumari\AppData\Local\Programs\Python\Python36-32
Files are always stored in the current folder/directory by default. The os module
of Python provides various methods to work with file and folder/directories.
A string like cwd that identifies a file is called a path. A relative path starts from
the current directory, whereas an absolute path starts from the topmost directory
in the file system.
The relative paths are relative to the current working directory denoted as a
dot(.) while its parent directory is denoted with two dots(..).
Example
Drive-letter:\directory[\directory…]
The first back slash refers to the root directory.
33
E:\SALES\YEARLY\BACKUP
In the drive E, under the root directory ( first \ ), under SALES subdirectory of
root, under YEARLY subdirectory of SALES, there lies BACKUP directory.
E:\ACCOUNTS\HISTORY\CASH.ACT
This is an example for absolute path names as they mention the paths from the
top most level of the directory structure.
With PROJ2 as the current working folder, the path name of TWO.CPP will be
..\CL.DAT ( This is in the parent folder ) which means that in the parent folder
of current folder, there lies a file CL.DAT. ( PROJ2 is the working folder )
34
35