Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
4 views

Python File Handling

The document provides an overview of file handling in Python, covering both text and binary files, including their characteristics and access modes. It details methods for opening, writing, reading, and closing files, as well as the use of the Pickle module for object serialization. Additionally, it explains how to manage file offsets using tell() and seek() functions.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Python File Handling

The document provides an overview of file handling in Python, covering both text and binary files, including their characteristics and access modes. It details methods for opening, writing, reading, and closing files, as well as the use of the Pickle module for object serialization. Additionally, it explains how to manage file offsets using tell() and seek() functions.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 36

Python File Handling

Samir V
Two types of files can be handled in Python, normal text files and binary files
(written in binary language, 0s, and 1s).

Text files: In this type of file, Each line of text is terminated with a special
character called EOL (End of Line), which is the new line character (‘\n’) in
Python by default.
A regular text file can be understood as a sequence of characters consisting of
alphabets, numbers and other special symbols. These files are with extensions
like .txt, .py etc
Few text file contents are usually separated comma (,) or tab (\t).Thse files are
called csv (comma separated value) or tsv(tabs seperated value) respectively

Binary files: In this type of file, there is no terminator for a line, and the data is
stored after converting it into machine-understandable binary language.
Binary files are stored in a computer in a sequence of bytes. Even a single bit
change can corrupt the file and make it unreadable to the supporting
application.
Binary files operations are faster than text file as it is deliminator free hence
translation is not required.
slide 2
Samir V
In Python, there are six methods or access modes, which are:

Read Only ('r’): This mode opens the text files for reading only. It raises the I/O
error if the file does not exist. This is the default mode for opening files as well.

Read and Write ('r+’): This method opens the file for both reading and writing. If
the file does not exist, an I/O error gets raised.

Write Only ('w’): This mode opens the file for writing only. The data in existing files
are overwritten. If the file does not already exist in the folder, a new one gets
created.

Write and Read ('w+’): This mode opens the file for both reading and writing. The
text is overwritten and deleted from an existing file.

Append Only ('a’): This mode allows the file to be opened for writing. If the file
doesn't yet exist, a new one gets created. The newly written data will be added at
the end, following the previously written data.

Append and Read (‘a+’): Using this method, you can read and write in the file. If the
file doesn't already exist, one gets created. The newly written text will be added at
the
slide 3 end.
Samir V
Working with Text Files

slide 4
Samir V
Opening a file

To open a file in Python, we use the open() function. The syntax of open() is
as follows:
file_object= open(file_name, access_mode)

This function returns a file object called file handle which is stored in the
variable file_object. This file handle is used to transfer data to and from the
file (read and write).
The file handle has certain attributes that tells us basic information about
the file, such as: file location, mode, capacity, last accessed position,
metadata etc.

If the file does not exist, the above statement creates a new empty file and
assigns it the name we specify in the statement.

# to open the file "MyFile1.txt”presentin current directory


file1 = open("MyFile1.txt","a")

# To open a file at specific location


file2 = open("D:\\Text\\MyFile2.txt","w+")
slide 5
Samir V
Notice the doble slashes in file path "D:\\Text\\MyFile2.txt“
This is to avoid escape sequences. For eg. \t inserts tab space.

If you do not want to add double slashes, python provide alternate way-

file2 = open(r "D:\Text\MyFile2.txt","w+")

One need to r preceding the file path. This tells interpreter to treat the path
as raw string.

slide 6
Samir V
Closing a file

Once we are done with the read/write operations on a file, it is a good


practice to close the file. Python provides a close() method to do so.
While closing a file, the system frees the memory allocated to it. The
syntax of close() is:
file_object.close()

Here, file_object is the object that was returned while opening the file.
Python makes sure that any unwritten or unsaved data is flushed off
(written) to the file before it is closed. Hence, it is always advised to
close the file once our work is done.

# Opening and Closing a file "MyFile.txt"

file1 = open("MyFile.txt","a")
….
….
file1.close()

slide 7
Samir V
WRITING TO A TEXT FILE

For writing to a file, we first need to open it in write or append mode. If we open
an existing file in write mode, the previous data will be erased, and the file
object will be positioned at the beginning of the file.

On the other hand, in append mode, new data will be added at the end of the
previous data as the file object is at the end of the file.

After opening the file, we can use the following methods to write data in the file.
WE CAN WRITE ONLY STRING DATA TO TEXT FILES.

• write() - for writing a single string


File_object.write(str1)

text = "This is new content"


# writing new content to the file
fp = open("write_demo.txt", 'w')
fp.write(text)
fp.close()
slide 8
Samir V
If numeric data are to be written to a text file, the data need to be converted
into string before writing to the file.

myobject=open("myfile.txt",'w’)
marks=58
#number 58 is converted to a string using str()
myobject.write(str(marks))

On execution, write() returns the number of characters written on to the file.

file = open("C:\\samir\\online classes\\Python\\mytxt.txt", 'w')


n = file.write(str(3000))
file.close()
print(n)

>>> 4

slide 9
Samir V
• writelines() - This method is used to write multiple strings to a file. We need to
pass an iterable object like lists, tuple, etc.

Syntax - File_object.writelines(L) for L = [str1, str2, str3]

file1 = open("Employees.txt", "w")


lst = []
for i in range(3):
name = input("Enter the name of the employee: ")
lst.append(name + '\n')

file1.writelines(lst)
file1.close()
print("Data is written into the file.")

Unlike write(), the writelines() method does not return the number of characters
written in the file.

slide 10
Samir V
The flush() method flushes the internal buffer. This internal buffer is
maintained to speed up file operations.

Whenever a write operation is performed on a file, the contents are first


written in the internal buffer; which is then transferred into the destination
file once the buffer is full. This is done to prevent frequent system calls for
every write operation. And once the file is closed, Python automatically
flushes the buffer. But you may still want to flush the data before closing any
file.

# Open a file

fo = open("foo.txt", "wb")
fo.flush()

# Close opened file


fo.close()

Samir V
READING FROM A TEXT FILE

We can write a program to read the contents of a file. Before reading a file,
we must make sure that the file is opened in “r”, “r+”, “w+” or “a+” mode.
There are three ways to read the contents of a file:

The read() method –


This method is used to read a specified number of bytes of data from a data
file. The syntax of read() method is:
File_object.read(n)

myobject=open("myfi le.txt",'r')
myobject.read(10)
myobject.close()

If no argument or a negative number is specified in read(), the entire file


content is read. For example,
myobject=open("myfile.txt",'r’)
print(myobject.read())
myobject.close() Samir V
The readline([n]) method

This method reads one complete line from a file where each line terminates
with a newline (\n) character. Itcan also be used to read a specified number
(n) of bytesof data from a file but maximum up to the newline
character (\n).
In the following example, the second statement reads the first ten
characters of the first line of the text file and displays them on the screen.

myobject=open("myfile.txt",'r')
myobject.readline(10)
myobject.close()

If no argument or a negative number is specified, itreads a complete line and


returns string.

myobject=open("myfi le.txt",'r')
print (myobject.readline())

To read the entire file line by line using the readline(), we can use a loop.
Samir V
The readlines() method
The method reads all the lines and returns the lines along with newline as a
list of strings. The following example uses readlines() to read data from the
text file myfile.txt.

myobject=open("myfi le.txt", 'r')


print(myobject.readlines())
myobject.close()

Exercise -
Write a program that accepts a conversation from two user and writes it to a
text file. Thereafter, the same program reads the text file and displays it on
the screen.

Samir V
Opening a file using with clause:

In Python, we can also open a file using with clause.


The syntax of with clause is:
with open (file_name, access_mode) as file_object:

The advantage of using with clause is that any file that is opened using this
clause is closed automatically, once the control comes outside the with
clause. In case the user forgets to close the file explicitly or if an exception
occurs, the file is closed automatically.

with open(“myfi le.txt”,”r+”) as myObject:


content = myObject.read()

Here, we don’t have to close the file explicitly using close() statement.
Python will automatically close the file.

Samir V
Working with Binary Files

Samir V
Python considers everything as an object. So, all data types including list,
tuple, dictionary, etc. are also considered as objects.
During execution of a program, we may require to store current state of
variables so that we can retrieve them later to its present state. Suppose you
are playing a video game, and after some time, you want to close it. So, the
program should be able to store the current state of the game, including
current level/stage, your score, etc. as a Python object.

To save any object structure along with data, Python provides a module
called Pickle. The module Pickle is used for serializing and de-serializing any
Python object structure.

Serialization is the process of transforming data or an object in memory


(RAM) to a stream of bytes called byte streams. These byte streams in a
binary file can then be stored in a disk or in a database or sent through a
network. Serialization process is also called pickling.

De-serialization or unpickling is the inverse of pickling process where a byte


stream is converted back to Python object.

Samir V
Why Do We Need Object Serialization?

Let’s understand why object serialization is so important.


You might be wondering why we can’t just save data
structures into a text file and access them again when
required instead of having to serialize them.

Here is a nested dictionary containing student information like


name, age, and gender:

students = {
'Student 1’: { 'Name': "Alice", 'Age' :10, 'Grade':4, },

'Student 2': { 'Name':'Bob', 'Age':11, 'Grade':5 },

'Student 3’: { 'Name':'Elena', 'Age':14, 'Grade’:8 }


}

Samir V
let’s proceed to write it to a text file without serialization:

with open('student_info.txt','w') as data:


data.write(str(students))

Since we can only write string objects to text files, we have


converted the dictionary to a string using the str() function.
This means that the original state of our dictionary is lost.

Now we will read the data from same file.

with open("student_info.txt", 'r') as f:


for students in f:
print(students)

The nested dictionary is now being printed as a string, and


will return an error when we try to access its keys or values.

This is where serialization comes in. When dealing with more


complex data types like dictionaries, data frames, and nested
lists, serialization allows the user to preserve the object’s
original state without losing any relevant information. Samir V
The pickle module deals with binary files. Here, data are not written but
dumped and similarly, data are not read but loaded.
The Pickle Module must be imported to load and dump data.

The pickle module provides two methods - dump() and load() to work with
binary files for pickling and unpickling, respectively

Samir V
The dump() method
This method is used to convert (pickling) Python objects for writing data in a
binary file. The file in which data are to be dumped, needs to be opened in
binary write mode (wb).

Syntax of dump() is as follows:


dump(data_object, file_handle)

where data_object is the object that has to be dumped to the file with the
file handle named file_ object.
Following program writes the record of a student (roll_no, name, gender and
marks) in the binary file named mybinary.dat using the dump(). We need to
close the file after pickling

import pickle
listvalues=[1,"Geetika",'F', 26]
fileobject=open("mybinary.dat", "wb")
pickle.dump(listvalues , fileobject)
fileobject.close()

Try opening the file that u created with notepad.


Samir V
The load() method(Reading data from binary file)

This method is used to load (unpickling) data from a binary file.


The file to be loaded is opened in binary read (rb) mode. Syntax of load() is
as follows:
Store_object = load(file_object)

The load() method raises EOFError when it reaches end of file. Hence load()
method must be enclosed in try … except block.

import pickle

fileobject=open("mybinary.dat","rb")
try:
objectvar=pickle.load(fileobject)
#perform other operations

except EOFError:
fileobject.close()
Samir V
import pickle

list= [1,2,3,4]
list1= [11,2,3,4]
list2= [12,2,3,4]
list3= [13,2,3,4]

file = open("myfile.dat", 'wb')


pickle.dump(list, file)
pickle.dump(list1, file)
pickle.dump(list2, file)
pickle.dump(list3, file)
file.close()

file =open("myfile.dat", 'rb')


try:
while True:
l1 = pickle.load(file)
print(l1)
except EOFError:
file.close
Samir V
Appending data to binary file

Appending data to binary file is same as text file with difference that file
need to opened using ‘ab’ option.
Once opened, we can use normal dump() function to append data to a file.

Samir V
Exercise – Write a program to enter Employee name, age, gender, Salary and
date of joining.
Program should get the data from user and write to file(Emp.dat) in binary
mode until User wishes. Display all data by reading a file.

Samir V
SETTING OFFSETS IN A FILE

If we want to access data in a random fashion, then Python gives us seek()


and tell() functions to do so.

tell() method

This function returns an integer that specifies the current position of the file
object in the file. The position specified is the byte position from the
beginning of the file.
The syntax of using tell() is:

Poition = file_object.tell()

Samir V
seek() method

This method is used to position the file object at a particular position in a file.
The syntax of seek() is:

file_object.seek(offset [,mode])

offset is the number of bytes by which the file object is to be moved.


Mode can have any of the following values:
0 - beginning of the file
1 - current position of the file
2 - end of file

By default, the value of mode is 0

Samir V
print("Learning to move the file object")

fileobject=open("testfile.txt","r+")
str=fileobject.read()
print(str)
print("Initially, the position of the file object is: ",fileobject. tell())

fi leobject.seek(0)
print("Now the fi le object is at the beginning of the file: ",fileobject.tell())

fileobject.seek(10) # fileobject.seek(10,0)
print("We are moving to 10th byte position from the beginning of file")
print("The position of the file object is at", fileobject.tell())

str=fileobject.read(5)
print(str)

fileobject.seek(20,1)
print("We are moving to 20th position from the current position")
print("The position of the file object is at", fileobject.tell())
Samir V
Searching in a file- There is no any pre-defined function available in python
for searching records in binary file in python. We will define our logic or
function to solve our problem of updating records.

import pickle

list= [[1,2,3,4],[11,22,33,44],[21,32,43,54],[91,92,93,94]]

file = open("myfile.dat", 'wb')

pickle.dump(list, file)
file.close()

file =open("myfile.dat", 'rb')


found = False

Samir V
try:
while True:
l1 = pickle.load(file)
num=int(input("Enter number to search"))

for l1 in list:
for j in l1:
if j == num:
found = True

except EOFError:
file.close

if (found):
print("number Present")
else:
print("number not Present")

Samir V
Binary File updation – A company want to give bonus to its employees.
Write a program to distribute bonus using following data.

The file is opened in “rb+” mode which is for reading as well as for writing.
Write all data to file. Make sure you have at least 3-4 entries.
We will use seek() and tell() functions to modify file pointer.

import pickle

list1= [1111,2222,33333,4444]
list2 = [1122,2233,3344,4455]
list3 = [2121,3232,4343,5454]
list4 = [9192,9293,9394,9495]

file = open("myfile.dat", 'wb')

pickle.dump(list1, file)
pickle.dump(list2, file)
pickle.dump(list3, file)
pickle.dump(list4, file)

file.close()
Samir V
file =open("myfile.dat", 'rb+')
found = False

num=int(input("Enter number to search"))

try:
while True:
pos= file.tell()
record = pickle.load(file)
# print(record)
for l1 in record:
if l1 == num:
found = True
ind= record.index(l1)
num1= int(input("enter number to replace"))
record[ind] =num1
#print(record)

file.seek(pos,0)
pickle.dump(record,file)

break;

Samir V
except EOFError:
file.close()

if (found):
print("record updated")
else:
print("number not Present")

To verify if file is written properly you may read it back.

import pickle

file1 =open("myfile.dat", 'rb')


try:
while True:

l1 = pickle.load(file1)
print(l1)

except EOFError:
file1.close()
Samir V
Exercise – A company want to give bonus to its employees.
Write a program to distribute bonus using following data.
Years Completed Bonus %
0 <= 5 No Bonus
> 5 <= 8 10% of salary
>8 15 % of salary

Program should Emp.dat created earlier. Update the records appropriately.


Display all data by reading a file.

Samir V
Summary –

• A file is a named location on a secondary storage media where data are


permanently stored for later access.
• A text file contains only textual information consisting of alphabets,
numbers and other special symbols. Such files are stored with extensions
like .txt, .py, .c, .csv, .html, etc. Each byte of a text file represents a
character.
• Each line of a text file is stored as a sequence of ASCII equivalent of the
characters and is terminated by a special character, called the End of Line
(EOL).
• Binary file consists of data stored as a stream of bytes.
• open() method is used to open a file in Python and it returns a file object
called file handle. The file handle is used to transfer data to and from the
file by calling the functions defined in the Python’s io module.
• close() method is used to close the file. While closing a file, the system
frees up all the resources like processor and memory allocated to it.
• write() method takes a string as an argument and writes it to the text file.
writelines() method is used to write multiple strings to a file. We need to
pass an iterable object like lists, tuple etc. containing strings to
writelines() method.
Samir V
• read([n]) method is used to read a specified number of bytes (n) of data
from a data file.
• readline([n]) method reads one complete line from a file where lines are
ending with a newline (\n). It can also be used to read a specified number
(n) of bytes of data from a file but maximum up to the newline character
(\n).

• readlines() method reads all the lines and returns the lines along with
newline character, as a list of strings.
• tell() method returns an integer that specifies the current position of the
file object. The position so specified is the byte position from the
beginning of the file till the current position of the file object.
• seek()method is used to position the file object at a particular position in
a file.

Samir V

You might also like