Chapter-4 Data File Handling (Notes)
Chapter-4 Data File Handling (Notes)
Programs NEED:
In a general program after execution
neither the input nor the output are
Transient Persistent saved for future use hence in order to
store the output of a program in a file
Transient:
and to perform various operations on
Run for a short period and produce the output but when they end the data
it we require Data File Handling.
disappears as the data is saved in RAM (volatile memory) as temporary memory.
Data File
Persistent: In PYTHON:
Run for a longer period and save some data in permanent storage. If closed they Python allows us to read and save data
execute from the same point. to external files permanently in
E.g.: Operating Systems Handling secondary storage.
The script with .py extension is also
6. Are stored in human readable form and can be created using any text While it's possible to interpret a stream of bytes (originally a string) as
editor.
7. Examples: numeric value but they are often incorrect and do not give the desired
output after file processing.
Document Files: .txt, .rtf 4. Python provides special module(s) for encoding and decoding of data
Tabular Files: .csv, .tsv for binary file.
5. Binary files are made-up in non-human readable form and need
Source Code Files: .py, .js, .c, .app, .java
programs to access their constant.
Web Standard Files: .html, .xml, .css, .json 6. Used to store binary data such as images, video files, audio files, etc.
Configuration Files: .ini, .cfg, .reg
CSV FILES
(comma-separated values)
1. CSV is just like a text file it's in human readable format except it's used to store data in tabular form with each line in a CSV file is treated as a record.
2. It is the most preferred import and export format for databases and spreadsheets.
3. The separator character of CSV files is called a delimiter.
comma (,)
tab ('\t')
colon (:)
pipe (|)
semicolon (;)
OPENING A FILE
open()
1. open() function takes the name of the file as the first argument
2. Syntax: <file variable>/<file object or handle> = open(file_name, access_mode)
f=open('abc')
3. In the given syntax we notice the following elements:
i) file_object
It establishes a link btw the program and data file. (Also referred to as file handle or object).
ii) access_mode
It defines the location of the file pointer (from where data is being read and written to).
4. Modes for opening a file:
CLOSING A FILE
close()
1. close() function flushes any unwritten info and closes the file object
2. Syntax: <file variable>/<file object or handle>.close()
f.close()
3. f.closed tells us whether the file object is closed in the form of True and False.
FILE MODES
Reading Mode:
Writing Mode:
If we do not use the close statement then no data will be written to the file and the data will be flushed.
WRITING TO A FILE
write() writelines()
x=file.write() x=file.writelines()
Takes a line in str and writes it to the file in a single line. Writes all sequence data types, incluiding str to the text file.
To store with EOL, it needs to specified at the end of string. To store with EOL, it needs to specified at the end of argument.
The entire argument must be a string. Accepts all sequence data types.
New file is created when it doesn't exist. New file is created when it doesn't exist.
Existing file gets overwritten or overridden each time (old data is lost). Existing file gets overwritten or overridden each time (old data is lost).
with statement
• with open("filename", "filemode") as fileobj:
APPENDING A FILE
fileobj=open("filename", "a")
Append means 'to add' hence the data written under the 'a' mode is added to the file unlike in 'w' mode which overwrites the file.
We can deduce that:
• If the file exists, it will not be erased and if it doesn't exist then it will be craeted.
• When data is written, it gets added to end of the file which means that the file pointer is at the end of the file.
BINARY FILES
Serialising or pickling is the transformation of data/object in RAM to byte streams for storage in disk or db or sending through a network.
It refers to the process of converting the structure to a byte stream before writing to the file.
whereas
Unpickling refers to converting the byte stream back to the original data structure.
• As we know in python writing and reading work with str parameters, conversion is necessary.
• Hence it the module can be used to store any kind of obj in bin file as it allows python obj with their structures.
• Steps:
Import the pickle module.
Open the bin file in the file object, in required mode.
EXCEPTION HANDLING
try block:
• Signifies to run the code.
• Includes statements that might generate some error or exception.
except block:
• Runs when an error or exception occurs.
The except exception is a base class with all types of exceptions in Python.
e,g,- f.seek(20) places the file pointer at the 20th byte in the file.
e.g.- f.seek(20,2) places the file pointer 20 bytes forward from the current position, which will be 2.
f.seek(-10,5) places the file pointer 10 bytes backward from the current position, which will be 5.
tell()
• Syntax: f.tell()
• Returns current position of the file pointer.
• In reading or writing mode, the file pointer osd at 0 bytes.
• In append mode, the file pointer is at the last byte.
CSV FILES
i) faster
Reading Writing
Syntax: csv.reader(fileobject) Syntax: csv.writer(fileobject)
reader() is an iterable object which reads CSV file line by line. writer() object converts user's data into a delimited string.
Step1: import csv
Step1: import csv
Step4: Iterate
Step4: fileobject=open("filename", "w")
fileobject.close() Step7: Write the data row-wise. This can be done two ways.
csv.writerow(i)
Step5: Close the file.
fileobject.close(