Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
75 views

Chapter-4 Data File Handling (Notes)

Data File Handling involves opening files, performing operations like reading and writing, and then closing the files. There are two main types of files: text files, which are human-readable, and binary files, which contain arbitrary data. Common text file types include .txt, .csv and .py files, while binary file types include images, videos and audio. In Python, the open() function is used to open a file for reading, writing or appending, and the close() function closes the file. Various modes like 'r', 'w' and 'a' determine if the file can be read, written, or have content appended.

Uploaded by

Himanshi Tomer
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
75 views

Chapter-4 Data File Handling (Notes)

Data File Handling involves opening files, performing operations like reading and writing, and then closing the files. There are two main types of files: text files, which are human-readable, and binary files, which contain arbitrary data. Common text file types include .txt, .csv and .py files, while binary file types include images, videos and audio. In Python, the open() function is used to open a file for reading, writing or appending, and the close() function closes the file. Various modes like 'r', 'w' and 'a' determine if the file can be read, written, or have content appended.

Uploaded by

Himanshi Tomer
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Data File Handling 

Friday, May 20, 2022  12:48 PM 

Programs  NEED: 
In a general program after execution
neither the input nor the output are
Transient          Persistent  saved for future use hence in order to
store the output of a program in a file
Transient: 
and to perform various operations on
Run for a short period and produce the output but when they end the data
it we require Data File Handling.  
disappears as the data is saved in RAM (volatile memory) as temporary memory. 

Data File 
Persistent:  In PYTHON: 
Run for a longer period and save some data in permanent storage. If closed they Python allows us to read and save data
execute from the same point.   to external files permanently in
E.g.: Operating Systems  Handling  secondary storage. 
The script with .py extension is also

Files  saved to the secondary storage


permanently. 

Data structures where data is packaged It INVOLVES: 


to store in devices    I) Opening the file 
 II) Performing operations 
• It is a stream of bytes, comprising data of interest. 
III) Closing the file 
• The data maintained inside a file is termed persistent. 
• They provide a means of communication btw the program and the world. 

TEXT FILES    BINARY FILES 


   
1. A text file is usually considered as sequence of lines.   1. A binary file contains arbitrary binary data i.e. numbers stored in the
2. It is a simple ASCII/UNICODE sequence.  file, can be used for numerical operation(s).  
3. Line is a sequence of characters stored on permanent storage media.   2. Hence, working on a bin file, means interpreting raw bit pattern(s)
4. Each line is terminated by a special character, known as End of Line. 
read into correct data type in the program.  
In python, the EOL character is '\n'.  3. In the case of binary file it is extremely important to interpret correct
5. At the lowest text file will be collection of bytes.   data type while reading the file.  

6. Are stored in human readable form and can be created using any text While it's possible to interpret a stream of bytes (originally a string) as
editor. 
7. Examples:  numeric value but they are often incorrect and do not give the desired
output after file processing.  
Document Files:  .txt, .rtf  4. Python provides special module(s) for encoding and decoding of data
Tabular Files:  .csv, .tsv  for binary file. 
5. Binary files are made-up in non-human readable form and need
Source Code Files:  .py, .js, .c, .app, .java 
programs to access their constant. 
Web Standard Files:  .html, .xml, .css, .json  6. Used to store binary data such as images, video files, audio files, etc. 
Configuration Files:  .ini, .cfg, .reg 

 
  CSV FILES 

(comma-separated values) 
 
1. CSV is just like a text file it's in human readable format except it's used to store data in tabular form with each line in a CSV file is treated as a record. 
2. It is the most preferred import and export format for databases and spreadsheets. 
3. The separator character of CSV files is called a delimiter. 

comma  (,) 
tab  ('\t') 
colon  (:) 
pipe  (|) 
semicolon  (;) 

OPENING A FILE 

open() 
1. open() function takes the name of the file as the first argument 
2. Syntax: <file variable>/<file object or handle> = open(file_name, access_mode) 

For a file 'abc', 

                 f=open('abc') 
3. In the given syntax we notice the following elements: 

i)  file_object 

     It establishes a link btw the program and data file. (Also referred to as file handle or object). 

ii) access_mode 

     It defines the location of the file pointer (from where data is being read and written to). 
 
 
 
4. Modes for opening a file: 

(r)  (w)  (a) 


Read Mode  Write Mode  Append Mode 
to read the file.  to write to the file.  to write at the end of the file. 

5. When file is not found the, "FileNotFoundError" is generated. 


 

CLOSING A FILE 

close() 
1. close() function flushes any unwritten info and closes the file object 
2. Syntax: <file variable>/<file object or handle>.close() 

For a file 'abc' open under filehandle 'f', 

                 f.close() 
3. f.closed tells us whether the file object is closed in the form of True and False. 
 

FILE MODES   
Reading Mode: 

r  r+  rb  rb+ 


• Reading only.  • Both reading and writing.  • Reading only in binary format.  • Both reading and writing in binary format. 
• Default mode.  • File pointer at beginning.  • Default mode.  • File pointer at beginning. 
• File pointer at beginning.  • File pointer at beginning. 

 
Writing Mode: 
 

w  w+  wb  wb+ 


• Writing only.  • Both writing and reading.  • Writing only in binary format.  • Both writing and reading in binary format. 
• File pointer at end.  • File pointer at end.  • File pointer at end.  • File pointer at end. 
• Overwrites existing file.  • Overwrites existing file.  • Overwrites existing file.  • Overwrites existing file. 

Creates new file for the particular mode if it doesn't exist. 


 
Appending Mode: 
 

a  a+  ab  ab+ 


• Appending only.  • Both appending and reading.  • Appending only in binary format.  • Both appending and reading in binary format. 
• File pointer at end.  • File pointer at end.  • File pointer at end.  • File pointer at end. 

Creates new file for the particular mode if it doesn't exist. 


 READING A FILE 
 

read() or read(n)  readline()  readlines() 


x=file.read() 
x=file.readline()  x=file.readlines() 
x=file.read(n) 
Reads entire file or specified 'n' bytes.  Reads only one line at a time.  Reads all lines in the txt file. 
File pointer goes from beginning to end of file.  File pointer goes from cursor (beginning) to EOL.   
Reads as a string.  Reads as a string.  Reads as a list of strings separated by '\n'. 
When n is invalid (negative), reads entire file.  Terminates after the EOL. The end of file is given by an empty list.   
  Also reads the EOL and fixes it in the string.  As it returns as a string it can be manipulated. 

If we do not use the close statement then no data will be written to the file and the data will be flushed. 
 

WRITING TO A FILE 
 

write()  writelines() 
x=file.write()  x=file.writelines() 
Takes a line in str and writes it to the file in a single line.  Writes all sequence data types, incluiding str to the text file. 
To store with EOL, it needs to specified at the end of string.  To store with EOL, it needs to specified at the end of argument. 
The entire argument must be a string.  Accepts all sequence data types. 
New file is created when it doesn't exist.  New file is created when it doesn't exist. 
Existing file gets overwritten or overridden each time (old data is lost).  Existing file gets overwritten or overridden each time (old data is lost). 

 
with statement 
• with open("filename", "filemode") as fileobj: 

       f.write("argument1 EOLchar") 


       f.write("argument2 EOLchar") 
• Used to group file operation statements within block to make code more compact and readable. 
• Ensures that all resources allocate to the file objects get deallocated automatically once we stop using the file. 
 

APPENDING A FILE 
 
fileobj=open("filename", "a") 
Append means 'to add' hence the data written under the 'a' mode is added to the file unlike in 'w' mode which overwrites the file. 
We can deduce that: 
• If the file exists, it will not be erased and if it doesn't exist then it will be craeted. 
• When data is written, it gets added to end of the file which means that the file pointer is at the end of the file. 
BINARY FILES 

(READING AND WRITING A FILE) 


The Pickle Module: 
• It is used to read and write structures such as list and dictionaries. 
• Used for serialising and desearilizing.  

Serialising or pickling is the transformation of data/object in RAM to byte streams for storage in disk or db or sending through a network.  

It refers to the process of converting the structure to a byte stream before writing to the file. 

                                                                            whereas 

Unpickling refers to converting the byte stream back to the original data structure. 
• As we know in python writing and reading work with str parameters, conversion is necessary. 
• Hence it the module can be used to store any kind of obj in bin file as it allows python obj with their structures. 
• Steps: 
           Import the pickle module. 

           Open the bin file in the file object, in required mode. 

           To read ("rb"): 

                          Use pickle.load(fileobject) 

            To write ("wb"):  

                          Take the desired input in a variable e.g.- x. 

                          Now use pickle.dump(x,f) 

           Close the file using fileobject.close() 


 

OPERATIONS IN A BINARY FILE 


  

Inserting/Appending  Reading  Searching  Updating 


Step1: Import Pickle module. 
Step1: Import Pickle Module. 
Step1: Import Pickle Module. 
Step1: Import Pickle Module. 

Step2: Add record using dump()  


Step2: Print using load() method.  Step2: load() method takes all the data  
Step2: load() method is used and the   

            method.              in a variable e.g.- r. 


             elm requiring change is
Step3: The to be searched elm is taken 
searched. 

             in another variable- x. 


Step3: If found, the new data is written  

Step4: For loop in r is initiated with an 


             in a variable and written using  

             if loop_var[0]==x               dump(). 

Step5: If the elm is found, the statement 


                                  or else 

             is printed.               If not found, error is generated. 


File mode: wb  File mode: rb  File mode: rb  File mode: rb+ 

  

 
 
 
EXCEPTION HANDLING 
try block: 
• Signifies to run the code. 
• Includes statements that might generate some error or exception.  
 
 
except block: 
• Runs when an error or exception occurs. 

The except exception is a base class with all types of exceptions in Python.  

Can be used when error is not known.   


 

RANDOM ACCESS IN FILES 


seek() 
• Changes position of the file pointer(handle, cursor) to a given, specific position. 
• 0= Moves file pointer to beggining of the file, the default positioning. 

1= Keeps file pointer to current of the file. 

2= Moves file pointer to end of the file. 


• Seek() can be done using two methods: 

            i) Absolute Positioning 

                The file pointer positions itself. 

                Syntax:  f.seek(file_location) 

                           e,g,- f.seek(20) places the file pointer at the 20th byte in the file. 

           ii) Relative Positioning 

                Syntax:  f.seek(offset, from_what) 

                e.g.- f.seek(20,2) places the file pointer 20 bytes forward from the current position, which will be 2. 

                         f.seek(-10,5) places the file pointer 10 bytes backward from the current position, which will be 5. 
tell() 
• Syntax:  f.tell() 
• Returns current position of the file pointer. 
• In reading or writing mode, the file pointer osd at 0 bytes. 
• In append mode, the file pointer is at the last byte. 
 

CSV FILES 

(COMMA SEPARATED VALUES) 


 
• Each line of the file is called a record. 
• Each record consists of fields separated by commas (delimiter). 
• Used for storing tabular data in spreadsheet or database. 
• The tabular data is stored as text. 
• Advantages: 

             i) faster 

            ii) smaller in size 

           iii) easy to genarate and import on a spreadsheet or database 

           iv) human readable and easy to edit 

            v) simple to interpret and parse 

           vi) processed by almost all existing applications. 


 

Reading  Writing 
Syntax: csv.reader(fileobject)  Syntax: csv.writer(fileobject) 
reader() is an iterable object which reads CSV file line by line.  writer() object converts user's data into a delimited string. 
Step1: import csv 
Step1: import csv 

Step2: fileobject=open("filename", "r") 


Step2: In a varaiable fields, take the field names in list form. 
Step3: variable=csv.reader(fileobject) 
Step3: In a variable rows, take the field data in form of lists. 

Step4: Iterate 
Step4: fileobject=open("filename", "w") 

             for row in variable: 


Step5: variable=csv.writer(fileobject, delimiter= ',') 
                    print(row)  Step6: Write the fields to the csv file. 

Step5: Close the file. 


             variable=csv.writerow(fields) 

             fileobject.close()  Step7: Write the data row-wise. This can be done two ways. 

             By iterating using writerow or directly- all at once, using writerows. 

             for i in rows:                                 csv.writerows(rows) 

                    csv.writerow(i) 
Step5: Close the file. 

             fileobject.close( 

You might also like