Transaction and Master Files
Transaction and Master Files
Transaction and Master Files
Theory Notes
Introduction
Data is stored in files on storage devices, for example, on floppy disks, hard drives and magnetic
tape. There is a choice of storage devices and there is also a choice in the way that files are
organized on the storage device.
Files can be organized so that a piece of data in the file can be accessed directly, without going
through other pieces of data to get to it. This is a fast method of data access.
Files could also be organized so that they are accessed serially or sequentially. This means that
to find a file, you start at the beginning of the file and go through all the records in turn until you
get to the one you want. This is a slow method of data access but has other benefits. When
deciding what file organization to use and what storage device to use, the key questions to ask
are:
How quickly do I need to get back a particular piece of data?
What do I want to do with the data?
IT (9626)
Theory Notes
Serial files
Data stored in a serial file is stored in time order only. When a new piece of data is added to the
file, it is simply added on to the end of the file. If you had a serial file with 10000 items in it, they
would be in the order that they were received on the storage device. This makes getting back
any individual piece of data a rather slow process. The only way you can retrieve a particular
record in a serial file is to:
Check the file is not empty.
If it is empty, report 'Empty' and stop.
If it is not empty, start at the first record in the file
Check if it is the record you want.
If it is, report 'Record found' and stop.
If it is not, get the next record and repeat, until you either find the record you want or
you get to the end of the file.
If you get to the end of the file, report 'File not found'.
Data access with serial files is slow and is therefore not suitable for applications that require fast
access to data. You wouldn't want to use this system of organization, for example, for a driving
license database used to deal with customer enquiries. If a driver rang up DVLC (the driving
license center) and wanted to query their details, it would take a very long time to search
serially the whole database! Some applications, however, don't require fast access to data. Here
are some examples.
Back‐ups
Most networks back‐up the users' data onto magnetic tape during the night. The data will
generally not be needed although occasionally a user will need to get back a copy of a file that
they have accidentally deleted and very occasionally the whole network will need to be
recovered after a system crash.
A shop
A shopkeeper might keep a record of each transaction made in the shop in time order (a serial
file) over a 24‐hour period. This file will hold details of each sale; what was sold and how many
of each item. This file (also known as a ‘transaction file’) can then be used at the end of the 24‐
hour period to update the ‘master file’ of products. The master file is a record of the products
and what is in stock. It doesn't change except when it is updated using the transaction file. Once
updated, the stock control system can then be used to automatically re‐order items.
IT (9626)
Theory Notes
It is also worth noting that updating the master file using the transaction file is done using
batch processing. In batch processing systems, all the data is collected together in one place (in
the transaction file) before being processed. Once a transaction file has been used to update the
master file, the transaction file can be cleared out. It will then be ready for a new transaction
period. Master files are permanent. Transaction files exist only for the transaction period, and
then they are cleared. It is also worth noting here that while a serial file can be used for this
application, a more efficient way to process the transaction file would be to re‐organize it into a
sequential file first.
A payroll system
When a company's employees 'clock‐on' or ‘clock off’, a record of this event is stored in a serial
file set up for this purpose. The entry in the file will record the employee number and the time
and date they clocked on or off. At the end of the week this serial file (take note that it is also
known as a 'transaction file') is sorted into a sequential file ‐ sorting the serial file into a
sequential file would collect together each employee's entries, which are scattered throughout
the serial file, and put them together and into the order that the employee records are stored
on the master file, probably by employee number. The sequential file is then batch processed
with the master file (the master file contains an up‐to‐date record of each employee, what their
hourly rate is and a running total of their pay and deductions for the current tax year). The
processing in this example means that the number of hours each employee has worked in total
for the last week is calculated. The pay due is then worked out and the deductions (tax, NI,
pension contributions and so on) made. Pay slips are printed out. The serial file is then cleared,
ready for the next period of record collection.
IT (9626)
Theory Notes
Sequential files
Serial files are organized by time. There are situations where this may not be the best way to
organize data. Consider the above example involving a shopkeeper. Over a 24‐hour period, the
same products will be bought many times yet the details of each transaction involving the same
product will be scattered throughout the serial file. When the serial file is processed, it will
clearly need to return time after time to one product's details in the master file every time it
comes across a transaction for that product. Having to constantly re‐visit master file entries isn't
very efficient! It would be better if the serial file, before the master file was updated, were
processed to put all of the transactions for each of the products together. Not only that,
however, it would be useful if the records were in the same order as the records in the master
file. Then a product could be accessed once on the master file and all of the transactions that
have taken place involving that product could be processed in one go. And since the sequential
file and the master file records are in the same order, the records will be processed together in
the most efficient way. You won’t get a record from the master file and then have to hunt
through the sequential file.
A file that is in some kind of order other than time order is known as a sequential file. In the
above example, the master file might be a file of product details held in product ID order. We
should therefore create a sequential file organized by product ID by processing the serial file,
ensuring that all the product transactions for each product are together. We would then batch
process the sequential file to update the master file.
If you store files on a magnetic tape, for example, to back‐up a hard drive then the files on the
tape will be either in serial or sequential order. You cannot go directly to a data item but have to
go through all the other data items to get to it. This means data access can be slow, especially if
there are lots of records. There are many occasions when you need fast access to some data.
You first of all need to select a storage medium that allows you to access directly areas of data
such as a floppy disk, hard disk or CD R/W (but not magnetic tape) and then you need a file
structure that allows you to go straight to some data.
IT (9626)
Theory Notes
An enquiry system
Index sequential is a method that speeds up access to data. It does this by taking a sequential
file and splitting it up into areas. Each block of data is stored in its own area. An index is then
provided that points to each area. For example, suppose you had to design a database that
allowed you to retrieve details about authors. You would get your file of authors and put them
in a sequential order. At the beginning of the file, you would create an index, like this:
For authors beginning with A go to address 23000.
For authors beginning with B go to address 24000.
For authors beginning with C go to address 25000 and so on.
When someone wants to get back details of an author, they:
Type in the author's surname.
The first letter of the author is stripped out.
The letter is looked up in the index.
The computer jumps to the address that corresponds to the letter.
A sequential search begins from that address, until the author is found or the end of the
file is reached.
Acknowledgements:
Resource by: theteacher/info