Computing Notes - A Level
Computing Notes - A Level
The systems analyst now needs to examine whether the said problem is real by carrying out an in-depth study,
after getting permission to conduct a feasibility study.
Page 1 of 185
- Can be shown using the Waterfall Model or the Stepping Stone Model, which encompasses the following
stages, in their order;
Feasibility Study
Data Collection
Analysis of the problem
System Design
System Development and Testing
System Implementation
System Maintenance
1. Feasibility Study:
-This is a preliminary investigation conducted to determine if there is need for a new system or modification of
the existing one.
-The Analyst examines whether a new system is feasible or not.
He assesses the magnitude of this problem and decides the scope of the project.
He examines the problem of the current system and what will be required of the new system.
-It involves evaluation of systems requests from users to determine if it is feasible to construct a new one.
Feasibility can be measured by its:
Economic feasibility: determining whether the benefits of the new system will out-weigh the
estimated cost involved in developing, purchasing, installing and maintenance of the new system. The
cost benefits analysis is important. Benefits can be tangible and quantifiable, e.g. profits in terms of
monetary values, fewer processing errors, increased production, increased response time, etc. Other
benefits are intangible, e.g. improved customer goodwill, employee morale, job satisfaction, better
service to the community, etc.
Technical feasibility: determines if the organisation can obtain software, equipment, technology and
personnel to develop, install and operate the system effectively.
Schedule feasibility: a measure of how long the system will take to develop, considering the desired
time frame.
Social feasibility: Will the system be acceptable by the local people, considering their values and
norms in their society? This also looks at impacts like loss of jobs,
Legal feasibility: determines whether the new system will not violet the legal requirements of the
state, for instance, laws outlined in the Data Protection Act.
Operational feasibility: determines whether the current work practices and procedures are adequate
to support the system, e.g. effects on social lives of those affected by the system
Thus the analyst must consider the following questions when producing a feasibility study:
- Is the solution technically possible?
- Is the solution economically possible to produce?
- Is the solution economic to run?
- Will the solution be socially acceptable?
- Is skilled workforce available? If not, are training requirements feasible
- How will the system affect employees
- Will profits increase?
- How long will it take to produce the system?
- etc
After carrying out the feasibility study, a feasibility study report must be produced and it contains the
following information:
A brief description of the business.
Advantages and problems of the existing system.
Objectives of the new system.
Evaluation of the alternative solutions.
Development timetable.
Management summary
Terms of reference. Contents page.
Proposed solution, its advantages and disadvantages
Compiled By Kapondeni T.
Page 2 of 185
2. DATA COLLECTION
The systems analyst collects data about the system. The fact finding methods that can be used include:
interviews, record inspection, questionnaire, observations, etc.
i. Interview:
This refers to the face-to-face communication between two or more people in order to obtain information.
Interviews can also be done over the phone but the most common ones are face to face. Interviews are done
when you want to collect information from a very small population sample.
Advantages of Interviews
Effective when gathering information about a system
The researcher can ask for clarification on some points that may not be clear.
Encourages good rapport between the researcher and the respondent.
Non-verbal gestures like facial expressions can help the researcher to determine if the respondent is
telling the truth.
Information can be collected even from the illiterate since the respondents language could be used.
First-hand information is collected.
The researcher can probe to get more information.
Disadvantages of Interviews
It is expensive since the researcher has to travel to the interview venue.
Difficult to remain anonymous
It is time consuming as more time is spent travelling and carrying out the interview.
Good interview techniques are required as failure may lead to disappointments.
Biased information can be given since the respondent may not tell the truth.
ii. Record inspection:
A fact finding method which involves scrutinising system documents in order to solicit information. Record
inspection has the following Advantages:
Accurate information is collected from system records.
Shows how data is collected within the system
Shows the exact data that is collected
Shows information that must be produced by the system
First-hand information is obtained.
Gives a good idea of the ways things are actually done rather than how they are supposed to be done.
Page 3 of 185
Page 4 of 185
Data type
Length
Validation criteria
Amount of storage required for each item
Who owns the data
Who accesses the data
Programs which uses the data
etc
The analysis stage determines whether computerisation will take place or not. The analysis stage also specifies
the hardware and software requirements and whether they will use in-house software or outsource the
program. The analysis stage looks at the following aspects:
Understanding the current system
Produce data flow diagrams
Identify the user requirements
Interpret the user requirements
Agree on the objectives with the user
Collect data from the current system
Dataflow Diagrams
These are diagrams that show how data moves between external sources, through processes and data stores of
a particular system. Dataflow diagrams use the following symbols:
Compiled By Kapondeni T.
Page 5 of 185
Page 6 of 185
PROTOTYPING
Involves building a working but limited model of a new system that will be tested, evaluated and improved if
necessary before building the actual system. It involves construction of a simple version of a program which is
used as part of the design to demonstrate how the system will work.
It is a mock-up of parts of the system for early evaluation
Reasons for prototyping:
- Gives an idea of the system before development
- enables clear identification of requirements
- allows revision and adjustments before full system is developed
The prototype will have a working interface but may not actually process data
Special software will be used to design input screens and to run the system.
The prototype can then be discarded and the real system designed using other or the same software (throw
away prototype).
Prototyping can be used at any stage of the SDLC. The prototype can be further refined until the user is
satisfied with it and then it is implemented as it is (Evolutionary prototype).
Benefits of prototypes:
cheaper to setup than alternative methods that could be used to predict what will happen in a system
faster to design a system model
Gives user the chance to experience the look and feel of the input process and make suggestions where
necessary.
System is more likely to have fewer or no errors
More acceptable to users of the system since they are also involved in the design
Disadvantages of prototyping
prototypes can be very expensive to design
takes too long to finish system design, especially if the prototype is thrown away
SYSTEM DOCUMENTATION
Documentation refers to the careful and disciplined recording of information on the development, operation
and maintenance of a system. Documentation is in two main types: user documentation and technical
documentation
(a) User Documentation: It is a manual that guides system users on how to load, operate, navigate and exit a
program (system). User documentation contains the following:
System/program name.
Storage location.
System password.
Compiled By Kapondeni T.
Page 7 of 185
Page 8 of 185
program using a specific programming language, testing of the coded program, user training (users are trained
on how to enter data, search records, edit fields, produce reports, etc).
Testing strategies
First step involves testing of the programs and various modules individually, e.g.
- Top-Down testing: program is tested with limited functionality. Most functions are replaced with
stubs that contain code. Functions are gradually added to the program until the complete program is
tested.
- Bottom up testing:
Each function is tested individually and then combined to test the complete program.
- Black-box testing:
Program is regarded as a black box and is tested according to its specification.
No account is taken of the way the program is written
Different values are entered for variables to determine whether the program can cope with
them. This includes standard (typical/normal), extreme (borderline) and abnormal data values.
testing will include:
Use of extreme, standard and abnormal data
Inputting error free data into the system to see if error free outputs can be produced.
Inputting data that contains errors into the system to see if the validation procedures
will identify the errors.
Inputting large quantities of data into the system to test whether or not the system can
cope with it.
Testing all the regular and occasional processing procedures.
Inputting data that contains extreme ranges of information to check that the validation
procedures can cope with it.
- White-box testing:
Each path through the program is tested to ensure that all lines of code work perfectly.
Involves testing the program to determine whether all possible paths through the program
produce desired results
Mostly appropriate if the program has different routes through it, i.e. uses selection control
structure and loops
Involves testing of logical paths through the code
Involves testing of the structure and logic of the program (if it has logical errors)
Involves desk checking (dry running)
- Alpha testing:
The first testing done within the developers company (at owners laboratory).
Testing is done by members of the software company
Some errors may still be in existence after alpha testing as the testers are programmers not
users.
The software version will be unfinished
Testers have knowledge of the software and of programming
- Beta testing: System testing done after alpha testing; in which the program version is released to a
number of privileged customers in exchange of their constructive comments. Mostly similar to the
finally released version.
Once a program is tested, it is installed and the analyst can now test it. A very large program must be tested
using the following types of tests:
1. Unit testing: the process of testing each program unit (sub-routine/module in a suite) singly to determine
if it produces expected results.
2. Integration Testing: testing to see if modules can combine with each other and work as expected. The
whole program is tested to determine if its module integrate perfectly
3. System testing: the testing of the whole program after joining the modules to determine if it runs
perfectly.
4. User acceptance testing: determining if users of the new system are prepared to use it. Usually the final
step. It enables identification of some bugs related to usability. User gain the confidence that the program
being ushered meets their requirements
Compiled By Kapondeni T.
Page 9 of 185
Ergonomics: the design and functionality of the computer environment and includes furniture setup,
ventilation, security, space, noise, etc. some of the ergonomic concerns include:
Incorrect positioning of the computer facing the window can lead to eyestrain from screen glare. Incorrect
sitting positioning can lead to backache. Constant typing with inadequate breaks can lead to RSI. Printer noise
can lead to stress. Badly designed software can cause stress. Trailing electricity cables are a safety hazard.
6. System Implementation/ Conversion (Installation/Changeover)
This also involves putting the new computer system into operation, that is, changing from the old system to the
new one. It involves file conversion, which is the changing of old data files into the current format. Different
changeover methods can be used, and these include:
a. Parallel Run: This involves using of both the old and new system concurrently until the new system proves
to be efficient. It involves operating the new and old systems simultaneously until management is confident
that the new system will perform satisfactorily. Other workers will be using the old system while others use
the old system but doing the same type of job.
Both the old and new system run concurrently until the new system proves to be
efficient.
Used for very important applications.
Costly (expensive) but the costs are worth paying for.
can correct the new system while the old system is running
allows workers to get familiar with the new system
Output from new system is compared with output from existing system.
Compiled By Kapondeni T.
Page 10 of 185
iii. Phased / Partial conversion: This is whereby the old system is gradually removed while the new system
is gradually moved in at the same time. This can be done by computerising only one department in an
organisation this month, then the next department in two months time, and so on until the whole system is
computerised.
Advantages of phased conversion
Avoids the risk of system failure.
Saves costs since the new system is applied in phases.
It could be easier to revert to the old system if the new system fails since only one department will be
affected.
Disadvantages of phased conversion
It could be very expensive since the organisation will be running two systems but in different
departments.
iv. Pilot conversion: one area of organization is converted to the new system while the remainder
This is whereby a program is tested in one organisation (or department), and is applied to the whole
organisation if it passes the pilot stage. It serves as a model for other departments. A pilot program can then be
applied in phases, directly or using the parallel run method.
7. Maintenance/review/evaluation Stage:
This stage is concerned with making upgrades and repairs to an already existing system. Certain sections of
the system will be modified with time.
Maintenance can be
(1) Perfective Maintenance:
Implies that there room for improving the system even if it is running effectively. For example,
improving report generation speed to improve response time. May also incde adding more
management information into the system.
(2) Corrective Maintenance
Involves correcting some errors that may emanate later, for example, wrong totals, wrong headings on
reports, etc. such errors may have been realized when the system has been later a short period of time.
(3) Adaptive Maintenance
Involves making the system adapt to changing needs within the organization. For example, changing
the system from being a standalone to a multi-user system. May be caused by purchasing of new
hardware, changes in software, new government legislation and new tax bands.
NB
Criteria used to evaluate a computer based solution includes the following:
- Were the objectives met? (Successes of the system are compared with set objectives)
- Does it carry out all the required tasks?
- Easiness to use (user friendly)
- Maintainability
- Compatibility with existing systems and hardware
- Offering better performance than the previous one
Compiled By Kapondeni T.
Page 11 of 185
Different paths of the project using lines of different value according to length of task
Longest journey along the arrows must show the shortest time of completion
Critical path:
All activities that are critical and are used to determine how long it will take to complete the project; they must
not delayed if the project is to finish in time.
Critical Path Analysis:
Analysis of a set of project activities so that delays and conflicts are resolved to reduce project development
time to a minimum. It is used to control the sequences of tasks within the project.
Critical Path Method (CPM).
A mathematical algorithm for scheduling and displaying a set of project activities. It includes the following:
- List of all activities required to complete the project
- Time (duration) for each activity will take
- Dependencies between activities (other activities done before others)
CPM calculates the earliest STARTING and latest FINISHING time for each activity.
-It reveals critical activities
-Float time (less critical) length of time an activity can be delayed or overrun without the whole project
being affected , e.g. setting up a printer can be done at the same time as installing the computer but will not
take as long.
Compiled By Kapondeni T.
Page 12 of 185
GANTT CHART
Software for producing Gantt Charts should have some of the following features:
Should be able to show individual components of tasks
Should show the earliest starting time
Should indicate latest ending times
Should be able to show relationships between components
Should show the shortest time to finish
Every aspect should be diagrammatic
It should be simple to follow
Should contain review milestones
Should show percentage of chart finished
Should automatically generate reports on costs
SYSTEM SECURITY
It is important to keep data secure so that it may not be damaged or get lost. The risks and their solutions are as
follows:
Risk
Hardware
failure
Fire
Theft
Disgruntled
employees
Hackers
Viruses
Floods
Solution
-Frequent back-up of data, at least one copy to be kept at different locations on
daily basis
-Log files to be kept for all transactions
Keep backup file at fireproof safe or storage at an alternative location
Physical security measures like locking rooms, use security cameras, guards,
electric fence, screen gates, etc
Employee checks (ID cards to check workers, careful vetting during
employment, instant removal of access right to sacked workers, separation of
tasks for workers, educating workers to be aware of security breaches)
Usernames & Passwords, firewalls
Latest and updated Antiviruses (, firewalls
Building rooms at higher grounds, waterproof safes for backups
If a hard disc fails, files can be recovered by using the last backup, which is copied on to another hard disc.
The log file should be used to update the master file.
During the recovering process, the master file will not be available but the system could be maintained at a
lower level of services. Any transaction could be logged and used to update the master file when the system is
up and running.
Employee resistance: When a new system is introduced, some employees may resist the change and this
could be catastrophic if not handled appropriately. Some fear losing their jobs, of being demoted or being
transferred and change of their job description. Resistance can be in the following forms:
Through strikes and demonstrations.
Giving false information during system investigation.
Entering wrong and inappropriate data so that wrong results are produces, etc.
User training:
Once a new system is put in place, existing employees are trained on how to operate the new system,
otherwise new employees are recruited. User training can be in the following forms:
i. On the job training: Users are trained at their organisation by hired trainers. This has the following
advantages:
Learners practice with actual equipment and the environment of the job.
Learners can engage in productive practices while on training.
This is cheaper for the organisation.
Enough practice is gained on how to operate the system.
Disadvantages of on the job training
Distractions occur in a noisy office.
Instructional methods are often poor.
Compiled By Kapondeni T.
Page 13 of 185
[2]
State an application for which a command-line interface would be suitable. Justify your choice
[2]
[2]
(ii) State an application for which a form-based interface would be suitable. Justify your choice
[2]
2.
The offices of a government department deals with local taxes in a city. It is decided to develop new
software for dealing with the calculation of tax bulls. A systems analyst is employed to develop the
software.
(a) Explain why care must be taken in defining the problem to be solved
[2]
(b) State the methods that the systems analyst can use to find out more information about the problem,
giving an advantage of each.
[4]
(c) Explain the importance of evaluating the system against the original specifications.
[2]
3.
Compiled By Kapondeni T.
Page 14 of 185
4. The analyst needs to collect information about the current system. State one advantage and one
disadvantage of each of the following methods of information collection.
(a)
(b)
(c)
(d)
Questionnaires
Interviews
Document Collection
Observation
[8]
5. (a) The analyst has to decide whether to use off-the-shelf or custom written software. Explain what
is meant by:
(i)
(ii)
Off-the-shelf-software
Custom written software
[2]
(c) List three advantages and one advantage of using off-the-shelf software rather than custom-written
software
[4]
6. (a) What is meant by
i.
ii.
User documentation
Technical documentation
[2]
(b) State two items of documentation which would be included in each of the following:
i. User documentation
ii. technical documentation
Compiled By Kapondeni T.
[4]
Page 15 of 185
CHAPTER 2: DATABASES
A database is a single organised collection of structured and related data,
stored with minimum duplication of data items so as to provide consistent
and controlled data within an organisation.
Databases can be accessed using different application software.
Data stored in databases can be accessed by all system users, but with
different access rights.
Databases are designed to meet the information needs of an organisation.
Database operations may include addition of new records, deletion of
unwanted records, amendments to existing records, creation of relationships
between files and removal or addition of fields of files.
Databases ensure controlled redundancy since redundancy cannot be wholly
eliminated.
Database terms:
Entity: physical objects like person, patient or events on which information or
data is being collected. It can also be an abstract object like a patient record.
Attribute: individual data item within an entity; e.g. date of birth, surname.
Relationship: links between two different entities or relations (tables). E.g A
student stays at St. Augustines High School. The entities here becomes student
and St. Augustines High School, of which the relationship becomes stays
Data Dictionary:
It is a table holding information about a database
A file (table) with description of the structure of data held in a database
Used by managers when they modify the database.
Not visible to (used by) general users.
It maps logical database to physical storage
Allows existence checks on data to be carried out.
Stores details of data used, including the following
Name of data item (fields or variables)
Data type
Length
Validation criteria
Amount of storage required for each item
Who owns the data
Who accesses the data
Programs which uses the data
Flat file: Data stored in a single file (table), allowing simple structuring, e.g.
spread sheet database file of student records. Data is stored in rows representing
records while columns represent fields. Thus data is stored in a two dimensional
format.
Building Block of Computerised Databases
Compiled By Kapondeni T.
Page 16 of 185
Page 17 of 185
Disadvantages
o Substantial hardware and system software overhead
o May promote islands of information problems
o However, it may be difficult to come up with relationships.
Database Keys:
A simple key contains a single attribute.
A composite key is a key that contains more than one attribute.
A candidate key is an attribute (or set of attributes) that uniquely identifies a
row. A candidate key must possess the following properties:
o Unique identification - For every row the value of the key must
uniquely identify that row.
o Non redundancy - No attribute in the key can be discarded without
destroying the property of unique identification.
Super key: An attribute or a set of attributes that uniquely identifies a tuple
within a relation. A super key is any set of attributes that uniquely identifies a
Compiled By Kapondeni T.
Page 18 of 185
row. A super key differs from a candidate key in that it does not require the
non-redundancy property.
Primary key: It is a candidate key that is used to identify a unique (one)
record from a relation. A primary key is the candidate key which is selected
as the principal unique identifier. Every relation must contain a primary key.
The primary key is usually the key selected to identify a row when the
database is physically implemented. For example, a part number is selected
instead of a part description.
Foreign key: A primary key in one file that is used/found in another file.
Foreign key is set of fields or attributes in one relation that is used to refer
to a tuple in another relation. Thus it is a filed in one table but also used as a
primary key in another table.
Secondary Key: A field used to identify more than one record at a time, e.g. a
surname.
*NB: Attribute: A characteristic of a record, e.g. its surname, date of birth.
Entity: any object or event about which data can be collected, e.g. a patient,
student, football match, etc.
2. Network (Distributed) Databases
It is whereby several computers on the network each hold part of the data which
can be shared among the networked computers (users). If data is not available on
a users computer, the user communicates with others on the network to obtain
it. Each computer creates its own backup of data resident on it.
These databases have links that are used to express relationships between
different data items.
They are based on the principle of linked lists
Data is maintained by a single input.
There is little duplication of data.
There is no duplication of inputs.
Linkages are more flexible.
Many to many relationships to records are limited
Compiled By Kapondeni T.
Page 19 of 185
Page 20 of 185
Benefits
OODBMS are faster than relational DBMS because data isnt stored in
relational rows and columns but as objects. Objects have a many to many
relationship and are accessed by the use of pointers, which will be faster.
OODBMS is that it can be reprogrammed without affecting the entire
system.
Can handle complex data models and therefore is more superior than
RDBMS
Disadvantages
Pointer-based techniques will tend to be slower and more difficult to
formulate than relational.
Object databases lack a formal mathematical foundation, unlike the
relational model, and this in turn leads to weaknesses in their query
support.
Database Management System (DBMS)
- It is a complex layer of software used to develop and to maintain the database
and provides interface between the database and application programs.
- It allocates storage to data.
- It also allows a number of users to access the database concurrently.
- The DBMS maintains data by:
o adding new records,
o deleting unwanted records,
o amending records.
- Data in databases can be accessed using different programming languages.
Compiled By Kapondeni T.
Page 21 of 185
Hardware
Can range from a Personal Computer to a network of computers where the
database is run.
Software
DBMS software, operating system, network software (if necessary) and also the
application programs.
Data
Includes data used by the organization and a description of this data called the
schema. The data in the database is persistent, integrated, structured, and
shared.
Procedures
Instructions and rules that should be applied to the design and use of the
database and DBMS. Procedures are the rules that govern the design and the use
of database. The procedure may contain information on how to log on to the
DBMS, start and stop the DBMS, procedure on how to identify the failed
component, how to recover the database, change the structure of the table, and
improve the performance.
People:
Users or people who operate the database, including those who manage
and design the database
DBMS Software components
Compiled By Kapondeni T.
Page 22 of 185
Page 23 of 185
Page 24 of 185
2.
3.
4.
5.
DISPLAY RECORD
DELETE RECORD
EDIT RECORD
MY OPTION IS: __
The user has to enter 1, 2, 3 or 4 and then press enter on the keyboard.
Advantages:
It is fast in carrying out task.
The user does not need to remember the commands by heart.
It is very easy to learn and to use.
Disadvantages:
The user is restricted to those few options available and thus is not flexible to use.
Form-Based Interfaces
-
Page 25 of 185
Advantages
- It saves disk storage space since there are no icons and less graphics
involved.
- It is very fast in executing the commands given once the user mastered the
commands.
- It saves time if the user knows the commands by heart.
Disadvantages
- It takes too long for the user to master all the commands by heart.
- It is less user friendly.
- More suited to experienced users like programmers.
- Commands for different software packages are rarely the same and this will
lead to mix-up of commands by the user.
Page 26 of 185
Disadvantages of GUI
- The icons occupy a lot of disk storage space that might be used for storage of
data.
- Occupy more main memory than command driven interfaces.
- Run slowly in complex graphics and when many windows are open.
- Irritate to use for simple tasks due to a greater number of operations needed
DBMS structure (views/schema)
View: Refers to how a user sees data stored in a database. Each user has his/her
own view of data, e.g. a standard database user can be restricted from seeing
(viewing) sensitive data that only managers can view. These two have thus
different views of the database.
A relation that does not necessarily actually exist in the database but is
produced upon request, at time of request.
Contents of a view are defined as a query on one or more base relations.
Views are dynamic, meaning that changes made to base relations that affect
view attributes are immediately reflected in the view.
Provides powerful and flexible security mechanism by hiding parts of
database from certain users.
Permits users to access data in a customized way, so that same data can be
seen by different users in different ways, at same time.
Can simplify complex operations on base relations.
A users view is immune to changes made in other views.
Users should not need to know physical database storage details.
Schema: Refers to the overall design of the database. It can be a collection of
named objects. Schemas provide a logical classification of objects in the database.
A schema can contain tables, views, functions, packages, and other objects.
Sub-schema: describe different views of database.
It consists of three levels/abstractions: External, conceptual and internal levels
Compiled By Kapondeni T.
Page 27 of 185
Page 28 of 185
Data Independence
Data independence means that programs are isolated from changes in the way
the data are structured and stored. Data independence is the immunity of
application programs to changes in storage structures and access techniques. For
example if we add a new attribute, change index structure then in traditional file
processing system, the applications are affected. But in a DBMS environment
these changes are reflected in the catalogue, as a result the applications are not
affected. Data independence renders application programs immune to changes in
the logical and physical organization of data in the system.
Logical organization refers to changes in the Schema. Example adding a column or
tuples does not stop queries from working.
Physical organization refers to changes in indices, file organizations, etc
Compiled By Kapondeni T.
Page 29 of 185
Page 30 of 185
Compiled By Kapondeni T.
Page 31 of 185
Page 32 of 185
Page 33 of 185
Files can be linked together making file updating easier and faster. Reduces
data redundancy. Redundancy means duplication of data. Data redundancy
will occupy more space hence it is not desirable as it will be more expensive to
the organisation.
Data can be secured from unauthorised access by use of passwords.
Users can share data if the database is networked. Duplication of records is
eliminated.
Ad hoc reports can be created easily.
Improves Data Integrity: refers to the correctness of data stored in
databases. Data accessed will be similar to all users, removing contradictions
caused by duplicates of records with different data values. This is because
most of the information is stored only once. Integrity is also enhanced as data
is protect from wrong/inappropriate processing thereby leading to users
trusting the correctness of data
Sorting of records in any order is very fast
Removes data inconsistency: inconsistency means different copies of the same
record will have data with different values.
Disadvantages of databases
If the computer breaks down, you may not be able to access the data.
It is costly to initially setup the database.
Computer data can be easily copied illegally and therefore should be
password protected.
Takes time and costs to train users of the systems.
Expensive to employ a database administrator who will manage the database
Individuals are concerned (worried) with their data held in Computers. This is
because of:
Some people do not want others to see their details (personal data)
Individuals may be targeted because of their property or wealth
May lead to comparison with other peoples details, which may negate
relationships with friends and colleagues
May lead to blackmail if the data stored is wrong
Some of the data may be wrong
Some of the data may be used for other purposes against the owner
May lead to identity theft
Relational Database Vs Flat File
Relational database
Less duplication of data as data does
not need to in every table
Offer greater data integrity as there is
little chance of getting duplicate of
data
Compiled By Kapondeni T.
Flat file
Too much duplication of data since
every table repeats data
No data integrity is guaranteed due
to too much duplication of data
Page 34 of 185
Page 35 of 185
If one wants to search students who paid $24 and the number of Subjects as 5, he
enters the following in the design view of the table query;
Compiled By Kapondeni T.
Page 36 of 185
The above can be written in SQL as given below in order to produce the same
result:
SELECT [ALL / DISTINCT] expr1 [AS col1], expr2 [AS col2] ;
FROM tablename WHERE condition
SELECT
tblExams.[STUDENT
NUMBER],
tblExams.[AMOUNT
PAID],
tblExams.[DATE PAID], tblExams.[NO OF SUBJECTS], tblExams.[RECEIPT NUMBER]
FROM tblExams
WHERE
(((tblExams.[AMOUNT
PAID])=25)
AND
((tblExams.[NO
OF
SUBJECTS])=5));
The SQL will produce the following result:
Page 37 of 185
Normalization stages
Compiled By Kapondeni T.
Page 38 of 185
Compiled By Kapondeni T.
Page 39 of 185
The record now has a compound key of using the Num and ProdID which is
illustrated as:
DELNOTE(Num, CustName, City, Country, ProdID, Description)
Underlined fields are the keyfield which is a combination of two attributes. In this
situation, we have identified a key which uniquely identifies a record.
2NF - Second normal form
The table in 1NF will lead to too much duplication of data, especially on Num,
CustName, City, and Country.
A table is in second normal form (2NF) if and only if it is in 1NF and every
non-key attribute is fully dependent on the primary key. There should be no
partial dependencies. A relation that is in 1NF and every non-primary key
attribute is fully dependent on the primary key is in Second Normal Form
(2NF). That is, all the incomplete dependencies have been removed.
In our example, using the data supplied, CustName, City and Country
depend only on Num and not on ProdID. Description only depends on
ProdID, it does not depend on Num. We say that
Num determines CustName, City, Country
ProdID determines Description
Which can be written as follows
Num CustName, City, Country
ProdID Description
To solve this proble, we introduce a dummy (functional dependency) for
the primary keys of the above. That is:
Num, ProdID
0 (dummy functional dependency/Link table)
We now get three relations, which are:
DELNOTE(Num, CustName, City, Country)
PRODUCT(ProdID, Description)
DEL_PROD(Num, ProdID)
DEL_PROD needs a compound key because a delivery note may contain
several parts and similar parts may be on several delivery notes. We now
have the relations in 2NF, which will appear as follows:
Compiled By Kapondeni T.
Page 40 of 185
However, in the table DELNOTE, Country depends on City not directly on the
primary key Num. We need to make sure that all non-key fields in all tables are
fully dependent on the primary key and not on other non-key fields. We now need
to normalize this into 3NF.
3NF - Third normal form
To be in Third Normal Form (3NF) the relation must be in 2NF and no
transitive dependencies may exist within the relation. A transitive
dependency is when an attribute is indirectly functionally dependent on the
key (that is, the dependency is through another non-key attribute).
A relation that is in 1NF and 2NF, and in which no non-primary key
attribute is transitively dependent on the primary key is in 3NF. That is, all
non-key elements are fully dependent on the primary key.
Num City Country (A transitive relationship: thus num transitively
determines the country)
This transitive relationship must be removed in order to have 3NF. Thus we
can have the following:
DELNOTE(Num, CustName, City)
CITY_COUNTRY(City, Country)
PRODUCT(ProdID, Description)
DEL_PROD(Num, ProdID)
Compiled By Kapondeni T.
Page 41 of 185
DATABASE RELATIONSHIPS
Attributes
This is a property or characteristic of an entity. Attributes are properties of
entities. In other words, entities are described in a database by a set of attributes.
The following are example of attributes:
Brand, cost, and weight are the attributes of CELLPHONE.
Student number, name, and grade are the attributes of STUDENT
Entity
An entity is something of interest to an organisation about which data is to be
held. It could be a person, place, object, event or concept about which data is to be
maintained.
An entity is an object that exists and is distinguishable from other objects.
In other words, the entity can be uniquely identified.
Compiled By Kapondeni T.
Page 42 of 185
Types of Relationship
One-to-one
Eg Products in a supermarket each have a unique barcode number.
A department in school is led up by a HOD, and this person only leads one
department
Compiled By Kapondeni T.
Page 43 of 185
One-to-many
Eg A video club member may hire out a number of videos.
The head of department may be in charge of many staff, but these staff
members only have one head of department.
Many to One
Many videos can be hired by one member.
Many-to-many
Teachers and pupils in a school. Each teacher teaches many pupils and each
pupil has many teachers.
A teacher may order many books, but each book could be ordered by many
teachers.
Entity-Relationship Diagram
An entity-relationship diagram is a diagrammatic way of representing the
relationships between the entities in a database.
Compiled By Kapondeni T.
Employee
drives
Page 44 of 185
Company car
(One-to-one)
Example
A hospital is organised into a number of wards.
Each ward has a ward number and a name recorded, along with a number
of beds in that ward.
Each ward is staffed by nurses.
Nurses have their staff number and name recorded, and are assigned to a
single ward.
Each patient in the hospital has a patient identification number, and their
name, address and date of birth are recorded.
Each patient is under the care of a single consultant and is assigned to a
single ward.
Each consultant is responsible for a number of patients. Consultants have
their staff number, name and specialism recorded.
Many to-Many relationships are not encouraged in E-R diagrams since they
violate the 3NF of databases. To remove M-N relationships, a link entity is used
to link entities with a M-N relationship as illustrated below:
Compiled By Kapondeni T.
Page 45 of 185
Data security
Refers to methods of keeping data safe from various hazards and from
unauthorized access and this includes:
- Natural hazards like fire, floods, etc
- Deliberate destruction/corruption by former employees or by terrorists.
- Illegal access to data by hackers, who may steel, amend or destroy the data
- Accidental loss of data due to hardware failure, software failure, etc.
(Refer to Heathcote pages 105 - 109 for more on measures of ensuring data
security. Pupils must be able to describe the following in detail:
Keeping data secure from fraudulent or malicious damage;
Password protection
User IDs and passwords
Encryption
Access rights and user permissions
Different user views
Biometric measures
Periodic backups
Antiviruses and protection measures
Audit trails
System restore and Rollback facilities
Record locking
Pupils should describe/explain concepts above.)
REVIEW QUESTIONS:
1. A garden design company keeps records of its customers. Each customer has
had a design produced for them which will be one of a library of design types
stored by the company. Each design type uses plants. Each customer is sent an
account based on the number of plants in the design.
(a) Draw an E-R (entity-relationship) diagram in third normal form, based on this
information. [10]
(b) Each delivery of plants to the garden design company is identified by a batch
number. Explain how customers who received eucalyptus trees from batch 12 can
be contacted. [4]
2. A sports club runs a number of sports teams.
Compiled By Kapondeni T.
Page 46 of 185
Each team is made up of a number of members of the club and each member may
play for more than one team. Each team has a number of coaches, but the coachs
job is so time consuming that each coach can only coach one team.
Represent the above information on an entity relation (ER) diagram, in 3rd
normal form, stating the primary key for each entity. [13]
3. (a) In relation to databases, describe what is meant by each of the following
terms.
(i) Primary key. [1]
(ii) Secondary key. [1]
(iii) Foreign key. [1]
(b) Using, as an example, the database of student records in a school,
(i) Explain why different users should be given different access rights; [4]
(ii) Describe how these access rights can be implemented. [4]
4. A landscape garden company services a number of gardens. Each GARDEN is
owned by an OWNER. Each owner may have more than one garden. Each garden
has a number of PLANTS in it and each plant may be in a number of gardens.
Draw an entity relationship (E-R) diagram to represent this data model in third
normal form and label the relationships. [10]
5. A health centre employs doctors, nurses and receptionists.
The data that is stored about the patients includes their medical history and
personal information about them.
Explain the need for maintaining privacy of the data and describe methods by
which the database management system (DBMS) can help to achieve this. [6]
6. (a) The structure of a database management system (DBMS) consists of three
levels;
External level,
Conceptual level,
Internal level.
State the meaning of each of these levels. [3]
(b) Describe the purpose of the following:
(i) the data description language (DDL), [2]
(ii) the data manipulation language (DML). [2]
7. (a) Describe the function and purpose of the following parts of a database
management system (DBMS):
(i) data dictionary,
[2]
(ii) data description language, [2]
(iii) data manipulation language. [2]
(b) Three advantages of using a relational database rather than flat files are:
Compiled By Kapondeni T.
Page 47 of 185
Compiled By Kapondeni T.
Page 48 of 185
- A programming language is a set of symbols in computer language that are used in coding
computer programs.
- A programming language is a specially written code used for writing application programs e.g.
C, Pascal, COBOL, BASIC, Visual Basic, C++ and Java (Originally for intelligent consumerelectronic devices (cell phones), then used for creating Web pages with dynamic content, now also
used for developing large-scale enterprise applications)
- Program: a set of detailed and unambiguous instructions that instructs a computer to perform a
specific task, for example, to add a set of numbers.
- Programming: A process of designing, coding and testing computer programs
- Programmer: A person who specialises in designing, coding and testing computer programs
- Problem: any question or matter involving difficulty or uncertainty and is proposed for a computer
solution.
TYPES OF PROGRAMMING LANGUAGES
1. Low Level Languages (LLL):
- These are programming languages used to write programs in machine code (i.e in 1s and 0s) or in
mnemonic codes.
- Low level languages are machine oriented (machine specific).
- Low level language is in two forms:
(a) Machine Language and
(b) Assembly Language.
a. Machine code (language)
- Is the language used to write programs in binary form (1s and 0s).
- Machine code executes without translation.
- Machine language has the following advantages:
Programs run faster since they are already in computer language. There is no need for
conversion as programs are in machine language.
Programs occupy very small disc storage space by storing just 1s and 0s.
Disadvantages of Machine language:
- They are very difficult to learn.
- They are difficult to understand.
- Very difficult to use and therefore very few programmers use them these days.
- It takes too long to debug and therefore is prone to some errors.
- It takes too long to develop working programs.
- They are machine dependent (they can only work on type of computer designed for and not
work on other computers)
b. Assembly Language:
- These are programming languages that use mnemonic codes in coding programs.
- Mnemonic codes are abbreviations used to represent instructions when coding assembly language
programs, for example, LDA for Load, ADD for Addition, etc.
- One assembly language statement is equivalent to one machine code instruction and therefore
programming lengthy and time consuming.
- However, assembly language programs are efficient.
- Programs also run faster as they are closer to machine language and therefore are used in
designing programs that needs efficient timing, e.g. games like chess, operating systems, etc.
- Assembly language is used when there is need to access registers and memory addresses directly.
- Assembly language instructions also occupy very little disc storage space.
Compiled By Kapondeni T.
Page 49 of 185
- Mnemonic codes are very close to machine code, hence are low level language assembly language
codes.
- They however run on specific computer architecture since they are hardware aligned.
- They also contain different forms of instruction, e.g. jump, control, arithmetic, etc.
- Assembly language allows immediate, direct and other forms of memory addressing.
Application: Assembly language is used in:
- Coding operating systems
- Coding device drivers
- Coding programs for embedded systems like DVD players, decoders, etc.
- Coding encryption software
Advantages of Assembly language:
- One assembly language instruction corresponds to one machine code instruction and therefore
translation is easier and faster to code.
- Programs run faster since they are close to machine code.
- They occupy very small disk storage space hence are economical to use.
- Easier for a programmer to use than machine language.
Disadvantages of Assembly Language
- They are very difficult to learn.
- They are very difficult to understand.
- Takes too long to develop working programs.
- They can be machine dependent (machine oriented) unless the machines use the same
processor chip.
2. High Level Languages (HLL):
- These are programming languages that use English-like statements in coding programs, for
example COBOL, Pascal, BASIC, etc.
- High Level languages are mostly used for developing user applications like stock control systems,
personnel records, etc.
- There are so many high level languages because of competition from designers who want to
outpace each other.
- It can also be due to the fact that we have so many application areas in real life so each high level
language is designed for a specific problem (problem oriented/problem specific) to be solved in
our daily lives, for example BASIC was designed for learning purposes, COBOL for business
applications, FORTRAN for scientific purposes, etc.
- High Level languages are independent of the architecture of the computer.
- One statement is translated into several equivalent machine code instructions before it is executed.
- Below is an example of a BASIC program that accepts two numbers entered through the
keyboard, adds them and display the result on the screen:
INPUT ENTER FIRST NUMBER., A
INPUT ENTER SECOND NUMBER., B
SUM = A + B
PRINT SUM
END
- Programs written in High Level Language are first converted to machine code before running.
High level languages have the following features:
Problem oriented (Machine independent): they are designed to solve an application problem and
therefore runs on any machine
They are portable: they can be transferred from one machine to another and run without problem.
Instructions are written in English statements which are easier to understand.
Compiled By Kapondeni T.
Page 50 of 185
2.
-
Compiled By Kapondeni T.
Page 51 of 185
3.
Compiled By Kapondeni T.
Page 52 of 185
State:
These are attributes/characteristics/fields for an object, e.g. a car has colour, registration
number, model, etc.
Bicycles have state of current gear, current speed, etc.
Each object stores its own state.
Dogs have state (name, colour, breed, hungry).
Behaviour (operations):
Refers to methods that can be used on each object state, e.g. changing gear of a car, applying
brakes, etc.
Each class has its source code that is associated with it and defines the fields and methods.
Dogs have behaviour (barking, fetching, wagging tail).
Class:
- A set of objects which share a common data structure and a common behaviour.
- In coding a program, a class is taken as an abstract data type that describes the fields and methods
of the class.
- Each class has different access levels, which can be private, protected or public.
- Example of class declarations in Java are as follows:
class Bicycle {
int speed = 0;
int gear = 1;
void changeGear(int newValue) { gear = newValue; }
void speedUp(int increment) { speed = speed + increment; }
void applyBrakes(int decrement) { speed = speed - decrement; }
void printStates() {System.out.println(speed:"+speed+" gear:"+gear); }
}
-
Compiled By Kapondeni T.
Page 53 of 185
bike2.speedUp(10);
bike2.changeGear(3);
bike2.printStates();
}
}
-
Encapsulation
- This is a technique of combining operations (methods) and data (fields) into one unit as in classes.
- Encapsulation can be in two forms:
(a) Data encapsulation: Hiding states internally and requiring all interaction to be performed
through an object's methods. It involves restricting access to the states (fields).
(b) Procedural encapsulation: users do not need to know how the behaviour happens, that is
hiding operations from the user.
Inheritance
- In Object-oriented programming, inheritance is whereby classes can re-use (assume) commonly
used state and behaviour from their parent (base) class.
- Inheritance is the ability of a class to use the variables and methods of a class from which the
new class is derived (parent class)
- Inheritance allows a new class to be derived from an existing class.
- Inheritance therefore is a relationship among classes, where a sub-class (derived class) shares all
the fields and methods of the base (parent) class, plus its own methods and states.
- Consider the inheritance diagram below:
- Base Class: This is the parent class or the first class to be created from which other class can
inherit states and methods.
- Derived class: These are new classes that are created from the base class, and therefore have
methods and states of the base class plus their own methods and states.
- The syntax for creating a subclass is use the extends keyword, followed by the name of the class
to inherit from:
class MountainBike extends Bicycle
{
// new fields and methods defining a mountain bike
// would go here
}
Polymorphism
- In general, polymorphism is the ability to have the same operation performing differently in
different circumstances.
Compiled By Kapondeni T.
Page 54 of 185
- Polymorphism allows an operation to perform differently depending on the parameters that are
passed.
- This is the ability of classes to use the same name in the class hierarchy for a method but each
class implementing the method differently.
- In polymorphism, derived classes are able to re-define some of the base (super class) methods.
Containment/aggregation/composition
- These are links/associations between objects that allow them to communicate.
- For instance: a form on a screen is an object. On the object form, there are other objects, e.g.
delete button, display button, exit button, etc.
- These buttons (which are objects) communicate with the form (another object).
- Thus the linkage between the form and the buttons is called the containment.
Event driven programming
- This is whereby the sequence of code execution is determined (triggered) by external events or by
user actions, e.g. clicking a button on a form is an event that can form a program to execute certain
sections of the code.
- Thus clicking a button or menu item is an event.
- This is important because the programmer cannot predict the exact sequence a user can perform
his/her tasks.
Event handlers: small program codes which are invoked (called) in response to external events.
Dispatcher: small program codes that call event handlers so that events can be processed.
Advantages of OOP
Grouping code into individual software objects provides a number of benefits, including:
- Modularity: The source code for an object can be written and maintained independently of the
source code for other objects. Once created, an object can be easily passed around inside the
system.
- Information-hiding: By interacting only with an object's methods, the details of its internal
implementation remain hidden from the outside world.
- Code re-use: If an object already exists, you can use that object in your program.
- Easy debugging: If a particular object turns out to be problematic, you can simply remove it from
your application.
- Reliability: if codes are designed by specialists, they are more likely to free from errors due to
intensive testing.
- Time saving: re-use of existing methods and states means less time needed to code programs.
- Decreased maintenance costs: programmers have les code to maintain.
- Smaller program codes: since the states and methods of classes are re-used, the code of the
program is smaller, taking less disk storage space.
- Storage structures of an object may be altered without affecting the programs that make use of it.
Compiled By Kapondeni T.
Page 55 of 185
Page 56 of 185
- They do not produce the machine code version (object code) of a program; hence translation is
repeated every time the program is executed.
- If the program is run 100 times, translation for each instruction is also carried out 100 times.
Functions of Interpreters
- They translate each instruction in turn into machine language and run it.
- Allocates storage space to variables.
- They check syntax error in a program statement.
- Gives error messages to the user
- Finds wrong and reserved words of the program
- Determines wrong use of variables
Advantages of interpreters
- It is easy to find and correct syntax errors in interpreted programs.
- There is no need for lengthy recompilation each time an error is discovered.
- It is very fast to run programs for the first time.
- Allows development of small program segments that can be tested on their own without
writing the whole program.
- It is easier to partially test and debug programs, especially during the programming stage.
- It is very fast to run small programs.
- individual segments can be run, allowing errors to be isolated
- running will be necessary after very minor changes
- continual compilation of whole code is wasteful/time consuming
Disadvantages of interpreters
- They are very slow in running very large programs.
- They do not produce an object code of a source code and hence difficult to use since
conversion takes place every time the program is run.
3. Compilers
- These are programs that convert a high level language program into its machine code equivalent at
one go (at once) and then run it, e.g. the COBOL compiler.
- Compiler must be present for compiling the program only and NOT during the running process.
- Creates an object code version of the source code
- Once compiled, the program no longer needs conversion since the machine code version is the one
that will be run, until some changes are made to the program code.
- Compilers run faster when called and therefore may be held as library routines.
- Once compiled, the program can then be run even on a computer without the compiler since the
program will already be in machine code.
- The compilation processes involves many complex stages which will be looked later in this
course.
Functions of Compilers
- They check syntax errors in program statements.
- They allocate storage space to variables.
- Translate the whole program into machine code at one go.
- Run object code of the program.
- Produces a program listing which indicates position of errors in a program.
- Gives error messages to the user
- Finds wrong and reserved words of the program
- Determines wrong use of variables
Compiled By Kapondeni T.
Page 57 of 185
Advantages of Compilers
- The object code can be saved on the disc and run when needed without the need for
compilation.
- Compiled programs run faster since only the object code is run.
- The object code can run on any computer, even those without the compiler. Therefore
compiled programs can be distributed to many users and used without any problems.
- The object code is more secure since can cannot be read without the need for reverse
engineering.
- Compilers indicate the line numbers with syntax errors and therefore assist programmers in
debugging programs.
- They are appropriate even for very large programs.
Disadvantages of Compilers
- Slower than interpreters for running programs for the first time.
- They can cause the computer to crash.
- Difficult to find errors in compiled program.
- There is need for lengthy recompilation each time an error is discovered.
Difference between High Level Languages and Low Level Languages
1
2
3
4
5
6
7
Page 58 of 185
4. Fourth Generation Languages (1971) Very High Level Languages (4GLs): This saw the
development of non-procedural languages like SQL, PARADOX, etc.
5. Fifth Generation Languages(1981) - Natural Language, artificial intelligence, expert systems like
PROLOG, LISP.
Features of High Level Programming Languages
1. Programming Constructs
These are the basis from which high level languages are built. Programming constructs
includes:
(a)
Control Structures
(i)
If Then Else Construct
(ii)
Case statement
(b)
Looping Structures
(i)
For Next construct
(ii)
Repeat Until construct
(iii)
While Endwhile
2. Operators
Operators are used to manipulate data and they can be
(a)
Arithmetic operators
e.g. b+c-d*e/f
(b)
Logical operators
And, or, Not
(c)
Assignment operators
=
(d)
Comparison/relational operators
>, <, <=, >=, <>, Is
3. Identifiers
- An identifier is a unique label of a data item of element of a program and comprises of one or
more characters.
- Identifiers includes variable names, procedure names, constants, etc. Identifiers must not be
reserved words.
- Identifiers can be user defined as long as they are not reserved words.
4. Constants
- A constant is a data item (variable) whose value does not change during program execution.
- Its value is fixed.
- Constants are used to represent data items with fixed values, e.g. the value of pi. In VB 6.0, a
constant is declared as shown below:
Public Sub compute_interest()
Const pi=3.14159
Dim rs As New Recordset
Dim rs_loanpay As New Recordset
..
.
..
End Sub
If the constant is a string value, it must be enclosed in quotes, e.g.
Const Name = Kapondeni
In this case, the value of name will never change (when Name is declared as a constant)
Compiled By Kapondeni T.
Page 59 of 185
5.
a.
Variables
A variable is a name given to a memory location that stores a certain value, which may
change (the value) during program execution.
Variables can be field identifiers, e.g. surname is a valid variable name.
Variables must not be reserved words.
Variables must be unique in the procedure or program (if all are global).
Variables are declared at the beginning or at some point inside the program code. Every
variable must be declared before use, otherwise an error is generated.
Variable names, as are all identifiers, start with an alphabetic character.
They can be one character or a string of characters.
Variable names can be alphanumeric (combination of alphabetic and numbers).
They must be one word and must be related to the data stored in them so that the programmer
cannot be confused, e.g. Surname should be variable that stores a surname.
If two words are used as a variable, they must be joined by and underscore ( _ ), with no
spaces between the words, e.g. Student_Surname, NOT Student-Surname.
Alternatively, one may join the words as follows, StudentSurname.
Variable names should not be too long, 8 characters are ok although VB supports longer
variables names but must not be more than 255 characters.
Variables can store numeric, character or string values and must be declared appropriately.
In Visual Basic 6.0, variables must be declared first before they are used. The keyword Dim is
used to declare variables, and each variable should have a data type, e.g.
Global variables
These are variables that are accessed and can be used by any procedure or function within the
same program.
- They are public variables
- The value of the variable exists throughout the program.
- Global variables are declared outside the procedure.
- In VB 6.0, global variables are declared as follows:
Public Sname As String
The word Public implies that it is a global variable.
-
b. Local variables
- These are variables that are defined within a procedure and that are accessible just within the
procedure they are declared.
- They are defined within the procedure.
Compiled By Kapondeni T.
Page 60 of 185
The above diagram shows that every variable name starts with a Letter, followed by any of the
following (Letter, digit or _ ) at any position, or a mixture of both in any order, as long as the
first character is a letter.
Using the diagram above, a variable like 3_Name, is invalid since it starts with a number.
variable names must follow the rules of the language
the translator tries the rules against the variable names used and reports any errors
The contents of variables must be of a specific type otherwise an error created by the attempted
use of anything else.
6.
Reserved words
Reserved (key) words are identifiers with a pre-defined meaning in a specific programming
language, for example Dim, if, End, integer, As, etc. in Visual basic.
Reserved words must not be used as variables.
Each programming language has its own reserved words, which may differ from other
languages.
translator program maintains a dictionary of reserved words
if the reserved word used is not in this dictionary then an error has been made and message
may be given which suggests one close to spelling provided
7.
-
Expressions
An expression is a construct made up of variables and operators that makes up a complete unit
of execution.
Example:
NumberA = a+b-c*d
- The above is a statement. However, a+b-c*d is an expression found in a statement.
- Expressions can be arithmetic (as given above), Boolean or string expressions. For example,
(a>b) and (a>=c)
- Operator precedence is very important in evaluating expressions and therefore it is important to
enclose expressions is brackets where possible. Operator precedence is as follows, starting from
the highest to the lowest:
( ), Not, ^, {*,/,}\,Mod, {+, -,}{=,<,>,<=, >= },
NB: Operators is set braces indicates that they are in the same level.
- Arithmetic expressions are evaluated first, followed by comparisons and lastly logical
expressions.
Compiled By Kapondeni T.
Page 61 of 185
8.
-
9.
-
Statements
A statement is a single instruction in a program which can be converted into machine code and
executed.
A statement can just be one line of program code but in some cases a statement may have more
than one line.
For example: Name = Marian is a statement.
Example 1
NumberA = a + b
this is an assignment statement, that is, variable NumberA is assigned the sum of the values of
variables a and b. thus if a=2 and b = 3, NumberA is assigned the value 5.
An assignment is an instruction in a program that places a value into a variable, e.g total = a + b
The above is just one line statement.
Example 2
If a>b Then MsgBox "a is bigger than b.", vbExclamation
The above is a one line statement composed of if statement.
Example 3
If b < 0 Then
MsgBox "b is less than zero. Command cannot be executed", vbExclamation
Exit Sub
End If
This is a statement (starting at the first if and ending at End If.
This state comprises of other statements between it.
Block structure
A block is a group of zero or more statements between balanced braces and can be used
anywhere a single statement is allowed. For instance
if (condition) Then ( begin block 1)
. (end block 1)
else (begin block 2)
.
End If (end block 2)
10. Functions
- A function is a self-contained module that returns a value to the part of the program which
calls it every time it is called/executed.
- A function can be just an expression that returns a value when it is called.
- A function performs a single and special task, e.g. generate a student number.
- Because they return a value, functions are data types, e.g. integer, real, etc.
- Functions can be in-built functions or user-defined functions.
- In-built functions are pre-defined procedures of a programming language that returns a value,
e.g. Val (returns a numeric value of a string), MsgBox (creates a textbox on the screen), Abs
(returns an absolute value of a number), etc.
- Visual Basic has in-built date functions, string functions, conversion functions, etc.
- A user-defined function is a procedure (module) that returns a value whenever it is called. The
structure of a user defined function is as follows:
Public Function count_rec(ByVal rs As Recordset) As Boolean
If rs.RecordCount <= 0 Then
MsgBox "There is no record in the table.", vbExclamation
count_rec = True
Else
count_rec = False
End If
End Function
Compiled By Kapondeni T.
Page 62 of 185
- Note that a Function starts with the word Function and ends with the statement End Function.
This function returns a Boolean value(either true or false). The function name as just after the
word Function, i.e count-rec in the case above.
11. Procedures
- A self-contained module that does not return a value.
- Procedures usually starts with the key word Procedure and then procedure name, e.g
Procedure FindTotal. Procedures are user defined.
- The name of the procedure should be related to its task
- Each procedure name must be unique within the same program.
- A procedure can be called from the main program or by other modules.
- A procedure is called by stating it name
- Parameters are usually passed when calling procedures.
- Parameters/arguments are values that are passed from one procedure to another and can be
the actual values or variable names. They are therefore values given to a function by
statements from other modules.
- Parameters can be formal or actual parameters
Actual parameters: these are arguments found in the calling module/statement, could be
variables or actual data like 30, 40.
Formal parameters are those variables that receive data from calling module or statement.
When processing of the called procedure is finished, processing goes to the next stated after
the calling statement.
Parameters can be passed by value or by reference
Passing Parameters by Value
Passing by value is whereby the actual value (or value of a variable) is searched (e.g 20),
copied and passed to another module.
In this case, variables whose values are passed will NOT be altered even if the values of
variables in the calling procedure or function change.
Thus only the values of the variables are passed, not the variable itself.
A copy of the value of the variable is passed and not the variable itself.
Compiled By Kapondeni T.
Page 63 of 185
If the programmers does not specify ByVal or ByRef function, Visual Basic assumes that it if
ByRef by default.
12. Semantics
The meaning attached to statements and the way they are used in a certain language.
13. Syntax
- These are grammatical rules and regulations governing sentence construction and layout of
different programming languages.
- For example, Pascal uses a semi-colon(;) at the end of each instruction, it also uses the reserved
(key) word writeln to display items on the screen, etc.
- Each programming language has its own syntax.
- A program with syntax errors does not run.
14.
-
Literals
A literal is a variable which is given a fixed value within the code of the program.
A literal is the source code representation of a fixed value.
Literals are represented directly in your code without requiring computation, as shown:
Dim result As Integer
Dim Name As String
Result =20
Compiled By Kapondeni T.
Page 64 of 185
Name = Kapondeni
15. Input operations
This Refers to statements that prompts the user to enter data into the computer. E.g.
ID = InputBox("Enter Member Data to Search")
The above displays an input box with the message in bracket that prompts the user to enter data that
needs to be searched.
16. Output operations
Refers to statements that allows display or printing of results on the screen or to the printer, e.g.
frmPayments.Show
allows, the payment form to be displayed on the screen.
MsgBox ("Record not Found")
Displays a message box with the message in brackets.
17. File handling operations
Allows user to manipulate files. E.g. in Visual basic:
rs.Close
While Not .EOF
DATA TYPES
- Data types describe the nature of data handled by programs.
- These are important as they enhance program readability and maintenance.
- Data types can be simple (integers, string, etc.) or complex (arrays, lists, records, etc.).
- Data types are declared at the beginning or at some point inside the program code.
- Data types for variables must be properly declared.
a. Numeric data types
1. Integer
- Used to represent whole numbers, positive or negative, and occupy 2 bytes of memory.
- Decimal values will be rounded to the nearest whole number.
- Values accepted range from -32 768 to + 32 767. Default value is 0.
2. Byte
- Stored in a single 8-bit unsigned data value ranging from 0-255.
- Used to store binary data.
- Default value is a 0.
3. Real
- A data type that is used to represent values (numbers) with decimal point values, e.g 23.56 is
a real number.
- Can be used to represent averages.
- Some languages take this to be float data type.
4. Currency
- Occupy 8 bytes with 0 as default value.
- Used for handling monetary values and support up to 4 digits to the left and 15 digits to the
left of the decimal point.
5. Date
Represent date and time values and occupy 8 bytes.
6.
7.
-
Double
Stores a double precision floating point number with decimal places but longer than single.
Default is 0 and occupies 8 bytes
Single
Stores a single precision floating point number, with decimal places.
Default is 0 and occupies 4 bytes.
Compiled By Kapondeni T.
Page 65 of 185
8. Long
- Short for long integer and default value is 0.
- It occupies 4 bytes from -2 147 483 648 to -2 147 483 647.
b. String data types
1. String
- A string is data type that stores one or more characters, which can be alphabetic or alphanumeric.
- If a variable is declared as a string, it must not be used for daily life mathematical calculations
because strings are not numbers.
- Strings are declared as follows:
Dim Name As String
- If a value of string variable is assigned within the program code, it must be enclosed in quoates
e.g.
Name = Tungamirai
Name =
- The second example will be a null (un-initialised) string. It is different from 0 or space but is
distinct in nature.
- Strings can be of fixed length or variable length
User-defined data.
These are defined by the user depending on the method of solution, e.g classes when the used define
own classes.
Enumerated data:
These are data types with a list of items that are pre-defined, e.g days of the week, months of the
year, etc.
Word: a computer word is group of bits that can be handled or transferred by the processor as a
single unit (word length).
Compiled By Kapondeni T.
Page 66 of 185
Page 67 of 185
Performance: the program must be efficient and fast in performing whatever it was designed
to perform.
Storage saving: program must occupy as little storage space as possible.
To stop this we use the fact that, eventually, <integer> is a single digit and write
<integer> ::= <digit>|<digit><integer>
We now have the full definition of an unsigned integer which, in BNF, is
<unsigned integer> ::= <digit>|<digit><unsigned integer>
<digit> ::= 0|1|2|3|4|5|6|7|8|9
The above can be shown using a syntax diagram as:
Compiled By Kapondeni T.
Page 68 of 185
Variables in BNF
Valid variables start with a letter (Upper or lowercase) and followed by any character (which must be
a letter, digit or underscore) upt to any length. This can be defined in BNF as:
<variable> ::= <letter>|<variable><character>
<character> ::= <letter>|<digit>|<under-score>
<letter> ::= <uppercase>|<lowercase>
<uppercase> ::= A|B|C|D|E|F|G|H|I|J|K|ZL|M|N|O|P|Q|R|S|T|U|V|W|X|Y|Z
<lowercase> ::= a|b|c|d|e|f|g|h|i|j|k|zl|m|n|o|p|q|r|s|t|u|v|w|x|y|z
<digit> ::= 0|1|2|3|4|5|6|7|8|9
<under-score> ::= _
Compiled By Kapondeni T.
Page 69 of 185
Page 70 of 185
Source Code
Lexical Analyser
Syntax Analyser
Semantic Analyser
Symbol Table
Manager
Intermediate Code
Generator
Error
Handler
Code Generator
Code Optimiser
Object Code
Analysis Phases
Consists of the lexical analyzer, syntax analyzer and the intermediate code generator.
Lexical Analysis
This is where scanning of the source program is done. Each sequence of characters that have
an atomic meaning are recognized and represented by a token.
A token represents a class of valid letters, for example a program is going to analyse numbers,
variables and functions.
Syntax Analysis
This is the stage where tokens are grouped into large structures, for example, assignment
statements.
Compiled By Kapondeni T.
Page 71 of 185
Semantic Analysis
This is where there is transition of tokens into code generation and complex errors are
detected.
Intermediate Code Generator
This is where compilation from the source language into a target program.
Synthesis Phase
This phase consists of the Code Optimiser and Code Generator.
Code Generation
The target code is generated from the intermediate code to perform static scheduling and
register allocation.
Code Optimiser
There is reduction of the number of instructions in order to allow the program to run faster.
Compiler Dealings with variables
During lexical analysis
Characters in the variable name are tokenised
Variable name is added to symbol table
Data type added
scope is added/block(s) in which variable is valid
During syntax analysis
variable checked against syntax of the language, e.g. syntax diagram.
Variable names which do not match the rules are reported in error diagnostics
Statements containing variables are checked for syntax
Position in table is hashed from the name
Variable declarations are checked/also variable use
During code generation
Address of variable calculated
added to symbol table
Compiler Dealings with syntax errors
Reserved word is isolated
Computer keeps list of all reserved words in the dictionary.
if not in list of reserved words, then an error message is conveyed
If reserved word identified then syntax table is checked for expected form of statement.
If form of statement does not match an error issued
Variable names checked against rules for variable names
Variable declarations are checked to determine if they are present.
Loading programs and linking modules
Loaders
A loader is a program that loads compiled program/modules into the computers memory. Some
functions of loaders are:
Loads the individual modules into the computers memory
Loader decides where modules are to be placed in memory
Adjusts memory addresses for individual module according to where they are placed
loads library routines
Copies program from storage location into memory
Loaders are in two forms, absolute and relocating loaders
Compiled By Kapondeni T.
Page 72 of 185
(a) Absolute loader: refers to a program that loads program into a fixed area of memory each
time a program is run. The loaded program can only work when it is loaded to that single
address in memory.
(b) Relocating Loader: This refers to a program that loads a program anywhere in the memory
and the loaded program work without any problem. Thus address of a loaded program is
recalculated every time the program is loaded into memory.
Also relocating can be in two forms:
Static relocating: Once the program is loaded into a memory address, its relocatability is lost,
that is, it cannot be moved to any other address during execution.
Dynamic relocating: the program can even be moved to any other address during execution.
Linkers
A linker is a program that compiles all loaded modules and then create linkages between them. Other
functions of linkers are:
Linker joins the modules compiled correctly
Calculates addresses of the separate modules
Allows library routines to be linked to several programs
Ensures jump instruction from module to module properly addressed
Produces an executable file
matches up address references between modules
Editors
These are programs that are used to keyin and amend source code. It is also used to display and edit
text before compilation.
REVIEW QUESTIONS
1. Explain the terms
(i) data encapsulation
(ii) inheritance
(iii) Polymorphism
when applied to programming in an object oriented programming language. (4)
2. Distinguish between procedural languages and declarative languages. (4)
3. Explain the passing of parameters by reference and by value. (4)
4. (a) Explain the difference between the translation techniques of interpretation and compilation [2]
(b) Give two advantages of each of the two translation techniques.
[4]
5. An amount of money can be defined as
A $ sign followed by either
A positive integer or
A positive integer, a point, and a two digit number or
A point and a two digit number
A positive integer has been defined as <INTEGER>
A digit is defined as <DIGIT>::= 0/1/2/3/4/5/6/7/8/9.
a) Define, using Backus Naur form, the variable <AMOUNT OF MONEY>
b) Using the previously defined values of INTEGER and DIGIT, draw a syntax diagram to define
AMOUNT OF MONEY.
6. State the three stages of compilation and briefly describe the purpose of each. [6]
7. Explain in detail, the stage of lexical analysis.
[6]
8. Explain the role of
(i) linkers,
(ii) loaders
in the running of programs
[4]
9. Two of the stages which a high level language program undergoes during compilation are lexical
analysis and syntax analysis.
Compiled By Kapondeni T.
Page 73 of 185
Discuss how errors are discovered during each of these two stages. [5]
10. (a) (i) Describe what is meant by source code. [2]
(ii) Explain why source code needs to be translated into object code. [2]
(b) State what is meant by the following types of programming error:
(i) syntax error [1]
(ii) arithmetic error [1]
11. (a) Explain how the translator program prepares the programmers code into a program that the
machine can run. [2]
(b) (i) Explain what is meant by a procedure. [2]
(ii) Describe how procedures and the programming construct selection can be used to code
a simple menu system for a user. [3]
12. (a) Explain the difference between interpretation and compilation of a program written in a high
level language. [2]
(b) Explain what happens during the lexical analysis stage of compilation. [5]
(c) Describe two things that happen during code generation. [4]
13. (a) Programs can be designed in modular form.
Discuss the advantages and disadvantages of designing programs in modular form. [5]
(b) A program is to be written which will update the records in a sequential file and then produce a
backup copy.
Describe, using a diagram, the way that this problem can be split into modules to prepare it for
coding. [5]
14.(a) Explain why a program, written in a high level language, needs to be translated before it is run
on a computer. [2]
(b) Describe the difference between interpretation and the compilation of a high level language
program. [4]
(c) Explain how errors in the
(i) reserved words,
(ii) variables
Used in high level language instructions are recognised by the translator program. [4]
15. Describe each of the following programming paradigms
(i) Object-oriented, [2]
(ii) Declarative. [2]
16. A name is passed as a parameter to a function. The function uses a loop structure to search for the
name in an array. It returns the details found to the calling program.
(a) The name to be searched can be passed either
(i) by value, or
(ii) by reference.
Using this example, explain what is meant by a parameter being passed by value and by
reference. [2]
(b) Using examples from this function, explain what is meant by a
(i) local variable,
(ii) global variable. [4]
(c) Two types of translator are interpreters and compilers.
Describe the difference between an interpreter and a compiler and state why both would be
used with this function. [4]
(d) Explain the purpose of a loader in the running of the final program. [2]
17. (a) Explain the differences between the lexical analysis stage and the syntax analysis stage in the
compilation of a high level language program. [6]
(b) One phase of compilation is the code generation phase.
Describe the code generation phase. [3]
(c) Explain the purpose of a loader. [2]
Compiled By Kapondeni T.
Page 74 of 185
(d) A program has been written using a top-down technique. The individual modules in the program
have been fully tested and there are no errors in any of them.
Explain why the program may fail to run or may produce incorrect results, despite the testing that has
been done. [2]
(e) When a computer runs a program, the program may fail to run successfully because there are
errors in the code.
Describe two types of error that may be present, giving an example of each. [6]
18. Explain why an interpreter would be preferred to a compiler as a translator when writing a high
level language program. [5]
19. In a particular object oriented programming language, the following classes are defined
Compiled By Kapondeni T.
Page 75 of 185
CHAPTER 4: ALGORITHMS
- A set of instructions describing the steps followed in performing a specific task, for example,
calculating change.
- They are a sequence of instructions for solving a problem.
- Algorithms can be illustrated using the following:
Descriptions, Flowcharts, Pseudocodes, Structure diagrams
a. Descriptions:
These are general statements that are followed in order to complete a specific task.
They are not governed by any programming language. An example is as follows:
Enter temperature in oC
Store the value in box C
Calculate the equivalent temperature in oF
Store the value in box F
Print the value of box C and F
End the program.
b. Pseudocodes:
- These are English-like statements, closer to programming language that indicates steps followed
in performing a specific task.
- They are means of expressing algorithms without worrying about the syntax of the programming
language.
- There are no strict rules on how pseudocode statements should be written.
- Indentations are very important in writing pseudocodes since they clearly indicate the extent of
loops and conditional statements.
- They are however independent of any programming language.
- An example is as follows:
Enter centigrade temperature, C
If C = 0, then stop.
Set F to 32 + (9C/5)
Print C and F
End
Control Structures/Programming Constructs/building blocks of a structured program
- A number of control structures are used in designing Pseudocodes.
- These includes: simple sequence, selection and iteration.
NB: GO TO statements (also called spaghetti programming) must be avoided as the programs will be
difficult to follow, difficult to debug, and difficult to maintain.
i. Simple sequence:
- This is whereby instructions are executed in the order they appear in a program without jumping
any one of them up to the end of the program.
- Statements are executed one after another in the order they are.
- It is simple and avoids confusion.
- Example:
Enter first number, A
Enter second number, B
C=A+B
Print C
Stop
Compiled By Kapondeni T.
Page 76 of 185
B
C
The above 3 Pseudocodes produces the same result.
Cascaded/Nested If Statements
This is whereby if statements are found inside other if statements (nested Ifs) as shown below:
Start
Enter First Number, A
Enter Second Number, B
Enter Third Number, C
If A>B Then
If B>C Then
Print A is the biggest Number
End If
End If
End.
CASE Statement: This is an alternative to the IF...THEN...ELSE statement and is shorter. For
example:
Enter first Number, A
Enter second number, B
Enter operand (+, -, * /)
CASE operand of:
+: C = A + B
-: C = A-B
*: C = A*B
/: C = A/B
ENDCASE
Print C
END
Compiled By Kapondeni T.
Page 77 of 185
iii. Repetition/Iteration/looping:
A control structure that repeatedly executes part of a program or the whole program until a certain
condition is satisfied.
Iteration is in the following forms: FOR...NEXT LOOP, REPEAT... UNTIL Loop and the
WHILE...ENDWHILE Loop.
a. For...Next Loop: A looping structure that repeatedly executes the loop body for a specified
number of times. The syntax of the For...Next loop is as follows:
FOR {variable} = {starting value} to {ending value} DO
Statement 1
Statement 2
loop body
................
NEXT {variable}
A group of statements between the looping structures is called the loop body and is the one that is
repeatedly executed.
The For...Next loop is appropriate when the number of repetitions is known well in advance, e.g.
five times. An example of a program that uses the For...Next loop is as follows:
Sum, Average = 0
FOR I = 1 to 5 DO
Enter Number
Sum = Sum + number
NEXT I
Average = Sum/5
Display Sum, Average
End
b. Repeat...Until Structure: This is a looping structure that repeatedly executes the loop body when
the condition set is FALSE until it becomes TRUE. The number of repetitions may not be known in
advance and the loop body is executed at least once. The syntax is as follows:
Repeat
Statement 1
Statement 2
loop body
................
Until {Condition}
For example
Sum, Average, Count = 0
Repeat
Enter Number (999 to end)
Sum = Sum + Number
Count = count + 1
Until Number = 999
Average = Sum / count
Print Sum, count, Average
End
In the above program:
- Count records the number of times the loop body executes.
- 999 is used to stop further data entry through the keyboard and thereby ending the loop. Such
a value that stops further data entry through the keyboard thereby terminating a loop is called
a Rogue value or sentinel.
Compiled By Kapondeni T.
Page 78 of 185
The condition here is {Number = 999}. The loop exits when the number 999 is entered. If
999 is part of the number to be entered in this program, then the user has to split it into two
numbers, that is 999 = 990 + 9, therefore can be entered separately as 990 and 9.
- A flag is also used to control the loop. In this case 999 is also a flag.
NB. As for the Repeat...Until loop, the condition is tested after the loop body has been run at least
once, even when the condition is true from start. This is rather misleading.
c. While ... Do Structure
A looping structure in which the loop body is repeatedly executed when the condition set is TRUE
until it becomes FALSE. It is used when the number of repetitions is not known in advance. The
condition set is tested first before execution of the loop body. Therefore the loop body may not be
executed at all if the condition set is FALSE from start. The syntax of the WHILEENDWHILE
structure is as follows:
WHILE {condition}
Statement 1
Statement 2
................
ENDWHILE
loop body
2
3
Compiled By Kapondeni T.
Page 79 of 185
c. Flowcharts
It is a diagram used to give details on how programs and procedures are executed. Flowcharts are
drawn using specific symbols, each with its own meaning, as given below:
Symbol
Explanation
- Indicates where some form of processing occur
Process Symbol
Arrow
Input /output
Terminal
Connector
Pre-defined process
Decision
Flowchart
Start
Enter number, A
Enter number, B
Sum = A + B
Display Sum
Stop
Compiled By Kapondeni T.
Page 80 of 185
3. Using Iteration
(a) Repeat ... Until Structure
Flowchart
Pseudocode equivalent
Sum, Average, Count = 0
Repeat
Enter Number
Sum = Sum + Number
Count = count + 1
Until Count > 10
Average = Sum / count
Display Sum, count, Average
End
Compiled By Kapondeni T.
Pseudocode equivalent
Sum, Average, Count = 0
WHILE Count <=10
Enter Number
Sum = Sum + Number
Count = count + 1
WEND
Average = Sum / count
Display Sum, count, Average
END
Page 81 of 185
module
Flowchart (a) above indicates modules named Accept Numbers, Add numbers Multiply Numbers and
Display Results. Flowcharts for individual modules can then be designed as given in diagram (b)
above, only the first module is indicated. Can you do the rest?
d. Structure Diagrams/Structure Charts: These are diagrams that show relationships between
different modules, thereby giving the structure of a program. They also illustrate the top-down
approach to programming. It is useful as a documentation of a complex program once it is completed.
It resembles a family tree as given below.
Start
Sum, Product = 0
Enter First Number, A
Enter Second Number, B
Sum = A + B
Product = A * B
Display Sum, Product
End
The structure diagram above indicates five sub-programs of the program Process Numbers,
namely Initialise, Accept Numbers, Process Numbers, Display Results and Exit.
The module Process Numbers has its own sub-programs, which are Add Numbers and
Multiply Numbers.
Modules are appropriate for very large programs.
If the module is repeatedly executed (loop), then an asterisk (*) must be placed at the top right
corner of the module (inside).
All the boxes at the same level indicate selection.
Boxes below others indicate sequence.
The program can be written as a continuous single program as indicated on the right side of
the diagram.
Compiled By Kapondeni T.
Page 82 of 185
RECURSION
A recursive function or procedure occurs when the procedure calls itself other than calling another
procedure.
Recursion can be used when finding factorial of a number. For example
Function Factorial (n)
If n=1 then
Return 1
Else
Return n * Factorial (n-1)
End if
End Function
A recursive structure has two important features:
- It calls itself
- It must have a terminating condition (n=1 in the above example), otherwise it will not stop
calling itself (runs forever). It uses the if condition (not a while) to specify the terminating
condition.
The following is a recursive procedure:
Procedure Squares (Low, High)
If Low High Then
Print (Low * Low)
Squares (Low + 1, High)
End If
End Procedure
If a recursive method is called with a base case, the method returns a result. If a method is
called with a more complex problem, the method divides the problem into two or more
conceptual pieces: a piece that the method knows how to do and a slightly smaller version of
the original problem. Because this new problem looks like the original problem, the method
launches a recursive call to work on the smaller problem.
For recursion to terminate, each time the recursion method calls itself with a slightly simpler
version of the original problem, the sequence of smaller and smaller problems must converge
on the base case. When the method recognizes the base case, the result is returned to the
previous method call and a sequence of returns ensures all the way up the line until the
original call of the method eventually returns the final result.
Both iteration and recursion are based on a control structure: Iteration uses a repetition
structure; recursion uses a selection structure.
Both iteration and recursion involve repetition: Iteration explicitly uses a repetition structure;
recursion achieves repetition through repeated method calls.
Iteration and recursion each involve a termination test: Iteration terminates when the loopcontinuation condition fails; recursion terminates when a base case is recognized.
Iteration and recursion can occur infinitely: An infinite loop occurs with iteration if the loopcontinuation test never becomes false; infinite recursion occurs if the recursion step does not
reduce the problem in a manner that converges on the base case.
Recursion repeatedly invokes the mechanism, and consequently the overhead, of method
calls. This can be expensive in both processor time and memory space.
Compiled By Kapondeni T.
Page 83 of 185
PROGRAMMING ERRORS
Programming errors are grouped into:
i. Syntax error:
- this is an error of violating the grammatical rules governing sentence construction in a certain
programming language, for example, misspelled reserved words or leaving out a semi-colon
at the end of each line in Pascal.
- Syntax errors are detected by the computer. A program cannot run with syntax errors.
ii. Logic error (Semantic error):
- refers to an error in the sequencing of instructions, modules and specifying wrong formulae
that will produce undesirable results.
- For example, specifying a jump instruction to the wrong procedure or instructing the
computer to display result before any processing has been done.
- Logic errors cannot be detected by the computer.
- The user just finds wrong and unintended results of a process.
- For example:
NetSalary = GrossSalary + Deductions + AidsLevy
The above formulae should have been correctly written as
NetSalary = GrossSalary - Deductions - AidsLevy
-
It is also an error generated by entering the wrong data type during program execution, for
example, entering a text value where a numeric value is needed.
Compiled By Kapondeni T.
Page 84 of 185
1. Translator diagnostics.
Each of the commands that are in the original program is looked at separately by the computer
translator to execute it. Each command will have a special word which says what sort of command it
is. The translator looks at the special word in the command and then goes to its dictionary to look it
up. The dictionary tells the translator program what the rules are for that particular special word. If
the word has been typed in wrongly, the translator will not be able to find it in the dictionary and will
know that something is wrong. If the word is there, but the rules governing how it should be used
have not been followed properly, the translator will know that there is something wrong. Either way,
the translator program knows that a mistake has been made, it knows where the mistake is and, often,
it also knows what mistake has been made. A message detailing all this can be sent to the
programmer to give hints as to what to do. These messages are called translator diagnostics.
2. Debugging tools.
These are part of the software which help the user to identify where the errors are. The techniques
available include:
a) Cross-referencing.
This software checks the program that has been written and finds places where particular variables
have been used. This lets the programmer check to make sure that the same variable has not been
used twice for different things.
b) Traces.
A trace is where the program is run and the values of all the relevant variables are printed out, as are
the individual instructions, as each instruction is executed. In this way, the values can be checked to
see where they suddenly change or take on an unexpected value.
c) Variable dumps (check/watch).
At specified parts of the program, the values of all the variables are displayed to enable the user to
compare them with the expected results.
3. Desk checking (dry run.)
The user works through the program instructions manually, keeping track of the values of the
variables. Most computer programs require a very large number of instructions to be carried out, so it
is usual to only dry run small segments of code that the programmer suspects of harbouring an error.
Test strategies are important to establish before the start of testing to ensure that all the elements of a
solution are tested, and that unnecessary duplication of tests is avoided.
Using VB 6.0, if you reach a point in your code that calls another procedure (a function, subroutine,
or the script associated with an object or applet), you can enter (step into) the procedure or run (step
over) it and stop at the next line. At any point, you can jump to the end (step out) of the current
procedure and carry on with the rest of the application.
Break points can be set within program code so that the program stops temporarily to check that it is
operating correctly to that point.
Step Into: Traces through each line of code and steps into procedures. This allows you to view the
effect of each statement on variables.
Step Over: Executes each procedure as if it were a single statement. Use this instead of Step Into to
step across procedure calls rather than into the called procedure.
Compiled By Kapondeni T.
Page 85 of 185
Step Out: Executes all remaining code in a procedure as if it were a single statement, and exits to the
next statement in the procedure that caused the procedure to be called initially.
jump to the end (step out) of the current procedure and carry on with the rest of the application
DATA TESTING
After a program has been coded, it must be tested with different data types to determine if intended
results are produced. The types of test data that can be used include:
i. Extreme Data: Refers to the minimum and the maximum values in a given range. For example, a
computer program requires the user to enter any number from (between) 1 to 20. 1 and 20 are
extreme data and the computer must accept these. Thus extreme data is accepted by the computer.
ii. Standard (normal) Data: This refers to data that lies within (in-between) a given range. In our
example above, the numbers from 2 to 19 are standard data and are accepted by the computer.
iii. Abnormal Data: This refers to data outside a given range. As to our example above, the number
0, -1, -50 and all number from 21 and above are abnormal data.
iv. Valid data: refers to data of the correct data type. Invalid data is data of the wrong data type.
Thus if the user enter the value Terrence instead of a number, this is referred to as a wrong
(invalid) data type. Only numbers are needed, not text.
PROGRAM TESTING
Can be done using the following testing methods:
Unit testing, Integration Testing, User acceptance testing, black box testing, white box testing,
bottom-up testing, top-down testing, etc.
Compiled By Kapondeni T.
Page 86 of 185
Page 87 of 185
Character
A character is any single digit, letter or symbol that can be represented in a computer, for example, 2,
t, G, %, &, M, space, etc. Each character is represented using binary digits, which the computer can
understand; therefore take up a single unit of storage on the computer. Some programming languages
refer to this as a Char. Can be used to represent coded data e.g. M for Male, F for Female.
Date
Used to represented dates, e.g date of birth, etc. can be long or short dates, e.g dd/mm/yy, dd/mm/yyyy
or dd-MonthName-yyyy. Dates usually take 8 bytes of storage.
In general memory requirements for different data types are as follows:
Page 88 of 185
To use a UDT, you must define a variable "As" the name following the keyword "Typ e" (in this case,
"EmployeeRecord"). For example:
Dim udtEmpRec As EmployeeRecord
The above defines a variable called "udtEmpRec" which has the attributes defined by the structure
"EmployeeRecord". Thus, it is "udtEmpRec" which you refer to in your procedural statements, NOT
"EmployeeRecord". the following code places data in the individual elements of udtEmpRec:
udtEmpRec.strEmpName = "JOE SMITH"
udtEmpRec.dtmHireDate = #1/15/2001#
udtEmpRec.sngHrlyRate = 25.50
Benefits of defined data-types
The use of data-types (Intrinsic and user-defined) within a programming language has the following
benefits:
enable the compiler to reserve the correct amount of memory for the data e.g. 4 bytes for an
integer;
trap errors that a programmer has made and errors that a user of a program can make a
variable defined as an integer cannot be given a fractional value;
restrict the values that can be given to the data a Boolean cannot be given the value
maybe;
Restrict the operations that can be performed on the data a string cannot be divided by 10.
Units Of Data Storage
In general, the units of data storage are as follows:
I Bit
=
1 or 0
I Nibble
=
4 Bits (1/2 a Byte)
I Byte
=
8 Bits
I Kilobyte (Kb)
=
1024 Bytes
1 Megabyte (Mb)
=
1024 Kilobytes
1 Gigabyte (Gb)
=
1024 Megabytes
1 Terabyte (Tb)
=
1024 Gigabytes
1.
=
=
210 Bytes
=
220 Bytes
230 Bytes
=
240 Bytes
Bit
Bit is short for BInary digiT. It is a single digit in base 2, that is, either 1 or 0. A bit is the smallest
unit of data that the computer can process. Therefore a binary number is composed of these two
values only, that is 1 and 0. Bit represents two states, ON or OFF, true or false, or yes or no
2.
Byte
A byte is a group of 8 bits representing a character. A character is any digit, letter or symbol that
can be represented in a computer, for example, 2, t, G, %, &, M, space, etc. Each character is
represented using binary digits, which the computer can understand, therefore take up a single
unit of storage on the computer. With 8 bits in a byte, you can represent 256 values ranging from
0 to 255:
0 = 00000000
1 = 00000001
2 = 00000010
...
254 = 11111110
255 = 11111111
Compiled By Kapondeni T.
Page 89 of 185
NB: However, the byte size may differ with the architecture of the computer. Other computers use an
8-bit byte, other 32-bit byte, others 64-bit byte. Thus in general, a byte can be a unit representation of
character, which could be 8, 16, 32 or in 64 bits. However, for this course, we will assume a bit as a
group of 8-bits representing a character.
3.
Word
A word is a fixed-size group of bits that can be handled as a unit by the processor.
Word size refers to the number of bits that the CPU can simultaneously process, which could be
8 bits, 16 bits, 32 bits or 64 bits. The bits are processed as a unit during input and output. A 64
bit processor can process data faster than a 32 bit processor, thus word size affects processor
speed.
DATA REPRESENTATION
The form of data representation is in its character set. All the characters that a system can
recognise, which often equates to characters on the keyboard, is called its character set.
Character set (or data representation) can be as follows:
Each character is represented using a unique set of bits which are equivalent to 1 or 2 bytes.
Character set of a computer is represented as binary codes, ASCII, UNICODE and EBCDic
using 7/8 bits.
1. American Standard Code for Information Interchange (ASCII)
It is a set of codes that a computer understands and is represented in a single byte of 7 or 8 bits per
character, which allows communication between systems. ASCII uses 7 bits which gives 128
combinations. However the extended ASCII now uses 8 bits so there are 256 different codes that can
be used and hence 256 different characters. However, this is not quite true, as some of the bits can be
used for parity checks.
The American Standard Code for Information Interchange (ASCII) is widely used in computers of all
types.
ASCII codes are of two types ASCII-7 and ASCII-8.
ASCII-7 is a 7-bit standard ASCII code. In ASCII-7, the first 3 bits are the zone bits and the next 4
bits are for the digits. ASCII-7 allows 27 = 128 combinations. 128 unique symbols are represented
using ASCII-7. ASCII-7 has been modified by IBM to ASCII-8.
ASCII-8 is an extended version of ASCII-7. ASCII-8 is an 8-bit code having 4 bits for zone and 4
bits for the digit. ASCII-8 allows 28 = 256 combinations. ASCII-8 represents 256 unique symbols.
ASCII is used widely to represent data in computers.
The ASCII-8 code represents 256 symbols.
Codes 0 to 31 represent control characters (non-printable), because they are used for actions
like, Carriage return (CR), Bell (BEL) etc.
Codes 48 to 57 stand for numeric 0-9.
Codes 65 to 90 stand for uppercase letters A-Z.
Codes 97 to 122 stand for lowercase letters a-z.
Codes 128-255 are the extended ASCII codes.
In the ASCII character set, each binary value between 0 and 127 is given a specific character. Most
computers extend the ASCII character set to use the full range of 256 characters available in a byte.
The upper 128 characters handle special things like accented characters from common foreign
languages.
You can see the 127 standard ASCII codes below. Computers store text documents, both on disk and
in memory, using these codes. For example, if you use Notepad in Windows OS to create a text file
containing the words, "Four score and seven years ago," Notepad would use 1 byte of memory per
character (including 1 byte for each space character between the words -- ASCII character 32). When
Compiled By Kapondeni T.
Page 90 of 185
Notepad stores the sentence in a file on disk, the file will also contain 1 byte per character and per
space.
Try this: Open up a new file in Notepad and insert the sentence, "Four score and seven years ago" in
it. Save the file to disk under the name getty.txt. Then use the explorer and look at the size of the file.
You will find that the file has a size of 30 bytes on disk: 1 byte for each character. If you add another
word to the end of the sentence and re-save it, the file size will jump to the appropriate number of
bytes. Each character consumes a byte.
If you were to look at the file as a computer looks at it, you would find that each byte contains not a
letter but a number -- the number is the ASCII code corresponding to the character (see below). So on
disk, the numbers for the file look like this:
F
o
u
r
a
n
d
s
e
v
e
n
70
111 117
114
32
97
110
100
32
115
101
118
101
110
By looking in the ASCII table, you can see a one-to-one correspondence between each character and
the ASCII code used. Note the use of 32 for a space -- 32 is the ASCII code for a space. We could
expand these decimal numbers out to binary numbers (so 32 = 00100000) if we wanted to be
technically correct -- that is how the computer really deals with things.
The first 32 values (0 through 31) are codes for things like carriage return and line feed. The space
character is the 33rd value, followed by punctuation, digits, uppercase characters and lowercase
characters.
ASCII codes can just be used for representing characters and not for arithmetic calculations.
ASCII codes also occupy a lot of disc storage space.
2. Binary System
Data is represented in 0s and 1s, thus in base 2. It is obtained by dividing the denary number by 2,
taking the remainders only. The number of bits in the answer does not matter unless specified.
3. BCD
Each decimal digit is represented by its own 4-bit binary code as follows:
0
0000
1
0001
2
0010
3
0011
4
0100
5
0101
6
0110
7
0111
8
1000
9
1001
The number 3765 is thus coded as 0011 0111 0110 01012
BCD is used to represent some numbers that are not proper numbers (numbers that dont behave
like numbers). A barcode looks like a number, but if the barcodes are added together the result is not
a barcode for any product. The arithmetic does not give a sensible answer. Values like this that look
like numbers but do not behave like them are often stored in binary coded decimal (BCD). Each digit
is simply changed into a four bit binary number which are then placed after one another in order.
- This has the advantage that it is easy to convert a number from BCD to decimal form and vice
versa.
- There is no rounding off numbers when computing fractional numbers, thus no errors due to
rounding off.
- Used in businesses where significant digit needs to be retained.
However:
Compiled By Kapondeni T.
Page 91 of 185
As compared to pure binary, more bits are needed to store a number, thus more memory is
needed
- Calculations with such numbers are more complex than in pure binary numbers, e.g.
Adding 1 and 19, i.e 0000 00012
+ 0001 10012
0001 10102, the first digit, 1 is wrong and 1010 does not exist in BCD.
The error is caused by the range of numbers used for representing data in BCD. BCD used 4 bits
which is 24 = 16 combinations. However the maximum range of numbers represented 9. 6 has to be
added to the result if the sum of bit is greater than 9. Thus adding the result above, 0001 1010 2 to 6
(0110) gives us 0010 00002, which is 20 in BCD.
4. EBCDIC
The Extended Binary Coded Decimal Interchange Code (EBCDIC) uses 8 bits (4 bits for zone, 4 bits
for digit) to represent a symbol in the data.
EBCDIC allows 28 = 256 combinations of bits.
256 unique symbols are represented using EBCDIC code. It represents decimal numbers (09), lower case letters (a-z), uppercase letters (A-Z), Special characters, and Control characters
(printable and non-printable e.g. for cursor movement, printer vertical spacing etc.).
EBCDIC codes are used, mainly, in the mainframe computers.
5. UNICODE
Unicode is a universal character encoding standard for the representation of text which includes
letters, numbers and symbols in multi-lingual environments. This is an international 16-bit data
coding method which represents 65536 different characters. It is enough to represent characters in
any language, even Chinese and hieroglyphics.
A problem arises when the computer retrieves a piece of data from its memory. Imagine that
the data is 01000001. Is this the number 65, or is it A?
They are both stored in the same way, so how can it tell the difference?
The answer is that characters and numbers are stored in different memory locations so it knows
which one it is by knowing whereabouts it was stored.
Signed and Unsigned Numbers
A binary number may be positive or negative. In daily life we use symbols + and - to represent
positive and negative numbers, respectively. However, binary numbers use 0 (for positive) and 1 (for
negative) in the computer. An n-bit signed binary number consists of two parts sign bit and
magnitude. The left most bit (Most Significant Bit (MSB)) is the sign bit. The remaining n-1 bits
denote the magnitude of the number, giving us a sign and magnitude as given below.
In an n-bit unsigned binary number, the magnitude of the number n is stored in n bits. An 8-bit
unsigned number can represent data in the range 0 to 255 (28= 256).
Sign and Magnitude representation
01100011 is a positive number since its sign bit is 0
11001011 is a negative number since its sign bit is 1.
An 8-bit signed number can represent data in the range -128 to +127 (-27 to +27-1).
Compiled By Kapondeni T.
Page 92 of 185
The magnitude of a number is its natural value, regardless of the sign. Thus the magnitude of -25 and
+25 is 25 (not considering the sign in this situation). In binary form, 25 = 11001. Thus using Sign and
Magnitude representation:
+25 = 0 11001
-25 = 1 11001
What only differ is the sign bit, not the magnitude.
Representation of Negative Numbers
Negative numbers are mostly represented using the complement of number. Complement of numbers
can be in 1s complement or 2s complement. The complement of a number behaves like the negative
of the original number.
1s Complement
Ones complement of a binary number is obtained by simply converting 1s to 0s and 0s to 1s. For
example, given the following four-bit binary number 10102, its 1s complement becomes 01012. The
alternating of bits only applies to negative numbers, positive numbers do not change. For example
+6 = 000001102
-6 = 111110012
-6 is the complement (negative) of +6. Just convert 1s to 0s and 0s to 1s and thus -6 in 1s
complement.
Given the 1's complement we can find the magnitude of the number by taking it's 1's complement.
The range of numbers that can be represented in 1s complement is found by the formula:
-(2n-1-1) to +(2n-1-1)
If the binary number has 8-bits (n=8). Thus the range of numbers will be from (-127) 100000002 to
011111112 (127)
Therefore the largest number that can be represented in 8-bit 1's complement is = 127. The smallest is
-127.
However 1s complement has a problem that it has two different representations (values) for zero,
which are 000000002 and 111111112 both represent zero.
When adding binary numbers using 1s complement, the carry bit is added back to the sum in the
rightmost position. There is no overflow as long as the magnitude of the result is not greater than 2n1
-1. We do not throw away the carry bit.
2S COMPLEMENT
Twos complement of number is obtained by:
a) Positive numbers remain the same
b) Negative numbers: - Change the number to its 1s complement.
- Add 1 to the result and the number will be in 2s complement.
OR
- Rewrite the bits starting from the right hand side, all 0s take as they
are at their respective position and the first 1 value encountered. The
rest alternate a 1 to 0 and a 0 to a 1 and your number will be in 2s
complement.
For example, in 2s complement,
+6 = 000001102
-6 = 111110102
Compiled By Kapondeni T.
Page 93 of 185
We can also find the magnitude the 2's complement number. The largest number that can be
represented in 8-bit 2s complement is 011111112 = 127. The smallest is 100000002 = -128. The
formula used for range is
-(2n-1) to +(2n-1-1)
2s Complement representation of 4-bit number
One way to detect overflow is to check the sign bit of the sum. If the sign bit of the sum does
not match the sign bit of x and y, then there's overflow. This only makes sense.
Suppose x and y both have sign bits with value 1. That means both representations represent
negative numbers. If the sum has sign bit 0, then the result of adding two negative numbers
has resulted in a non-negative result, which is clearly wrong. Overflow has occurred.
Suppose x and y both have sign bits with value 0. That means, both representations represent
non-negative numbers. If the sum has sign bit 1, then the result of adding two non-negative
numbers has resulted in a negative result, which is clearly wrong. Overflow has occurred.
This suggests that one way to detect overflow is to look at the sign bits of the two most significant
bits and compare it to the sum. Refer to diagrams below:
Compiled By Kapondeni T.
Page 94 of 185
Step-1
Complement the number.
Step-2
Add one add prefix a minus sign.
Step-3
Convert binary to decimal.
11111011
00000100
-00000101
-5
When the addition of two values results in a carry, the carry bit is ignored and is thrown away. There
is no overflow as long as the magnitude is not greater than 2n-1-1 nor less than (2n-1).
The two's-complement system has the advantage that the fundamental arithmetic operations of
addition, subtraction, and multiplication are identical to those for unsigned binary numbers (as long
as the inputs are represented in the same number of bits and any overflow beyond those bits is
discarded from the result). This property makes the system both simpler to implement and capable of
easily handling higher precision arithmetic. Also, zero has only a single representation, other than in
ones'-complement where it has two values.
Binary arithmetic
The arithmetic operations - addition, subtraction, multiplication and division, performed on the
binary numbers is called binary arithmetic. The basic arithmetic operations performed on the binary
numbers are
Binary conversion
Binary Addition, and
Binary Subtraction,
Binary Conversion
This involves converting a number in binary from to either denary (base 10), octal (base 8) or
hexadecimal( base 16)
(a) Conversion from decimal (denary) to Binary
Divide the denary number by 2, listing the remainders until the answer is 0 remainder 1.
Take the remainders only from the last one until the first.
For example:
Compiled By Kapondeni T.
Page 95 of 185
= 1168
(d) Octal to decimal
Power
82
Equivalent
64
to:
Octal Digits
1
81
8
80
1
= (64 x 1) + (8 x 1) + (1 x 6) = 78
(e) Decimal to hexadecimal
Hexadecimal means base 16.
A hexadecimal number contains numbers from 0 to 15.
However, 10 to 15 are represented by uppercase alphabetic characters from A to F
respectively.
The table below illustrates this:
0 1
Decimal Number
0 1
Hexadecimal
Compiled By Kapondeni T.
2
2
3
3
4
4
5
5
6
6
7
7
8
8
9
9
10 11 12 13 14 15
A B C D E F
Page 96 of 185
Equivalent
As on binary, take the number, divide it by 16 and take the remainders only, e.g.
209 to hexadecimal will be expressed as:
161
16
160
1
13 (D)
Page 97 of 185
Take the binary digits and join them into one binary number.
NB: Pupils should be able to add and subtract hexadecimal numbers, which were left out in this
module. Cognisance should be taken on carry if the answer after adding exceeds 16. Bear in mind
also that the decimal number 10, 1115 and represented by letters A, BF respectively.
Binary Addition
The table below illustrates procedure for binary addition, just like the addition of normal figures.
Compiled By Kapondeni T.
Page 98 of 185
Binary subtraction
Subtraction of binary numbers follow the principles laid down on the following table:
Compiled By Kapondeni T.
Page 99 of 185
Hexadecimal
0
1
2
3
4
5
6
7
Binary (4-bit)
0000
0001
0010
0011
0100
0101
0110
0111
101
Compiled By Kapondeni T.
e.g
Express the denary number 78 as:
(i) a binary number stored in an 8 bit byte,
-Divide 78 by 2 and then take the remainders, to get 1001110. This answer has 7 bits but you
are required to give your answer in 8-bit. Add a 0 to the leftmost side of the block to produce
010011102 which is the correct answer.
63 in to binary form give 111111, which has 6 bits instead of 8. A leading 0 is added to the
left to make them 8, thus giving us: 00111111. Change 0s to 1s and 1s to 0s, which gives
us 11000000. Add 1 to the number and will give us 11000001. Which is now in 2s
complement of 64 which is -63.
Compiled By Kapondeni T.
* * * * * *
There is no memory space for the decimal point. However computers represent a finite number of
digits. This limitation allows us to evaluate the maximum and minimum possible numbers that can be
represented. These include:
-
Smallest Magnitude Negative Number:Negative numbers start with a 1. This gives us 11111111 = -1
Smallest Magnitude Negative Number:Negative numbers start with a 1. This gives us 1.1111111 = -1/27=0.0078125
The decimal point is fixed at one position and therefore does not move. In binary we can have
functional column headings.
27
128
1
=164.75
26
64
0
Binary Fraction
0.1
0.01
0.001
0.0001
25
32
1
24
16
0
23
8
0
Fraction
1/2
1/4
1/8
1/16
22
4
1
21
2
0
20
1
0
2-1
0.5
1
2-2
0.25
1
Decimal
0.5
0.25
0.125
0.0625
Compiled By Kapondeni T.
Step-1
Calculate the positive
number in binary.
equivalent
0
Step-2
Change 0s to 1s and 1s to 0s 1
(Complement).
Step-3
1
Add 1 to the result.
The floating point representation of a number has two parts: mantissa and exponent. The mantissa is
a signed fixed point number. The exponent shows the position of the binary point in the mantissa.
For example, the binary number +11001.11 with an 8-bit mantissa and 6-bit exponent is represented
as follows Mantissa is 01100111. The left most 0 indicates that the number is positive.
Exponent is 000101. This is the binary equivalent of decimal number +5.
The floating point number is Mantissa x 2exponent , i.e. + (.1100111) x 2+5.
Example: consider the binary number 10111. This could be represented by 0.10111 x 25 or 0.10111 x
2101. Here 0.10111 is the mantissa and 101 is the exponent.
Thus, in binary, 0.00010101 can be written as 0.10101 x 2-11 and 0.10101 is the mantissa and 11 is
the exponent.
Compiled By Kapondeni T.
It is now clear that we need to be able to store two numbers, the mantissa and the exponent. This
form of representation is called floating point form. Numbers that involve a fractional part, like
2.46710 and 101.01012 are called real numbers.
Give the denary number which would have 01000000 00000000 as its binary, floating point
representation in this computer
The answer is 0.5 or because it will be 0.1 x 20
It is not possible to represent zero as a normalised floating point number because a normalised value
must have the first two bits of the mantissa different. Therefore one must be a 1- which must
represent either -1 or + , but not zero.
- For a positive number, there must be NO leading 0s to the left of the MSB, excluding the sign bit.
With positive numbers, the binary point in the mantissa was always placed immediately
before the first non-zero digit because it allows us to use the maximum number of digits.
Compiled By Kapondeni T.
Suppose we use 8 bits to hold the mantissa and 8 bits to hold the exponent. The binary number
10.11011 becomes 0.1011011 x 210 and can be held as
The first digit of the mantissa is zero and the second is one. The mantissa is said to be normalised if
the first two digits are different. Thus, for a positive number, the first digit is always zero and the
second is always one. The exponent is always an integer and is held in two's complement form.
Now consider the binary number 0.00000101011 which is 0.101011 x 2-101. Thus the mantissa is
0.101011 and the exponent is 101. Again, using 8 bits for the mantissa and 8 bits for the exponent,
we have
Compiled By Kapondeni T.
Benefits of Normalisation
- Ensures that a single representation of a number is maintained (standardisation).
- Ensures maximum possible accuracy with a given number of bit is maintained.
- Can be used to detect error conditions such as underflow and overflow
- Tires to maximise the range of numbers that can be represented in a fixed point representation
(Range and accuracy is limited in fixed point representation)
Range and Precision/ Accuracy and Range
The size of the exponent determines the range of numbers that can be represented
The range of numbers is expanded by increasing the number of bits that are used to represent
the exponent. This will however decrease precision.
Reducing the number of bits in the exponent will reduce the range because power of two
which the mantissa is multiplying by is decreased.
At the same time decreasing the exponents bits will increase accuracy because more digits
are represented after the binary point.
The size of the significant determines the precision of the numbers that can be represented
Precision can be increased by increasing the number of bits that are used to represent the
significant
This will decrease the range.
The only way to increase both range and precision is to use more bits
that is, the use of single-precision numbers, double-precision numbers, etc
Compiled By Kapondeni T.
If we use more bits for the mantissa we will have to use fewer bits for the exponent. Let us start off
by using 8 bits for the mantissa and 8 bits for the exponent for explanations below:
The largest positive value we can have for the mantissa is 0.1111111 and
The largest positive number we can have for the exponent is 01111111.
This means that we have 0.1111111 x 21111111 = 0.1111111 x 2127.
This means that the largest positive number is almost 1 x 2127.
Also:
The smallest positive mantissa is 0.1000000 and
the smallest exponent is 10000000.
This represents 0.1000000 x 210000000 = 0.1000000 x 2-128 which is very close to zero; in fact
it is 2-129.
On the other hand:
The largest negative number (i.e. the negative number closest to zero) is 1.0111111 x 210000000
= -0.1000001 x 2-128
We cannot use 1.1111111 for the mantissa because it is not normalised. The first two digits
must be different.
Furthermore:
The smallest negative number (i.e. the negative number furthest from zero) is 1.0000000 x
201111111 = -1.0000000 x 2127 = -2127.
Zero cannot be represented in normalised form. This is because 0.0000000 is not normalised because
the first two digits are the same. A normalised value must have the first two bits of the mantissa
Compiled By Kapondeni T.
different. Therefore one of them must be a 1 which must represent either -1 or + , but not zero.
Usually, the computer uses the smallest positive number to represent zero.
Also, size of number mean the furthest to the left on a number line so -1 is a bigger number than -2.
Whereas, if we talk about largest magnitude negative number then the -2 is greater magnitude than -1
because the integer value is greater (not considering the sign).
For a positive number n,
2-129 n < 2127
For a negative number n
-2127 n < -2-129
Range: Allowable values for a register representation of data, starting from the lowest (minimum)
allowable to the highest (maximum) allowable values. It can also be defined as the difference
between the lowest and the highest acceptable values.
Accuracy and Errors
Floating and fixed point numbers will be accurate to the smallest number they can represent.
Accuracy: is the degree of closeness (nearness) of measurements of a quantity to that quantity's
actual (true) value.
Precision: precision is the degree to which further measurements or calculations show the same or
similar result. More bits for precision is a measure of reliability (repeatedly give the same value).
More bits for mantissa means the number will be more precise, i.e by having more significant figures
To increase accuracy, either you can:
Increase the number of bits used for the mantissa
Then reduce the number of bits for the exponent
However, the range of numbers represented is reduced because the size of the index of the power of
two is reduced
Round-Off Errors
Rounding: Expressing a number to the nearest whole/decimal/binary number. For example 3.9569
rounded to 3 decimal places is 3.957. If rounded to the nearest whole number, it becomes 4.
Often we cannot represent a denary fraction exactly even if we allow many bits in memory.
Therefore the number stored is "rounded off" to the closest possible binary equivalent.
In rounding, the least significant bit may be increased depending on digits removed. The result
should represent the value that is nearest to the original value, e.g
100.1 = 100 (2 s. f)
10101 = 1100 (2 .f)
11011 = 11000 (2 s.f)
11.101 = 100 (2 s.f)
1100 =1100 (3 s.f)
1101 = 1110 (3 s.f)
111.1101 = 1110.11 (6 s.f)
Compiled By Kapondeni T.
Overflow
Underflow
An underflow is produced when a result that is smaller in magnitude than the smallest number
that can be represented.
It occurs when a small number is divided by a large number or
When small numbers are multiplied together.
Compiled By Kapondeni T.
There exists the largest and smallest positive value. Around 0, there exists a range of values
that cannot be represented (stored) and this is underflow.
NB: Overflow and underflow occur when a result of calculations falls outside the range of
values permitted by the representation of the number.
Practice Questions
1. (a) Describe how characters are stored in a computer. (3)
b) Explain what is meant by an integer data type. (2)
c) State what is meant by Boolean data. (1)
2. a) Express the number 113 (denary) in
(i) binary
(ii) in BCD
using an appropriate number of bytes in each case. (4)
b) Using the answer obtained in part (a) show how 113 (denary) can be expressed in
(i) octal
(ii) hexadecimal. (4)
3. Explain how the denary number 27 can be represented in binary in
(i) sign and magnitude
(ii) twos complement
notation, using a single byte for each answer. (4)
4. (a) Add together the binary equivalents of 34 and 83, using single byte arithmetic, showing you
working. (3)
(b). Describe a floating point representation for real numbers using two bytes. (4)
5. a) Explain how the fraction part of a real number can be normalised. (2)
b) State the benefit obtained by storing real numbers using normalised form. (1)
6. a) A floating point number is represented in a certain computer system in a single 8 bit byte. 5 bits
are used for the mantissa and 3 bits for the exponent. Both are stored in twos complement form and
the mantissa is normalised.
(i) State the smallest positive value,
(ii) state the most negative value
that can be stored. Give each answer as an 8 bit binary value and as a decimal equivalent. (4)
b) Explain the relationship between accuracy and range when storing floating point representations of
real numbers. (4)
7. (a) Express the denary number -95 as a twos complement integer in an eight-bit byte. [2]
(b) Add together the following binary numbers. Show your working.
0 1 1 0 0 1 1 0 and 0 0 1 0 0 1 0 1
[2]
8. Part of the information stored in the data dictionary describes the type of data which is being
stored.
A particular piece of data is 10010110.
State what the data stands for if the data dictionary describes it as:
Compiled By Kapondeni T.
Compiled By Kapondeni T.
14. A computer stores fractional numbers in floating point binary representation. Five bits are used
for the mantissa and three bits for the exponent. All values are stored in twos complement form.
(a) By using a diagram of this representation, state the value of each of the bits. [4]
(b) By using 2 as an example, explain how real numbers can be shown in normalised form in this
representation. [3]
(c) State the floating point binary value of - in this representation. [2]
15. A computer stores numbers in floating point form, using 8 bits for the mantissa and 8 bits for the
exponent. Both the mantissa and the exponent are stored in twos complement form.
(a) Explain the effect on the
range
accuracy
of the numbers that can be stored if the number of bits in the exponent is reduced. [4]
(b) Give the denary number which would have 01000000 00000000 as its binary, floating point
representation in this computer. [2]
(c) Explain why it is not possible to represent zero as a normalised floating point number. [2]
Compiled By Kapondeni T.
Compiled By Kapondeni T.
Compiled By Kapondeni T.
(c) Registers:
- This is a high-speed storage area in the CPU used to temporarily hold small units of program
instructions and data immediately before, during and after execution by the CPU.
- It is a small amount of storage available on the CPU whose contents can be accessed more
quickly than storage available elsewhere
- Registers are special memory cells that operate at very high speed. They provide the fastest
way for a CPU to access data.
- The CPU contains a number of registers and each has a predefined functions
- Most modern computer architectures operate by moving data from main memory into registers,
operate on them, then move the result back into main memory
- Register size determines how much information it can store
- The size of register is in bytes: i.e., can be one, two, four or eight byte register
- The processor contains a number of special purpose registers (which have dedicated uses) and
general purpose registers (which may be used for arithmetic function and are a sort of working
area)
- The main types registers (special purpose registers) found in the Von Neumann Machine are as
given below:
program counter
memory address register
memory data register/memory buffer register
current instruction register
index register
Compiled By Kapondeni T.
Compiled By Kapondeni T.
Index register
It is a register used for modifying operand addresses during program execution,
Used in performing vector/array operations.
Used for indirect addressing where an immediate constant (i.e. which is part of the instruction
itself) is added to the contents of the index register to form the address to the actual operand
or data
How and why is the index register (IR) used
- used in indexed addressing
- stores a number used to modify an address which is given in an instruction
- allows efficient access to a range of memory locations by incrementing the value in the IR
e.g. used to access an array
- It stores an integer value
- The integer value is added to the base address in the instruction
- Used for the successive reading of values from memory locations e.g. in an array
- Can be incremented after use
General Purpose Registers
Accumulator
- A general purpose register used to accumulate results of processing
- it is where the results from other operations are stored temporarily before being used by
other processes.
- Available to the programmer and referenced in assembly language programs
- Used for performing arithmetic functions
Flags Register: Used to record the effect of the last ALU operation
Compiled By Kapondeni T.
Decode
The instruction in the CIR can now be split into two parts, the address and the operation
The address part can be placed in the MDR and the data fetched and put in the MAR.
Execute
The contents of both the memory address register and the memory data register are sent
together to the central processor. The central processor contains all the parts that do the
calculations, the main part being the CU (control unit) and the ALU (arithmetic logic unit),
there are more parts to the central processor which have specific purposes as well.
The ALU will keep referring back to where the data and instructions are stored, while it is
executing them, the MDR acts like a buffer, storing the data until it is needed
The CU will then follow the instructions, which will tell it where to fetch the data from, it will
read the data and send the necessary signals to other parts of the computer.
Load the address that is in the program counter (PC) into the memory address register
(MAR).
Increment the PC by 1.
Load the instruction that is in the memory address given by the MAR into the MDR
Load the instruction that is now in the MDR into the current instruction register (CIR).
Decode the instruction that is in the CIR.
If the instruction is a jump instruction then
Load the address part of the instruction into the PC
Reset by going to step 1.
Execute the instruction.
Reset by going to step 1.
The first step simply places the address of the next instruction into the memory Address Register so
that the control unit can fetch the instruction from the correct part of the memory. The program
Compiled By Kapondeni T.
counter is then incremented by 1 so that it contains the address of the next instruction, assuming that
the instructions are in consecutive locations.
The memory data register is used whenever anything is to go from the central processing unit to main
memory, or vice versa. Thus the next instruction is copied from memory into the MDR and is then
copied into the current instruction register.
Now that the instruction has been fetched the control unit can decode it and decide what has to be
done. This is the execute part of the cycle. If it is an arithmetic instruction, this can be executed and
the cycle restarted as the PC contains the address of the next instruction in order. However, if the
instruction involves jumping to an instruction that is not the next one in order, the PC has to be
loaded with the address of the instruction that is to be executed next. This address is in the address
part of the current instruction, hence the address part is loaded into the PC before the cycle is reset
and starts all over again.
Memory Unit
Is the computer memory that temporarily stores the operating system, application programs and data
currently use.
It used to store the following:
Program instructions in current use;
Data in current use;
Parts of Operating System that are currently in use.
Some architectures have a Memory Unit (Main memory) which has two types: RAM and ROM.
Buses
A bus is a pathway through which data and signals are transferred from one device to another.
They are a set of parallel wires connecting two or more components of the computer.
Buses can be internal or external.
Buses can be generally referred to as system bus and this connect the CPU, memory and I/O
devices.
Each bus is a shared transmission medium, so that only one device can transmit along a bus at
any one time.
Multiple devices can be connected to the same bus
Diagram
Compiled By Kapondeni T.
Data and control signals travel in both directions between the processor, memory and I/O
controllers.
Address, on the other hand, travel only one way along the address bus: the processor sends
the address of an instruction, or of data to be stored or retrieved, to memory to an I/O
controller.
Address bus:
Used for transferring memory addresses from the processor when it is accessing main
memory
They are used to access memory during the read or write process
The width of the address bus determines the maximum possible memory capacity of the
computer.
This a uni-directional bus (one way). The address is send from CPU to memory and I/O
ports only.
Control bus:
The purpose of the control bus is to transmit command, timing and specific status
information between system components. Timing signals indicate the validity of data and
address information. Command signals specify operations to be performed. Specific
status signals indicate the state of a data transfer request, or the status of request by a
components to gain control of the system bus
This is a bi-directional bus used for carrying control signals (Signals can be transferred in
both directions).
They carry signals to enable outputs of addressed port and memory devices
Control signals regulate activities on the bus.
Control buses transmit command, timing and status information between computer
components.
Typical control signals are:
Memory Read
Memory Write
I/O Read
I/O Write
Interrupt Request
Interrupt Grant
Reset
Ready hold
etc
Timing signals: indicate validity of data and information.
Command signals: Specify operations to be performed
Status signals: Indicate state of data transfer request or status of a request.
Compiled By Kapondeni T.
INTERRUPTS
Interrupt: it is a signal generated by a device or software, which may cause a break in the execution
of the current routine
An interrupt is a signal send to the processor by a peripheral or software for attention to be turned to
that peripheral/software, thereby causing a break in the execution of a program, e.g. printer out of
paper.
Control is transferred to another routine and the original routine will be resumed after the interrupt
Interrupt Service Routine (Handler): a small subprogram that is calld when an interrupt occurs and
it handles the interrupt.
Interrupt priorities
Interrupts have different priorities
This is important if two interrupts are received simultaneously the processor for it to decide which
one is more important to execute first.
There are four levels of priority, which are (highest priority order):
- Hardware Failure: can be caused by power failure or memory parity error.
- Program Interrupts: Arithmetic overflow, division by zero, etc
- Timer Interrupts: generated by the internal clock
- I/O Interrupts:
Interrupt Handling
At the end of each Fetch-Execute cycle, the contents of the interrupt registers are checked.
Should there be an interrupt; the following steps will typically be taken:
a) The current fetch-decode-execute cycle is completed
b) The operating system halts current task
c) The contents of the PC and other registers will be stored safely in a stack.
d) The highest priority interrupt is identified. Interrupts with a lower priority are disabled.
e) The source of the interrupt is identified.
f) The start address of the interrupt handler is loaded into the PC.
g) The interrupt handler is executed.
h) Interrupts are enabled again, and the cycle will restart with any further interrupts.
i) The PC and other registers are popped from the stack and restored.
j) The users program resumes with the next step in its cycle.
- When dealing with an interrupt, the computer has to know which interrupt handler to call for
which interrupt.
- One method of doing this is known as the vectored interrupt mechanism.
- In this approach a complete list of interrupts and the memory address of their handler is stored in a
table called the interrupt vector table.
- The interrupt supplies an offset number, which identifies the interrupt uniquely.
- This offset is added to a base term, and the resultant number is the memory address of a pointer to
the memory location of the handler routine.
- This is explained in the example below:
Compiled By Kapondeni T.
- If the interrupt 002 is received, the base number 5000 is added to it, which allows the processor to
know that the handler can be found by opening the data stored at address 5002.
- The address 5002 simply stores a pointer to another memory location, 6280, where the actual
handler routine begins.
- The advantage of this approach is that each interrupt only needs to give the processor an offset
number, such as 002, and the processor can determine from that the correct memory location to
use. This is more efficient than the interrupt sending the full memory address itself. This approach
also allows the interrupt routines to be stored anywhere in the memory, with the pointer table
updated to reflect if a handler routine is moved
Types of interrupts
Input / output interrupt e.g. disk full, printer out of paper, etc. they are generated by the I/O devices
Interrupts generated by running process: process may need more storage or to communicate with
the operator
Timer interrupts: generated by the processor clock, e.g. control being transferred to another user in
a time sharing system
Program check interrupts: caused by errors like division by zero
Machine check interrupts: Caused by malfunctioning hardware.
Clock (happens normally in time sharing systems where the clock transfers control from one
computer to another.)
Why interrupts are used in a computer system
- to obtain processor time for a higher priority task
- to avoid delays
- to avoid loss of data
- as an indicator to the processor that a device needs to be serviced
- allows computer to shut down if the power off interrupt predicts loss of power, saving data in
time
Sources of interrupts
- power failure/system failure
- peripheral e.g. printer (buffer empty)/hardware
- clock interrupt
- user interrupt e.g. new user log on request
- software
Compiled By Kapondeni T.
Vectored Interrupts
A specific number assigned to each interrupt is called an interrupt vector. Each interrupt is numbered.
Each interrupt vector is the one used to call the interrupt handler
Address of interrupt service routines are stored in an array (known as interrupt dispatch table) and the
interrupt vector is used as a subscript to this array.
Buffer
- Buffer: This is a temporary memory store for data awaiting processing or output,
compensating speed at which devices operate, for example printer buffer.
- A buffer is a memory in the interface between two devices which temporarily store data
which is being transmitted from one device to another
- A buffer is a small amount of fast memory outside the processor that allows the processor to
get on with other work instead of being held up by the secondary device.
- The buffer is necessary if the two devices work at the different speed
- Buffering is appropriate where an output device processes data slower than the processor. For
example, the processor sends data to the printer, which prints much slower and the printer
does not need to wait for the printer to finish printing in order for it to carry out the next task.
- It therefore saves the data in a buffer where it will be retrieved by the printer.
- Buffering usually match devices that work at different speeds, e.g. processor and disk.
- Sometimes a device is already busy in executing some instructions.
- Example: there are three printing jobs, the printer can print only one job at a time. The OS
sends the next two jobs in buffer, a process which is also known as Spooling
- Buffers are a main component of Memory
- The printer buffer is one of the most common type of buffer.
Reasons for using printer buffers:
Stores data or information being sent to the printer temporarily.
Compensates for difference in speed of CPU and printer.
Allows CPU to carry out other tasks whilst printer is printing.
Benefits of increasing size of buffer in a printer:
Reduces the number of data transfers to the printer.
Ensures a more efficient use of the CPU.
Larger files can be sent to the printer without problems
Compiled By Kapondeni T.
Use of buffers and interrupts in the transfer of data between primary memory and hard disk.
Buffer is temporary storage area for data
Data transferred from primary memory to buffer (or vice versa)
When buffer full, processor can carry on with other tasks
Buffer is emptied to the hard disk
When buffer empty, interrupt sent to processor requesting more data to be sent to buffer.
Works according to priorities
Cache Memory
- A cache is a small and very high speed memory used to speed up the transfer of data and
instructions, doubling the speed of the computer in some cases.
- It can located inside or close to the CPU Chip
- it is placed between the CPU and the main memory.
- It stores frequently or most recently used instructions and data
- It is faster than RAM
- The data and instructions that are most recently or most frequently used by CPU are
stored in cache memory.
- it is used to increase the speed of processing by making current programs and data
available to the CPU at a rapid rate
- CPU processes data faster than main memory access time, thus processing speed is
limited primarily by the speed of main memory.
- It compensates the speed difference between the main memory access time and processor
logic.
- It is used to increase the speed of processing by making current programs and data
available to the CPU at a rapid rate.
- The cache thus used for storing segments of programs currently being executed in the
CPU and temporary data frequently needed in the present calculations
- The amount of cache memory is generally between 1kb and 512kb
Compiled By Kapondeni T.
Diagram B
In the first clock cycle the processor gets the instruction from memory and decodes it. In the
next clock cycle the required data is taken from memory. For each instruction this cycle
repeats and hence needs two cycles to complete an instruction
Pipelining the instructions is not possible with this architecture.
A stored-program digital computer is one that keeps its programmed instructions, as well as
its data, in read-write, random access memory (RAM), that is the Von Neumann computer.
This makes the machines much more flexible.
Compiled By Kapondeni T.
By treating those instructions in the same way as data, a stored-program machine can easily
change the program, and can do so under program control.
Once in the computers memory a program will be executed one instruction at a time by
repeatedly going through
In the vast majority of modern computers, the same memory is used for both data and
program instructions.
Advantages
- Almost all data can be processed by the von Neumann computer
- Cheaper than alternative types of processing
- Its design is very simple
Disadvantages
- Slower than other architectures
- Limited by bus transfer rate
- Does not maximise CPU utilisation
- Poorly written programs can have their data mixed up as both data and instructions share
the same memory
Harvard Architecture.
Diagram A
Diagram B
Stores instructions and data in separate memory, thus has physically separate storage for data
and instructions
Has separate data and instruction buses, allowing transfers to be performed simultaneously on
both buses.
Data and instructions are treated separately
May employ pipelining. Efficient Pipelining - Operand Fetch and Instruction Fetch can be
overlapped.
Harvard Architecture have a system would have separate caches for each bus.
Using a simple, unified memory system together with a Harvard architecture is highly inefficient.
Unless it is possible to feed data into both busses at the same time, it might be better to use a von
Neumann architecture processor.
Compiled By Kapondeni T.
Disadvantages
Not widely used.
More difficult to implement.
More pins needed for buses.
System clock
It is an electronic component that generates clock pulses to step the control unit through its
operation.
This sends out a sequence of timing pulses or signals, which are used to step the control unit
through its operations.
It generates electric signals at a fast speed
It controls all functions of computer using clock ticks
These ticks of system clock are known as clock cycle and speed of CPU
The speed at which the CPU executes instructions is called clock speed or clock rate.
It generates a continuous sequence of clock pulse to step the control unit through its
operations
Serial Processing
Each instruction is executed in turn until the end of the program.
Advantages
- Nearly all programs can run on serial processing and therefore no additional complex
code can be written.
- All data types are suitable for serial processing
- Program can use the previous result in the next operation
- Data set are independent of each other
- Cheaper to handle than parallel
Disadvantages
- Slows data processing especially in the Von Neumann architecture (bottleneck)
- Too much thrashing especially with poorly designed programs
Parallel Processing
- Parallel processing is the ability of a computer system to divide a job into many tasks which
are executed simultaneously, using more than one processor, thus allowing multiple
processing.
- Multiple CPUs can be used to carry out different parts of the fetch-execute cycle.
- The computer is able to perform concurrent data processing to achieve faster execution time.
- The system may have two or more ALUs and be able to execute two or more instructions at
the same time.
- It may also have two or more processors operating concurrently
- The objective is to increase throughput
- Mostly applies to Single Instruction Single Data computer (SISD)
- Supercomputers utilizing parallel processing are used to maintain the safety.
- Scientists are using parallel processing to design computer-generated models of vehicles.
- Airlines use parallel processing to process customer information, forecast demand and decide
what fares to charge.
- The medical community uses parallel processing supercomputers.
NB:- Instruction Stream:-the sequence of instructions read from memory
- Data stream: operations performed on the data in the processor
Parallel processing occurs in the instruction stream, the data stream or both.
Compiled By Kapondeni T.
(a) Single Instruction Stream, Single Data Stream (SISD) Instructions are executed
sequentially and parallel processing can be achieved by multiple functional units or by
pipelining.
(b) Single Instruction Stream, Multiple Data Stream (SIMD)- includes multiple processing
units with a single control unit. All processors receive the same instruction but operate on
different data
(c) Multiple Instruction Stream, Single Data Stream (MISD) Involves parallel computing
where may functional units perform different operations by executing different instructions on
the same data set
(d) Multiple Instruction Stream, Multiple Data Stream (MIMD) processor capable of
processing several programs at the same time
Advantages of parallel processing
- allows faster processing especially when handling large amounts of data
- more than one instruction (of a program) is processed at the same time
- Not limited (affected) by bus transfer rate
- Can make maximum CPU utilisation as long as it is kept full
- different processors can handle different tasks/parts of same job
- Memory is scalable with number of processors. Increase the number of processors and the
size of memory increases proportionately.
- Each processor can rapidly access its own memory without interference and without the
overhead incurred with trying to maintain cache
Disadvantages of parallel processing:
- Only certain types of data is appropriate for parallel processing
- Data that relies on previous operation cannot be made parallel.
- Each data set must be independent of each other
- Usually more expensive
- The programmer is responsible for the details associated with data communication.
- operating system is more complex to ensure synchronisation
- program has to be written in a suitable format
- Program is more difficult to test/write/debug
- It may be difficult to map existing data structures, based on global memory, to this memory
organization.
Parallel processing includes Vector (Array) Processing and Pipeline Processing
Involves the use of a several processors to perform a single job.
Compiled By Kapondeni T.
2. Pipeline Processing
It is a technique which allows the overlapping of the fetch-decode-execute cycle for different
instructions.
A parallel processing architecture in which several processors are used, each one doing a
different part of the fetch, decode, execute cycle, so the fetch-decode-execute cycle is
staggered.
The processor is split up into three parts (fetch, decode, execute), each of which handles one
of the three stages.
Each part is called a line, where each single line is a pipeline.
This can be best illustrated with the diagram below.
Compiled By Kapondeni T.
As long as the pipelines can be kept full, it is making best use of the CPU. This is an example of
single instruction single data (SISD) processor, again it should be quite clear why, the processor is
processing a single instruction to a single bit of data.
In pipelining, three instructions are dealt with at the same time. This reduces the execution time
considerably.
However, this would only be true for a very linear program.
Once jump instructions are introduced the problem arises that the wrong instructions are in the
pipeline waiting to be executed, so every time the sequence of instructions changes, the pipeline has
to be cleared and the process started again.
A non-pipeline architecture is inefficient because some CPU components (modules) are idle while
another module is active during the instruction cycle. Pipelining does not completely cancel out idle
time in a CPU but making those modules work in parallel improves program execution significantly.
Processors with pipelining are organized inside into stages which can semi-independently work on
separate jobs. Each stage is organized and linked into a 'chain' so each stage's output is fed to another
stage until the job is done. This organization of the processor allows overall processing time to be
significantly reduced
Compiled By Kapondeni T.
ADDRESSING MODES
Each instruction specifies an operation on certain data.
The different ways in which a computer calculate addresses holding the source and/or destination of
the data being processed in a particular instruction is called addressing mode.
Addressing modes are mostly found in assembly language for microprocessors
Each assembly language instruction has the following structure:
Op-code (operator):
- is the part that represent the operations that the computer can understand and carry out. It is
the mnemonic part of the instruction/that indicates what it is to do/code for the operation.
They are easier to remember. They can be represented by mnemonics which are the pseudo
names given to the different operations that make it easier. E.g. ADD.
Operand:
- it is the address field in an instruction that holds data to be used by the operation given in the
opcode, e.g. in ADD 12, 12 is the operand
- is the data to be manipulated, theres no point telling the computer what to ADD if theres no
data to apply it to. It can hold the address of the data, or just the data.
The data is what the operation is being applied to, there are a number of different ways in which this
data can be represented, and this is known as addressing.
Symbolic addressing: the use of characters to represent the address of a store location
Effective Address: the actual address of operand to be used by the instruction.
The most common addressing modes are: direct, indirect, indexed, relative and immediate
addressing.
1. Immediate Addressing
This is where the value to be used is stored in the instruction.
This is when the value in the instruction is not an address at all but the actual data (constant to
be used in the program).
The data to be operated on is held as part of the instruction format.
The data to be used is stored immediately after the op code for the instruction. Thus the
operand field actually contains the data
e.g: LDA #&80 : Means that Load the hexadecimal value of 80 into the accumulator register.
MOVE #8, R1: Moves the value 8 into register R1
Immediate addressing uses the # symbol.
This is very simple, although not often used because the program parameters cannot be
changed.
This means that the data being operated on cant be adjusted and only uses constants.
Can be used to initialize constants.
Compiled By Kapondeni T.
2. Direct Addressing
The address in the instruction is the address to be used to get to the operand.
The operand gives the address of the data to be used in the program.
It requires one memory reference to read the operand from the given location
The address given in the instruction is the one that contains the data to be used in the
operation without any modification.
It is also called memory addressing
e.g In the instruction ADD 23,
we first go to memory address 23 which stores the instruction to be executed.
It provides only a limited address space
It is very simple, although does not make best use of memory
It is slow as too much memory is used
3. Indirect Addressing
- In this mode of addressing, the address given in the instruction holds the address of where the
data is stored.
- This is whereby the real address is stored in the memory so the value in the address part of the
instruction is pointing to the address of the data.
- The address of data in memory is held in another memory location and the operand of the
instruction holds the address of this memory location.
It is MOSTLY used when access areas of memory that are not accessible using the space
available for the address in the instruction code
using our example, ADD 23
we go to memory address 23
there we are given another memory address, e.g 32, where the actual instruction will be stored
Compiled By Kapondeni T.
This method is useful because the amount of space in a location is much bigger than the space
in the address part of the instruction.
It gives flexibility as the original program does not need to be altered if the position of the
routines (sub-programs) change.
Therefore we can store larger addresses and use more memory.
It is used where memory larger than can be accessed by address in instruction
It is also used when one wants to allow full size of register to be used for address
used if memory locations are 32 bits are used and thus allowing more memory to be accessed
there is a problem that some areas of memory cannot be addressed because size of memory
address is larger than space available in instruction
Indirect addressing solves this problem as the Memory address will fit in a memory location
Relative Addressing
- The same as Indexed Addressng except that the PC replces the Index Register.
- E.g Load Ri, X (PC)
- This loads register Ri with the contents of the memory location whose address is the sum of
the contents of the PC and the value X.
This is direct addressing that does not commence from the start of the address of the memory.
It begins from a fixed point, and all addresses are relative to that point.
allows a real address to be calculated from a base address by adding the relative address
relative address is an offset and can be used for arrays
can be used for branching instructions
it adds the PC contents to the base address to get the effective address
it is appropriate when the code is going to be loaded at a random place in memory to be
executed.
Use to refer to jump instructions
Indexed Addressing
- The address part of the instruction is added to a value held in the index register.
- It is where the actual address is found by adding a displacement to the base address.
Compiled By Kapondeni T.
Questions
1. The Program Counter (Sequence Control Register) is a special register in the processor of a
computer.
a) Describe the function of the program counter. (2)
b) Describe two ways in which the program counter can change during the normal execution of a
program, explaining, in each case, how this change is initiated. (4)
c) Describe the initial state of the program counter before the running of the program. (2)
2. Explain what is meant by the term Von Neumann Architecture. (2)
3. Describe the fetch/decode part of the fetch/decode/execute/reset cycle, explaining the purpose of
any special registers that you have mentioned. (7)
4. a) Describe how pipelining normally speeds up the processing done by a computer. (2)
b) State one type of instruction that would cause the pipeline system to be reset, explaining why such
a reset is necessary. (3)
5). Give 3 differences between the Von Neumann and the Harvard Computer architectures.
Compiled By Kapondeni T.
Names[1]
Names[2]
Names[3]
Names[4]
The five individual locations are Names (0), Names (1), Names (2), Names (3) and Names (4).
Each data item is called an element of the array. To reference a particular element one must use the
appropriate index.
NB: However, most programming languages differ with Microsoft Visual basic in handling
arrays, especially on the amount of memory allocated. For example, using Java, the following
declaration:
Int [4 ]Names;
This array declaration creates exactly 4 memory spaces for the array Names. The indices of the
array range from 0 to 3 which are
Names[0], Names[1], Names[2] and Names[3]
Initialising an array in Visual Basic 6.0
The procedure of initializing an array in the computer memory is as follows:
- Size of array is calculated
- Location of array is decided according to data type and size
- Locations are reserved for the array
- Size of array is stored in a table
- Lower bound of the array is stored in a table
- Upper bound of array is stored in a table
Compiled By Kapondeni T.
Names(4):
Theresa
Lameck
Johanne
Laurence
Fadzai
Two-dimensional arrays
A two-dimensional array is a data structure in which the array is declared using two indices and can be
visually represented as a table.
Indices
0
1
2
3
4
0
Makombe
Vheremu
Mununi
Chirongera
Mutero
1
Tinashe
Alex
Mary
Salpicio
Violet
2
M
M
F
M
F
3
4A
4B
3C
2C
4C
The diagram above shows the visual representation of a 2 dimensional array Names(4,3)- 5 rows and 4
columns:
Each individual element can be referenced by its row and column indices. For example:
Names(0,0) is the data item Makombe
Names(2,1) is the item Mary
Names(1,2) is the item M
Initialising an array
Initialising an array is a procedure in which every value in the array is set with starting values this
starting value would typically be for a string array, or 0 for a numeric array.
Compiled By Kapondeni T.
Initialisation is important to ensure that the array does not contain results from a previous use,
elsewhere in the program.
Algorithm for initialising a one-dimensional numeric array:
DIM TestScores(9) As Integer
DIM Index As Integer
FOR Index = 0 TO 9
TestScores(Index) = 0
NEXT Index
Algorithm for initialising a two-dimensional string array:
DIM Students(4,3) As String
DIM RowIndex, ColumnIndex As Integer
FOR RowIndex = 0 TO 4
FOR ColumnIndex = 0 TO 3
Students(RowIndex,ColumnIndex) =
NEXT ColumnIndex
NEXT RowIndex
Serial search on an array
The following pseudo-co de can be used to search an array to see if an item X exists:
01 DIM Index As Integer
02 DIM Flag As Boolean
03 Index = 0
04 Flag = False
05 Input X
06 REPEAT
07
IF TheArray(Index) = X THEN
08
Output Index
09
Flag = True
10
END IF
11
Index = Index + 1
12 UNTIL Flag = True OR Index > Maximum Size Of TheArray
13 IF Flag = False THEN
14
Show Message Item not found
15 END IF
Note that the variable Flag (line 04 and 09) is used to indicate when the item has been found and stop
the loop repeating unnecessarily (line 12 ends the loop if Flag has been set to True).
The Actual Visual Basic Code will be as follows:
Dim Index As Integer
Dim Flag As Boolean
Dim x As Integer
Index = 0
Flag = False
x = InputBox("Enter item to Search")
Do
If Names(Index) = x Then
MsgBox (Names(Index) & " Has Been Found")
Flag = True
End If
Index = Index + 1
Compiled By Kapondeni T.
NB:
If the item is string, it replaces with empty spaces. However, if it is numeric, it replaces with a
0.
Deleting form an array is often difficult as elements need to be shifted positions after deletion.
It is an effective method for deleting elements.
This allows the user to create the array when he/she actually needs it, using a ReDim statement:
Dynamic arrays can be re-created at will, each time with a different number of items. When you recreate a dynamic array, its contents are reset to 0 (or to an empty string) and you lose the data it
contains. If you want to resize an array without losing its contents, use the ReDim Preserve
command:
ReDim Preserve Names(20) As String
For example, given the following numbers: 20, 30, 5, 2, 7, 6, 17, 58, 41
Placing them in the binary tree is as follows:
- The first element becomes the root node, i.e. 20
- For other numbers, the bigger number goes to the right and the smaller one to the right of a
node. Every time start from the root node, until you get to an empty space to place the new
node.
- For example, 30, is bigger than 20, therefore is placed to the right hand side of 20. There is
nothing on this side and therefore a new node is created and 30 placed inside.
- Next is 5, which is smaller than 20 (root node) and therefore goes to the left. There is an
empty space therefore a new node is created and 5 is placed inside.
- Then 2 is smaller than 20 (root node) and therefore goes to the left. On the left there is 5. 2 is
smaller than 5, therefore we go to the left and place 2 there.
- Next is 7, which is smaller than 20, we go to the left where there is 5. Seven (7) is bigger than
5, therefore we place it to the right of 5.
- .finish on your own!!!!!!!!!!!
traversefrom(Right)
endif
endProcedure
C. Post-Order Traversal
The order of traversal is:
- Traverse the Left sub-tree
- Traverse the Right sub-tree.
- Visit the Node
This is generally given as LRN
For the diagram above, the pre-order traversal will be as follows:
2, 6, 17, 7, 5, 41, 58, 30, 20.
The algorithm for post-order traversal is as follows:
Procedure traversefrom(p)
If Tree[p].Left<>0 Then
Traversefrom(Left)
EndIf
If Tree(p).Right <> 0 Then
traversefrom(Right)
endif
Print (data);
endProcedure
Inserting Data into a Binary Tree
Look at each node starting from the root node
If the root is empty, create the node and place the value
If the new value is less than the value of the of the node, move left, other wise move right
Repeat this for each node arrived until there is no node
Then create a new node and insert the data.
Perform until no new item need to be added
This can be written algorithmically as:
1. If tree is empty, enter data item at root and stop.
2. Current node = root.
3. Repeat steps 4 and 5 until current node is null.
4. If new data item is less than value at current node go left else go right.
5. Current node = node reached (null if no node).
6. if node reached is null, create new node and enter data.
This can also be written as:
Repeat
Compare new value with root value
If new value > root value then
Follow right sub-tree
Else
Follow left sub-tree
Endif
Until no sub-tree
Insert new value as root of new sub-tree.
Compiled By Kapondeni T.
Compiled By Kapondeni T.
Binary trees are also important in the evaluation of postfix expressions (Reverse Polish Notation)
Infix Expressions
Normal mathematical expressions are written as follows:
A+B
This method is called Infix Notation, because the operator is found between the operands to be acted
upon. Infix notation involves the use of brackets and observes operator precedence, e.g.
BODMAS/BOMDAS.
For example, the expression below is an Infix:
(A+B)*C+(D-A)
If there are no brackets, the expression will give a different answer. It is not easy for the computer to
evaluate infix expressions.
Prefix Expressions
The Polish Notation is also called the prefix. In Prefix (Polish) notation, the operator precedes the
operands. For example, the infix expression
A + B is given as: +AB
This has an advantage that it removes ambiguity and avoids use of brackets
Postfix Expressions
Reverse Polish Notation (Postfix) is a way of writing mathematical expressions without using
parenthesis and brackets, and in which the operands precede the operator. Reverse Polish Notation is
also called Postfix. For example, the Infix expression A + B is written as follows in Postfix:
AB+
Likewise, the infix expression (A+B)*C+(D-A) is written as:
AB+C*DA-+
There is no need for brackets in this situation.
Reverse Polish Notation has the following benefits:
can be processed directly by reading the expression from left to right
is free of ambiguities
does not require brackets
does not require rules of precedence as in BODMAS/BOMDAS
can be processed using a stack
NB: If the original expression to be converted contains, brackets or parenthesis, ignore them, that is,
dont put them in the tree, but use them in getting the weakest operator. Items in brackets have a
higher priority and therefore are inserted in the tree latter than those NOT in brackets.
Given the following infix notation: (a+b) c*(d-e)
This can be represented using a tree as follows:
Using the above tree, the reverse polish notation will be as follows;
ab+cde-*Compiled By Kapondeni T.
The above can be shown diagrammatically. Can please draw this on your own?
Questions
1. (a) State the difference between dynamic and static data structures giving an example of each. (3)
b) Show how a binary tree can be used to store the data items Feddi, Eda, Joh, Sean, Dav, Gali in
alphabetic order. (4)
c) Explain why problems may arise if Joh is deleted from the tree and how such problems may be
overcome.
2. An array is to be used to store information. State three parameters that need to be given about the
array before it can be used, explaining the reason why each is necessary.
Compiled By Kapondeni T.
3. (a) Explain the difference between static and dynamic data structures. [2]
(b) Give an example of a
(i) static,
(ii) dynamic
data structure, giving an advantage of each. [4]
(c) The details of a car part are stored in a binary tree according to this algorithm
READ VALUE NEW_PART
START AT ROOT NODE
WHILE NODE NOT EMPTY, DO
IF NEW_PART < VALUE AT NODE
THEN FOLLOW LEFT SUBTREE
ELSE FOLLOW RIGHT SUBTREE
ENDIF
ENDWHILE
INSERT NEW_PART AT NODE
END
(i) Show the binary tree after the following values have been input
Radio Visor Brakes Tyres Alternator Windscreen [3]
(ii) Explain how Clutch is added to the tree in (i). [5]
(iii) Describe an algorithm that can be applied to the binary tree of car parts, so that the tree is read in
alphabetic order.
4.
The following binary tree diagram contains a number of integers. In each case the right pointer
indicates the condition higher number and the left pointer indicates the condition lower or equal
number.
- A logic gate is a device that produce signals of 1 or 0 when the input logic requirements are met
and are used in manipulating binary information.
- A logic gate is a device (or electrical circuit) that performs one or more logical operations on one
or more input signals.
- Its output represent Boolean (T or F) or binary values (1 or 0) as voltages.
- Logic gates are the building blocks of digital technology.
- They can be used in applications like:
Building computer chips
Programming traffic signals
Chips for automatic alarm systems
Chips for automated control systems
- Electronic circuits operate using binary logic gates.
- Logic gates process signals which represent TRUE or FALSE, ON or OFF , 1 or 0
Main Logic Gates
The main logic gates are:
(a) OR gate
(b) AND gate
(c) NOT gate
(d) NOR gate
(e) NAND gate
(f) Exclusive OR gate (XOR)
(g) Exclusive NOR gate (XNOR)
Logic gates are used with truth tables.
A truth table is a table which shows how a logic circuit's output responds to various
combinations of the inputs, using logic 1 for true and logic 0 for false.
A truth table is a table that describes the behaviour of a logic gate.
It lists the value of the output for every possible combination of the inputs
Truth tables contains 1s and 0s and are an integral part of logic gates functionality.
Truth table and logic gates use the following:
- 1 (True, ON, Not False)
- 0 (False, OFF, Not True)
The number of rows in a truth table shows the number of combinations of the inputs of a
particular circuit. The number of rows for each gate is found using the following formulae: rows
= 2n , n being the number of inputs in the gate or circuit. For example, a gate or circuit has the
following rows corresponding to the number of input (excluding column headings):
- 1 input = 21 = 2 rows
- 2 inputs = 22 = 4 rows
- 3 inputs = 23 = 8 rows
- ..
Graphical Representation of Gates and their Truth Tables
Each logic gate has its own unique graphical representation, which can be in general form or in
standard form.
(1) General form
Compiled By Kapondeni T.
Each logic gate has a circle and the name of the gate to differentiate it from the rest as given
below:
The name inside the gate gives us the type of the gate
(2) Standard Representation
In standard form, each logic gate has its own unique diagram. Even if the name of the gate is
not written, one knows what it stands for because of the shape. The following are the logic
gates and their shapes in standard form.
(a) OR gate
This represents two inputs entering the gate and one output from the gate. The inputs can be
represented by any alphabetic characters, e.g. A and B, while the output can be X, given as
follows:
Logic Gate Diagram
Standard Form
Truth table
General Form
X= A OR B
The output (X) is true if the INPUT A OR INPUT B are true.
Thus if any one of the inputs is 1, the output is automatically 1
Output only becomes 0 if all inputs are 0
Compiled By Kapondeni T.
Truth table
General Form
The output (X) is only true if the INPUT A AND INPUT B are both true. If any one of the inputs is
0, then the output becomes 0 also.
Thus X = A AND B.
(c) NOT gate
Logic Gate Diagram
Standard Form
Truth table
General Form
The NOT gate has only one input and one output. The input is negated. Thus if input is 1, output is 0,
and vice versa.
The output (X) is true when the INPUT A is NOT TRUE.
The output (X) is False when the INPUT A is TRUE.
(d) NOR gate
Logic Gate Diagram
Standard Form
Truth table
General Form
Compiled By Kapondeni T.
Truth table
General Form
General Form
Compiled By Kapondeni T.
Truth table
General Form
Compiled By Kapondeni T.
- One can now draw the truth table, basing from the logic statement in Step 1.
Questions
1. A computer will only operate if three switches P, S and T are correctly set. An output signal (X =
1) will occur if R and S are both ON or if R is OFF and S and T are ON. Design a logic network and
draw the truth table for this network.
2. A traffic signal system will only operate if it receives an output signal (D = 1).
This can only occur if:
Either (a) signal A is red (i.e. A = 0)
Or
(b) signal A is green (i.e. A = 1) and signals B and C are both red (i.e. B and C are
both 0)
Design a logic network and draw a truth table for the above system.
3. A chemical plant gives out a warning signal (W = 1) when the process goes
wrong. A logic network is used to provide input and to decide whether or not
W=1
Compiled By Kapondeni T.
Compiled By Kapondeni T.
Compiled By Kapondeni T.
However, coaxial cable is expensive to buy and is stiff, making it difficult to handle.
They are suitable for short distance communication on a LAN
Application: Used for TV distribution (connecting decoders with the antenna on the satellite
dish); long distance telephone transmission; short run computer system links, Local area
networks
3. Fibre optic: A media that uses light to transmit data. Used in Wan and Man networks. Its benefits
are: It has less attenuation and therefore fewer repeaters are needed,
has very high bandwidth and cannot corrode (not affected by corrosion),
it is thin and therefore has less weight.
It allows very fast data transfer,
has no electromagnetic interference,
is physically secure.
Fibre optics is in two forms, multimode and monomode. Multimode fibre optic cable carries 2 or
more signals at a time, each at a slightly different reflection angle. This is used over short distances.
Monomode (Single mode) cable carried one signal at a time and is appropriate for long distance
communication.
However, fibre optics is very expensive to buy and is uni-directional (travels in one direction only).
Cable cannot bend around tight corners. It is also difficult to interface with computers.
Unguided Transmission
Wireless Transmission media
1. Bluetooth (Refer to presentations)
2. Radio (refer to presentations)
3. WIFI (Wireless Fidelity)
It is a Wireless LAN(A local area network) that uses high frequency radio signals to transmit
and receive data over distances of a few hundred feet; using Ethernet protocol. it is a set of
standards that set forth the specifications for transmitting data over a wireless network. There
must be a wireless router which enables wireless devices to connect to the network and to the
internet.
Compiled By Kapondeni T.
Range: Wi-Fi provides local network access for around a few hundred metres
Speed: maximum of 54 Mbps,
Provides local area network
Limited to one subscriber
Can be used where cables cannot run
Wireless network adaptors are inbuilt withion most devices like laptops, therefore cheaper
and easier to get.
Tend to be slower if more devices are added to the network
- Users can send corporate e-mail while out of office - even behind a firewall on mobile.
- Users can use wireless internet connection from chat rooms for discussions with colleagues while
on the move.
Disadvantages of Wireless Technology:
- Wireless LAN speeds are slower than Net access at work due to narrow bandwidth.
- Anyone within the Wireless LAN nodes range with an appropriate device can use your Wireless
LAN and broad band link.
- Anyone who walks past your house or WLAN linked into a corporate system can access sensitive
information like credit card details.
- 3G phones are not compatible with 2G phones.
- Blue tooth has limited range.
- Signals can be blocked, distorted or will be weak.
- Can lead to health problems from microwaves
Synchronous and asynchronous Transmission
Synchronous Transmission:
- This is whereby data is sent in blocks (packets) at any given time, and uses control characters.
- This method is faster in transmitting data.
- Data transfer is timed by the clock pulse
- There is no need for start and stop bits since the timing signals are used to synchronise
transmission at sending and receiving ends.
- Mostly used in local area networks
- Many transmission errors are bound to occur.
Asynchronous Transmission:
- This is whereby data is send character by character over a transmission channel.
- This is much slower as compared to synchronous transmission.
- A start bit and two stop bits marks the beginning and ending of a character respectively.
- The start and the stop bit are always different.
- The start bit alerts the receiving end and synchronises its clock, ready to receive the character.
The baud rate of the two devices is set to be similar so as to correctly receive the data.
- A parity bit is included to check against incorrect transmission.
- Each character is send as soon as it becomes available rather than waiting for the clock pulse
Serial data transmission
- Data is send one bit at a time over a single wire from source to destination, until all data has
been send.
- A system in which one bit is send at a time along the same data line until the whole data has
been send.
- Suitable for long distance communication since less cabling is needed.
- Very reliable form of transmission
- Can be fast using fibre optic cable
Parallel data transmission
- All the bits (byte) of a character are send simultaneously along separate data lines.
- Several bits are send simultaneously over a number of parallel lines linking the sender and the
receiver.
- This is faster in data transmission
- Suitable for short distance communication, i.e, inside the computer system using buses, e.g.
from processor to hard drive, processor to printer, etc.
- A parallel port is used to link external devices like printer to the processor.
Compiled By Kapondeni T.
Bits can arrive at the destination at different times since the cables may have different speeds
if the distance is too long. This is called skew. This is prevented by using short distance
transmission.
It becomes expensive in long distance communication system systems since too much cabling
is required.
It is more reliable over very short distances
Half Duplex: This is a transmission mode in which data travels in both directions but not
simultaneously. The receiver waits until the sender has finished sending data in order for him to
respond. Examples include police radios, uses Over to allows time for other to transmit
Transmission impairments
This refers to change in signal form as it propagates through the transmission channel. Transmission
impairments include:
Attenuation: The loss of signal power as it moves through the transmission channel.
Noise: Occurs when an unwanted signal from other sources than the transmitter enters the
transmission channel.
Distortion means that the signals are deformed a more or less different signal as it propagates
through the medium
Multiplexing
This is a method of allowing multiple signals to share the same channel, reducing too much cabling,
as shown below:
A multiplexer is used in
multiplexing. A multiplexer is a
device that joins two or more
channels into one channel
while the de-multiplexer is
responsible for splitting a
channel into a number of them
for easy transmission to the
intended destination.
Bandwidth
Refers to the carrying capacity of a transmission channel. It is generally the volume of data that a
communication channel can carry at a given time. It is the difference between the lowest and the
highest (range) amount of data that a channel can transmit. It determines the amount of data a
channel can transmit at a given period of time. Fibre optic cables have high bandwidth and therefore
transmits data faster than coaxial cables, which have low bandwidth.
Baud rate: the amount of bits that can be send of a channel per second. It is a key measure of data
transfer rate. One baud = one bit per second
Sender and receiver must send and receive data at the same rate
The transmission path remains open (connected) until transmission is complete.
After transmission, the path can now be released for others to use.
If no path is established, transmission cannot occur
Similar to normal telephone systems whereby a specific line is routed from point A to point B
and is dedicated but not necessarily used all the time.
Data is not necessarily split, thus is send as it is.
Data signals are received in the same order they are send, therefore no need for processing at
the receiving end.
Advantages Circuit switching (CS)
o no congestion.
o dedicated transmission channel with guaranteed data rate.
o More effective transmission
o Less transmission errors
Disadvantages Circuit switching (CS)
o Channel reservation for duration of connection even if no data are being transferred is
an inefficient media use process.
o Long delays in call setup.
o Designed for voice traffic (analog).
Packet switching:
- Data is first split into smaller chunks called packets (or datagrams) which may take different
routes and then reassembles to the original order at their destination.
- Packets are routed to the next (intermediate) node along an appropriate route, which can store
and transmit the packet until the destination.
- Each packet takes its own convenient path and then re-assembled at the receiving end.
- Packets do not necessarily arrive at the same time or in correct order.
- At the destination, packets are re-grouped to the original message.
- Packets can be of fixed size
- Each packet has the following data: source address, destination address, error control signal,
packet size, packet sequence number, etc.
Benefits of packet switching:
- Makes more efficient use of lines
- Cheap as cost depends on number of packets send, not distance, so all data can be transmitted
at local call rates
- Less likely to be affected by network failure since an alternative route is used from each node.
- Security is better since packets follow different routes
- No call set-up is required.
- Fast and suitable for interactive applications
NB: A virtual circuit must be established between the sender and the receiving end. Virtual circuit
A temporary 'dedicated' pathway between two communicating points on a Packet Switched System
before sending of packets. Bandwidth is allocated for a specific transmission pathway.
Message Switching
This is whereby the whole message may be routed by any convenient route.
No physical/dedicated path is established in advance between sender and receiver
Data is stored at a hop (which may be router) then forwarded one hop later.
Each block is received in its entity form, inspected for errors
Data is not transmitted in real time.
Blocking cannot occur
Delays are very common
Compiled By Kapondeni T.
Sender and receiver need not be compatible since sending will be done by routers, which can
change data format, bit rate and then revert it back to original format on receiving or submit it
in different form.
Storing data solves congested networks since data can be stored in queue and forwarded later
when channel becomes free
Priorities can be used to manage networks
Very slow if the number of nodes is many since each node stores before forwarding the data
In message switching, whole message is routed in its entirety, one hop at a time.
Now implemented over packet or circuit switched data networks.
Each message is treated as a separate entity.
Each message contains addressing information, which is used by switch for transfer to the next
destination.
Also called a store and forward network
Used in e-mails and in telex forwarding
There is often no real limit on the message / block size.
Advantages
more devices can share network bandwidth
reduced traffic congestion
one message can be sent to many destinations through broadcast addresses
Disadvantages
often costly must have large storage devices to hold potentially long messages
not compatible with most real time applications
Transmission protocols
A protocol is a set of rules that govern how data is transferred in a network. It defines the rules on
how network devices communicate, e.g the TCP/IP. This includes:
A network communication protocol: a standard method for transmitting data from one computer to
another across a network. Some of the protocols are:
i. HTTP (HyperText Transfer Protocol)
This is a protocol that defines the process of identifying, requesting and transferring
multimedia web pages over the internet. It is used for transferring data across the internet,
usually between servers and computers on the internet. It is based on the client server
relationship. It uses TCP/IP to transmit data and messages
ii. FTP (File Transfer Protocol)
it is a protocol used to transfer data from one computer to another. It is often used to
download software from the internet, and it uses the TCP/IP protocol in doing this. However,
FTP has no security to data as the data is not encrypted prior to its transmission.
iii. TELNET
This is a network protocol that allows a computer user to gain access to another computer and
use its software and data, usually on a LAN and on the Internet. It allows users to access data
stored on servers from their terminals. Telnet allows computers to connect to each other and
allows sharing of data and files. Telnet has security problems especially on the internet.
iv. VoIP (Voice Over Internet Protocol)
It is a method of using the internet to make ordinary voice telephone calls. Thus it is a way of
having phone conversations using the internet as a way of communication. By VoIP,
Compiled By Kapondeni T.
international and long distance calls are of the same price as local calls and sometimes are for
free. However, the system does not offer emergency calls. An example of VoIP is Skype.
1. TCP/IP (Transmission Control Protocol/Internet Protocol)
At each level, additional information is added to allow service to be provided. This layered model is
also called protocol stack
NETWORKING
Types of networks
i. LAN (Local Area Network
A LAN is a privately owned connection of computers on a very small geographical area for
sharing of data and files by users of the network, for example, within an single room. Usually
connected using cables of radio connections.
Hardware Requirements for a LAN
Network Interface Card (NIC):- Each computer on the network must have this as it allows
computers to be linked and to be uniquely identified on the network.
Server:- to store software that controls the network, software and files and also data that can
be shared by all users of the network
Compiled By Kapondeni T.
Hub or alternatively a Switch:A hub is a device that connects workstations together in order to make a LAN. It receives
signal/data from workstations, regenerates it and the sends it to all ports on it. Thus all
workstations connected to it will get the signal or data packets. Hubs are less intelligent, they
do not determine the exact computer the data is addressed to and so they broadcast the signal.
This is a security risk. It is usually used on a star network or on a hybrid network. A hub has
many ports on which cables to all computers on the network are connected.
A switch is a networking device that allows multiple devices and workstations to be
connected to each other on a LAN just as a hub does. However, a switch is more intelligent
than a hub. A switch directs traffic across a LAN, enabling computers to talk to each other
and share resources. It joins computers on a LAN and is found at layer 2 of the OSI reference
model. It allows different nodes on the network to directly communicate with each other. A
switch runs in full duplex mode. It can recognise different devices on the network using their
MAC address so that data and signals can be send to exact/intended devices. This is more
secure than a hub. Switches can be LAN switches or ATM switches which are used on WANs
and MANs.
Terminals:- these are computers that are connected to each other through a server and cannot
work without the server. Terminals can be dump or intelligent. A dump terminal does not
have neither processing nor storage capabilities and thus wholly depends on the host
computer for it to work. An intelligent terminal has limited processing and or storage
capabilities.
Workstation:- these are the computers connected to the server and are less powerful than the
server
Cables: - connects computers together and acts as pathway for data moving from one
workstation to another.
Bridge: - this is a device that connects networks using the same communication protocols. It
is used to connect different parts of a LAN, thus is used to connect different LAN segments
together. However, it cannot handle multiple paths for data. In general a bridge is used for:
Routers:- this is a network device that connect different types of networks together, for
example, connects a school LAN to the internet (which is a WAN). It can route packets of the
same protocol (e.g. TCP/IP) over networks with dissimilar architectures (e.g. Ethernet to
token ring). It receives transmitted messages and forwards them to their correct destinations
over the most efficient available route. A router is used to form complex networks with
multiple paths between network segments (subnets), each subnet and each node on each
subnet is assigned a network address.
Compiled By Kapondeni T.
A router is very intelligent. It uses network addresses and IP addresses of other routers to
create routes between two networks. They keep tables of addresses that will be used in
routing information. Routers are thus used for:
Determining the path of data packets using destination addresses of the packets.
Used for packet switching
Gateway: - a device used to connect different kinds of networks. Thy act as link to different
WANs. A gateway is a device that connects networks with different architectures and
different protocols. When packets arrive at a gateway, the software strips all networking
information from the packet, leaving only the raw data. The gateway translates the data into
the new format and sends it on using the networking protocols of the destination system. Thus
it becomes a protocol converter.
Modem (MOdulator DEModulator):- This is a device that converts digital signal received
from a computer into an analogue signal that can be sent along ordinary telephone lines, and
back to digital at the receiving end. Mostly used to connect to the internet using the ordinary
telephone line. The speed of modems is measured in bits per second e.g. 56K bps. The
following parameters must be specified when a modem is installed:
the telephone number of the ISP
baud rate of modem
number of data bits per block
number of stop bits
whether odd or even parity is used
Cable modems - employ broadband transmission across regular cable television wires
Integrated Services Digital Network (ISDN) line it is a digital telephone service that
provides fast, accurate data transmission over existing copper telephone wiring, for internet
connection. It is a set of communication standards for simultaneous digital transmission of
voice, video, data, and other network services over the traditional circuits of the public
switched telephone network. ISDN is a line that allows the transmission of digital signals
without them being changed into analogue which leads to improved quality for the user. It
requires a network adapter and a network termination device (no modem required)
Asymmetric Digital Subscriber Line (ASDL) - offers Internet connection up to 30 times faster
than dial-up modems still using traditional copper wires but allocating more bandwidth to the
data flow from the ISP to the PC than is allocated from the PC to the ISP
Dial-up networking: user pays for the amount of time spent using the telephone link. It is less
expensive. more appropriate for low-volume applications requiring only occasional
transmission
Dedicated/leased line: the line is continually available for transmission and the user pays a
flat rate for total access to the line. It transmits data at higher speeds. It is more appropriate
for high volume transmission
Value-added network (VAN):- This is a private, multipath, data-only, third-party managed
network. It is used by multiple organisations. It may use ISDN lines, satellite links etc. it is set
up by a firm in charge of managing the network. Its subscribers pay a subscription fee and for
data transmission time. The cost of using the network shared among many users. subscribers
do not have to invest in network equipment or perform their own error checking, routing and
protocol conversion
Compiled By Kapondeni T.
Electronic data interchange (EDI):-e.g. transmitting A level results to schools using BT's
CampusConnect . virtually instantaneous electronic transmission of business data from one
firm's computerised information to that of another firm. It increases accuracy and eliminates
delays.
Internetwork:- This is created when two or more independent networks are connected but
continue to function separately e.g. Internet. In larger networks it is common to supply
multiple paths through the network to provide fault tolerance
Compiled By Kapondeni T.
Token passing
a small packet called a token is passed around the ring to each computer in turn
to send information, a computer modifies the token, adds address information and sends it
down the ring
information travels around the ring until it reaches its destination or returns to the sender
when a packet is received by the destination computer, it returns a message to the sender
indicating its arrival
2. Star Network:
Computers form a star shape with host computer at the centre.
Compiled By Kapondeni T.
The Server (host computer) manages all other computers/terminals on the network.
If the terminals are not intelligent, they have to rely on the host computer for everything.
This network is as shown below:
- No central device oversees a mesh network, and no set route is used to pass data back and forth
between computers.
- Thus, if any one computer is damaged or temporarily unavailable, information is dynamically
rerouted to other computersa process known as self-healing
all components are connected via a backbone (a single cable segment connecting all the
computers in a line)
entire network will be brought down by a single cable break
terminator at the end of the line absorbs all signals that reach it to clear the network for new
communication
data is sent in packets across the network and received by all connected computers; only the
computer with the packet destination address accepts the data
only one computer can send information at a time
Ethernet uses a collision system - carrier sense multiple access with collision detection
(CSMA-CD) - if transmitted messages collide, both stations abort and wait a random time
period before trying again.
network performance degrades under heavy load
Compiled By Kapondeni T.
Definition of Terms
(a) Bus/Backbone: the dedicated and main cable that connects all workstations and other computer
devices like printers.
(b) Nodes: these are connection points for workstations and the bus.
(c) Terminator: devices that prevent data in the bus from bouncing back, causing noise and prevents
data from getting lost.
Advantages of Bus network
- If one workstation breaks down, others will remain functional.
- If one workstation breaks down, the network remains working.
- All computers have processing and storage capabilities.
- It is cheap to install due to less cabling.
- Easy to add workstation without disrupting the network.
- Requires less cabling than a star network.
- Less expensive network than the other systems
Disadvantages of Bus Network
- Computers cannot send data at the same time nor while there is data being transferred in the bus.
- Can cause collision of data during transmission.
- It is slow in transferring data.
- Its requirements are expensive, that is computers with their own processors and storage facilities.
- The system will be down if the main cable (bus) is disrupted at any point.
- Less secure.
- Performance worsens as new stations added
5. Hybrid
This topology is a combination of two or more different network topologies into one. When different
topologies are connected to one another, they do not display characteristics of any one specific
topology.
Advantages of hybrid topology
1. Flexibility:- adding / removing other peripheral connections is easy.
2. More reliable: it is easier to isolate the different topologies connected to each other and find the
fault with the hybrid topology.
3. Speed: Speed is consistent, combines strengths of each topology and eliminates weaknesses
Compiled By Kapondeni T.
4.
Effective: The weaknesses of the different topologies connected are neglected and only the
strengths are taken into consideration.
5. Scalable: It is easy to increase the size of network by adding new components, without disturbing
existing architecture.
Weaknesses of hybrid topology
1. Since different topologies come together in a hybrid topology, managing the topology becomes
difficult.
2. It is also very expensive to maintain. The cost of this topology is higher as compared to the other
topologies. Cost factor can be attributed to the cost of the hub/switch, which is higher, as it has to
continue to work in the network even when any one of the nodes goes down.
3. Costly Infrastructure: The cost of cabling also increases, as a lot of cabling has to be carried out
in this topology.
4. Installation and configuration of the topology is difficult since there are different topologies,
which have to be connected to one another.
NB: Point to-Point Connection: Point-to-point topology is the simplest connection, consisting of
two connected computers.
Media Access Methods
A. Carrier Sense Multiple Access (CSMA): CSMA is a contention access method in which each
station first listens to the line before transmitting data. It first of all checks if there is data in the
transmission channel before transmitting. Thus it cannot transmit while another device is
transmitting.
a) CSMA/CD (Carrier Sense Multiple Access with Collision Detection)
This is an access method in which a station transmits whenever the transmission medium is available
and retransmits when collision occurs. A device first listens before transmitting, and if the channel is
idle, it sends the data. If the channel is busy, it continues listening until the channel is no longer busy.
However, two device (stations) may be listening at the same time and then transmit simultaneously
when they detect that the channel is idle. This causes collision. The transmitting devices detects that
collision has occurred, and they cancel all the data in transmission, broadcast a message to other
channels that collision has occurred. These channels are then given a random period of time to start
listening again in-order to re-transmit.
CSMA/CD control software is relatively simple and produces little overhead. CSMA/CD network
works best on a bus topology with burst transmission
Disadvantages
CSMA/CD protocols are probabilistic and depends on the network (cable) loading.
Considered unsuitable for channels controlling automated equipment that must have certain
control over channel access. (This could be OK for different channel access).
We can set priorities to give faster access to some devices (This is, probably, not an issue in
some applications)
Compiled By Kapondeni T.
A token is a special authorising message that temporarily gives control of the channel to the
device holding the token.
Passing the token around distributes access control among the channel's devices.
Each device knows from which device it receives the token and to which device it passes the
token.(see fig.)
Each device periodically gets control of the token, performs its duties, and then retransmits
the token for the next device to use.
System rules limit how long each device can control the token.
Whenever the network is unoccupied, it circulates a simple three-byte token.
This token is passed from NIC to NIC in sequence until it encounters a station with data to
send.
That station waits for the token to enter its network board. If the token is free the station
may send a data frame.
This data frame proceeds around the ring regenerated by each station.
Each intermediate station examines the destination address, if the frame is addressed to
another station, the station relays it to its neighbor.
If the station recognizes its own address, copies the message, checks for errors, and
changes four bits in the last byte of the frame to indicate address recognized and frame
copied.
The full packet then continues around the ring until it returns to the station that sent it.
Advantages
Even though there is more overhead using tokens than using CSMA/CD, performance
differences are not noticeable with light traffic and are considerably better with heavy loads
because CSMA/CD will spend a lot of time resolving collisions.
A deterministic access method such as Token Ring guarantees that every node will get access
to the network within a given length of time. In probabilistic access method (such as
CSMA/CD) nodes have to check for network activity when they want to access the network.
Disadvantages
Components are more expensive than for Ethernet or ARCnet.
Token Ring architecture is not very easy to extend to wide-area networks (WANs).
Token Ring network is much more expensive than Ethernet. This is due to the complex token
passing protocol.
d) Contention
With contention systems, network devices may transmit whenever they want.
No referee mandates when a device may or may not use the channel.
Stations simply transmit whenever they are ready, without considering what other stations
are doing.
Unfortunately, the "transmit whenever ready" strategy has one important shortcoming.
When this happens, the resulting co-mingling of signals usually damages both to the point
that a frame's information is lost.
This unhappy event is called a "collision."
Polling Access Method
Polling is an access method that designates one device (called a "controller", "primary", or
"master") as a channel access administrator.
Compiled By Kapondeni T.
This device (Master) queries each of the other devices (secondaries) in some
predetermined order to see whether they have information to transmit.
If so, they transmit (usually through the master).
Secondaries may be linked to the master in many different configurations.
One of the most common polling topologies is a star, where the points of the star are
secondaries and the master is the hub.
To get data from a secondary, the master addresses a request for data to the secondary, and
then receives the data from the secondary sends (if secondary sends any).
The primary then polls another secondary and receives the data from the secondary, and so
forth.
System limits how long each secondary can transmit on each poll.
Advantages
Maximum and minimum access times and data rates on the channel are predictable and
fixed.
Polling is deterministic and is considered suitable for channels controlling some kinds of
automated equipment.
Disadvantages
Polling systems often use a lot of bandwidth sending notices and acknowledgments or
listening for messages.
Line turnaround time on a half- duplex line further increases time overhead.
This overhead reduces both the channel's data rate under low loads and its throughput.
Compiled By Kapondeni T.
FILE ORGANISATION
Refers to the way in which records in a file are stored, retrieved and updated. This affects the number
of records stored, access speed and updating speed. The most common methods of file organisation
are: Serial File Organisation, Sequential File organisation, indexed sequential file organisation
and random (direct) file organisation.
1. Serial File Organisation: This is whereby records are stored one after another as they occur,
without any definite order as on magnetic tapes. Data is not stored in any particular sequence. Data is
read from the first record until the needed data is found. New records are added to the end of the
file. Serial file organisation is not appropriate for master files since records are not sorted and
therefore are difficult to access and to update. Suitable for temporary/ transaction files since records
are not sorted.
To delete records:
More complex
Read record to be deleted from the file, search it from 1st record until found=true
re-write the whole file to a new disk, omitting the unwanted record.
Compiled By Kapondeni T.
To delete a record, the whole file is to be copied over to a new sequential file, omitting the file to be
deleted.
Processing of records is faster than that of serial files
Hit rate proportion or percentage of records being accessed on any one run. In payroll systems, the
hit rate is mostly 100% since every employee will be paid. Hit rate is calculated by dividing the
number of records accessed by total number of records in the file and then multiplying by 100. For
example, if 270 records are accessed out of 300 records, the hit rate is 270/300 x 100 = 90%
3. Indexed-Sequential Files: This is whereby records are ordered in sequence based on the value of
the index or disk address as supported by hard disks. It supports batch processing. It is also used for
creating master file since the records are ordered. It is also suitable for real time processing
applications like stock control as it is fast in accessing records and in updating them. It provides
direct access to data as on hard disks, diskettes and compact disks. It ensures that data is accessed in
some order. It ensures that no data is missed during accessing. Can provide direct access if requests
are send online.
Indexed sequential files consists of 3 basic parts:
the index
The home area
Compiled By Kapondeni T.
Overflow area
The index:
Contains record keys and disk addresses. The record key can be one or more fields that uniquely
identify a record. Each record key is associated with a disk address (which can be surface, track and
sector number) to identify the specific sector of the home area. Thus the index points to the home
area.
The Home Area
This contains the data records stored in record key sequence. The home area is in sequence and can
be accessed sequentially. In some situations, it can be accessed randomly using the index.
It allows data to be stored in blocks that contain several records. A block may be one or more sectors
of the disk.
Each block may be partially filled in order to allow new records to be added later. For example, if a
block can accommodate 12 records, 8 records may be saved in each block, allowing new records to
be added during execution. This is called packing density, which is usually 70% or more. Thus if the
computer is using 70% packing density, it means, data is stored in 70% of each block in the home
area. The packing density is always less than 100% to allow insertion of additional records later.
The home area also points to the overflow area.
Overflow area
The home area may become too small and may not accommodate all records. The home area may
become full. In this case, the remaining part of the home area just store pointers to indicate position
of overflow area of any additional records as the home area gets full
NB: However, it may take longer to process the records. This is because records would have been
placed in the overflow area. After reading the index, it takes a single disc access to read a record in
the home area. Each time the home area is accessed, it takes at least two disc accesses; one to read
the home area and one to read the overflow. This problem can be solved by re-organising the file
using the housekeeping program, which copies the file to a new file, placing all the overflow records
into the home area and re-writing the indices.
4. Random (Direct/Hash/Relative) File Organisation: This is whereby records are not in any order
but stored and accessed according to their disk address or relative position, calculated from the
primary key of the record, as supported by hard disks and compact disks. Records are stored and
retrieved according to their disk address / relative position within file. The hashing algorithm/formula
translates the primary key into an address, using the modulo method.
To add a new record, use the hashing algorithm to work out the appropriate memory location. If the
location is empty, the records is inserted/written, otherwise the next block is examined until an empty
space is found.
To search/access a record, its address is calculated from the record key using the hashing algorithm,
the record at that address is then read, if it not the required record, the next record is read and
examined until either the record is found or empty space is encountered. Suitable for online systems
where fast response is required.
To delete a record, set flag to zero but leave the value there, therefore space can be reused but is not
actually empty. The record is not physically deleted but just marked as deleted.
Structure of random files
Records are stored in blocks which are not necessarily in sequence. The position of the record is
determined by a hashing algorithm or randomising function. When a record is to be stored in a
file, a hashing algorithm is applied to the record key to determine the block that is to be used. For
example, the blocks may range from 0 to 499, and the hash algorithm generates the number within
this range. The records are stored in this format:
Record Key
Block
Compiled By Kapondeni T.
22387
13495
58905
48676
68798
300
201
104
349
34
It is appropriate where extremely fast access to data is required as in airline reservation. Updating of
records is in situ, very simple and very fast. Hard disk, compact disks and diskettes promotes
random file organisation.
When records are deleted, they are just marked as deleted but are not removed from the file. These
deleted files take up space and may slow down processing. This can be solved by saving the records
on a different file, removing the deleted records.
Overflow
If there is no space on the block, collision is said to have occurred and the record must be stored
elsewhere.
A re-hashing algorithm is carried out on the block that is full in order to give another block that is not
full. If the given block is full again, the hashing algorithm is applied again until an empty block is
found. The overflow area can be used just as in the indexed sequential files.
NB:- If no further information is given, assume that overflow records are stored in the next block
Hashing algorithm - used to translate record key into an address. However, synonyms may occur,
i.e. two record keys generate the same address (use overflow area and flag)
To solve problems of clashes of blocks after applying the hashing algorithm:
(a) Subsequent locations are read until empty location is found. The record is inserted in the
empty location. If the maximum address is reached, it loops back to the first address, i.e.
position 000
(b) A bucket (area of memory) can be set aside for overflow. Any clashing record is inserted in
the bucket or in next location in serial form.
(c) Another method is to use the existing record as the head of list. Pointers are then used to point
to records with the same hash value. New values are inserted in free location.
FILE PROCESSING
Refers to any form of activity that can be done using files. This includes: file referencing, sorting,
maintenance and updating.
1. File Referencing/Interrogation: This involves searching of record and displaying it on the screen
in order to gain certain information, leaving it unchanged. The record can also be printed.
2. Sorting: Refers to a process of arranging (organising) records in a specific ordered sequence, like
in ascending or descending order of the key field.
3. Merging Files: This is the process of combining two or more files/records of the same structure
into one. Below is an example of how records can be merged:
Record A (sorted)
Record B (unsorted)
12
34
71
78
101
103
67
3
90
12
Record C (Merged and sorted for records A and Record B)
3
12
34
67
71
78
90
101
103
4. File maintenance: This is the process of reorganising the structure of records and changing
(adding or removing or editing) fields. May also involve updating more permanent fields on each
record, adding / deleting records. This can be due to changes due to addition or deletion of records.
5. File Updating: Updating is the process of making necessary changes to files and records, entering
recent information. Only master files are updated and they must be up-to-date. For updating to
occur, any one of the following must have occurred:
Compiled By Kapondeni T.
A new record has been entered. Deletion of an unwanted record. An amendment (change) to the
existing data has been made, e.g. change in date of birth only.
The most common methods of file updating are:
Updating in situ and Updating by copying.
a. Updating by copying
This happens in sequential file updating. The transaction file must be sorted in the same order with
the master file records. This is done through the following steps:
- A record is read from master file into memory.
- A record is then read from transaction file into memory.
- Record keys from each file are compared.
- If record keys are the same, the master file is updated by moving fields form transaction file
to the new master file.
In sequential file updating, it is recommended to keep at least three master file versions that will be
used for data recovery in case of a system failure or accidental loss of data. The first master file is
called the Grandfather file, the second master file is called the father file and the third master file is
the son file. This relationship is called the grandfather-father-son version of files. The process of
keeping three versions of master files (grandfather-father-son) as a result of sequential file updating
is called File Generations. Thus the first master file (grandfather file) is called the first generation
file, the second master file (father file) is called the second generation file and the third master file
(son file) is the third generation file. The following diagram illustrates the sequential file updating
process:
*NB: - Always create data backups
on compact disk or hard disks and
re-run the old master file with the
transaction file if the computer
system fails or if data is lost. This is
a data recovery method that works
well.
*NB:- A backup is a copy of file(s)
on an alternative medium like CDROM in case the original file is
damaged or lost and will be used for
recovery purposes. The original files
could be deleted accidentally,
deleted by hackers, corrupted by
system failure or could be corrupted
by hackers.
Algorithm for sequential file updating
Open master file for reading
Open transaction file for reading
Open new master file for writing
Repeat
Read next transaction file record
While master file record key<transaction file record key
Write master file record key to new master file record
Read next master file record
End While
Compiled By Kapondeni T.
Update record
Until EOF (Transaction file)
While not EOF Master File
Read next record
Write master record to new master file
EndWhile
Compiled By Kapondeni T.