Computer Programming For Beginners
Computer Programming For Beginners
for Beginners
Computer Programming
for Beginners
A Step-by-Step Guide
Murali Chemuturi
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been made
to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all
materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all
material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been
obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any future
reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized
in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying,
microfilming, and recording, or in any information storage or retrieval system, without written permission from the
publishers.
For permission to photocopy or use material electronically from this work, please access www.copyright.com (http://www.
copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-
8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For organizations that
have been granted a photocopy license by the CCC, a separate system of payment has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for
identification and explanation without intent to infringe.
v
vi Contents
This book is aptly christened as Programming for Beginners: A Step-by-Step Guide. The
author, Murali Chemuturi, has truly “aimed at making you a star programmer, a pro-
fessional programmer …” The perspective of the author has been to hand-hold a person
desirous of entering this profession or consolidate upon previously acquired skills in order
to carve out and tailor a programmer to meet the requirements of the present-day IT and
ICT industry.
Introducing the basics of computers, data and data types, their storage, and retrieval
including DBMS as well as computer programs/programming, Murali has elaborated the
execution of the program by the computer. Very few authors would dare to talk about the
operating systems as well as their operations in a book on Computer Programming. In
the next few chapters, the audience is subjected to the rigors of algorithms and flowcharts
and handling data in real-life programs while emphasizing and discussing their standard
techniques threadbare. This is followed by a detailed discussion of expressions, control
statements, I/O, and other related statements explaining the concepts of using advanced
aspects of programming.
Murali points out that “The actual implementation of these aspects differs from OS to OS
and one programming language to another and all programming languages have not imple-
mented all these aspects.” Hence the need to use these concepts in programming necessarily
requires that the programmer must thoroughly read the relevant parts of the program-
ming language manual, understand it fully, and experiment before implementation.
The subsequent chapters discuss methods of error handling with facilities provided in
the OS, interprogram communication coding, debugging, and performance tuning sub-
routines with best practices and pitfalls as well as building and using libraries and pro-
gramming device drivers.
Murali has dwelt upon programming multilanguage software and making the software
amenable for use in multiple languages which would help ease of programming in the
various languages in the world.
Finally, Murali Chemuturi has used all his experience to educate the not-so-
knowledgeable entrants to the evolution of programming languages right from the
days of COBOL and FORTRAN to the present-day 4GLs (4th Generation Languages).
Programming standards and guidelines, as well as the scope of these guidelines with
specific coding guidelines for each of the programming languages, have also been
included for ease of understanding and maintenance. Personal Software Process for
monitoring productivity as a metric for quality and efficiency forms the ultimate chapter
and includes peer review, coding, and testing methodology.
This book is strongly recommended for the stalwarts in coding as a short lifetime
revision for a longer inning as a professional programmer and to the beginners to become
“star programmers.”
xi
Preface
xiii
xiv Preface
programmers to quickly master a new language. The result is that the programmer looks
for another trade after two or three new languages.
The learning of newer languages becomes extremely difficult because the initial learn-
ing was defective. If we give the comprehensive concepts about programming, rather than
giving a few keywords, then learning a new programming language will be easy because
it is applying the same concepts with new keywords. If I can, I wish to prevent program-
mers from falling by the wayside because they are not able to learn a new language quick
enough to retain their job. This book is an attempt by me to do just that. This is what this
book strives to achieve.
I did quite a bit of teaching and programming myself, and all the books I could find
on computer programming were tied to some language or the other. The titles were like
Programming with C, Programming with Visual Basic, Java Programming, and so on. Most of
those books are devoted to getting the reader off the ground to write a simple program and
execute it successfully. The concepts and the constructs are not completely covered. I see
this as a gap. I want to bridge this vital gap. This book is not tightly coupled with any pro-
gramming language and, rather, it is not aimed at getting you to write a simple program
and execute it to make you feel good about your programming skills. This book is aimed
at making you a star programmer, a professional programmer, a Philippe Kahn and a Bill
Gates! Perhaps it is an exaggeration, but my intention was to write a comprehensive book
to give you a complete idea of all the concepts and constructs of computer programming.
Once you finish and master these concepts, I am sure you will be ready to conquer any new
programming language that may come on the horizon. I wish you all the best.
Feel free to email me at murali@chemuturi.com and I promise to reply.
Acknowledgments
When I look back, I find that there are so many people to whom I should be grateful. Be it
because of their commissions or omissions, they made me a stronger and better person,
and both directly and indirectly helped to make this book possible. It would be difficult
to acknowledge everyone’s contributions here, so to those whose names may not appear,
I wish to thank you all just the same. I will have failed in my duty if I do not explicitly
and gratefully acknowledge the following persons:
• My parents, Appa Rao and Vijaya Lakshmi, the reason for my existence—especially
my father, a rustic agrarian, who by personal example taught me the virtue of hard
work and the value of the aroma of perspiration from the brow
• My family, who stood by me like a rock in times of struggle—especially my wife
of 44 years, Udaya Sundari, who gave me the confidence and the belief that I can,
and my two sons, Dr. Nagendra and Vijay, who provided me the motive to excel
• My two uncles, Raju and Ramana, who by personal example taught me what
integrity and excellence mean
To all of you, I humbly bow my head in respect, and salute you in acknowledgement of
your contribution.
Murali Chemuturi
xv
About the Author
xvii
1
Introduction to Computers
What Is a Computer?
People may say that perhaps 20 years ago we might have needed this chapter, but not
now! Perhaps they are right, but in my interactions with people, I have seen misunder-
standings or different understandings about computers and what they can do. I have
seen computer science graduates look at me incredulously when I asked them this ques-
tion, but when pressed, they could not provide me with a credible answer. So, by defin-
ing and explaining what a computer is, I, the author, and you, the reader, would be on
the same page.
One definition is, “one who computes is a computer.” Yes, a computer does compute.
And some time back, organizations had employees with the designation of “computer.”
He/she was usually in the finance and accounts department checking all the important
computations carried out by other employees in the department. A computer was originally
developed to solve mathematical problems, but today’s computers do more than just solve
mathematical problems. So, this definition, while it is correct, does not adequately define
the computers of today.
Another definition of computer is, “A computer is a data processing tool.”
The key words in this definition are:
1. Computer: This describes the hardware part of the computer. It is a fast, electronic
machine that is versatile and assists human beings in a variety of applications. It
works accurately and diligently. By diligence, I mean not being subject to fatigue
or monotony. It performs the computations a million times with the same accu-
racy and speed as it would the first time.
2. Data: Data is facts and figures about entities. An entity is a person, place, thing,
or transaction. Data is comprised of attributes describing the entities. A person
is described by his/her personal attributes, such as height, weight, educational
qualifications, vocation, address, age, place of work, and so on. A place like a
building is described by its address, location, purpose of its usage, number of
people using it, number of rooms, other facilities, number of floors, and so on.
A town is described by its location, county, state, country, size of its population,
important landmarks, famous personalities, its history, and so on. A business
can be described by its date of incorporation, its location, address, its outputs,
its customers, its vendors, and so on. A thing can be an object, a machine, a
tool, an appliance, or something like that. It is described by its make, model,
1
2 Computer Programming for Beginners
alphabets in addition to numbers. A set of 8 bits was settled upon by the company IBM
to codify all characters for its mainframe computers, and it was called the EBCDIC
(Extended Binary Coded Decimal Interchange Code). IBM continues to use this code in
its mainframe computers to this day. Then the ASA (American Standards Association,
now, American National Standards Institute [ANSI])) developed another system of
codifying characters for use in computers that is called ASCII (American Standard
Code for Information Interchange). It originally used a 7-bit code that allowed 128 (27)
characters to be codified. Presently, ASCII code is extended to 8 bits, adding 128 more
characters to be codified. ASCII is used in all personal computers and those computers
built using VLSI chips for the processor. There can be exceptions, but most of comput-
ers, other than mainframe computers, presently use ASCII code.
The set of 8 bits used for codifying characters is called a “byte.” A byte can store one
character inside the computer. So, a bit is the short form of “binary digit,” and byte is a set
of 8 bits and can store one character.
Components of a Computer
A computer is made up of two main components, namely, the hardware and the software.
Let us first look at hardware and then move on to software.
CPU
At the heart of the present-day computers is the transistor, but the transistors are integrated
into a chip called the integrated chip. Depending on the number of transistors integrated
into the chip, they are called LSIC (Large Scale Integrated Circuits) and VLSIC (Very Large
Scale Integrated Circuits). Presently, they achieved the capability to pack more than
a million transistors onto a single chip. The base of the chip is basically silicon crystals
on which the electronic circuits are engraved and transistors are etched. A VLSIC chip
is nothing less than a miracle. Because of this integration, not only did the price of the
computer come down, but its size also was brought down significantly. While the first
computer occupied a large room, its equivalent or even a more powerful computer is
accommodated in a handheld tablet today.
A computer consists of a processor unit, Random Access Memory (RAM), a control unit,
input devices, and output devices. Strictly speaking, the processor unit with its RAM is the
computer, and the rest are peripheral devices to the computer.
The processor unit is referred to as the Central Processing Unit, or simply, CPU. It is
the heart of the computer system. The CPU carries out all the data processing opera-
tions. All the data necessary for the CPU to process needs to be stored in the RAM.
4 Computer Programming for Beginners
The CPU and the RAM are very tightly coupled. The CPU contains what are referred
to as “registers.” Each register is a block of memory. Some of the important registers of
a processor are:
The CPU performs all the processing by moving data and instructions into and out of
the registers. How all that movement and processing happens is beyond the scope of this
book. I have included this information to give you an insight into the CPU. The size of the
register is specified in the number of bits its can store. The size of a register is generally in
lots of bytes; that is, a register is either 8, 16, 32, or 64 bits.
Bus
The other component of the processor is the bus. Bus is in fact the short form of bus-bar.
“Bus-bar” is a phrase used in electrical engineering. Inside electricity-generating stations
and distribution stations, bus-bars are used to allow electricity to flow from one terminal
to another. Other electrical equipment can place taps on the bus-bars to draw electric cur-
rent as required. The electrical bus-bars are insulated or uninsulated heavy metal rods,
depending on the current that flows inside them. For a three-wire system, three bus-bars
would be needed. That is, the number of bus-bars needed would be determined by the
number of parallel circuits required.
The bus in the computer is built on the same premises. A bus is used to move data and
instructions to and from the CPU to RAM. Each line in the bus allows transmission of one
bit of information. The capacity of the bus is defined in the number of lines in the bus. It is
generally expressed as a 16-bit bus, 32-bit bus, or 64-bit bus, and so on.
Now the capacity of a processor in the computer is expressed in terms of the size of
the registers inside the CPU and the capacity of the bus. If the size of the register in the
CPU is 32 bits and the size of the bus has 16 bits, the processor is referred to as 32/16 bit.
Usually, the sizes of the registers and the bus are maintained at the same level, but differ-
ence in their sizes is also possible.
Introduction to Computers 5
RAM
Then the other important component of the processor is the primary memory, or RAM.
While RAM is physically outside the processor, it is considered part of the CPU. Magnetic
cores of annular-ring shape were used for RAM in earlier computers but are now replaced
by silicon chips, commonly referred to as the memory chips. Each element of RAM has
only two states, which is a true or false state; in other words, it stores a value of zero or
one. Computers in the earlier days had very limited RAM, but now computers have a large
amount of RAM. Having a large amount of RAM decreases the response time of the com-
puter and increases the overall output.
The capacity of the RAM is expressed in terms of bytes: kilobytes (KB), megabytes (MB),
and gigabytes (GB). Presently, even tablets have gigabytes of RAM. A kilobyte has 1024
(210) bytes, a megabyte has 1,048,576 (220) bytes, and a gigabyte has 1,073,741,824 (230) bytes.
In common parlance, a KB is taken as a thousand bytes, an MB is taken as a million bytes,
and a GB is taken as a billion bytes.
How much RAM do we need if we want to have a computer that does not make us wait
endlessly for a response? It should be more than the sum of RAM required for the operat-
ing system and all the programs we wish to run concurrently. If you are using a windows-
based PC or laptop, these details are given in the Task Manager, which you can see by
right-clicking the task bar and selecting Task Manager.
System Clock
System clock is another component of the CPU. The system clock releases a pulse at regu-
lar intervals of time. The CPU performs one operation with every pulse released by the
system clock. Therefore, the number of operations performed by the CPU depends on the
number of pulses released by the system clock. The speed of the system clock is expressed
in hertz; kilohertz (kHz), megahertz (mHz), and gigahertz (gHz). All present-day pro-
cessors have speeds greater than a gHz. A hertz is the unit of measure for frequency.
Frequency is the number of cycles taking place per unit time. In the present case, a hertz is
equal to one cycle per second. A system clock speed of 1.6 gHz means the clock releases 1.6
billion pulses per second and the CPU performs 1.6 billion operations per second.
Input/Output Devices
It is the I/O (Input/Output) devices through which we interact with the computer. The key-
board, mouse, and display screen are most commonly used I/O devices and are familiar to
any computer user. We can safely say that a computer is useless without the I/O devices.
The development of versatile I/O devices is the reason for the popularity of computers.
There are many other I/O devices for a computer and some of them are enumerated here:
1. Keyboard and mouse: I am sure that everyone is familiar with these two devices that
provide input to the computers.
2. Display screen: I am once again sure that everyone is familiar with this output
device. There are, of course, a variety of display screens, including cathode ray
tubes, which are on the way out, LCD (Liquid Crystal Display) screens, plasma
screens, and LED (Light Emitting Diode) screens. Earlier the screens were able to
display only character data in one color. The current display screens display stun-
ning colors as well as graphic data.
6 Computer Programming for Beginners
3. Printers: There a variety of printers available on the market for different appli-
cations. Heavy-duty printers are used for bulk printing needs. These include
line printers or line matrix printers. Inkjet printers and laser printers are used
for low-volume personal printing purposes. Photo printers are used for printing
photo-quality outputs either on paper or polyester films. Photo printers and inkjet
printers can print in color. Usually, the heavy-duty printers print only in black and
white. Printers are only output devices.
4. Magnetic tapes: Once upon a time, magnetic tapes (shortened as “mag tapes”) were
the primary storage device driving the computers. In those days, in movies, com-
puters meant whirring magnetic tape drives. The tape drives have gone out of use
more or less, except as backup devices. Presently, data cartridges, which are basi-
cally magnetic tapes of a much smaller size, having capacity to store a hundred GB
of data are used to take backups of corporate data.
5. Plotters: Plotters are used to generate engineering drawings. They come in five
sizes, namely, A4 (8.3ʺ × 11.7ʺ − 0.0625 sq m), A3 (11.7ʺ × 16.5ʺ − 0.125 sq m), A2
(16.5ʺ × 23.4ʺ − 0.25 sq m), A1 (23.4ʺ × 33.1ʺ − 0.5 sq m), and A0 (33.1ʺ × 46.8ʺ − 1 sq m).
These are standard sizes for engineering drawings. A larger size plotter would be
able to produce its size and all the lesser sizes of drawings. Thus, an A0 plotter can
be used to produce not only A0 size drawings, it can also produce A1, A2, A3, and
A4 size drawings. Plotters usually come with color capability, but they would not
be able to produce photo-quality outputs. These are basically used for producing
engineering drawings.
6. Computer networks: As computer networks have come of age and a large part of
the world is now connected, most of the computers access Internet at a minimum.
Computers output information onto the network and receive input from the net-
work. Networks are both input and output devices.
7. Machines: Now, a variety of machines are controlled by computers. Of course, the
computers used for machine control are slightly different from commercial com-
puters in their ability to withstand extremes of climate and the strict response
times needed by the machines. Today, not just airplanes, rockets, and machin-
ing centers in factories, but also mundane devices including cars, refrigerators,
and washing machines are controlled by computers. Machines are mostly output
devices except sending responses about the successful or unsuccessful execution
of computer instructions and error conditions.
8. CD/DVD drives: These are used for storing information for future use. These act as
both input as well as output devices.
9. In the past, we were using punched card machines for input and output from
computers, floppy drives and floppy disks for backup storage, and teleprinters for
both input and output. Paper tapes and paper tape readers were also used for driving
NC (numerically controlled) machines in factories. All these devices are more or less
obsolete now. They could be in use in some remote places, but not in large numbers.
10. Robots: Robots are versatile automatic machines that can be programmed for a
variety of tasks. These are used in factories for manufacturing. Robots can be
solely output devices receiving computer instructions to perform assigned opera-
tions, or they can be solely input devices providing input like sounds and pictures
to the computers. The Mars rovers that are collecting enormous data from the
planet Mars are, in fact, robots of a special kind. They even analyze Martian soil!
Introduction to Computers 7
Computers are also used in process control in factories, such as petroleum refining, fertil-
izer production, chemical manufacturing, drug/pharmaceutical production, and so on.
In fact, computers are used in any automation where the parameters are programmable
using numbers. The I/O devices listed earlier are by no means the comprehensive listing.
There are many devices that can be interfaced with computers. In fact, machine designers
are designing in such a way that many machines can be interfaced with computers so that
the processing capabilities of computers can be profitably utilized.
Software
While hardware is the visible component of the computer, software is the invisible compo-
nent. Software is the set of computer programs that run the computer system and process
data. Hardware performs the operations, and software directs the hardware operations.
While hardware components are like the individual musicians in a concert, the software is
akin to the conductor. The melody is achieved by the conductor and the musicians obeying
the conductor. The software has primarily three components, namely, the firmware, the
operating system, and the application software.
Firmware
Firmware is the basic software that is usually stored on a ROM (Read Only Memory) chip.
This software is generally supplied by the hardware manufacturer along with the com-
puter. As soon as the computer is switched on, this software takes over and performs the
following functions:
1. It stores the information about the configuration of the system and applies it at the
time of switching on.
2. It checks the configuration of the system to ensure that it is not changed in an
unauthorized manner since the last time it was checked.
3. In some cases, it also stores the hardware password, then prompts for entering the
password, ensures that the entered password is correct, and allows access to the
rest of the operations only upon entering the right password.
4. It checks all the hardware components to ensure that all are working as they
should be.
5. When anything is found to be out of place, it displays error messages, allows for
correction of the defects, and prevents switching on a faulty computer system.
6. It then loads the computer operating system and passes on the control to it for
further operations.
Operating System
The second component of the software is the OS or the Operating System of the computer.
The OS controls and governs the hardware resources of the computer system. The OS
interfaces between the users and the hardware resources of the computer system.
An OS consists of four components apart from the user interface:
1. Processor management
2. Memory management
8 Computer Programming for Beginners
3. Device management
4. Information management
The OS comes in two flavors, namely the single-tasking OS and the multi-tasking OS.
In a single-tasking OS, the computer system performs only one process at a time. The
old MS-DOS (Microsoft Disk Operating System) and its predecessor, CP/M (Control
Program for Microprocessors), were both single-tasking OS. Present-day Windows OS is
a multi-tasking OS.
Multi-tasking OS comes in two varieties, namely, the single-user OS and multi-user OS. In
single-user OS, only one user can use the computer at any given time. Windows OS is a single-
user OS. It permits only one user to use the computer at any given time. Two or more users
cannot use it concurrently. Multi-user OS permits multiple users to use the computer system
concurrently. UNIX and its variants, as well as mainframe computer OS, are multi-user OS.
Processor Management
The processor management module of the OS manages the processor of the computer sys-
tem. The processor executes processes. A process is a program in execution. In all multi-
tasking OS, there are multiple processes waiting for the processor to execute them. The
functions of the processor management module are:
In multi-tasking OS, multiple processes are waiting for the time of the processor. The time
of the processor is shared between the processes. That is why those OS are called as time-
sharing systems. The method of allocating processor to processes varies from OS to OS.
Introduction to Computers 9
In most business computer systems, a round-robin method is used. In this system, each
process gets the processor for a fixed amount of time before it is moved to the end of the
queue. The process will get the processor once again after all the waiting processes in the
queue ahead of it get their time slice with the processor. So, a waiting process is allocated
to the processor a number of times before its execution is completed. A slight variant of
the round-robin method is to assign priorities to each of the processes. A process with a
higher priority gets more time slices per round than the processes with a lesser priority.
Generally, OS processes would have a higher priority than processes of application soft-
ware. Some OS allow the system administrator to manually assign a higher priority to a
process. In some OS where the response time is specified, as in the case of machine-control
computers, the processor allocation method would be based solely on priority.
Memory Management
In SPA (Stored Program Architecture) computers of the present day, the program under
execution needs reside in the primary memory. A multi-tasking OS has multiple programs
in its memory. Each of the programs will have these components in the RAM until the
execution of the program is completed:
RAM has to be allocated for these spaces for each program under execution along with
the space required for the OS. It is often the case that the available RAM is insufficient to
hold the details of all programs that are under execution. In such cases, this module swaps
some programs on to the disk to a specified area called the “swap” area and retrieves it
whenever it is once again required by the processor. Usually, the programs requiring an
I/O operation are swapped. For this purpose, RAM is organized into segments. Each seg-
ment is divided into pages or blocks. Usually a page or block of RAM is 1 KB, but it can
vary from computer to computer. In modern computers, demand paged memory man-
agement is used. In this method, the pages that are not demanded by the processor are
swapped to the disk and the pages demanded by the processor are brought from the disk
to the RAM. When the RAM is too little, the swapping in and out becomes excessively high
and the OS would only be doing the swapping activity. This is referred to as “thrashing.”
The system, as a result of thrashing, becomes too slow.
10 Computer Programming for Beginners
1. Monitor the RAM available, ensure that all bits are in working condition, and raise
alerts whenever any part of the RAM is not working.
2. Maintain a table of all the pages of memory available and their allocation details.
3. Allocate RAM to programs in execution as required and available.
4. Manage the process of swapping in and out smoothly and automatically.
5. Ensure that the RAM allocated to a program is not violated by any other program
and that the integrity of the program data is not violated.
Device Management
This module is also referred to as the I/O (Input/Output) management, as every
device connected to a computer is either an input device or an output device. Some
devices, like the disk, are both input and output devices. Usually, each device con-
nected to a computer system is controlled by its control circuit, which is connected to
the system bus. Each device comes with its own device driver software. The OS of the
computer would have sockets to plug in the device driver software and whenever a
request is placed for the use of the device by a program, or the user of the device itself,
the OS hands over the request to the concerned device driver software and monitors
the device.
The device management module of the OS would perform the following functions:
1. Provide sockets on the OS to enable plugging in the device driver software for any
compatible device.
2. Maintain a list of devices connected to the computer system and monitor their
functioning.
3. Communicate with the device, including:
a. Receive communication from the device.
b. Decipher the device from which the communication is received.
c. Take the action requested by the device.
d. Communicate the result back to the device.
e. Raise alerts whenever the device is malfunctioning or not functioning.
4. Interface the processes with the devices:
a. Receive a request for the device from the process.
b. Decipher to which device the request should be routed.
c. Transmit the information to the concerned device.
d. Receive the response from the device and communicate it back to the process.
e. Raise alerts if the device is either not functioning, malfunctioning, or busy
servicing a request from another process.
5. Maintain a buffer memory to bridge the gap in speeds of the computer and the
device.
6. Manage the communication protocol between the computer and the device.
Introduction to Computers 11
Information Management
In the present day, magnetic disks are used as the main secondary memory or secondary
storage of the computer information. RAM inside the computer is volatile and all its infor-
mation is lost when the power is switched off. Therefore, the data is stored on a medium
that can retain information even when the power is switched off. CDs (Compact Disks),
DVDs (Digital Video Disks), flash memory (also referred to as solid state disk/memory),
and magnetic disks are used for this purpose. The disks are organized into tracks, and
each track is divided into sectors or blocks.
When the disk is formatted, the computer performs the following functions:
Now the disk is ready for use inside the computer. Now the information management
module of the OS uses this information to load information into this disk. The information
management module of the OS handles the following functions:
Network Management
With the coming of age of the computer networks and the Internet, each computer OS is
equipped with a module for managing the network. A server that controls the network
would have network management software as a large and special module. Other comput-
ers which are not servers would have a module for managing network communications. It
would perform the following functions:
Miscellaneous Utilities
Utilities are miscellaneous programs usually bundled along with the OS by the sup-
plier. These include file copy, file delete, list the contents of the disk, display the
contents of a data file, back up data files, restore data files, system administration
utilities, and so on. Each supplier of OS supplies a different set of utilities needed for
the OS. The purpose of these utilities is to make it easier for the users to use the OS
efficiently.
Application Software
The software that is neither firmware nor the OS is application software. It works
above the OS, utilizing the services provided by the OS. The office suites, database
management systems, media players, and all such packaged software supplied on a
CD or DVD also fall under application software. Any software developed specifically
for our purpose is also application software. Ticket reservation software, gaming
software, supply chain management software, customer relationship management
software, accounting software, and all other software that we use comes under appli-
cation software.
While computer hardware, firmware, and OS facilitate use of computers, applica-
tion software is the one that carries out real work that results either in revenue or in
reduced costs for the organizations. Most of the software development work that is
carried out in the world is for developing application software.
Introduction to Computers 13
Using these two functions, and the capability of the I/O devices, we achieve a variety of
functionalities. We use the graphics hardware and display screen in engineering designs
to produce drawings; we use the number crunching capability to evaluate engineering
designs; and with number crunching capability and programmed decision-making capa-
bility, we use computers in airplanes and rockets, too.
Where is the intelligence that computers are much touted for? It is there in the computer
programs and the storage media as data. Programs make use of the data, the standard data
in the secondary media and the input data coming from the input devices. Computers
crunch the numbers, make comparisons complying with the rules in the programs, and
give out intelligent decisions.
Computers are diligent and are free from monotony and fatigue that inhibit human
beings from doing certain functions. Crunching large numbers using complex formulas
can be performed by human beings, but human beings, being prone to error, are likely to
commit mistakes. Once programmed correctly and a sample is checked, computers can
perform the operation without any mistakes, even after a million times, without suffering
from monotony and fatigue. Again, the human beings, not considering the exceptions, are
much slower than computers, especially in performing a large number of computations.
Thus, the computers have come to perform many operations that were not thought of ini-
tially by the people who built the first set of computers.
Thus, the computers can crunch numbers and make programmed decisions diligently
and very quickly. As programmers, we need to be aware of this fact.
computer is to increment the number 4 five times with an increment value of 1. Supposing
you wished to subtract 3 from 5, the computer would decrement 5 three times with a dec-
rement value of 1. If you wished to multiply 3 by 4, the computer would perform it in three
installments; in each installment, it increments the number three times. Initially, it starts
with 3 and increments it three times, obtaining 6 as the result. In the second installment,
it takes 6 and increments it 3 times getting 9. In the final installment, it begins with 9 and
increments it 3 times getting the result of 12. Why did it select 3 installments? It selected
3 installments because, 3 × 4 is the same as (3 + 3 + 3 + 3), and the first 3 is already avail-
able. So, it needs to add 3 only three times. Division is just the reverse of the multiplication
process.
Extending this procedure, we can solve all mathematical equations. When you attempt
to solve a long, complex mathematical equations in this manner, it would require a signifi-
cantly large number of increment/decrement operations. But, as the computer is capable of
performing millions of operations per second diligently, we get the answers very quickly.
Basically, computer is simple at heart but is very diligent, shorn of fatigue and monotony.
So, we are able to use it for a variety of purposes to solve our problems using computers.
Who would tell the computer to perform all these operations? The computer programs
pass on the instructions to the computer. Some procedures are built into the OS; some are
built into the compilers; some are supplied as libraries; and some need to be programmed
by the developers of application software.
Final Words
In these days, there is no one in the developed countries that has not heard of computers. But
most would understand the computer the way a car or an airplane or a ship is understood.
We could have seen one, used one, and suffered with one, but most of us do not know
how a car functions or an airplane flies or a ship floats, let alone their internal components
and functioning. Aeronautical engineering, mechanical engineering, marine engineering,
or electrical engineering fields are not amenable for accountants, lawyers, or computer
programmers to undertake work, but computer programming is such a field in which people
who did not have a formal degree in computer science are working in it. You can find people
from all fields working as computer programmers. Why? Because it is not necessary to know
the internal workings of a computer to be able to write computer programs. It is adequate
if you are expert in a programming language and can understand the user requirements.
That is the reason for me to include a rudimentary chapter on an introduction to computers.
Well, this information is by no means comprehensive. Perhaps I abridged about 1500 pages
of information into the pages of this short introduction to arouse your interest so you would
read it and become proficient in computers some time later. I believe that it is essential to
understand computers to be a good computer programmer.
2
Introduction to Data and Data Types
What Is Data?
Data, as we defined in Chapter 1, is facts about entities. An entity can be:
1. A person
2. A location
3. An object
4. A system
5. A piece of equipment
6. An item of material
7. A transaction
8. A town
9. It can be anything that has some attributes by which it is described!
Facts can be in figures or be descriptive. For example, a person has a name that is descrip-
tive in nature; has a weight that is expressed in numbers; has an educational qualification
that is descriptive; has an income that is described in figures; has a date of birth that is a
special type of number; and has a title that is, again, descriptive. In this manner, all entities
have some facts associated with them.
While each entity has facts associated with it, we are not interested in all entities. We
are interested in those entities whose data needs to be stored in our computer system,
processed, and reported as required.
15
16 Computer Programming for Beginners
As programmers, we perhaps need to handle both types of data. Then the data used by
human beings is of two types, namely:
1. Character data: The data that contains alphabets and perhaps numbers, too.
2. Numeric data: The data that is expressed in numbers.
Character Data
Character data includes any character that can be input to the computer. What are
characters in the context of computers? It can be anything accepted by the computers.
IBM uses EBCDIC (Extended Binary Coded Decimal Interchange Code) codification of
characters for its mainframe computers. It is an 8-bit code that allows 256 characters
to be codified. Later, the American Standards Association finalized the standard for
codification of characters for use in computers and called it ASCII (American Standard
Code for Information Interchange). It was originally a 7-bit code but has now extended
to 8-bits and allows 256 characters to be codified. It codifies not only alphabets
(uppercase and lowercase), numerals (0–9), and other special characters that are both
humanly readable (like space, (,), [,], and so on), but also those that are not humanly
readable (like the enter button, CTRL, ALT, and so on). Character data can also contain
those characters that are not humanly readable in addition to those humanly readable
characters.
How do computers treat character data? The characters that are not humanly readable
provide special input to computers. They are generally used in combination with other
characters to give special commands to computers. Humanly readable characters are used
to denote names of entities to be stored, retrieved, and searched in the computers. The
treatment of character data depends on the compiler’s definition of character data of the
specific programming language.
In the earlier days, COBOL defined character data into two classes, namely, alpha-
betic data and alphanumeric data. Alphabetic data allowed only alphabets and a blank
space. Alphanumeric data allowed alphabets, numbers, and other humanly readable
characters.
BASIC and C languages allowed for the definition of a single character, which allowed
any character to be input to the computers.
Character data is usually defined as a string of words. A word is a contiguous set of
characters. A word is separated from another by a delimiter, which usually is a blank
space. Character data that is humanly readable is usually stored inside the computer.
It is used to search for a specific set of data inside a large group. It is also used to
make a logical decision based on the word or a group of words. It can also be added;
that is, two words can be added (or concatenated) to result in a single word. It can
also be used to pick parts of words and make a new word. The only arithmetic opera-
tion possible on character data is addition (or concatenation). Other arithmetic opera-
tions including multiplication, division, subtraction, and so on are not possible on
character data. True, some languages permit performance of all arithmetic operations
Introduction to Data and Data Types 17
on character data, but they do not promise reliable results or any practical purpose.
Character data is generally used as:
1. Strings: A string can consist of multiple words. When we use strings in programs,
we need to enclose the string between quote marks (“....”). Usually the length of a
string is restricted to 255 characters or, in other words, one-fourth of a one kilobyte.
2. A memo or long textual matter is used to store documents or long explanations
inside a database. It can contain lines in addition to words and strings, numbers,
and other humanly readable special characters such as parentheses, full stops,
commas, and so on.
3. Special strings: Recently, this data type has come into existence. The URLs
(Uniform Resource Locators) used for website addresses and email addresses
form part of this data type. These are used for navigating to a website or send-
ing an email.
Numeric Data
Numeric data is numbers expressing the value of some attribute of an entity. As you
perhaps know, there are real numbers and imaginary numbers. A computer can han-
dle only real numbers on its own. What is an imaginary number? The square root of
any negative number is imaginary. However, if you can come up with a procedure
for handling imaginary numbers, then, perhaps, it can handle them. If during the
execution of a program, the computer comes across an imaginary number, an error is
thrown up.
But some numbers are associated with a unit. The unit of money can be dollars and
cents; the unit of weight can be pounds and ounces; the unit of length can be feet and
inches; and so on. The units have to be handled by the programmer. A computer cannot
handle units unless it is programmed to do so.
Computers subject numeric data to arithmetic manipulation, and it is possible to per-
form all arithmetic operations on numeric data. Numeric data is further classified into the
following types:
1. Integers: Integers are whole numbers without any fractional part. Except in
counting, the real world uses fractional numbers. Integers are used in repre-
senting age, income, addressing memory locations both in RAM as well as on
secondary storage, and so on. In reality, the age of a person remains an integer
just for a day, on the birthday. After that, the age would have fractional part.
When integers are subjected to the arithmetic operation of division, it may
result in a number with a fractional part. Some programming languages allow
for two types of integers, namely, the short integer and the long integer. Two
bytes are allocated for short integers and four bytes are allocated for long inte-
gers. The first bit of the integer is reserved for sign, positive or negative, and
the remaining bits are used to store the number. A short integer would have a
18 Computer Programming for Beginners
maximum value of 32,767 (215 –1). A long integer would have a maximum value
of 2,147,483,647 (231–1). When we use integers, we ought to be sure that the num-
ber would never result in fractional part.
2. Real numbers: Real numbers are those that have a fractional part. Examples are
10.5, 21.25, 100.35, and so on. The real numbers are stored inside the computer in
a special manner. They are stored as the mantissa and the exponent along with a
sign. For example, the number 100.25 is stored as (+0.10025, 3). That is the number
is 0.10025 multiplied by 103 resulting in 100.25. The real numbers are also referred
to as floating point numbers. Real numbers are usually stored as single precision
and double precision numbers. The precision denotes the number of significant
digits the number can store. We should be careful in using double precision num-
bers, as they take up significant amount of RAM and storage space which is twice
that of a single precision number.
a. Single precision numbers are those that are allocated a minimum of 4 bytes.
Presently, most computers allocate 8 bytes to single precision numbers. Single
precision number with 8-byte allocation can handle up to 20 significant digits.
When the language does not explicitly specify the precision, the default preci-
sion is single precision number.
b. Double precision numbers are those which are allocated a minimum of 8 bytes.
Most modern computers allocate 16 bytes to double precision numbers, espe-
cially those computers that are used in scientific and mathematical applica-
tions. Double precision numbers with 16-byte allocations can handle up to 39
significant digits in the number.
3. Dates: Dates are a special category of numeric data. They contains three distinct
parts, namely the day of the month, the month, and the year with rules for depen-
dency of the date on the month with respect to the maximum value a date can
be. Dates are two types, namely, the short date and the long date. The short date
would consist of only the date, month, and the year. The long date would consist of
time in hours, minutes, and seconds in addition to the date, month, and the year.
The actual storage of a date depends on the computer, but most store the date as
the number of seconds from a reference date such as 1900-01-01 00 hours 00 min-
utes 00 seconds. When displaying the date, they convert it to the date format that
human beings are used to see.
4. Time: Time is also a special type of numeric data. It contains three numbers for
the hour, the minute, and the second. The minute and the second would have a
maximum value of 59, then roll out to 00. Hour would have a maximum value of
24. Often the time needs to be displayed with a suffix/prefix of AM or PM and also
in a 12-hour format or 24-hour format.
5. Currency: Currency is also one special number whose fractional part is restricted
to 2 digits after the decimal point. Financial applications use this data type, as the
digits after the decimal point are always two.
6. Counters: This data type is integers and is used to count the number of iterations a
set of program statements is executed inside a program.
7. Auto incremented numbers: These are used in database tables to form a primary key
for the table when a data item cannot be unique in the table. We will explain the
primary key in the next chapter on data storage. The feature of this data type is that
it will be automatically incremented whenever a new record is inserted in the table.
Introduction to Data and Data Types 19
We need to note that numeric data is the main data, as the main purpose of computers
is to process numeric data. Most common numeric types of data used in computers are
detailed earlier. Some computers can have additional types to make life easier for the
programmers.
1. Arrays: Arrays are tables of data. Usually, arrays are used in solving mathematical
problems in matrix algebra. Arrays can be single dimensional, that is, there is
only one row of data, or two-dimensional, containing both rows and columns.
Most programming languages restrict the use of arrays to numeric data, but
some programming languages do allow use of character-type data in arrays.
In computers, an array is a contiguous chunk of RAM. The amount of RAM
allocated to an array depends on the data type used in the array. If we declare
a single dimensional array of six integers, then the computer would allocate a
contiguous chunk of RAM that can accommodate six integers. If we declare a two-
dimensional array, say, a 3 × 4 array of integers, then the computer would allocate a
contiguous chunk of an array to accommodate 12 integers. Arrays are used in both
commercial software as well as system software development. Arrays have to be
declared separately like other data types, but they need the following additional
information besides the name of the array:
a. Type of data used by the array
b. Dimension of the array; that is, the number of rows and columns contained in
the array
2. Pointers: Pointers are a special type of integer data. Pointers can hold the address
of any location inside the RAM. So, the size of a pointer type variable is fixed so it
can hold the maximum address of any location in the RAM. Usually, the size of a
pointer would be 4 bytes. Using pointers, we can access any location in the RAM
and manipulate its contents. Application programs are not allowed to access any
RAM location, as it would violate the security of the execution environment. In
application programming, pointers are used for handling arrays, especially in the
C family of programming languages. In system software programming, we need
to handle allocation, deallocation, and manipulation of all the RAM, and pointers
are used to manipulate RAM as desired by the programmer in the development of
system software.
3. Union: Union is a special data type used in the C family of programming lan-
guages. It allows the same amount of RAM to be addressed using different vari-
able names as well as different data types. It is perhaps the only data type that
allows storing both numeric as well as character data and allows the data to be
manipulated as both numeric and character data. Of course, it is incumbent on
the programmers to store numeric data in a union before handling it as a numeric
variable and to store character data in it before manipulating it as a character
20 Computer Programming for Beginners
Data Classes
Basically, there are two data classes, namely the local (dynamic) and global (static). The
distinction surfaces when the program contains subprograms.
Local Data
Data declared in a subprogram is usually local data. That is the value stored in a local vari-
able that is used only by the subprograms in which it is declared. It is not available to be
manipulated by the main program calling the subprogram. The variable, declared as local
to a subprogram, is released to the operating system for allocation to other programs when
the execution of that subprogram ends. If we want the value stored in the variable to be
available to the main program or other subprograms, we must declare it as a global vari-
able. A variable declared in a subprogram is by default a local variable unless it is declared
as a global variable.
Global Data
A variable declared in the main program is available to all the subprograms called by the
main program for manipulation. It need not be declared as global, but if we declare a vari-
able in a subprogram and want it to be available for manipulation by other subprograms
or the main program, we need to declare it as global. A variable contained in a higher-
level program is available for manipulation by a lower-level program, but the reverse is
not true. A variable declared as global would not be released until all the programs in the
main program, including the main program, are closed. A global variable ties up the RAM
until all subprograms and the main program are closed. Therefore, it is better to declare
variables as local to reduce the burden on RAM. If more RAM was tied up by global vari-
ables and other programs need RAM, virtual RAM has to be utilized with all its attendant
disadvantages.
Introduction to Data and Data Types 21
1. A variable is a name for our reference representing the data we propose to use
inside the program.
2. The rules for naming a variable vary from one programming language to another
programming language. We will discuss more about naming variable in the com-
ing chapters.
3. A variable is associated with a specific set of data to be used in the program.
4. Depending on the data type, the computer allocates the amount of memory (RAM)
when the program is loaded for execution into the computer.
5. Whenever we use that variable in the program, it refers to the allocated RAM.
6. The allocated amount of RAM remains allocated as long as the program is under
execution in the computer.
7. When the program completes execution and is removed from the RAM, the space
allocated for the data will also be released for allocation to other programs.
Thus, to use data in the programs, we declare variables for all the data items we propose
to use in the program. Then we keep reading data into the specified locations in the RAM
and process it as long as the data exists.
Next, we will see how data is stored in the computers for storage and usage.
3
Data Storage and Retrieval
Storage of Data
To process data, we need to supply it to the computer. We can do so item by item when
asked by the computer. We get a prompt on the screen and we input the data using our
keyboard. This procedure is alright when we process small amounts of data and in class-
room hands-on sessions, but not in the real world of business and government. We need
to process bulk amounts of data in the real world. It is just not possible to input bulk data
using only keyboards. Therefore, we need to store data.
We will be storing the data inside the RAM when the programs for processing the data
are being executed. During program execution, only one set of data is stored inside the
RAM. Bulk data is stored on secondary memory or storage. Secondary storage is magnetic
media, optical media, and solid-state memory.
In the beginning, we used to store the data on punched cards. Then we moved on to
magnetic tapes. Then we moved on to magnetic disks. Now we have optical disks that
are now referred to as CDs (Compact Disks) and DVDs (Digital Video Disks). These opti-
cal disks use laser (Light Amplification by Stimulated Emission of Radiation) technology.
While optical disks have made giant strides in technology, they are yet to replace the mag-
netic disks as the primary media for data storage. First, the cycles of read and write, on
optical disks have not yet achieved the reliability of the magnetic disks. Second, the capac-
ity of the optical disks is not large enough to replace the magnetic disks. Third, the number
of times the read and write can be performed on the optical disks is not high enough to
match the magnetic disks.
Solid-state memory (flash drives, pen drives, USB disks) is looking very promising, and
they can very well replace the magnetic disks as the primary medium for storing data for
the long term. They are now capable of holding data up to 250 GB, which makes it feasible
to be used in personal computers to begin with. Solid-state memory is the primary reason
behind having handheld computers, which are referred to as tablet computers.
Magnetic tapes were the primary choice for storing data until the 1970s but were super-
seded by magnetic disks. Since being replaced by magnetic disks, tapes were used as
backup media to keep backups of data and programs. Even today, large-capacity magnetic
tapes are used to store 70 GB or more on data cartridges that are basically quarter-inch
wide magnetic tapes kept securely inside a cartridge.
23
24 Computer Programming for Beginners
Magnetic Disks
Magnetic disks have been the primary choice for the storage of data for some time
now. Magnetic disks are now hermetically sealed and mounted inside the computer to
store data and supply data and programs to the computer. Magnetic disks are capable
of holding terabytes (one terabyte or TB is equal to 1000 GB) of data. The time to access
any location on the disk is also the lowest next only to the RAM. Solid-state drives are
competing with magnetic disks in terms of access time but not in the capacity to hold
data. Magnetic disks are now referred to as hard disks or Winchester disks. They are
called hard disks as they were used at the same time floppy disks were used. While
floppy disks were made with polyester film, these disks were made on metal platters.
So, polyester disks were referred to as floppy disks and these, being on metallic platters,
were referred to as hard disks. Though floppy disks are passé, the name “hard disk”
continues.
The magnetic disk is cut into many tracks for holding data. Each track is then divided
into number of sectors. The first track is numbered as 0 (zero) and the numbers continue
upwards. The sector is the smallest element of the disk that is addressable. In some
computers, the sector would be numbered from 0 (zero) upwards within the track. That
is, there will as many sectors numbered zero as there are tracks. This necessitates sup-
plying two addresses, namely the track number and the sector number, to locate data
on the disk. In some computers, the sectors are numbered from zero upwards without
reference to the tracks. In this method, we need to supply only one address, the sector
number, to locate data on the disk. Presently, most computers use the latter practice
of numbering the sectors; the first sector of the disk is numbered 0 (zero) and the last
sector would have whatever number it comes to depending on the capacity of the disk.
Each sector holds one block of data. Most computers use 1 KB (1024 bytes) of data as one
block or one sector.
The block is the minimum amount of data allocable to files whether we use it entirely or
not. When we have more data than can be allocated in one block, the disk space is allocated
in multiples of blocks.
When we format a disk to make it usable in the computer, the computer writes a table on
the disk. This is referred to as VTOC (Volume Table of Contents), or in PCs, it is called the
FAT (File Allocation Table) or a similar name. This table usually contains:
1. The file identification number that usually starts at zero and is numbered upwards.
2. The name of the file.
3. The identification number of the block where the first block of data of the file is
stored.
4. It may contain additional information about the size of the file, the addresses of
the other blocks of data of the file, the address of the last block of information, and
so on.
The disk would also have one more tables in which the block address and to which file it
is allocated are written. Initially, the allocation information will be blank. As the disk gets
filled up, the allocation information would be filled in at every allocation.
Data Storage and Retrieval 25
Records
Before we move on to data files, we need to understand records. A related set of data
items is one record. For example, let us assume a pay roll situation. An employee has an
ID, a name, his basic rate, number of payable hours, other allowances, and deductions. In
this manner, all the employees in the organization have similar information. Now, the
data would be organized for each employee, for all the employees. If we assume a table,
each row will contain the information of one employee and all the rows put together
would hold the information of all employees. Each row can be viewed as a record. To
summarize:
Data Files
Data is organized in files in the computers. Why were they called files? Initially, decks
of cards were used to store data. When you punch data onto the punch cards, and the
cards are bunched into a deck, it would resemble a paper file with papers. Then the
data moved on to magnetic tapes, with each tape holding one class of data or, in other
words, one file. Each tape had one file of data. Though the data moved onto magnetic
disks, the name “file” stuck, as people are familiar and more comfortable with that
name. Even though IBM began using the name “data set,” which I think is more appro-
priate to describe a set of data, “file” is definitely more popular in the computer and
data processing world.
Although the advent of Database Management System (DBMS) has moved the main-
stream data storage from plain data files into the tables of the databases, data files are still
used in a significant manner. Therefore, we need to learn about data files before we move
on to databases.
Traditionally, files were organized as sequential-access data files and with the introduc-
tion of magnetic disks, random access data files came into existence. Then, to combine the
advantages of both file types, indexed sequential access data files were introduced. We
will discuss these files in greater detail now.
cards. On tape too, the tape runs only in one direction and to get to the desired data, we
had to pass on the preceding data. With disks, it became possible to access any desired
data directly as the disk was spinning and it is possible to access any track and sector
within almost the same amount of time. In sequential access data files, there were two
kinds of organization.
1. Variable record length sequential-access data files: In these files, the length of each
record can vary. It can happen when the fields of the record have variable length.
For example, the names of persons would not have the same number of characters.
For example, the name John has four characters, while Harold has six characters.
Similarly, the pay rate for an employee can be less than 100, having two digits,
whereas another employee can have pay rate more than 1000, having more than
three digits. When the fields have different lengths, to distinguish one field from
other in the same record, a delimiter is inserted after each data item in the record.
Usually, a comma, a space, or a semicolon is used to delimit one field from the
other in the record. The advantage is that the record length for each record, and
thereby the size of the file, is maintained at its minimum. While this was not a
great advantage during the punched card days, it was a great advantage during
the initial days of magnetic disks, as the disks did not have the large capacity
that is available in the present day. The flip side is that an additional character in
the form of a delimiter has to be inserted between adjacent fields in the record,
increasing the overall record and file sizes.
2. Fixed record length sequential-access data files: In these files, each field has the
same length. If a specific data item has less than the allocated length, the extra
length is wasted. In numeric fields, leading zeroes are inserted to pad the field
to its full length, and in character fields, blank spaces are suffixed to the field as
needed to pad the field to its full length. With fixed length fields, the need for
delimiters is eliminated. This was more or less the practice in the days of punched
cards and magnetic tapes. With the advent of the disks, variable record length
sequential access files have become the norm. Still, fixed record length is used in
mainframe computers even in current times.
Now there is one other variant in the sequential access data files. It is referred to as the
line sequential files. This is mostly used in PC (Personal Computer) applications. In fixed
record length files, no delimiter is used between adjacent records. The record sequence is
determined by counting the number of characters and dividing the count by the record
length. In variable record length files, the record sequence is determined by counting the
number of field delimiters and dividing the count by the number of fields in the record.
It works perfectly with files on magnetic tapes, but the files on the disk have to use a
mechanism to distinguish records from one another. They use the carriage return (ASCII
character #13) character and/or line feed (ASCII character #10) character. Windows-based
PCs use both the characters and UNIX-based computers use only one of them. These files
are referred to as Line Sequential files.
Sequential-access data files offer the best economy in terms of disk space and are still
being used. In disk-based sequential access data files, the disk space is allocated contigu-
ously as far as possible and the file is continued in the free blocks available on disk. In
this mechanism, a record may span two or more blocks depending on the record length.
Sequential-access data files are still used in a big way, especially in mainframe computers
and where bulk data needs to be processed.
Data Storage and Retrieval 27
applications, flat files are the ones used to store data. In machine-control applications, such
as the software used in cars, airplanes, rockets, and so on, flat files are used to store data.
In firmware, that is, software etched onto a silicon chip, it is the flat files that are used. It
is only in commercial applications that databases handle the bulk of the data, but even in
them, a few flat files are used. Therefore, serious computer programmers need to learn
about and understand flat files and their organization and usage.
Databases have been adopted by the industry quickly, and in present day commercial
applications and those applications that using large volumes of data, DBMS has become
an inseparable part of the applications. With DBMS, the programmers need not define
the file structure inside the program. They need not even recompile the programs, even if
the file structure changed. Another major advantage of DBMS technology is that no pro-
grams need development for entering data. DBMS provides data entry facilities. Another
major disadvantage of flat files that prevented multiple users from concurrently using
the same file was overcome by DBMS. With DBMS, the entire file need not be dedicated
to a program. It is adequate to lock one record to the program and other programs can
Data Storage and Retrieval 29
concurrently continue to use the other parts of the table or the database in their applica-
tions. This facilitated online data processing by end users of the computers, and the need
for specialist data entry operators was eliminated. As users are now entering the data
directly, the need for data verification was also eliminated.
DBMS has facilitated storing large volumes of data simultaneously by multiple users,
and efficient and quick data retrieval. This has facilitated moving business transaction
from registers and papers to computers and paved the way for paperless offices.
Now, DBMS has four levels of organization:
1. At the bottom-most level is the field or domain: It is the smallest unit of data and holds
data of one specific attribute of the entity. For example, a name is a field and holds
the values of names; similarly, pay rate is a field that holds the values of pay rates
for different people. A field is like a column in a two-dimensional table.
2. At the next level is a record or tuple: It holds the complete information for an entity. For
example, in pay roll data, a record contains the pay information for one employee.
It would contain information like the employee identification number, name, pay
rate, other allowances, deductions, payable number of hours, and so on. A record
consists of multiple fields. A record is akin to a row in a two-dimensional table.
3. A table or relation: It is related information for an application. It consists of a num-
ber of records. In a pay roll application, a master table would consist of master data
of employees such as employee id, name, designation, pay rate, allowances appli-
cable to the specific employee, normal deductions, and so on. A pay roll transac-
tion table would consist of employee id, number of payable hours, other earnings
during the pay period, deductions during the pay period, and so on. In this man-
ner, there can be other tables in the pay roll application.
4. A database: It is a collection of tables that are related together. For example, a pay
roll database may consist of employee master table, transaction table, union mem-
bership table, income tax table, and other such tables. An organization may have
multiple databases such as a marketing database, material management database,
finance database, personnel database, and so on.
In the programs, a connection needs to be made to the database and then all the data in the
database can be used: any record can be retrieved, any field can be updated, any record can
be added or inserted, any record can be deleted. In this manner, any data can be manipu-
lated as required by the organization. DBMS technology has removed the dependency on
data for the programs. Now, programs and data are independent of each other and can be
maintained separately without worrying about their impact on each other.
Presently, most commercial applications make use of database to handle the data of the
applications. Databases evolved from hierarchical databases and network databases to
relational database technology. In hierarchical databases, relations could be set from one
table to many tables. In network databases, relations could be set from many tables to
many tables. In relational databases, relations could be set from one table to just another
table. It is a one-to-one relationship.
However, by using interface tables, we can achieve many-to-many relationships in rela-
tional technology. Relational databases utilize disk space very efficiently and hence are
preferred over the other two technologies. Oracle, SQL Server, Ingress, Informix, MySQL,
and Progress are examples of commercially available relational DBMS software. A stan-
dard scripting language of SQL (Structured Query Language) was developed for data
30 Computer Programming for Beginners
definition, retrieval, and maintenance. Programmers can call the SQL routines in their
programs directly and make use of them for data maintenance and retrieval.
The design of databases and the definition of tables can be performed independently
of the software development. Let us now understand a few basic requirements of DBMS
packages so that we can effectively use them in software development.
Each table needs to have a primary key. A primary key is a field in a table in which data
values are not duplicated. Each data value of the primary key is unique. Without a primary
key, a table cannot be defined in a truly relational database. In some relational database
systems, a combination key with the values of two or more fields can be combined to form
the primary key, but most systems require one field to be defined as a primary key. In cer-
tain cases, it may not be feasible to set aside any one field as a primary key. In such cases,
we usually define an additional field and fill it with auto incrementing numbers in order
to have a primary key.
As some perceive, DBMS does not completely eliminate the duplication and redundancy
of data across multiple tables. What they really do is control the redundancy. To set a rela-
tionship between two tables, a field must be common between those tables. It is referred
to as the secondary key. While the primary key cannot have duplicate values in the table,
secondary keys can have duplicate values. Incidentally, the secondary key in a table ought
to be the primary key in another table with which the current table is related. A relation is
set between the primary key of a table and the secondary key of a different table. A table
can have only one primary key, but it can have multiple secondary keys.
The design of a database is a large enough subject to merit a book in itself. Here I am
including a brief explanation for your understanding of the basics. Usually, in organiza-
tions specializing in software development, it is common to dedicate a database specialist
to design the database and develop routines for efficient data manipulation that are used
by the programmers in their programs. Here are the steps in the design of the database:
Each package would have different rules for naming the tables and fields, as well as data
types for defining data, but most DBMS packages would support the data types described
earlier in this chapter. All the DBMS packages implement the standard SQL language with
some extensions and a few adaptations of their own.
Depending on the type of application you are developing, the type of data storage and
retrieval needs to be determined and used in the development. Chapter 2 introduces you
to the two basic types of data storage and retrieval, namely, the flat files and the DBMS
packages. Now we are ready to understand what computer programs are.
4
Introduction to Computer Programs
Introduction
What is a program? Merriam Webster’s dictionary defines “program” for contexts other
than computers as “a plan of things that are done in order to achieve a specific result,”
and as “a thin book or a piece of paper that gives information about a concert, play, sports,
games, etc.” If you look at a program sheet, you would notice that it lists all the items that
needed performance, and each of the items in the list are arranged in the chronologi-
cal order of their performance. Usually, the program is adhered to in the performance of
activities to achieve the desired result.
Merriam Webster’s dictionary also gives another definition that is pertinent to computer
programs: “a set of instructions that give information that tell a computer what to do.”
Standard 610, Standard Glossary of Software Engineering Terminology of IEEE (Institute
of Electrical and Electronics Engineers) gives this definition of computer program: “a com-
bination of computer instructions and data definitions that enable computer hardware to
perform computational or control functions.” It is also pertinent to note IEEE’s definition of
software: “computers programs, procedures, and possibly associated documentation and
data pertaining to the operation of a computer system.”
Wikipedia defines a computer program as “a sequence of instructions written to perform
a specified task with a computer.”
Oxford dictionary defines program as “a series of coded software instructions, to control
the operation of a computer or other machine.”
I think these definitions are adequate for us to begin understanding what a program is
in the context of computers and software. One interesting aspect to note is in the definition
from the Oxford dictionary. It included the phrase, “or other machine.” Now we have soft-
ware controlling rockets, airplanes, cars, washing machines, and even children’s toys! But,
these machines have a processor in them to execute the software and then pass on appro-
priate signals to the machine. The processor may not be as powerful as the ones in com-
puters, but they are processors. So, in order to determine realistically if these processors
inside machines are computers or not, we need to look at a universally and undisputedly
accepted definition of “computer.” Unfortunately, such a definition eludes the world. That
is why, perhaps, Oxford dictionary included “machine” in its definition, as it perceived
the processor inside the machines not as computers. Others perceived that the processors
inside machines are computers and did not include “machine” in their definitions.
31
32 Computer Programming for Beginners
I prefer the definition of IEEE and I wish to extend it thusly: “a combination of computer
instructions and data definitions arranged in their chronological sequence of execution that
enable computer hardware to perform input/output, computational or control functions.”
The italicized parts in the earlier definition are my insertions. Now, adopting this defini-
tion of a computer program, we can move forward.
Computer programs are in three states, namely, the source code, the object code, and
the executable code. Source code is the program written by programmers in a computer
programming language. It is humanly readable and understandable. Object code is the
compiled source code program. It is in a machine readable and understandable form.
Executable code is the object program linked with code libraries and is ready to be exe-
cuted by the computer. Basically, the object code and executable code are similar. Object
code is the compiled program that is then transformed into executable code by linking it
with the applicable libraries.
Many synonyms are used to refer to computer programs. The most common synonyms
are macro, script, agent, trigger, procedure, function, applet, servlet, app, bot, and routine.
Perhaps there could be more, and more would be on the way. While each of these terms
has a definition, they are all basically programs. Programming covers all these types of
programs.
Components of a Program
A program would have the following components, normally:
1. Headers: These are statements placed above the program code and can be used in
the entire program.
2. Program beginning and ending: Every program has an identifiable beginning and an
identifiable ending. These contain specific sets of actions to be performed when
the program begins execution and ends the program.
3. Data definitions: These are definitions of the data proposed to be used during the
program execution. Each definition results in the allocation of RAM as required.
4. Input operations: These operations bring in data from the outside world into the
RAM for processing.
5. Output operations: These operations deliver the processed information to the out-
side world.
6. Computational operations: These operations perform mathematical operations and
produce results.
7. Decision making operations: These operations make programmed decisions and
control the program flow.
8. Program documentation: These statements explain the logic of the program state-
ments, which assist other programmers in maintaining the program.
9. System calls: These operations call the services provided by the operating system
and utilize them in the program.
Program Statements
The program consists of program statements. In the earlier days, the card consisted of
one statement and, as a punched card could accommodate 80 characters, a statement was
restricted to one card. Continuation of the statement in another card was permitted in
special conditions but was used sparingly. Later, when the VDU (Visual Display Unit) was
introduced, the programming shifted to VDUs. They also were designed to support 80
characters per line, so the same rules continued for the line and statement. But modern
computer screens can display more than 80 characters per line and can even scroll to the
right side to accommodate more characters. That is why modern programming languages
support up to 255 characters per line, but the good practice of writing programs suggests
including as many characters as the display unit can accommodate in a line without the
need for scrolling the screen horizontally.
A line ends with an LF (Line Feed—ASCII character #10) character and/or CR (Carriage
Return—ASCII character #13) character. It is easy to count the number of lines in a pro-
gram as we just need to count the LF/CR characters.
In today’s programming environment, a statement can span across multiple lines. Some
languages need some indication that the line is a continued statement, but many develop-
ment environments do not need this. Some languages use a statement terminator character
such as a semicolon (;). We can write any number of lines as required between two succes-
sive semicolons and all those lines would be treated as a single statement by the computer.
When the source code is translated, it is translated into object program, or object code.
It simply consists of computer instructions and the data definitions. So, object code is the
source code translated into computer instructions.
Now, the data definitions need to be allocated RAM. As we noted earlier, the data is of
different types. So, each of the defined data items would be allocated RAM in appropriate
amounts. An integer would be allocated two bytes, a single precision real number would be
allocated four bytes, and a character string would be allocated the number of bytes equal to
the number of characters it needs to hold. In this manner, RAM would be allocated to all data
items defined in the programs. But, at this point in time, it would not be known where the
RAM would be allocated. So, the computer would allocate the total amount of RAM needed,
the number of items needed, and the address of each item from the first item allocated. Actual
locations would be decided when the program is submitted to the computer for execution.
This is referred to as the “relative addressing.” Each address is relative to the first address.
Let us assume that the program needed 100 bytes for four locations. The first item begins
at location 1 and extends to 4, the second item begins at 5 and extends to 60; the third item
begins at 61 and extends till 92; and the last item begins at 93 and ends at 100. Now, sup-
pose the allocation of RAM begins at location 102,401. Then the address of the first data
item is 102,401; the address of the second data item begins at 102,405; the address of the
third data item begins at 102,461; and the address of the fourth data item begins at 102,493.
This is just an example to give you an idea about memory allocation and relative address-
ing. Now, all data items are allocated.
Every programming language makes use of the services of the operating system of the
computer. These services consist of receiving inputs and delivering outputs, using RAM,
and using the CPU. Apart from those services of the operating system, certain other fea-
tures like the mathematical operations and commonly used functions are provided by the
programming language. Now the object program is equipped with memory locations and
linked to the services and common routines, making the program ready for execution by
the computer. Now, this code is referred to as the executable code.
Summarizing our discussion:
A program library is a set of programs made available by the manufacturer of the com-
puter, the developers of the operating system, and the developer of the programming lan-
guage used for writing the computer program.
Here we need to understand the term “compiler.” A compiler is a computer program in
itself that performs the functions as follows:
1. It checks the program for adherence to syntax rules specified by the specific pro-
gramming language. When errors are detected, it lists out the errors so that they
can be corrected.
2. When the program is completely free of syntax errors, it translates the program
into object code.
Introduction to Computer Programs 35
3. It computes the amount of RAM required for all defined data items and allocates
them using relative addressing.
4. The final output of a compiler is the object program that can be used to link to its
concerned libraries to prepare the executable code.
Computer Programming
What is computer programming? Computer programming is a set of activities including writ-
ing a program in a programming language, then compiling it and linking it to the relevant libraries
to prepare the executable code, and then testing the program to remove all errors including syntax
errors, logical errors, and computational errors lurking inside the program.
Before the advent of the IEDs (Interactive Development Environments), these were the
steps in writing and preparing the executable programs and then executing them:
1. Use a text editor program to enter the program: This is the first step of writing a com-
puter program. In the earlier days, programmers used to write the program on a
graph paper with one character per each square. Then that was punched on the
punched cards by specialist data entry operators using card punch machines. But,
with powerful editors becoming available, the programmers themselves were typ-
ing out the program into the computers using the text editors. Editors allow enter-
ing the required text, making corrections, deletions, copying, and pasting, and
all such text-editing facilities. The programmers just need to adhere to the syntax
rules prescribed by the specific programming language in which the program is
being written. The programmer also needs to write the program in such a way
that the desired results are obtained when the program is executed with relevant
data on the intended computer.
2. Compile and debug for syntax errors: Once the program was written, the program-
mer submits the program to the compiler. The compiler checks the syntax and
enumerates the errors onto a printer or a screen. Then all the syntax errors enu-
merated by the compiler need to be rectified by the programmer using the text
editor. The process of removing errors from a program is called “debugging.”
Once all the errors are removed, the program needed to be resubmitted to the
compiler once again. The program may go through multiple iterations of compil-
ing and debugging until all syntax errors are eliminated from the program and
the object code is generated by the compiler without any errors.
3. Link the object code to the required code libraries: This step is achieved using a link-
ing program that scans the program and extracts the names of the code libraries
the program invoked and then links the program to the invoked libraries. In the
earlier days, the code of the code libraries invoked in the program was appended
to the program’s object code. In the present days, the code libraries are DLLs
(Dynamically Linked Libraries). They need not be appended to the program code.
They will be loaded dynamically at the time they are called during the execution
of the program. The linking program ensures that all the invoked libraries are
indeed present in the specified directories. With the successful completion of this
step, the executable code is ready.
36 Computer Programming for Beginners
4. Execute to ensure that the program is working: Now we need to execute the program.
This involves invoking the program. In the earlier days, it involved typing the
name of the program at the command line along with any arguments that are
necessary to run the program. In the current times, we select the program from a
drop-down menu or double click an icon on the desktop of the computer screen.
When we invoke the program, the program is loaded into the RAM and its execu-
tion begins. It will display a screen, a message, print, or whatever the program is
expected to do. Sometimes, the program may contain such logical errors that the
program itself cannot run. This step ensures that the program is indeed running
and can be tested to ensure that it is doing what is expected of it.
5. Test, test data, accuracy of computation, retrieval of right data: Every programmer is
expected to test the program written by him/her thoroughly to ensure that the
program is doing what it is expected to do and is not doing what it is not expected
to do. This is referred to as “self-testing.” Self-testing may not be as thorough as
the testing performed by a specialist tester. Still, the programmer is expected to
test thoroughly so that no errors can be uncovered by the specialist tester. After
all, the programmer is the one who did all the programming and is the best per-
son to know what it can or cannot do. Testing forms part of software quality assur-
ance and is important enough to merit a separate chapter. We will discuss more
about it in the subsequent chapters.
Now, all these need not be separate steps. With the advent of IDEs, all these can be performed
from within the IDE. An IDE allows for entering the program code, editing it, compiling it,
running it, testing, and debugging it as well. Each supplier of the programming language
supplies an IDE tailored to their programming language. Commercial off-the-shelf IDEs
are also available for use in the software development.
Debugging has become a very popular term in software development circles. The
term came about as follows. The first practical electronic digital computer was ENIAC
(Electronic Numerical Integrator and Calculator) was built using many valves and relays.
One day, the computer stopped working and the technicians worked for a long time
to find the problem and rectify it. On the night shift, a technician found a moth stuck
between the contacts of a relay. She removed the dead moth and cleaned the contacts and
then the computer began working again. In the shift log she wrote, “The computer was
‘debugged’ and it is now working,” or something similar to that. From then on, the term
“debug” became very popular and entered the dictionaries. Debug has come to be under-
stood as removing defects from anything.
When we invoke a program for execution on a computer, the following activities are
performed:
10. If the process is deallocated only because its time slice expired, then the state of
the process would be set to “ready” when it is deallocated.
11. All the processes in the “ready” state would be in the queue for the allocation of
CPU. The CPU would be allocating time to each of the processes in ready state and
executes process by process in a round-robin manner.
12. When all the instructions in the program are executed and all the data is processed,
the program would be removed from the RAM, all data files and databases opened
for the program are closed, and the process is removed from the process table.
Programming Styles
We have noted that a computer program is a sequence of statements arranged in their chrono-
logical sequence. The computer executes the first instruction in the program and then it comes
to the second, then the third, and so on. The computer execution goes like a waterfall from the
top of the program to the bottom of the program, executing one instruction after the other. Of
course, by inserting control statements, we can make the computer skip the order of execution
and execute a set of other instructions. These statements are referred to as branching state-
ments (meaning that these instructions cause diverting the execution to another branch of the
program) and control statements (as these instructions control the flow of execution).
When we write programs in such a way as to flow freely from top of the program to
the bottom of the program, the style is referred to as waterfall programming. We have to
note that no program can be written completely avoiding some sort of decision making
and branching based on the outcome of the decisions. The experience had shown that the
branching statements in this style of programming branches off the execution to unde-
sirable locations and produces unexpected, inaccurate, and undesirable output. This has
been especially true in the case of very long programs exceeding a thousand lines of code.
So, this sort of programming is more or less shunned. We use this style in a limited man-
ner, especially in very short programs like macros, scripts, functions, and subprograms.
This style is superseded by what is popularly referred to as “structured programming.”
The following guidelines define structured programming:
Now with GUI (Graphical User Interface) becoming the norm for all software, structured
programming has been more or less implemented across the applications. In the GUI, each of
the controls has a number of events associated with it. They include gotfocus, lostfocus, click,
doubleclick, and so on. Each event needs to be programmed separately. Thus, each event is
controlled by a subprogram. Structured programming is inbuilt in the GUI applications.
Readability of Programs
When we write programs, we need to keep in mind one thing and that is that the program
could be running for years and it would need maintenance. The programs written in 1960s
and 1970s are running even today. The programmers who wrote them retired long ago,
and the programmers who are maintaining them are today’s youngsters. Therefore, while
we write programs, we need to ensure that the programs are written in such a way that it
is easy for other people to read and understand them. The following guidelines help us in
writing programs that are easily understandable to others:
1. We write each line such that the reader need not scroll horizontally.
2. When a line is subordinate to the preceding line, then we begin it with an offset.
That is, if the principal line is starting at position 1, we begin the subordinate line
at an offset of one tab-space. Here is an example:
Here is the first principal line. It begins at position 1
Here is the first subordinate line. It is offset by one-tab space
Here is the second subordinate line
Now here begins another principal line
3. Most programming languages provide keywords to reduce the program length.
Such keywords combine a few lines of code into one line. They allow us to write
short programs. But, when something does not work as expected, it becomes very
difficult to debug those statements that combine multiple lines of code into a sin-
gle line. When the program is compiled, both programs (with one line or multiple
lines for achieving the same functionality) would translate into the same number
of computer instructions in the executable code. Therefore, it is better to be writing
long programs instead of short programs using complex keywords. Of course, it
takes more time to type in more number of lines, but it reduces the time in debug-
ging and the total time taken is almost the same.
4. When a statement has to be longer than what the screen can horizontally accom-
modate in one line, it is better to break the statement into shorter statements.
Alternatively, we can make use of the statement continuation facility to break it
into multiple lines. When we continue a statement on a second line, it is better
to offset the next line by a tab space so that it is clear that it is subordinate to the
previous line.
We need to note that the hallmark of a good program is that it is maintainable in addition
to the main functionality of doing what is expected: produce expected results diligently
and accurately.
40 Computer Programming for Beginners
1. Open: Open something, such as a database, a table, a port, or a device. This state-
ment tells the computer to begin communication with the specified object. Once
the operating system encounters this keyword, it sends a command to the speci-
fied object to begin the communication. If the specified object is available and
ready to reciprocate communication with the calling program, it responds with an
acknowledgement, which means that the object is ready to exchange information
with the calling program. Sometimes, the device may send a busy response, which
indicates to the calling program that it needs to wait and keep calling for the object
until it receives a ready response. Sometimes, when the specified object is absent,
no response is received. In such cases, the operating system generates an appropri-
ate error message and sends it to the calling program. Then the calling program
needs to interpret the error message and take appropriate action as specified by
the programmer. When the proper acknowledgement is received from the object,
then the operating system makes an entry in the open objects table and stores all
required information about the object so that further exchange of information can
take place between the object and the calling program.
2. Close: This keyword tells the operating system to close the communication with the
specified object. The specified object can be a database, a table, a file, or an I/O device.
When this statement is encountered by the operating system, it tells the object that
it is now free to interact with other programs, its state is set as “free” in the relevant
table, and its entry is deleted from the open objects table of the calling program.
In this manner, there are many keywords for each of the programming languages. How does
a programmer make use of these keywords? The programming language defines a set of
rules generally referred to as the syntax for the language. The syntax specifies the following:
1. The number and type of data items to be used along with the keyword.
2. Other subordinate keywords that are expected to be used along with the main
keyword. For example, in many programming languages, the “If” keyword needs
to have the keyword “Then,” as well as “Else,” following it.
Introduction to Computer Programs 41
3. The exact arrangement of keywords and data items, including the order in which
they need to be placed in the statement.
4. The rules for continuing a statement on the next line.
5. The rules for terminating a statement.
6. The rules to hammock a set of statements into one block.
7. The rules for calling a subprogram or the system services offered by the operating
system of the computer.
8. The rules for handling the devices connected to the computer, such as the printer
or scanner.
The syntax can have other rules that are specific to the programming language. The rules
of the syntax must be adhered to in toto. Any laxity in adhering to the rules of syntax
would result in an error during the process of compilation. When a syntax error is encoun-
tered by the compiler, the compiler will generate an error message and it would not gener-
ate the object code.
To be able to write good quality, and efficient programs in a programming language,
one must master the syntax rules of the language besides learning as many keywords as
possible.
5
Algorithms and Flowcharts
Introduction
Two questions every beginner to computer programming faces are, “How do I know what
the sequence of statements should be? How do I solve the problem at hand by writing a
computer program?” The answer to these questions is algorithms and flowcharts. First, let
us see what an algorithm is and then understand flowcharting. In order to solve problems
our minds use a complex mechanism to arrive at solutions. We really do not have to think
systematically. For example, if you ask a fourth grader, “What is the sum of 2 and 4?” he
would immediately answer, “6.” How did he arrive at it? Perhaps he would tell you that
he knows. Did he think about it systematically? No! How exactly our minds arrive at solu-
tions to problems is still an area of research, but computers are not gifted as human beings
are with a brain. Therefore, we have to understand automated reasoning. Automated rea-
soning is building a mechanism to reason in a systematic manner and to arrive at accurate
conclusions. Algorithms are one of the most popular tools of automated reasoning.
Algorithm
Merriam Webster’s dictionary defines “algorithm” as “a step by step procedure for solv-
ing a problem accomplishing some end especially by a computer.” Those that have studied
mathematics in some depth would readily understand algorithms. Most of the mathematical
problems are solved using algorithms.
We need to understand that a computer is a sincere and diligent servant who would
carry out all instructions “as specified” but would not correct the defects in our instruc-
tions. Your brain has intelligence, but there just is no intelligence in a computer. It cannot
apply its “common sense” as it does not have any. Everything needs to be programmed
and all instructions need to be provided to it. Let us understand an algorithm with an
example.
Let us take two numbers and add them. It is a very simple thing for human beings. If I
say, “Add 2 and 4.” You would answer, “6,” in no time at all. How did you do this? Perhaps
you are not really aware how this is done, but surely:
Right? Your brain performs these steps so fast that you are not even aware that those
steps were indeed performed by your brain. Now if this is to be performed by a computer,
almost similar steps would need to be performed. The steps would slightly change their
order though. Here are the steps:
1. Receive the input of the first number and store it in the RAM: Computers have no eyes
or ears to automatically know when we supply data. The first number is a data
item of numeric type. We store data in our brain and the computer stores the data
in its RAM. The supplied data remains in the RAM until the program is closed.
The CPU needs all data items to be in the RAM for processing. The CPU does not
access input devices directly. Receiving the input from a specified device is carried
out by a program that is usually part of the operating system.
2. Receive the input of the second number and store it at a different location in the RAM: It
needs to receive the second number, just as it received the first number, and store it
at a different location. In this manner, the computer needs to receive as many data
items as we supply to it and keep storing them at different locations until they are
processed and the program is closed.
3. Receive the instruction of what to do with those two numbers: Once all the data items to
be processed are received and stored in their respective locations, we need to sup-
ply the instructions to do the operation with the supplied data items. Notice here
the contrast between the human beings and the computer. We give the instruc-
tions first and data items next to the human beings, but for computers, it is the
reverse. We supply the data items first and then the instructions to process the
data items.
4. Perform the addition using the program for solving mathematical operations existing inside
the computer: The knowledge to solve a mathematical problem is stored inside our
brain. For the computers, it is supplied by the program libraries included with the
programming language. When an instruction is given to us, we consider if we
already have the knowledge to perform it. If we have it, then we do it. Otherwise,
we respond by saying, “I do not know how to do it.” When the computer comes
across an instruction, it searches the included libraries to see if the procedure to
solve it exists. If it exists, it performs the operation; otherwise, it generates an error
message that it cannot perform the operation and gives it as output.
Algorithms and Flowcharts 45
5. Store the result at a third location in the RAM: Once the operation is performed, it
needs to store the result inside the RAM. The operation is performed by the CPU,
and the CPU is not a storage area. So, the CPU transfers the result to a location
inside the RAM. All outputs are taken from the RAM onto the specified devices.
The CPU does not interact with the output devices directly.
6. Give the output on an output device: Once the result is inside the RAM, the program
to deliver the output would begin execution to deliver the result to the specified
output device. This program for delivering the outputs to the specified device is
usually part of the operating system.
From the earlier explanation we can draw some conclusions for our future use:
1. All data items need to be supplied to the computer through an input device. We
had seen in Chapter 1 what the possible input devices are. However, when we
are developing the algorithms, we need not be concerned with the specific input
device that would be used. We simply say “Read” or “Input.” Each input device
comes with a software called the “device driver” that is installed and becomes
part of the operating system.
2. All data items supplied by the computer to the outside world are delivered through
an output device. Each output device also comes with a device driver and becomes
part of the operating system on installation. At the time of developing the algo-
rithm, we need not concern ourselves with the specific output device to be used.
We can simply say “Write” or “Print.”
3. All data, either input data items or output data items, needs to be stored inside the
RAM. The CPU can access data items only from the RAM.
We stated that the data items and the results of operations are stored at different
locations inside the RAM. Who specifies the location for each of the data items? We
do, of course! We do it symbolically though. We use a name for the location and the
operating system converts it to a location and uses it. For example, we do not say,
“Read 3.” We instead say, “Read a number A.” That is, we are specifying the address
of a memory location symbolically referred to as A into which the input needs to be
stored.
Similarly, we specify “C = A + B.” We are telling the computer to add the contents of
memory location A and the contents of memory location B and store the result in memory
location C. Now, we can develop a generic algorithm to add a series of two-numbers, as
follows:
1. Read A
2. Read B
3. C = A + B
4. Print C
5. Prompt to enquire if there are more numbers
6. Receive the answer from the user
7. If the answer is “Yes,” then go to step 1
8. If the answer is “No,” then stop
46 Computer Programming for Beginners
When this algorithm is fed to the computer and the computer executes it, the following
actions are taken by the computer:
1. The first statement causes the computer to receive a number from the specified
input device. Of course, in the earlier algorithm we did not specify the input
device. It would be specified in the program. In case no input device is speci-
fied, it would expect the input from the default input device, which is usually the
keyboard. The number received from the input device is then stored in a memory
location symbolically addressed as A.
2. The second instruction causes the computer to receive the second number from
the input device and stores it at the memory location addressed as B.
3. The third statement causes the computer to take the contents of memory locations A
and B, add them together, and store the result in a memory location addressed as C.
4. The fourth statement causes the computer to print the contents of the memory
location addressed as C on the specified output device. If the output device is not
specified, then it would be printed on the default output device, which is usually
the computer screen.
5. The fifth statement causes the computer to display a message on the computer
screen asking if the user wants to perform more additions. The actual message to
be displayed needs to be included in the program. Of course, we can include it in
the algorithm also.
6. The sixth statement causes the computer to wait for the user to enter an answer
and receive it. The answer can be “Yes” or “No.” We may specify that his/her
answer be stored in a memory location or directly be taken to the CPU. We need
to specify the alternative in the program.
7. The seventh statement causes the computer to make a decision based on the
answer supplied by the user.
8. Once the decision is made, the computer begins executing the program all over
once again if the user-supplied answer was “Yes.”
9. If the answer was “No,” the computer would stop running the program. This
would include the following steps:
a. Remove the program from the list of programs under execution.
b. Release the RAM that was allocated for storing the program and the data items
used by the program.
c. If any data files were opened for use by the program, they would be closed.
d. It would release the input and output devices used by the program for use by
others.
In this manner, we can develop other algorithms. What should be the level of granular-
ity that is desirable? There is no fixed answer to this question. The level of detail and
granularity of the algorithm depends on the next person, the programmer who needs
to understand the algorithm and convert it into a computer program using a program-
ming language. When the programmer is a reasonable expert and has the adequate expe-
rience in computer programming, we may not need to include excessive detail and keep
the granularity at a coarse level. If the programmer is a trainee, then we need to include
greater detail and finer granularity in the algorithm.
But how do we develop an algorithm?
Algorithms and Flowcharts 47
Developing Algorithms
The following are the steps in developing an algorithm:
Flowcharts
Flowcharts, as the name implies, are charts that depict, in this case, the flow of the execution
of instructions by the computers. Alternately, flowcharts graphically depict our intention of
the chronological steps to be executed sequentially by the computer to arrive at the desired
solution for the problem. These are graphical methods and assist us in visualizing the process
more easily. It is said that a picture is worth a thousand words. Flowcharts were the only tool
for computer programmers until very recently for developing algorithms for computer pro-
gramming, but lately, other graphical tools have been developed, which include Data Flow
Diagrams (DFDs), class diagrams, unified modeling language (UML), structure charts, and
so on. But still, at a program level, a flowchart is the best technique to develop an appropriate
algorithm and to evaluate its accuracy in arriving at the desired solution flawlessly.
Should we use both algorithms and flowcharts? It depends on the complexity of the
problem at hand. If the problem is simple in nature, we may use either algorithm or flow-
chart. If the problem is complex in nature, then it may be better to use both the algorithm
and the flowchart. One point may be noted here, and that is that a flowchart lends itself
easier to read, analyze, and interpret than an algorithm, especially when the problem is
complex in nature.
Flowcharts use the following symbols:
Depicts the beginning or the ending of the flowchart. It is either the first block in
the chart or the last one in the chart.
Depicts a step in the process. It depicts the action to be performed.
A decision block. It handles a total of three possible outcomes for a decision
scenario.
Depicts data either as input or output. We use this block when we have not
decided which device to use for input or output.
Depicts paper-based output or a device that handles paper. Scanners are input
devices handling paper, and printers are output devices that handle paper.
Depicts data from or to a disk drive.
Depicts a computer screen.
Depicts the connector between blocks. The direction of the arrow depicts the
flow of execution.
Algorithms and Flowcharts 49
Sample flowchart
Start
Read A
Read B
C=A+B
Print C
Wish to
add more Yes
numbers?
No
End
FIGURE 5.1
Flowchart to add two numbers.
There are some more symbols that are used in preparing flowcharts. You can learn them
as you get more and more proficient in flowcharting. The earlier symbols suffice to begin
flowcharting.
Let us depict the algorithm we enumerated above to add a couple of numbers. It is
depicted in Figure 5.1. Of course, we can extend it to add more numbers. We can also com-
bine all read operations into one process step.
From this, we can enumerate the general rules followed while preparing the flowcharts.
1. The first and last blocks are horizontal ellipses. All other blocks are embedded
between these two blocks. It is the normal practice to caption the first block as
“Start” and the last block as “End” or “Stop.”
50 Computer Programming for Beginners
2. The desired flow of execution is depicted using a connector line with an arrow-
head at one end.
3. The flow is generally forward, that is, from top toward the bottom and from left
toward right. An exception to this rule is when the flow goes back as a result of a
decision as shown in the Figure 5.1 at the decision block.
4. We can have any number of blocks between the Start block and the End block.
5. When the flowchart cannot be accommodated on one single sheet, we can con-
tinue it on to another sheet by using a connector. A connector is a simple circle
with a number inside it. The number on the connector at the end of the sheet
would be the same as the number inside the connector at the beginning of the next
sheet where the flowchart is continued.
I will end the discussion on the algorithms and flowcharts by developing the algorithm
and its flowchart for determining if the given number is a prime number or not. What is
a prime number? It is a whole number that is divisible only by itself and 1, leaving the
remainder as zero. Any number can be divided by 1 leaving a remainder as zero. So, a
prime number is a positive number that cannot be divided by any other number except by
itself with zero as the remainder.
In mathematics, we divide the number with the first divisor as 2 and then incrementing it by 1
until the divisor just crossed half the value of the number. This can be illustrated by an example:
Let us take a number 11 and find if it is a prime number
Now, we can generalize the previous steps and write the algorithm as follows:
10. Go to 12
11. Print “M is a Prime Number”
12. End
Now, this algorithm is depicted pictorially as a flowchart in Figure 5.2. Of course, we can
modify this flowchart. We can also have a different algorithm to arrive at the solution.
I have used a simpler version to introduce you to the techniques of developing algorithms
and flowcharting.
Prime numbers
Start
Read M
N=2
Divide M
by N
Is the
Yes
Add 1 to N remainder
zero?
No Print “M is
not a prime
number”
No
Is > M/2?
Yes
Print “M
is a Prime
number”
Stop
FIGURE 5.2
Flowchart to determine prime numbers.
52 Computer Programming for Beginners
Introduction
As we have noted in the earlier chapters, computer programs are a sequence of instruc-
tions to computer giving details of what needs to be accomplished and the details of the
data to be used. Instructions to the computer are given using statements. “Statement” is
a very frequently used word in general parlance. It is similar in meaning to information,
but it connotes authentication and affirmation and is expected to be much more reliable.
Statements are made in courts of law, by politicians, and in press conferences by others. In
general, a statement has the following attributes:
That is why the word “Statement” is used in respect of computer programs. In the earlier
days, a statement was accommodated in one line. A line and a statement were synony-
mous. In those days, a line could accommodate only 80 characters, but a need arose to
span the statement across multiple lines and computer programmers came up with a
method to continue the statement onto the next line. But modern computer screens can
accommodate more than 80 characters per line, and some screens can accommodate up
to 255 characters per line. Others have the facility of horizontal scrolling and thus accom-
modate more than 80 characters per line. Most modern programming languages permit
up to 255 characters per line and also allow multiple lines in a single statement. Presently,
a statement in a computer program has the following attributes:
53
54 Computer Programming for Beginners
4. A statement can be contained in one line, or it can span across multiple lines.
When the statement spans across multiple lines, it is the general practice to prefix
the continuing statement with a specially designated character. The continuation
character differs from one programming language to another. In the modern
programming languages, this is not a requirement.
5. Generally, a statement is terminated by a specially designated character. The most
commonly used statement terminator character is the semicolon (;).
6. Statements must follow the prescribed syntax.
All programming languages specify a set of syntax rules for writing statements in that
language. Let us discuss the syntax here.
Syntax
Each statement in a computer program has to adhere to a set of rules specified by the pro-
gramming language. This set of rules is commonly referred to as syntax for that language.
Syntax is defined as “the arrangement of words and phrases to create well-formed sentences
in a language” by Google. Merriam Webster’s Dictionary defines syntax as “arrangement
of words in a sentence, clauses and phrases.” Wikipedia defines syntax as “the set of rules,
that defines the combination of symbols that are considered to be correctly structured
document or fragment in that language” in the context of computer programming.
Syntax for programming languages has the following characteristics:
1. The program begins with a specially designated keyword to indicate the begin-
ning of a program.
2. The program ends with a specially designated keyword that tells the computer that
the program execution is completed and necessary closing actions can be taken.
3. All program statements have to be embedded between these program-beginning
and ending keywords.
4. It uses words and phrases defined by the programming language. The words
are of two types: words that are defined by the programming languages and the
words that are defined by the programmer. The words defined by the program-
ming language are often referred to as “keywords.”
5. Keywords include certain symbols, such as arithmetic operators (+, −, *, /, ^),
parenthesis, relational symbols (<, >, =), and any other programming-language-
specific symbols.
6. The keywords and the words defined by the programmer are mutually exclusive.
The keywords cannot be used by the programmer for any purpose other than the
purpose defined by the programming language. User-defined words are usually
data declarations.
7. Generally, the statement begins with a keyword and is followed by user-defined
words and, in some cases, other keywords. Exceptions to this rule are found
especially in arithmetic statements, in which case the statement can begin with a
programmer-defined keyword.
Statements and Assignment Statements 55
8. There are rules for arranging the keywords, symbols, and the programmer-
defined words, and these must be adhered to strictly. No deviation is allowed
from these rules under any circumstances. Of course, some programming
languages like COBOL (Common Business Oriented Language) allow the use of
some words for documentation purpose.
9. A statement in most programming languages is terminated by a specially desig-
nated character. A semicolon (;) is the most commonly used statement termination
character.
Non-adherence to these rules of syntax would give rise to what are referred to as “syntax
errors.” The first step in converting the source code to executable code is checking
the program for syntax errors. The compiler begins the conversion of source code to
executable code only when there are absolutely no syntax errors in the source code of the
program. When the program compiler encounters a syntax error, it halts all further steps
to convert the source code to executable code and outputs the syntax errors so that the
programmer can correct them and resubmit the program for compilation.
Types of Statements
The statements are classified in two ways: by the function they perform and by the com-
plexity of the statements. By the function performed, statements are classified into the
following types:
1. Declaration statements: Declaration statements are used for declaring the data being
used in the program. Declaration statements also declare any custom or special
libraries other than those that were supplied by the vendor in the SDK (Software
Development Kit) used by the program.
2. Assignment statements: Assignment statements assign a value to a variable already
declared earlier in the program. The value to be assigned may be another variable,
a constant value supplied within the program, or is the result of evaluation of an
arithmetic expression. We will learn about expressions in the later chapters.
3. Input/output statements: These statements, as the name itself implies, receive data
from the outside world or deliver data to the outside world.
4. Control statements: Control statements halt the program execution from going to
the next sequential statement and can shift the execution to another statement,
which need not be the next statement in the sequential order. They assist us in
making programmed decisions and processing the information as necessary.
5. Loops: Loops are blocks of statements that are executed a number of times repeat-
edly based on a condition. These are very useful statements in programming.
6. Error-handling statements: Even when we take extreme care to avoid errors when the
program is running, still errors can creep in due to errors in the data, unexpected
results, and the environment. We use error-handling statements to trap the error
conditions and take corrective actions to ensure that the program closes smoothly
or to move the program away from the error condition to a normal condition.
56 Computer Programming for Beginners
7. System calls: Usually, programs are written using the facilities provided by
the programming language. Sometimes, we may need to utilize the facilities
provided by the operating system that are not provided by the programming
language. This is achieved by the system call statements. These are utilized more
by advanced programmers.
8. Inter-process communication: These statements are also not the run-of-the-mill
statements utilized by every programmer. These are utilized by advanced pro-
grammers. These statements are utilized when two programs running in the
computer concurrently need to communicate with each other for an exchange of
information. These are heavily used in the development of the system software
and real-time software.
9. Interrupt handling: These statements are used by programmers in the develop-
ment of software for device handling. Any device connected to the CPU places
an interrupt on the CPU when it needs to communicate. These are also advanced
statements used in the development of system software and real-time software.
10. Device handling: All the peripherals including printers, scanners, other machines,
airplanes, and rockets need special software. These are usually referred to as
device drivers. Device drivers are software programs that handle the devices and
make them perform the desired functions. These have special calls and syntax.
11. Starting and ending statements: Every program begins with a special statement. This
tells the computer that the program begins execution, and this instruction is fol-
lowed by a series of instructions to be executed by the computer. Similarly, every
program ends with a special statement. It tells the computer that the execution of the
program is completed without a hitch so that it can close the program, delete it from
the RAM, and release all the RAM allocated for the data items used in the program,
as well as release the devices used by the program. These are part of every program.
12. Documentation statements: One thing is certain in software development and pro-
gramming. Every developed program put into production needs to be modified
and enhanced during its lifetime. The original programmer who wrote it may
have departed or is not available currently to maintain the program. Therefore, the
programs need to be written in such a manner that another programmer would
understand and be able to maintain it. The programming languages are normally
cryptic in nature. So, all programming languages provide a feature called as
“commenting statements.” The commenting statements are completely ignored by
the computers while compiling and preparing the executable code. These are for
the reference by the programmers maintaining the programs. Usually, a specially
designated character is prefixed to the statement to indicate that this statement is
a commenting statement used for the purpose of documenting the program logic.
1. Single statements: Single statements perform one function and are usually con-
tained in one line or a set of continuing lines. Assignment statements are usually
single statements.
2. Compound statements: Compound statements have multiple statements and per-
form more than one function and span across multiple independent lines. These
are also referred to as “block statements,” or as a block of statements. Control
statements and loops are usually compound statements.
Statements and Assignment Statements 57
In this chapter, we will discuss in detail the assignment statements. We will discuss
the documentation statements, starting and ending statements, declaration statements,
system calls, inter-program communication, interrupt handling, and device handling
statements in a separate chapter. I have dedicated a separate chapter for each of the other
statements.
Assignment Statements
Assignment statements form the major chunk of any computer program. It is the state-
ment in which actual work is carried out. Assignment statements are taken from the
mathematics field. In mathematics, an assignment statement looks like this:
a ← b+c+d−e
The same rules apply to assignment statements in computer programming too, but the
assignment symbol is not the arrow sign. In most programming languages, it is the equal
to (=) sign. An assignment statement in a computer program looks like this:
A = B+C
One deviation that assignments in computer programming have over the mathemati-
cal assignment statements is that the variable on the left-hand side of the assignment
symbol can also be there on the right-hand side of the assignment symbol. For example,
“A = A + B” is permitted in computer programs but not in mathematical assignments. This
statement results in:
3. Add A and B
4. Assign the result to memory location A
1. The assignment statements would contain the assignment symbol, occurring only
once, which is usually the equal to (=) sign.
2. On the left-hand side of the assignment statement must be a single variable.
3. On the right-hand side of the assignment symbol, there can be:
a. A constant, which can be a number or a literal (a string of characters enclosed
in quotation marks).
b. A mathematical expression containing all variables, all constants, a combina-
tion of variables and constants, or a combination of expressions.
c. The variable on the left-hand side of the assignment statement can be part of
the expression on the right-hand side of the assignment statement.
4. During execution of the program, the expression on the right-hand side of the
assignment symbol is evaluated and the result would be assigned and stored
in the memory location symbolized by the variable on the left-hand side of the
assignment symbol.
We will learn in detail about the expressions used in computer programming in later chap-
ters. Let us first learn about initialization statements. Initialization statements are a type of
assignment statements in which an appropriate value is assigned to a variable before it is
used in an expression. Here are a few examples of initialization statements:
Count = 0;
Salary = 0
Totals = 0
Names = “” (that is a set of blank spaces)
Ismarried = TRUE
Statements similar to the earlier examples are frequently found in computer programs in
which we assign a constant, usually, zero to a variable. These are referred to as “initializa-
tion statements.” When we declare a variable, the compiler assigns a memory location to
it. What default value would be assigned to that variable by the computer when the pro-
gram execution begins? Some powerful computers and programming languages assign
an appropriate value such as zero for numeric variables and blank spaces for character
variables. But most computers and programming languages assign a special value referred
to as “NULL” to the variables immediately upon the declaration. This value is not usable
in any expressions except in comparisons. When we use the variable with NULL value in
expressions and assignment statements, the computer throws up an error that may result
in abrupt termination of the program execution. Therefore, to avoid such uncontrolled
errors, programmers must assign an appropriate value to the variables immediately after
declaring them. This process is referred to as “initialization.”
While zero is the most assigned value for numeric variables, and blank spaces for char-
acter variables, in initialization statements, other values are also assigned to the variables.
We may also need to initialize the variables multiple times in the program depending on
Statements and Assignment Statements 59
the specific condition in the program. Most errors in getting wrong results is due to forget-
ting to initialize the variables at appropriate locations within the programs.
Here are some of the examples of other assignment statements we can find in computer
programs:
1. We use assignment statements for initializing the variables used in the program.
2. We use assignment statements to assign values to the environment and other
variables, especially the configuration information of the user environment. For
example, we often select screen colors to suit our individual tastes in many appli-
cations. When using applications, we set our preferences to a host of settings made
available for us by the software package. Most computer operating systems make
many options for the individual users to choose from. We set our user ID and
password and keep changing the password often. These are achieved by assign-
ing our preferences using assignment statements.
3. We use assignment statements for evaluating mathematical expressions and
arriving at the solution to a mathematical problem. Computers can solve simple/
complex arithmetic problems used in business transactions, including computa-
tion of salaries for employees, the prices in sales transactions, costs in various
aspects of management, decision support systems, and interest computations in
banking, as well as solving complex problems like calculus, linear programming,
transportation problems, matrix algebra, and, for that matter, any mathematical
problem for which an algorithm is available.
4. We use assignment statements for concatenating strings to form a new string of
characters. Why should we concatenate strings of characters? There are many
practical uses for such an operation. One such use that comes to mind readily is to
form a connection string used to open databases. We usually store the first name,
middle name, and the last names of individuals in our application but we output
the complete name, and this necessitates concatenating the three together.
5. We often use assignment statements in writing information to data files and data-
base tables using assignment operations.
60 Computer Programming for Beginners
Introduction to Expressions
As we have noted in Chapter 6, expressions are used in statements. Almost every statement
contains at least one expression. Merriam Webster’s dictionary defines an expression gen-
erally as “an act of expressing” and “an act of making your thoughts, feelings etc. known
by speech, writing or some other method,” and in the context of computer programming,
thus, “a mathematical or logical symbol or a meaningful combination of symbols”. Wikipedia
defines an expression thus: “an expression in computer programming is a combination of
explicit values, constants, variables, operators and functions, that are interpreted according to
the particular rules of precedence and of association for the specific programming language.”
Let us enumerate the attributes of expressions in the context of computer programming
to understand them better:
Types of Expressions
Expressions used in computer programming are of three types, namely:
1. Arithmetic expressions
2. Relational expressions
3. Logical expressions
61
62 Computer Programming for Beginners
Arithmetic Expressions
Arithmetic expressions are used to solve arithmetic equations and compute the required val-
ues. These expressions can include both variables and constants. The variables and constants
used in arithmetic expressions have to be numeric in type. Of course, some programming
languages allow use of character type variables and constants in arithmetic expressions. But,
they can be used only for addition. When two are more strings of characters are used in an
arithmetic expression using an addition operator, they would be concatenated together. The
variables and constants in the arithmetic expressions are joined together to form an expres-
sion by the arithmetic operators. The following are the arithmetic operators:
1. Addition symbol +
2. Subtraction symbol − (a dash)
3. Multiplication symbol * (an asterisk)
4. Division symbol / (a slash)
5. Exponentiation symbol ^ (a caret)
6. Open parenthesis “(”
7. Close parenthesis “)”
8. For square root, there is no symbol allocated. It is usually achieved by a library
routine. The specific library routine used for finding the square root differs among
programming languages. Usually, it is either “SQRT” or “SQR.”
In real-life mathematics, we use curly braces {and} and square braces [and] along with
curvy braces, “(” and “)”, when more than one set of parentheses is needed in the mathe-
matical expression. In computer programming, we use only the curvy braces in arithmetic
expressions. Here are a few valid examples of arithmetic expressions:
• a + b
• a + b − c
• a + b * c
• a + (b * c)
• (a + b) * c
• a − b * c / d
• (a + b) ^ 2
• (a + b) ^ 2 − (a − b) ^ 2 / d
• ((a + b) ^ 2 − (a − b) ^ 2) / d
1. The evaluation proceeds from left to right when the precedence level of the opera-
tors is same.
2. The addition symbol and the subtraction symbols have the lowest priority in the
precedence of evaluation. They both have the same precedence.
Arithmetic, Relational, and Logical Expressions 63
3. The multiplication and the division symbol have the same precedence. They have
higher precedence than the addition and subtraction symbols.
4. The exponentiation symbol has higher precedence over the multiplication and
division symbols.
5. Parentheses have the highest precedence than the rest of the arithmetic operators.
Now using these rules, let us see how the computer evaluates the previous example
expressions:
1. a + b—there is only one operator. The computer just adds the values of both the
variables and delivers the result.
2. a + b − c—here we have two operators with the same precedence. Therefore, the
computer will add the value of a with the value of b and then it would subtract the
value of c from the sum and deliver the result.
3. a + b * c—here we have two operators with different precedence levels.
Therefore, the computer will first evaluate the part of the expression joined by
the operator with the higher precedence. So, it will multiply the value of b with
the value of c. Then it will add the result to the value of a and then deliver the
result.
4. a + (b − c)—here we have three operators with different precedence levels. The
computer will evaluate the expression within the parentheses. So, the value of c
would be subtracted from the value of b and then the result would be added to the
value of a. Then the result would be delivered.
5. (a − b) * c—here we have three operators with different precedence levels. The
computer will evaluate the expression within the parentheses. So, the value of b
would be subtracted from the value of a and then the result would be multiplied
by the value of c. Then the result would be delivered.
6. a − b * c / d—here we have multiple operators with different precedence levels.
a. Following rule #1 of evaluation, the evaluation proceeds from left to right.
Therefore, b would be multiplied by c.
b. The result would then be divided by d.
c. Then the result would be subtracted from the value of a.
d. Then the result would be delivered.
7. (a + b) ^ 2—the evaluation in this expression is rather straightforward. The expres-
sion in the parentheses would be evaluated first. Then it would be raised by the
value of the exponent, 2, and then the result would be delivered.
8. (a + b) ^ 2 − (a − b) ^ 2 / d—here we have multiple operators with different prece-
dence levels.
a. First, the computer would evaluate the expressions in the parentheses.
b. It would then divide the result of the expression (a − b) ^ 2 by d.
c. Then the result would be added to the result of expression (a + b) ^ 2.
d. The result would be delivered.
9. ((a + b) ^ 2 − (a − b) ^ 2) / d—here we have multiple operators with different pre-
cedence levels.
64 Computer Programming for Beginners
a. First the computer would evaluate the expressions in the inner-most parenthe-
ses. Then the outer parenthesis would be evaluated.
b. Then the result of expression (a − b) ^ 2 would be subtracted from the result of
expression (a + b) ^ 2.
c. Then it will divide the result by d.
d. Then the result would be delivered.
As you can see, the expressions in bullet points #8 and #9 are similar except the place-
ment of parentheses, which changes the way the expression is evaluated. Both expressions
would deliver different results.
Therefore, we need to be careful when programming the arithmetic expressions, espe-
cially when using the operators with different precedence levels. As a precaution, it is recom-
mended to use the parenthesis operators liberally to avoid confusion about the precedence
values to obtain the desired and accurate result. It is also recommended to avoid writing long
arithmetic expressions, as it can be very confusing when debugging the program. It is better
to break long expressions into multiple smaller expressions and write them in different lines.
When using arithmetic expressions in arithmetic statements, the following precautions
are suggested to obtain an error-free result:
1. The variable receiving the result of the expression needs to be able to accommo-
date the result. If we are using floating-point numbers of the single-precision type,
we need to have a variable of double-precision type on the left-hand side of the
assignment symbol.
2. If we assign the value to a variable that has a lesser capacity than the result of the
arithmetic expression, the result would be truncated to the size of the variable.
This results in an erroneous result.
3. If we use all integer type values in the arithmetic expression, it is better to assign
the result to a variable of single-precision type.
4. If we use all single-precision or a combination of integers and single-precision
variables in the arithmetic expression, then we better assign the result to a variable
of double-precision type.
Important: One major precaution we need to take while defining the arithmetic expressions
is to ensure that the denominator never becomes zero. Anything divided by zero results
in infinity. But we can never know what the real-life data will throw at our program. So,
whenever we need to use a division operator in our arithmetic expression, it is essential
to check if the denominator is zero or NULL before evaluating the expression. This, in
addition to forgetting the initialization statements, is a major reason for abrupt abortion of
the program execution causing damage to data in files and delivering erroneous results.
Multiplication by zero is alright, as the result would be zero, but division by zero is unac-
ceptable. So, we must take care to ensure that the denominator never becomes zero while
writing programs using arithmetic expressions.
Arithmetic, Relational, and Logical Expressions 65
Relational Expressions
Relational expressions relate one expression with another and determine if the relation-
ship between the two is true or false. Relational expressions are used for decision mak-
ing and are used in control statements. They are not used in assignment statements. The
expressions in the relational expressions are joined together by the relational operators.
Relational operators specify the type of relation to be established between both the expres-
sions. The relational operators are given in Table 7.1.
While parentheses are not relational operators, parentheses are often used to enclose
a relational expression. It is a good practice of programming to enclose the entire rela-
tional expression in a set of parentheses. The relational operators given in Table 7.1 are
the popular representation used by most programming languages. But variations can be
seen in some programming languages. For example, COBOL uses full text, such as LESS
THAN, GREATER THAN, NOT EQUAL, and so on. In some programming languages,
the characters le, ge, eq, and so on, are used. You need to check the programming lan-
guage manual or help pages to know exactly what relational operators are used in that
language.
As noted earlier in this chapter, an expression can contain a variable, or a constant, or a
combination of variables and constants. We need to follow the following rules when writ-
ing relational expressions:
TABLE 7.1
Relational Operators
Relation Symbol Characters
Equal to == eq
Not equal to != or <> or /= ne
Greater than > gt
Greater than equal to >= ge
Less than < lt
Less than or equal to <= le
Arithmetic, Relational, and Logical Expressions 67
We need to take note of one important aspect here. While it is permitted to use constants
in relational expressions, it is pointless to use only constants on both sides of the relational
operator. What is the point in comparing 3 and 4 to see if 4 is greater than 3?
All relational operators have lesser precedence than all arithmetic operators. So, if an
arithmetic expression is included in a relational expression, it would be evaluated first
before evaluating the relational expression. Now let us consider some examples of rela-
tional expressions:
1. x > y—this expression compares value of x with the value of y. Then, if the value
of x is more than y, it will return the result of TRUE. If the value of y is either equal
to or is more than the value of x, then it will return a result of FALSE.
2. a + b > c—in this expression, it will add the value of a to the value of b and only
then it will compare the resultant value with the value of c. It will finally return a
result of TRUE only when a + b happens to be more than the value of c.
3. (a + b) > c—this expression is similar to the expression in bullet #2 except that a
parenthesis is used to make the process of evaluation more explicit.
4. (a + b) < (c + d)—this has arithmetic expressions on both sides of the relational
operator and both are enclosed in parentheses. Both these expressions would be
evaluated first and then the results would be compared. Then the appropriate
result, be it TRUE or FALSE, would be delivered.
5. ((a + b)/c) >= (x + y/z)—this is another example of two arithmetic expressions
being compared. In the expression on the left-hand side of the relational opera-
tor, the value of a would be added to the value of b and then the sum would be
divided by c. In the other expression, the value of y would be divided by the value
of z and then the quotient would be added to the value of x. Then the results of
both the expressions would be compared to deliver the result of the relational
expression.
Now, we need to understand the difference between some confusing symbols. Of course,
the meaning of symbols is not confusing really, but if we are not careful, we can use them
wrongly. Let us consider some examples. Let us assume that the value of a is 4; value of b
is 4; value of c is 4.001. Now let us use these values in our examples:
5. (a == c)—this evaluate to FALSE because, while the difference between the values
of a and c is very small, it is still significant. But a and c are not equal, however
small the difference may be. We human beings may consider the difference to be
insignificant for all practical purposes, but a computer would consider any differ-
ence, irrespective of its magnitude, to be significant.
6. (a != c)—this expression evaluates to TRUE, as the values of a and c are not equal.
7. (a >= c)—this expression evaluates to FALSE. The value of c is certainly greater
than that of a in the view of the computer.
8. (a <= c)—this expression evaluates to TRUE as a is certainly less than the value of c.
What we have to note here is that for writing good programs, we need to ensure that both
variables being compared are of same precision. If the variable on the left-hand side is
integer, the right-hand side variable also needs to be integer to ensure accurate results. If
we compare an integer value with a single- or double-precision value, the result is rather
unpredictable. So is the case when we compare a double-precision variable with an integer
or a single-precision variable; the result is unpredictable. If we wish to write reliable pro-
grams that deliver consistent results, we need to ensure that the precision is the same in
numeric variables used in the relational expressions.
Now let us consider some character data. Let us assume that the variable fname contains
a value of “Thomas;” the variable lname contains a value of “thomas;” the variable mname
contains a value of “Tho mas;” the variable x contains a value of “THOMAS;” and the vari-
able y contains a value of “thomas.” Now let us see some examples:
Therefore, when we compare two-character strings, we convert both variables into either
lowercase or uppercase before comparing them to deliver results that are sensible to
human beings.
We human beings are very inconsistent when we come to spelling our names and use
different spellings for the same name. Some of this has to do with the cultural, national,
and linguistic backgrounds we came from. So, locating a certain record based on imperfect
information is difficult with the aforementioned relational expressions. Instead of trying
to locate the required string of characters using exact match, we like to locate the record
by using a part of the total string to locate a set of records from which we can select the
required one manually. Sometimes, we may locate a set of records as coming from the
same group for some purpose. What we need is not the exact match of the string but find-
ing if a longer string of characters contains within it a shorter string of characters. For
example, we may need to locate records whose name is either Katherine or Catherine.
Arithmetic, Relational, and Logical Expressions 69
We know both names are pronounced same but spelled differently. Instead of searching
twice, we wish to locate all records in whose name, the string “rine” is embedded. None of
the aforementioned relational operators would accomplish our need.
This is a well-recognized need. This is achieved by a pre-defined library function in
some programming languages. Some languages provide a relational operator. Generally,
the dollar symbol “$” or “$$” is used as this relational operator that permits searching a
string within another longer string. Let us assume the variable fname contains the value
“Katherine;” the variable gname contains the value “Catherine;” and the variable hname
contains the value of “Kathereen.” Let us also assume that a variable xyz contains the
value “Brine.” Now let us consider the following examples:
As you can see, this operator locates all names that contain the specified string. In most
programming languages, the shorter string is located on the left-hand side of the relational
operator, but it is possible that some language could have the shorter string on the right-
hand side too.
Relational expressions are used for decision making and are used in control statements.
Relational expressions form part of logical expressions. But relational expressions are
not used in assignment statements. Most logical errors in computer programs stem from
poorly formed relational expressions.
1. When using numeric data in relational expressions, it is better to ensure that the
arithmetic expressions on both the sides of the relational operator are of the same
precision. That is, compare integer to integer, single-precision to single-precision
and double-precision to double-precision.
2. It is better to use variables in relational expressions than arithmetic expressions. If
it is unavoidable to use expressions, keep the expressions small so that it is easy to
understand and debug programs.
3. When comparing strings of characters, it is better to convert all the characters
to either uppercase or lowercase before the comparison. Of course, this does not
apply to cases like checking the passwords where the case of the characters is
significant.
4. It is always better to enclose the relational expression in a set of parentheses. It
removes any confusion in debugging the programs.
70 Computer Programming for Beginners
Logical Expressions
Computers are built using integrated circuit chips (IC), which have transformed into large
scale integrated chips (LSI chips) and now into very large scale integrated circuit chips
(VLSI chips). These chips are built based on logic circuits referred to as gates, namely the
AND gate, OR gate, and NOT gate. These gates are built using diodes and transistors.
While a detailed explanation of these hardware components is out of scope for this book,
it is necessary for us to understand the basics of these chips, so we have a clearer under-
standing of the logical expressions we use in computer programming. These gates are
depicted in Figure 7.1.
These gates can have multiple inputs but have only one output. The combination of
inputs and the corresponding output is depicted in a table referred to as the chip’s “truth
table.” NOT would have only one input and one output. The truth table for the gates shown
in Figure 7.1 is depicted in Table 7.2. In the field of electronics, the value TRUE is depicted
as 1 and FALSE is depicted as 0.
Now the explanation is as follows. The AND gate produces an output only when all the
inputs are present. In other words, the AND gate produces a TRUE output only when all
the inputs are TRUE. The OR gate produces a TRUE output when any one of the inputs
is TRUE. The OR gate produces a FALSE output only when all the inputs are FALSE. The
NOT gate produces an output that is opposite of the input. That is, if the input is TRUE, the
output would be FALSE, and if the input is FALSE, the output would be TRUE.
In logic gates, there are other gates built using these basic gates. They are NAND, NOR,
and XOR gates. A NAND gate is a combination of an AND gate and a NOT gate. That is,
the output of an AND gate is fed as input to a NOT gate. A NOR gate is one in which the
output of an OR gate is fed as input to a NOT gate. An XOR gate (or Exclusive OR gate) is a
special variety of OR gate. A representation of an XOR gate is depicted in Figure 7.2.
A A
C C A B
B B
AND gate OR gate NOT gate
FIGURE 7.1
Logic gates.
TABLE 7.2
Truth Table for Logic Gates
Gate A B C
A
C
B
XOR gate
FIGURE 7.2
XOR gate.
XOR gate is like an OR gate in its input/output combinations except when all the inputs
are TRUE, it returns a FALSE output. That is, an XOR gate returns a TRUE output only
when any one of its inputs are TRUE but not when all inputs are TRUE.
These logic gates are the basic components of our electronic computers and also are the
basis of our logical expressions.
Now returning to our discussion on computer programming, the inputs are relational
expressions and the output is the result of the evaluation of the logical expression. A logical
expression is a combination of expressions joined together by logical operators. Unlike in
relational expressions, we can have more than one logical operator in a logical expression.
The logical operators are AND (&&), OR (||), NOT (!), and XOR. XOR is a rarely used in the
computer programming fraternity and not many programming languages support it.
All the logical operators have the same precedence level, so the logical expressions are
always evaluated from left to right. The expressions placed on either side of the logical
operator can be either relational expressions or variables of data-type Boolean. Let us
assume that a, b, and c are expressions valid for use in logical expressions. Now let us
consider some examples:
4. !a—This is an example of using the NOT operator. The NOT operator returns a
value that is opposite to the value delivered by the evaluation of the expression.
The expression evaluates as follows:
a. If the expression a evaluates to TRUE, the delivered result would be FALSE.
b. If the expression a evaluates to FALSE, the delivered result would be TRUE.
We will see some examples of how all these expressions are used in our chapter on control
statements.
Introduction
When a program is executed, the computer begins with the first statement and moves
towards the last statement, executing all the statements sequentially one after the other.
The execution of a program is akin to a waterfall. In a waterfall, the water, once it begins
falling, will not stop until it touches the ground and then it would flow, taking the down-
ward slope. Similarly, once a computer begins executing a program, it will not stop until
it reaches the last statement, which informs the computer that the program has ended and
instructs the computer to take all necessary actions to stop the execution in a smooth man-
ner and release all the resources held by the program.
But often, we cannot allow the program to execute like a waterfall. We need to take some
programmed decisions, and move the program execution depending on the outcome of a
programmed decision. Control statements are the tools by which we can control the pro-
gram flow as we desire based on the programmed decisions we built into the program. As
the name implies, these statements control the flow of program execution.
We have the following statements that assist us in controlling the flow of program
control:
1. Goto
2. If ... Then ... Else
3. Switch ... Case
4. Loops
a. Counting based loops
b. Condition based loops
i. Condition checked at the beginning
ii. Condition checked at the end
73
74 Computer Programming for Beginners
Goto Statements
These statements are the initial control statements, and, in fact, this is the statement at the
backend of all control statements. This statement is also called the branching statement.
Once this statement is executed, the program flow is prevented from going to the next
statement and is branched off to a different predefined statement in the program.
Goto statement can be used in a standalone manner or as part of an If statement. A Goto
statement pushes the execution to a different statement and leaves it at that. The Goto state-
ment usually consists of the keyword “Goto” followed by a label or a statement number.
The label indicates the statement to which the control needs to be passed on to. Then the
statement that receives control needs to be prefixed by the label. In some programming
languages, each statement is sequentially numbered. In such languages, the keyword
Goto is followed by the desired statement number. In other programming languages, the
method of declaring and using statement labels is described in the language specification.
Generally, the syntax of a Goto statement looks like:
languages, the entire If ... Then ... Else ... statement is written on the same line. But in most
programming languages, the If statement spans across multiple lines. It generally takes
the form:
In the programming languages of the C-family, they use curly braces ({}) to enclose the
statements between Then ... Else and between Else ... and Endif. They also do not use an
Endif statement at all, and they simply begin and end the If statement with a set of curly
braces.
Often in real life, decisions are not simple to make with just one expression. We may
need to use multiple expressions. In such cases, we may need to use multiple If statements
to make the decision. For example, consider this classification of people by their age:
Now, we use these rules for selecting a person’s stage in life. Now we need to develop a set
of control statements to make the right decision. Let us assume a variable named “age” to
contain the value of the age of the person between 1 and 100. With this let us form a set of
If statements to arrive at the right decision.
Now, as you can see, we have an If statement as part of another If statement. This type of
embedding an If statement within another If statements is referred to as “nesting” of the
If statements. In the present case, there are just two levels of If statements. We refer to this
as “the nesting of If statements is two levels deep.” Real life provides us instances wherein
the nesting of If statements go even deeper than two, but if we nest the If statements too
deep, it becomes very difficult to analyze the programs to debug errors. So, the general
rule we follow in the programming fraternity is to limit the nesting of If statements to a
maximum of 3.
76 Computer Programming for Beginners
The If statement is perhaps the most important of the computer programming skills and
mastering it is essential to writing good quality computer programs.
Switch <Variable>
Case a
Statement
Statement
Case b
Statement
Statement
Case n
Statement
Statement
Default
Statement
Statement
Endcase
In the earlier statement, a, b, and n are the values for the variable specified in the Switch
statement.
Here are the rules for writing the Switch ... Case statement:
1. The Switch case consists of a switching variable defined by the Switch keyword.
2. There would be a number of cases based on the value of the variable. Each value is
defined by the Case keyword.
3. The Case keyword is usually followed by a constant, which can either be numeric or
character data.
4. Some programming languages allow a range of values with a special syntax.
5. Generally, expressions are not allowed in the Case statements, but exceptions can
always arise.
6. Each Case statement includes a number of statements. There is no restriction on
the number of statements that can be under a Case statement.
7. The last case for this statement would be the Default statement, which specifies the
action to be taken when the value of the variable does not match any value speci-
fied in all the Case statements.
8. The statements in the Case and Default statements can be action statements or
branching statements.
9. Most programming languages require a statement to denote the end of the Switch ...
Case statements. Some would define a keyword, such as Endcase. Some program-
ming languages, such as in the C-family, just use a closing curly brace “}” to denote
the ending of the Switch ... Case statement.
10. The last of the statements under each Case would usually be a “Break” or “Exit”
statement. It would be usually the only word in that statement. The execution of
Break (or Exit) statement would take the execution to the statement that immedi-
ately follows the Endcase statement.
The total set of statements, beginning with the first Case statement and ending with the
Endcase (or its equivalent) statement, are usually called the Switch Block. The statements
in each of the Case statements are usually called the Case Block.
78 Computer Programming for Beginners
The Break statement in the Default block is really superfluous, as the program execution
falls to the next statement, which in this case is the Endcase statement, but it is a good
practice to write the Break statement. Sometimes during program maintenance, we are
likely to add another Case statement after the last Case block. Software packages such as
Control Statements 79
the word processing packages make heavy use of the Switch ... Case statements. Some pro-
grammers consider the Default statement as unnecessary and avoid it, but it is a good prac-
tice to include it. Real life throws unexpected data that can never be completely guessed at
the time of writing the program, so the Default statement handles all exceptions that real
life throws at the program.
Loops
A Loop is a construct that causes a set of statements to execute iteratively. How many
times do the statements in the loop block execute? It depends on the type of loop that we
constructed. Of course, the loop has to be finite and if the loop becomes infinite, then we
have inserted a bug and the program freezes and needs to be manually interrupted. Loops
are basically two types:
1. Counting-based loops
2. Condition-based loops
a. Condition checked at the beginning
b. Condition checked at the end
80 Computer Programming for Beginners
Counting-Based Loops
Counting-based loops execute the set of statements for a fixed number of times based on
the definition at the beginning. Most modern programming languages use the For ... Next
construct for using this type of loop. “FOR” loops are the primary tool in handling arrays
in programs. Generally, the FOR loop has the following syntax:
For i = j to k Step l
Statement 1
Statement 2
Statement n
Next
In the earlier example, i, j, k, l, and n are numeric variables. The variable j denotes the initial
number at which the loop begins, and k is the last number at which the loop stops. Once
the value of i becomes greater than k, the execution of the loop stops. The keyword “For”
signifies the beginning of the loop. The keyword “Step” indicates the value by which the
variable i would be incremented after every iteration. The keyword “Next” indicates incre-
menting the value of the variable i by the amount of the value indicated after the keyword
Step.
The variable i would assume an initial value of j and executes the statements in the
loop block. The set of statements in the loop block would be executed even when the
value of the variable i becomes equal to k. Therefore, the number of times the set of state-
ments, in the loop block, are executed, is equal to (k − j + 1). For example, if the statement
is “For i − 1 to 6 Step 1,” then it would be executed (6 − 1 + 1 = 6) times. Let us consider
an example. Let us assume a single-dimensional array having six cells. Let us fill this
array using a FOR loop:
For i = 0 to 5 Step 1
Read m
arr(i) = m
Next
In the earlier example, i is a counting variable and m is the input variable. Note that we
began the loop with the initial value of zero. Generally, the first element of an array is
denoted as 0th element/cell. With the value of i being incremented by 1 in each of its
iterations, it assumes the values of 0, 1, 2, 3, 4, and 5. As you can see, the loop would be
executed six times to fill the array with values received as input. In each of the iterations,
the computer performs two operations; namely, it receives one value as input and fills the
corresponding array element with that value. When i is incremented and reaches a value
of 6, the execution of the loop stops and moves to the next statement in the program.
Let us now consider a two-dimensional array and fill it with values input by the user. Let
arr be a two-dimensional array with 3 columns and 6 rows. Here is the pseudo code for it:
For i = 0 to 5
For j = 0 to 2
Read m
arr(i, j) = m
Next
Next
Control Statements 81
In the earlier example, i and j are counters. The variable m is the input variable. The first
FOR loop sets the value of the row of the array into which the values are being filled. The
second FOR loop sets the value of the array column for filling the value. The statement
arr(i, j) indicates the array element where i is the row number and j is the column number.
If i = 2 and j = 2, then the value goes into row #3 and column #3 (as the computer counts
from zero and we count from 1!).
Here are the general rules for the FOR loop.
1. FOR loops are used to define loops that are executed a finite number of times
based on the value of the counting variable.
2. A FOR loop needs a counter, a numeric variable used for counting the number of
iterations executed by the loop.
3. It also needs an incrementing value/variable to increment the counter by after
completing every iteration. If we do not define an incrementing variable, most
computers assume the increment to be 1.
4. The FOR loop needs some indicator to indicate the last statement in the loop. Some
programming languages use Next, and some simply use a closing curly brace—“}.”
5. The FOR loop can be blank, that is, there need not be any statements between the
For statement and the Next statement. Such loops are called as delay loops and are
used sometimes in the programs. It just delays the program execution for the time
it spends in counting the numbers specified in the loop.
Condition-Based Loops
In these loops, the set of statements in the loop block are executed as many times as the
loop condition is either TRUE or FALSE as programmed. This loop is used when we do
not know or cannot know how many times the set of statements needs to be executed by
82 Computer Programming for Beginners
the computer. One example that readily comes to mind is processing the records from
a data file or a table. We normally would not know the number of records contained in
the file or table. Then we define a loop using a condition. These loops are basically of
two types:
1. Condition checked at the beginning: In this loop, the condition is checked before
entering the loop and executing the statements contained within the block. If the
condition evaluates to FALSE (or TRUE as the case may be), the statements in the
block do not get executed even once.
2. Condition checked at the end: In this loop, the condition is checked in the last state-
ment of the loop statement. Therefore, the all the statements in the loop block get
executed at least once in this loop.
WHILE Loop
The While statement is the most versatile construct for the loop that checks the condition
at the top of the loop. Do ... While is the popular construct for the loop that checks the
condition at the bottom of the loop block. Of course, we need to check the programming
manual of the specific programming language we are using when writing the program to
know the exact constructs for these loops. Let us see some examples of the While loop. Let
us assume FileofNames as a data file or a table from which we read records and simply
print the name.
EOF, meaning “end of the file,” is generally a variable to indicate if all the records in the
file or table are processed and that there are no more records in the file or table to be pro-
cessed. This is the general notation used by many programming languages to denote the
condition that there is no more data available in the file or table. From the earlier example,
we note the following aspects:
1. As long as there are records in the file or table, the condition evaluates to TRUE. It
goes something like this:
a. When there are records, the answer to the question “is EOF NOT TRUE” would
be “Yes, EOF is NOT TRUE.” The execution falls to the first statement in the
loop block.
b. When all the records are exhausted, the answer to the “is EOF NOT TRUE”
would be “No, EOF is TRUE.” The execution falls to the first statement coming
after the loop block.
2. When all the records in the file or table are exhausted, the condition in the While
statement evaluates to FALSE and the computer exits the loop and begins execu-
tion of the statement that immediately follows the loop. We can also insert an IF
statement in the loop and check further conditions, along with a BREAK or EXIT
statement to exit the loop. Let us consider another example.
Control Statements 83
In this example, we added a condition to check the name is John Doe, and if it is, then exit
the program. That is, we are trying to locate the contact details of a specific person from
the file or table.
The WHILE loop is a very versatile one, and it can replace the FOR loop. Let us rewrite
the segment of code that we wrote earlier for filling a single-dimensional loop in the sec-
tion on FOR loops. Look at the following example:
i = 0
WHILE i < 6
Read m
arr(i) = m
i = i + 1
WEND
It will do the same operation as the FOR loop example given in the earlier section on FOR
loops. We had to add two more statements: one to initialize i and one to increment i. But,
one question: why should we use a WHILE loop where a FOR loop is the perfect fit? The
answer is, we need not. I just showed the possibility. I have seen some programmers prefer
to only use the WHILE loop to the exclusion of all other loops. Here are the general rules
for writing WHILE loops:
1. A WHILE loop consists of a minimum of two statements, one to initiate the loop and
the other to end the loop. The keyword WHILE is the construct preferred by most
programming languages; the keyword WEND is not very popular. In the C-family
of programming languages, a closing curly brace “}” is used for ending the loop.
2. The WHILE statement needs to have a relational or logical expression along with
it. This condition sets the rule for stopping the execution of statements within the
block and passes on execution to the statements following the loop block.
3. There can be any number of statements embedded between the WHILE statement
and the WEND statement.
4. There must some statements contained within the loop block that cause the condi-
tion to evaluate to FALSE. Otherwise, the loop becomes infinite and freezes the
program and the computer.
5. Generally, the WHILE loop is not used to branch control to other statements in the
program. The IF statement is the appropriate one for that purpose. But unexpected
errors can crop up that we need to trap using appropriate statements. Only in such
cases would we use EXIT or BREAK to exit the WHILE loop.
One is to forget to insert statements that terminate the loop by making the condition evalu-
ate to FALSE. This causes the loop to execute infinitely. The second mistake is to forget to
catch unexpected errors. If we are careful about these two aspects, we can write impec-
cable WHILE loops in our programs.
DO
Statement 1
Statement 2
Statement n
WHILE <relational or logical expression>
As you can see, the WHILE statement is at the bottom of the block. The DO statement
does not require any other supporting keywords or expressions. It indicates the beginning
of the loop. The WHILE statement needs the condition defined by a relational or logical
expression. All other aspects of this loop are same as the ones detailed for the WHILE loop
in the preceding section. Therefore, I am not repeating them here once again.
Other keywords are used in different programming languages. Notable among those
keywords is the REPEAT ... UNTIL set used by the PASCAL programming language. It
is also used in some other languages. As there are now a plethora of programming lan-
guages, perhaps there could be different keywords for this kind of loop. It is also possible
that a few differences could be there between what I described earlier and what those
languages prescribe.
1. First and foremost, we need to ensure that the loop would never be an infinite
loop. I am sad to say that this is one of the biggest challenges in the debugging
and maintenance of computer programs. Computer freezes occur due to infinite
loops. We must build in condition statements in such a way that it becomes TRUE
and terminates the loop. Ensuring that the loop terminates is a best practice in
programming loops.
2. It is often better to use FOR loop as much as possible, as it is more efficient than the
WHILE loop.
Control Statements 85
3. While programming the FOR loop, I suggest that the maximum limit in the state-
ment be read from a file rather than hard-coding it (writing a number inside the
program), as it would avoid program change during software maintenance. We
need to accept the reality that in this world, everything that can change, will.
Changing a data file is much easier than changing the code, compiling it, testing it,
debugging it, linking it to libraries, and moving it to the production environment.
4. Most programming languages provide for composite keywords that shorten state-
ments. These keywords, while reducing the chore of typing, are hard to decipher
during software maintenance. Therefore, I advocate writing multiple simple state-
ments, even if it increases the typing load. It would help greatly during the soft-
ware maintenance.
I avoided discussing the IF ... GOTO loops as it has disadvantages, especially in leading to
error conditions. In the present day, most quality, conscious organizations prohibit using
the GOTO statement except for error-handling situations. So, my humble suggestion is that
you avoid the temptation of using IF ... GOTO loops in your programs.
9
Input Statements
Introduction
As we noted in Chapter 1, computers are data processing tools. Programs help the com-
puter to process the data, and input statements are the statements that bring data from the
outside world into the computer. In this chapter, we look at the way by which computers
take in data from the outside for processing it and delivering the output.
1. The computer receives data from the outside world only when a program executes
an input instruction.
2. The input instruction tells the computer:
a. The specific input device from which data is to be received.
b. The data that needs to be brought in from the specified device.
c. The locations in RAM where the received data is to be stored.
3. The computer receives the data and stores it at the specified location.
This is the manner in which data from the outside world is input into the computer. The
following actions are performed by the computer in receiving data from the specified input
device:
1. The computer loads the device driver of the specific device from which data is
to be received into the RAM. A device driver is the program that performs the
following actions:
a. It controls the device and issues commands to it in a manner that the device
understands the instructions and performs the required actions.
b. It checks if the device is connected, powered up, and ready for interaction with
the computer.
c. It checks the device for proper functioning and raises error messages if it is not
functioning.
87
88 Computer Programming for Beginners
We have to note here that there is a huge difference between the speed of the CPU and
of any input device. It would be a travesty if we make the CPU wait all the time that the
input device takes to supply the information. Therefore, we release the CPU to other tasks
while the input device is taking its time to input the information. This is handled in the
following ways:
1. Use of interrupts:
a. The CPU places an interrupt on the input device and returns to the next task
waiting for its allocation.
b. The input device makes the input ready and places an interrupt on the CPU.
c. The CPU responds to the interrupt, receives the information from the input
device, and stores it in the appropriate memory location.
2. Buffers:
a. All the I/O devices are equipped with a small amount of RAM generally
referred to as “buffer,” as it is used to provide a buffer to bridge the gap in the
speed difference.
b. When the CPU places an interrupt on the device, the device fills this buffer
and places an interrupt on the CPU, which reads the input from the buffer of
the device.
3. Direct memory access (DMA): Usually, the receiving of data from input devices is
not handled by the main CPU. Most computers have an auxiliary CPU, referred to
as DMA (Direct Memory Access), which handles all the interfacing between the
peripheral devices and the RAM. The CPU passes the I/O activity to be performed
along with the ID of the device, the addresses of the RAM, and the details of the
data to be received to DMA, and it performs the action of receiving the data and
storing it at the appropriate locations in the RAM. Once the action is completed,
it informs the operating system about the completion of the action and the CPU
changes the state of the waiting program to “ready.”
Input Statements 89
Most of the present-day computers use the DMA method to resolve the speed differences
between the CPU and the I/O devices.
However, as a programmer, we need not be concerned with all this. All these activities
are carried out by the operating system of the computer silently and in the background
without requiring any intervention from the programmer or the user. Of course, the user
still has the role of providing the right data to the input device.
We had enumerated various input devices in Chapter 1; now let us have a recap of a few
important devices. These are:
1. The computer keyboard: Usually, the keyboard is the default input device to most
present-day computers. Some time ago, the card reader was the default input
device. In real-time systems, the machine could be the default input device.
2. The hard disk, which is mounted internally on the computer itself: The input from the
hard disk is supplied by the data files and database tables. We really do not access
the hard disk directly, but access the files and database tables on it. We receive data
from data files and tables of the databases.
3. There are other devices that provide data from files including the magnetic tape,
the CD (Compact Disk), and the DVD (Digital Video Disk). The devices such as
the punched card reader and floppy disk have become obsolete and are not being
used any more.
4. There are devices, like the machines and hardware of vehicles like cars, airplanes,
and rockets that provide data on a continuous basis.
1. The ID of the input device: This would usually be a number, at least, inside the OS.
Our programming language would provide a number, either predefined or to be
defined by us. This will be used by the computer to locate the device from which
to obtain the input.
2. The location of the information: This could be a file or database table. The file and
database table would be assigned numbers by us.
3. The details of the information to be received: These would be the variables that were
defined earlier in the program by us. Variables would be identified by the names
we assigned during the declaration of the variables. Along with the variable
names, we also need to specify the order in which the variables are to be received.
This is achieved by specifying the variables in the order in which they were stored
in the file or table. If the order specified in the program is different from the order
in which the variables were stored, then error would result.
90 Computer Programming for Beginners
We need to open the file before attempting to read from it. The following functions are
performed in the opening of a file:
1. The OS would locate the device on which the file is located. This would involve
checking the health of the device and ensuring that it is operational and has the
specified file.
2. Copy the file information from the VTOC (Volume Table of Contents). It is
also referred to as File System, File Allocation Table, or by any other name
into RAM.
3. Allocate RAM to hold one record or a block/set of records in the RAM to hold the
information read from the file.
4. The file structure defined in the program would be compared with the structure
of the file on the disk and if they are differing, then an error would be generated
by the OS that needs to be handled by the programmer. We have a separate chap-
ter on error-handling.
5. If a filter (a condition to selectively retrieve the records) is set, apply the filter while
retrieving the records. This involves allocating RAM and store the filter informa-
tion in it for reference during every read operation.
6. Place an interrupt on the CPU to indicate that the input is ready.
For reading information from a database table, we need to connect to the database and
then open the table. It involves the following operations:
1. Connect to the database. We need to provide the following information while con-
necting to a database:
a. The opening string of the machine on which the database is located. This
would be a string of characters and includes the IP address, the name of the
machine, the disk volume, the name of the DBMS software (like Oracle, SQL
Server Progress, and so on) on which the database is located, and so on. The
contents of the database opening string depend on the installation of the hard-
ware and software combination.
b. The user ID and password for the database. Please note that this combina-
tion is not for the machine but only for the database that is proposed to be
opened.
c. The type of security, which can be the database administrator, user, or any
specific database role that needs to be specified along with the opening string.
d. Any lock (read lock, write lock, exclusive use, or any other lock) that needs to
be applied to the database while performing database operations.
2. Once the string is provided to the connect statement, the OS will connect the pro-
gram to the database.
3. The detailed information about the database, that is, its location, ID, table informa-
tion, permissions, user ID and password, etc., are copied to the RAM for reference
during each database operation.
With this, the database gets connected to the program and we can perform read operations
on the database.
Input Statements 91
Once the database is connected to the program, we can access any table from the data-
base and use it for read operations. Opening any table involves the following operations:
1. The table information including the ID, number of records contained in the table,
the indexes generated on the table, and the primary and secondary keys would be
copied into the RAM for reference during each table operations.
2. Copying the security information and security permissions into the RAM for ref-
erence during the execution of the program.
3. Copying the filter information for the retrieval of the records.
Now, the table is ready for receiving information input to the program under execution.
Input Statements
Now, most of the user input is taking place using the GUI (Graphical User Interface). In the
GUI screens, we have various controls, and the important ones among these are:
1. Form
2. Frame
3. Text boxes
4. Combo boxes
5. List boxes
6. List views
7. Grids
8. Radio buttons
9. Command buttons
10. Labels
11. Links
Form
In a GUI environment, all the controls are placed on the form. When the program is exe-
cuted, the form is displayed on the screen along with all controls placed on it. A form has
many properties associated with it, and here are some important ones:
1. Name: Every form has a name associated with it. We use this name in the pro-
grams when we refer to it. We use “form.control.property” to refer to the property
of a control on a specific form. For example “loginform.userid.text” refers to the
text entered into the box named “userid” on the form named “loginform.”
2. Icon: This defines the picture of icon type to be displayed on the top left-hand cor-
ner of the form (or at a place defined by the programmer or the specific program-
ming language).
92 Computer Programming for Beginners
3. Caption or title: This is the text displayed on the top border of the form when
it is displayed or at an appropriate place defined by the specific programming
language.
4. We can set the colors for the background and foreground.
5. Enabled or Disabled: By setting this property as desired, we can allow modification
of the contents of the controls on the form. When we set this property to Disabled,
the controls are displayed on the form but disallow entry of fresh values or modi-
fication of existing values.
6. Visible: By setting this property dynamically, we can make this form appear and
disappear during execution as desired.
7. There are many more properties to achieve various requirements of the form.
A form has a number of events associated with it, and the important ones are enumerated
here. Each of these events can be programmed as needed by us. Please note that a specific
programming language may have different names for these controls:
1. FormLoad: In this event, we can program all those actions that need to be executed
before allowing the user to use the form. All these actions would be executed by
the computer as it loads the form. We usually program such actions as authenticat-
ing the user (see if the user accessing the form was properly logged in), ensuring
that the user has necessary permissions to use the facilities provided for in the
form, block out those specific controls for which the logged-in user has no security
clearance, open a connection to the database, and so on.
2. FormUnload: In this event, we program all those actions that needed to be executed
before closing the form and taking it off the screen. Typically, we write code to
close all database connections, restoring the form that was on top before this form
was loaded, closing the application if necessary, logging off the user, and so on.
3. Activate and deactivate: In these events, we either activate the form and bring it on
top or deactivate the form and minimize it. By activation, we bring the form on
top of the screen and enable it to receive inputs or use the facilities provided on the
form. By deactivation, we minimize the form and disable it from receiving inputs
or using any facilities provided on the form.
4. Resize: In this event, we change the size of the form to the desired size. Usually
the form will have three sizes, namely, normal, full, and minimized. The normal
size form, typically, would not occupy the entire screen. The exact size of a nor-
mal form depends on the definition provided in the programming language. The
full size however, would occupy the entire screen. The minimized form would be
shown as an icon on the task bar at the bottom/top of the screen and would be
taken off the screen. Of course, we can also define the form size to suit our applica-
tion by giving the coordinates for locating the form and the size of the form on the
screen.
Frame
A frame can be viewed as a subform or a form within a form. When we use radio buttons
(to be discussed in the following sections), we need to use a frame to define all the buttons
as one set. A set of radio buttons within a frame are shown in Figure 9.5. We can also use
Input Statements 93
frames to arrange the controls on the screen into logical groups. The advantage of using
a frame is we can disable all the controls on the frame with one programming statement.
We can also enable all the controls enclosed within the frame with one programming
statement. The other advantage of using frames is that we can logically divide the controls
on the form into logical groups to make the use of the form easier for the user. The events
associated with a frame are got-focus, lost-focus, enable, disable, visible, invisible, and so
on. These properties are discussed in the following section dealing with text boxes.
Text Box
A text box (Figure 9.1) is usually a rectangular area on the screen placed at a designated place
to receive a set of characters as guided by a prompt beside the box. These characters can be
numbers, alphabets, and other characters as desired by the user. A text box allows for the
entry of values without checking the data type being entered, and it behooves on the pro-
grammer to do the type-checking. If we do not check the data type being entered into a text
box, the error would be thrown up when we assign the value to a variable if there is a mis-
match of the data-type. The value entered in a text box is transient. It disappears when the
screen is unloaded. We can use the value entered into a text box directly in processing with-
out assigning the value to a variable. A text box can have a name defined by the programmer.
We can also use the default name assigned by the programming language. A text box has a
number of properties associated with it. The important ones are:
1. Name: The text box has a name associated with it. We use this name when we
refer to it in the program and to capture the value entered into it for processing or
storing.
2. Text: We suffix this property to the name of the text box. This would contain the
value entered by the user. We usually use this property as “text1.text,” which
means the value of the text entered into the text box named “text1.” We can now
assign this value to a variable in this manner—“NameofStudent = Text1.Text.”
This statement would assign the value entered into the text box, named Text1, to
the variable NameofStudent. Of course, if the type of data is not same in the text
box as well as the variable, an error would be thrown up.
3. Tab Index, Tab Stop: The Tab Index property sets the order in which the cursor comes
into this box when we press the tab button on the keyboard. By using the Tab Stop
property, we can make the cursor either skip this box or stop in this box.
4. Visible: By setting this property, we can make the text box either visible or hidden
when the form is displayed.
5. We have a few properties to select the font to display the text entered by the user,
the background color of the box, foreground color of the box, the size of the box,
and so on. We use these properties to enhance the appearance of the text box.
6. We have a few properties to associate the box to a field in a database table so that
we can import data from the database table into the box.
FIGURE 9.1
A text box.
94 Computer Programming for Beginners
7. Enabled and Disabled: We can use these properties to permit entry of text into the
box or modification of the existing text in the box. When we set the property to
Disabled, the text would be visible but does not allow its modification.
8. Tool tip: This is a facility to provide some helpful text to the user. The text we define
in the tool tip control becomes visible when the mouse hovers over the box.
The text box has a number of events associated with it. We can write a program to perform
desired actions when the event is triggered. Here are the important events associated with
a text box:
1. Got-Focus: This event contains statements to be executed when the cursor position
shifts into this box either as a result of pressing the tab button or clicking the left
mouse button when the cursor is positioned in the box. We use this event to high-
light the text already inside the box, but there are many uses for this event.
2. Lost-Focus: This event contains statements to be executed when the cursor shifts
out of this box. Usually, we use this event to validate the value entered into the
box., but we can also program other statements as required by the situation at
hand.
3. Click: We can include such statements that need to be executed when the left mouse
button is clicked when the cursor is placed in the text box.
4. Double-Click: We can include such statements that need to be executed when the
left mouse button is double-clicked (the left button of the mouse is clicked twice in
quick succession) when the cursor is placed in the text box.
5. Change: This event is triggered when the value in the text box is changed. A
change can happen only when there is some text already inside the box before
the focus is shifted into this box and then it is changed. The Keypress event is
triggered whenever a key on the keyboard is pressed, but the Change event
is triggered only when the focus is shifted out of this box. We can program all
such statements that enumerate the actions to be performed by the computer
when this event is triggered.
6. Keyup, Keydown, Keypress: Keydown is triggered when the key is pressed but not
released. Keyup is triggered when the pressed key is released. Keypress is trig-
gered when the key is pressed and immediately released. We use the KeyPress
event to check if the character is either numeric or non-numeric, especially when
we expect the data to be numeric. We also use this event to ensure that the char-
acter is a permitted character and prevent special characters to be entered along
with the Alt and Ctrl keys.
7. Mouseup, Mousedown, Mousemove: These events are used to program the move-
ments of the mouse and the click of its left button. Oftentimes, we may click on a
box by mistake. In such cases, we can use the Mousedown event to program what
can and cannot be done. Similarly, we can use the Mouseup event to program
when the pressed left button of the mouse is released. Mousemove event is used
to program what needs to be accomplished when the cursor is moved by moving
the mouse such that the cursor hovers on the box. Usually we highlight the text in
the box when the cursor hovers over the box by programming the event.
8. Dragdrop, Dragover: These events are used to program the drag and drop operation
during the cut and paste actions.
Input Statements 95
9. There are some more events associated with a text box, and these events are contin-
uously enhanced as well as supplemented to make the programs more powerful,
aesthetically more appealing, and reduce the actions to be taken by programmer.
So, we need to refer to the manual of the specific programming language we are
using for complete set of events and their functionality.
All these events can be set using the facilities provided by the IDE (Interactive Development
Environment) of using program statements for the event. The other controls have similar
properties and events associated with them. As I discussed these events here in detail, I
would not repeat them when discussing other controls to avoid duplicating the explanation.
Combo Box
A combo box is in fact a combination box combining a text box with a list box (Figure 9.2). It
is a rectangle with a downward-pointing triangle indicating selection. It is used for select-
ing a single value from a set of values already available in the combo box. When the pro-
gram is under execution, clicking the inverted triangle results in the box being extended
downwards displaying the values available for selection as shown in Figure 9.3.
A combo box can be filled with values of a specific attribute (variable or field), and then
the user can select any of the values in the list. Alternatively, the user would be allowed to
enter a fresh value into the box. The box provides the facility to block the user from enter-
ing a new value. Often, this would be used to take in the primary key from a database so
that if an existing value is entered, the relevant data would be retrieved and displayed
in the other appropriate controls. Of course, all these actions need to be programmed by
the programmer! We can store the data in a combo box either in a sorted order or in their
original order. Combo boxes allow the user to select the values by clicking the item in the
displayed list or by pressing the first character in the item and then using arrow keys. We
use combo boxes extensively to present a choice of items and allow the user to select a
single existing value or to enter an altogether new value. A combo box is both an input as
well as an output. It is used as output when we fill the combo box with values, and when
the user selects an item, it will be used as an input.
FIGURE 9.2
A combo box.
FIGURE 9.3
A combo box after clicking the inverted triangle.
96 Computer Programming for Beginners
FIGURE 9.4
Grid.
List Box
While the text box and the combo box are used to receive input of one data item, the list box
can receive multiple data items for input. But in both the cases, all three, text box, combo
box, and the list box, receive inputs for the same attribute (or field of a database table). The
list box looks exactly same as the text box with sliding bars at the right side. It permits the
selection of multiple items from the displayed list. The selection of multiple items is facili-
tated by holding the control (CTRL) key and then clicking on the desired items.
Grid
While text box, combo box, and list box receive values for one single attribute (or a data-
base field), a grid can receive data items for multiple attributes (or database fields or an
entire record). List view, or a grid (Figure 9.4), is a table with rows and columns displayed
on the screen. It can display values of multiple attributes (or fields of a database table). It
can be directly connected with a database table, and each column can hold the values of a
field. Each row can hold the values of a record. We can fill the grid with only the required
data. In some grids, the act of changing the values in the grid would automatically change
the values in the database table. A grid can be used both as an input and an output. When
we fill the grid with the data from a table, it is an output, and when we receive new values
from the grid, it is used as an input. We use grids in input screens in which we receive
multiple data items to show that the entered data is indeed saved by displaying them in
the grid. It is also used in the enquiry screen to show multiple items that resulted from the
query passed on to the database table.
The grids allow for column headings to be placed on the top row. Some grids allow for
sorting the entire grid by the values of a selected column by clicking on the header row at
the desired column. Grids also allow us to search within the grid to locate the desired values.
Radio Buttons
Radio buttons are shown in Figure 9.5. Radio buttons are to be embedded inside a frame
as shown in Figure 9.5. Radio buttons are used to present the user with a set of mutually
exclusive options for an attribute from which the user can select only one option. Each radio
button presents one option and together, a set of radio buttons presents all available options
Input Statements 97
FIGURE 9.5
Radio buttons.
from which only one can be selected. All the options are visible on the screen. The same
functionality can be achieved with a combo box also, but the combo box presents only one
option at a time. To see the other options, we need to click the inverted arrowhead on the
combo box. With radio buttons, all the available options are present on the screen itself,
enabling the user to view all the options at one time without clicking anything. When the
program is executed, only one radio button in the set can be selected and all others would be
deselected. That is, the radio buttons are mutually exclusive. When a radio button is selected,
the previous selection would be removed. Radio buttons are to be used to ensure that only
one option is selected. Each radio button results in a Boolean input that is 0 or 1, or Yes or No.
The caption property is available to controls like the radio button, check box, command
button, the label, etc. This is displayed along with the control when the program is exe-
cuted. This helps the user in efficiently entering the needed input with ease.
Check Boxes
Check boxes are similar to radio buttons in the sense they present all the options avail-
able just like the radio buttons, but the comparison stops there. Each check box results
in a Boolean input as 0 or 1, or Yes/No, for one data item. All the radio buttons in a
frame are intended to obtain input for one data item whereas each check box obtains
input for one data item. Check boxes are mutually inclusive, that is, the user can select
multiple check boxes, all of which are valid for the scenario. Radio buttons are used
to enable the user to select one option out of a set of mutually exclusive options, and
check boxes enable all the options that complement and supplement each other. A set of
check boxes are shown in Figure 9.6. Another aspect is that the check boxes need not be
embedded inside a frame, as they are mutually inclusive.
FIGURE 9.6
Check boxes.
98 Computer Programming for Beginners
FIGURE 9.7
Command buttons.
Command Button
Command buttons are used to indicate that the inputs are given, and that the computer
can begin to process the inputs. A sample command button is shown in Figure 9.7. Each
command button will usually require a program to process the data entered into the
controls by the user. It initiates the execution of the program. Usually, command buttons
are used to save the data, cancel the action, delete a set of data, or any other affirmative
actions. It generally has the captions of Save, Cancel, OK, Yes, Book Ticket, Cancel Ticket,
Transmit, and so on. Programs can make the command buttons visible or invisible to
implement security based on the permissions the user has. The properties usually avail-
able to command buttons are the enable, disable, visible, invisible, caption, size, picture,
and click.
Labels
Labels are pieces of text that make it easier for the user to understand what is presented
on the screen. Labels are not editable when the program is under execution. In Figure 9.6,
a label is used: “Check all that applies.” This guides the user on how to use the control.
Usually each of the controls on the screen would have a label associated with it to indi-
cate its name, function, or an explanation. Labels can be made invisible or visible. Labels
are very important controls on the screen, and just changing the labels on the screen can
change the software from one language to another.
In addition to the controls described earlier, the specific programming language being
used may provide some more controls. One point we need to note is that some of these con-
trols like the text box, combo box, list box, and the grid, can be used as both input as well as
output.
Links
In the present day where the applications are Internet-based, links are provided to navi-
gate the user to other pages within or without the application. Links are two components,
namely, the URL (Uniform Resource Locator) executable by the browser and the text that
is displayed to the user to aid him/her. Of course, the URL itself can be used as the text,
but using a humanly readable text is preferred to make the application user-friendly.
Sometimes, the URL could be too long, and it is preferable to use a piece of text in place of
URL. Each programming language provides a facility to define links and link them with
pieces of text to be seen by the end user.
Input Statements 99
The C-family languages use the “getchar” statement to receive one character read into a
predefined variable. Other languages have similar keywords for receiving just one charac-
ter. This statement can also be used to receive large amounts of data from a file character
by character. This keyword is the main tool in commands for copying files.
To receive value for a variable, the C-family languages use the “scanf” keyword. The
keyword “scanf” can receive numeric as well as non-numeric values. However, we need
to define the variable appropriately before reading the value from the keyboard. If there
is a mismatch between the type of the defined variable and the data supplied from the
keyboard, it would raise an error. Other languages use key words like “read” and “input”
to receive values into variables. Of course, the type of the defined variable and the value
supplied must be the same. Otherwise, an error will be thrown up.
Whatever values we enter from the keyboard in response to the previous statements, the
value would be displayed on the computer screen for us to view and ensure that it is the
right value. We can also correct the value if necessary by using the backspace key. While
we can present a blank screen to the user to enter the input, we better provide him/her
a prompt (a message on the screen that instructs the user what needs to be done) on the
screen to guide him/her. We can also locate the prompt appropriately on the computer
screen at a location we feel is most convenient for the user to read the prompt and enter
appropriate data. The value will not be input to our program until the user presses the
Enter key on the keyboard.
The longest string that we can enter into a variable is a double-precision number for
numeric variables and a maximum of 255 characters for the character variables. If we wish
to have the user enter large amount of textual matter, we need to write special programs
like MS-Word or use database packages.
Using the scanf family of statements, we can receive input for one or more variables
into our program. These statements can also be used to read data from flat files. When we
use this scanf family of input statements, we need to specify the input device from which
to get the input data. Usually, it would be an integer defined earlier when we opened
the file for input. If we do not specify a device ID in this statement, it usually expects
the input from the standard input device, which in the present day is the keyboard and
screen combination. Inputting multiple variables would follow a specific set of protocol
based on the specific programming language being used. When receiving input from a
file using this class of input statements, there would be extensions to this statement to
100 Computer Programming for Beginners
suit the type of data file organization of sequential, random, or indexed sequential orga-
nization. Receiving information from a database depends on the specific DBMS software
we are using. However, there is a standard set of SQL constructs that are common to all
DBMS software packages. However, they differ in the statements used for connecting to
a database and then opening a database table. For reading, of course, the SELECT state-
ment is used across all DBMS software packages. Here are some important standard SQL
statements:
1. SELECT: This statement is used to read information from a database table. SELECT
keyword is followed by a * (indicates reading of all the fields in the record, or in
other words, the entire record) or by the list of the fields to be read. The field names
must be written as defined in the database table structure. This statement can
include a relational or logical expression to filter the retrieved records.
2. INSERT: This statement is used to add a new record to the database table. Of
course, this statement also causes the update of the relevant indexes used on the
table. The INSERT keyword is followed by the field names and the values (con-
stants or variables) to be stored into those fields.
3. UPDATE: This statement is used to modify the contents of fields in an existing
record. In this statement, we can change the contents of one field, some fields, or all
the fields as necessitated by the situation. The UPDATE keyword is followed the
names of the fields to be modified along with the values (constants or variables)
with which to modify the contents of the fields. This statement can include a rela-
tional or logical expression to locate the records to be updated. This statement can
be used to modify the contents of one record, multiple records, or all the records
as necessary.
Of course, there are many other SQL keywords that aid in the programming and manipu-
lating the data in the databases. For a comprehensive coverage, you need to refer to an
SQL programming book or manual of the specific DBMS you are using. DBMS packages
also make available certain arithmetic statements like count, average, sum, and so on to
retrieve the data as well as perform the simple arithmetic operations.
Data Validation
“To err is human, but to make a real mess, you need a computer,” or so goes the joke about
data errors in the information obtained from computers. Most of the errors are attributable
to errors made while entering data into computers. A few of the errors are caused because
of defects in the programs. The errors caused by the defective programs are easily trace-
able as they are consistent over all the data that is processed by the program, but data entry
errors are difficult to detect and rectify.
In the past, when the data entry was offline, the data was entered by data entry special-
ists and each item of data was verified by reentering the data or by manually checking to
ensure that the data was as accurate as humanly possible. In those days, the data was accu-
rate to 99.96%, or for every 10,000 records, 4 records were likely to be containing defects.
Input Statements 101
Now, offline data entry is still used but not in mainline business applications. Most of the
data comes from business transactions handled every day by employees working in orga-
nizational business processes. The people entering data are not data entry specialists, but
functional specialists using the computers for record-keeping and other purposes. They
are not likely to enter every item of data twice to ensure the accuracy of the data entered
in the computer. Therefore, we need to build in data validation statements to ensure the
accuracy of the data and to prevent as many errors as possible from entering the corporate
databases.
We have to note that data-entry errors caused purposefully cannot be avoided. By includ-
ing data validation statements in our programs, we aid the functional specialists using the
computer from committing errors inadvertently. Here are some of the methods we use in
programs to prevent inadvertent errors:
1. Masterfile look up: We usually have some master data in our databases and, each
table must have a primary key, which is an important nonduplicated item of data.
In a payroll or HR application, the employee ID is a key item. In a marketing appli-
cation, the customer ID is a key item. In a purchase application, the order number
is a key item. In a retail store, the product codes and the prices are key items. So,
whenever an item of such key is required to be entered, we force the user to select
from a list using a combo box or another appropriate control. That way, the user
is prevented from entering the wrong data. When the user needs to enter data for
such an item, we check the appropriate database table for the existence of the item,
and if it is not available, we flash an error message to alert the user about the mis-
take committed.
2. Range check: This is useful in preventing errors while entering numeric data. For
example, in a HR application, we check the date of birth to be within the range of
employability to ensure the employee attained the minimum age and is below the
retirement age. When processing the financial applications, we see the range of
permissible amounts.
3. Logical checks: We use certain common-sense logic to see if the entered data is accu-
rate. For example, when the user enters the date as February 29, we check if the
year is a leap year or not. In an email address, we expect one “@” character, a dot,
and, at the end, there are certain acceptable words such as com, edu, org, and so
on. In a website address, we expect the letters “www,” and at the end, acceptable
words like com and org. These are few simple examples, but in every application
there will be many situations like this, and we build in as much logical checking
as possible to ensure accuracy of data.
4. Unusual data: This is used in financial applications. When a payment to a person/
employee/organization goes above the usual amount by a certain percentage, we
alert the user before processing the data. In credit card and debit card applica-
tions, we not only check the limits, but also compare the present transaction with
all the transactions over the past year or so to prevent fraudulent use of the card.
5. Unusual requests: With the Internet making it easy to access data all over the globe,
we now look at the source from where the request is coming. If the request comes
from a foreign location, and especially from an enemy country, we prevent access
and alert the administrator.
102 Computer Programming for Beginners
6. Unexpected characters: This has become a big problem in these Internet applica-
tions. As you perhaps know, HTML (Hyper Text Markup Language) is built up
using tags. Now, if the data is entered into the text boxes with an executable string,
it can cause havoc. We ensure that such tags are either prevented or prevent their
being used as valid HTML statements. This has now become an important aspect
of today’s programming. Another aspect is, while entering data in fields such as
names of personnel where digits and other non-alphabetic characters should not
be present, we check for such unexpected characters and eliminate them or alert
the user about the erroneous entry.
7. Integrity check: When codifying important IDs such as social security numbers,
employee IDs, material codes, order codes, and so on, they are built with some
logic. When someone enters such data, the data will be verified against the logic to
ensure that accurate data is being received.
8. Type-checking: We also check the data being entered to ensure that right type of
data is being entered. Common mistakes are entering an “O” (letter “O”) in place
of zero. This is especially implemented when accepting numeric data.
9. Constraint-checking: In many cases, we place constraints on the type of data being
entered. In numeric cases, there could be a minimum value or a maximum value.
In character strings, there could be a minimum length and a maximum length of
the string acceptable to the application. In some cases, there needs to be only one
word. We check for all such constraints. In passwords, it has become common to
ask for a numeric and a special character. We do that in this kind of checks.
10. Consistency checks: We check for consistency between the parts of the data being
entered. If the title was entered as “Mr.,” then the gender should be “Male.” If a
product is selected, then the price and discount should be commensurate with the
product. There could be many such consistency checks possible depending on the
application, and we check for these aspects.
4. Vibrations of places at some critical places like shafts, enclosures, and panels.
5. Various measurements like the voltage, current, electrical resistance, volume of
liquids, levels of cooling liquids, temperatures, and so on.
6. There could be situation-specific parameters internal to the machine.
The external parameters matter especially to moving machines like the automobiles, air-
planes, ships, submarines, rockets, and so on. They are:
1. The environmental parameters like the wind speed, rain, snow, dust, and so on.
2. Any approaching objects.
3. The direction and the ground speed.
4. The time allowed, elapsed, and remaining.
5. The distance to be covered, already covered, and remaining.
6. There would be many more machine-specific parameters.
All this data comes from instruments/sensors mounted on the concerned parts of the
machines, and they collect the data that is basically analog in nature. They would have
either built-in or externally mounted A2D (Analog to Digital) converters that will convert
the analog signal from the instrument to a digital signal suitable for a digital computer.
Now this signal is interpreted and processed by the CPU embedded in the machine.
As computer programmers, we are not concerned how these A2D converters work. We
simply assume that the data comes in digital form readily usable by the CPU and code our
programs.
Final Words
Inputs are vital in computer programming. Inputs can come from master data stored inside
the computer and transaction data generated with every business transaction performed.
Data comes in from various input devices including flat files and database tables. The
important input programming keywords are discussed, but the specific syntax depends
on the programming language selected for the application. During business transactions,
data comes in from GUI screens housing various controls, and the important controls are
discussed in this chapter.
10
Output Statements
Introduction
The purpose of using computers is to generate useful and actionable information by process-
ing data. Once generated, the information needs to be delivered to the user on the desired
medium so that the users, who need the information, can utilize it. In this chapter, we will
discuss how the information is sent from inside the computers to the external world.
105
106 Computer Programming for Beginners
the output statements, so they go to the right output device. Generally, the output state-
ment consists of three parts:
1. A keyword such as “write,” “print,” or some such other keyword that specifies
that the purpose of this statement is to deliver some output.
2. The ID of the output device which indicates the desired output device that is going
to receive the generated data/information.
3. The data or information that was generated by our program.
There would be other components in an output statement in other cases. We will discuss
them at appropriate places.
1. Add or append mode: In this mode, the contents of the file would be retained as they
were, and new records would be added to the end of the file. The file would be
lengthened by the addition of the new records. In this opening statement:
a. The file is located on the magnetic disk connected to the computer. If the file
could not be located, an error message indicating that the file could not be
located would be flashed on the screen.
b. If the file is located, the details about the file, location, record structure, and
number of records, etc., would be copied to a location in the RAM.
c. Sufficient RAM would be allocated to hold a record or a block of records.
This space would be used to receive the saved records.
2. Output mode: When a flat file is opened in output mode:
a. If a file with that name already exists, then it would be deleted. Therefore,
we need to be careful when opening a file for output mode lest we may lose
an existing file.
b. A new entry is made in the VTOC with the name specified in the file opening state-
ment, and disk space would be allocated adhering to the procedures of the OS.
c. Sufficient RAM would be allocated to hold a record or a block of records.
This space would be used to receive the saved records.
Output Statements 107
3. Modify/update mode: This facility is generally not available in flat files. Flat files,
by design, are not amenable to modification of contents. When a flat file is to
be modified, we create a new file, taking the contents of the existing file and
the changes as input. But in certain specific cases, modification of flat files is
allowed. These conditions are:
a. The record structure must contain fields of fixed length. That is, the field
must contain the same number of character in every record. If a record has
a lesser number of characters, it must be padded with either blank spaces
or leading zeroes to fill the length available for the field.
b. The file needs to be organized either in random-access mode or indexed
sequential-access mode. While sequential-access flat files do not allow
modify/update mode, certain computers may allow such facility. But it is
very rare.
Whenever we issue a write statement, the record would be saved in the RAM allocated
while opening the file. Whenever this space is filled, the OS would automatically write
the information to the disk and empties the space to receive new records. The resid-
ual records in this RAM would be written to the disk when we issue the “Close” file
statement. Of course, modern programming languages flush all the buffers at the time
of terminating the program execution, but as programmers, we cannot rely on such
facilities. We need to include the close-file statement to ensure that all the information
received is stored in the files without exception.
Writing information to flat files consists of three steps, namely, the opening of the file,
writing information into it, and then closing the file. The opening statement for a flat
file would consist of the following components:
This results in the file being opened for the specified mode. Now, we can use the file to
send information using the defined file ID.
Once we complete writing data to the file, we must close the file. It is generally a simple
operation. The keyword in most languages is “close,” followed by the file ID. The closure
action would cause the data in the buffer to be written to the file and the memory space
allotted to the file operation is released to the OS. If we do not close the file using the speci-
fied statement, then the data in the buffer is likely to be lost.
108 Computer Programming for Beginners
1. The keyword “Write,” or “Print,” or some such other meaningful keyword that
indicates to the computer the nature of the operation requested.
2. The file ID, to instruct the computer where to write the information.
3. The format of the output.
4. The list of variables to write.
5. Any other language-specific information.
What is the format of the output? In some programming languages, mere mention of
the field names is adequate. In some programming languages, we need to specify the
format of each of the variables mentioned in the output statement. The format of each
field could be:
Now what happens if the data supplied and the format differ from each other? Ideally, an
error should be thrown up by the OS. But sometimes, the erroneous character in the data
would be replaced with a blank space or a zero and written to the file. So, we must take care
to ensure that the data and the format are exactly the same while writing the data to a file.
What would be the final shape of the record in the file once the output instruction is
executed? There are two types of data records in the flat files. They are:
In fixed field-length records, each field in every record would contain the same number
of characters with the shortage being filled with either blank spaces or zeroes. In variable
field-length records, the number of characters in a field can vary from record to record. Let
us illustrate this with an example.
Let us assume three fields, a first name of 6 characters, a last name of 8 characters, and
an income of 8 characters. Now this would appear in a fixed field-length records as this:
Output Statements 109
JOHN MCDONALD00050000
SHANE WARNER 00100000
You can see the blank spaces being padded for the alphanumeric data and leading zeroes
for the numeric data. Now the same data, in variable length records, would appear like this:
“JOHN”, “MCDONALD”,50000
“SHANE”,“WARNER”,100000
Now, as you can see, there would be no padding, but the alphanumeric data is enclosed
between quotation marks and there is a comma separating each field. Some programming
languages do not use the quotation marks, but the field separator, generally referred to as
the field delimiter, is used in all programming languages. The field delimiter can be any
character but generally the comma or the semicolon are the most popular field delimiters.
How is one record delimited from the other? In fixed-field length records, they would
not be delimited. The earlier two records would appear on the disk, thus:
When we try to read such records using an editor, it would be confusing. But the program
can distinguish clearly, as it counts the number of characters and then selects the right
record. To make reading such records simpler in an editor, another file format, referred to
as Line Sequential, is often used. In this format, each record is delimited using an enter
(ASCII 13) character, a line-feed character (ASCII 10), or a combination of both. While using
a record delimiter makes it easy for us to open the data file in a text editor and make cor-
rections, it increases the file size on the disk.
In variable field-length records, too, sometimes a record delimiter as explained earlier is
used for the same purpose of opening the data file in a text editor to make corrections to
data as necessary.
How do we write programs to ensure that the output is written as either fixed field-
length records or as variable field-length records? Usually there would be a different key-
word in the programming language to denote the type of record. Some programming
languages use the “write” keyword for fixed field-length records and the “print” keyword
for fixed field-length records. Other languages use other keywords to specify the type of
record to differentiate the record types.
While writing data, we need to ensure the following:
1. Some programming languages mandate that the data be moved into the record
fields. Therefore, the data that resides in the variables needs to be moved to the
output record fields by an assignment statement.
2. We need to ensure that the variable names appear in the write statement in the
same order as they are to be stored in the flat file as defined in its record structure.
3. The fields must be the same data type as in the variables being written to the file
as in the definition in the file structure.
4. The field lengths must also be same in the file structure and in the variables being
written to the file.
110 Computer Programming for Beginners
When these precautions are ignored, errors are likely to be thrown up by the OS. We are
lucky if the errors are thrown up, as it allows us an opportunity to correct the mistakes
and re-execute the program. In some case, no error would be thrown up and the OS would
write the information as specified in its functionality. That would result in erroneous data
files.
Of course, the use of flat files for storing data has come down significantly with the
growth in the use of DBMS packages. Still, flat files are used for storing configuration data
in commercial applications, and flat files are the main data storage method in the real-time
applications. Flat files have very little overhead in terms of storage space and, therefore,
wherever there are constraints on storage space, flat files are used.
1. The IP address of the machine on which the DBMS was installed. The IP
address can be in the standard form (such as 192.168.0.1), or a name such as
chemuturi.data.
2. The name of the DBMS package—this helps in making the connection string to
connect to the database. Each DBMS package needs the arrangement of informa-
tion in a specific order in the connection string.
3. The name of the database.
Output Statements 111
4. The type of security used by the database. In some cases, the security could be
integrated with that of the server. In this case, unless the user is able to connect
to the server, the user would not be allowed to connect to the database. The secu-
rity rights need not be defined separately for the user on the database. The data-
base would use the security definition of the server machine. The other security
method used is to define the user rights on the database. In this case, the user need
not have rights to connect to the server machine to connect to the database. The
connection string would differ in both the cases.
5. The username and password to connect to the database.
With this information in hand, we are now ready to prepare the connection string. The
connection string is nothing, but the previous four aspects concatenated into a string of
characters. We may use four statements using the character arithmetic for concatenating
the strings, or write one single statement. I suggest writing four statements, as it becomes
easier to debug. You would be amazed at the propensity to err in the programmers. I would
go to the extent of saying that programmers in the original development of the programs
and later on in maintaining them spend much more time in debugging the programs than
in writing them. So, better to write a longer program so we can save a lot of time while
debugging it. It is common to assign the connection string to a variable and use it when-
ever we connect to the database. Usually, we store the string along with the open statement
in a subroutine that is accessible to all the programs in the application and call it whenever
we need to connect to a database.
Now we simply write the statement for to connecting the database along with the con-
nection string. Usually, an OPEN statement is used to connect to a database. The syntax is
usually like this:
Open <ConnectionString>
“Open” is the keyword that tells the computer to establish a connection with the database
using the parameters given in the connection string. We need to write an error-trapping
statement immediately after the connection statement to trap the error should it occur.
There are any number of reasons why a connection with the database cannot be estab-
lished. Here are some:
1. The connection between the machine on which the application is running and the
database server may be in a disconnected state.
2. The server machine may be suffering from a hardware/software malfunction and
is not working as it should.
3. The number of connections on the database server have reached the maximum.
As stated earlier, a single-user DBMS can accept only one connection at a time.
Multiuser DBMS packages can accept multiple connections concurrently, but the
price of the DBMS varies with the number of connections it can accept concur-
rently. So, if we have installed a 5-user DBMS and are attempting to establish a
6th connection, obviously it fails.
4. Each connection to the DBMS needs certain amount of RAM in the database server.
If the RAM is full and the server is not able to allocate more RAM for another con-
nection, or it is not able to allocate the RAM to the new connection request in the
expected time, the server my return an error message.
112 Computer Programming for Beginners
There could be more reasons like these. Whenever a connection is not established, an
error message is returned to the calling program by the database server of the machine on
which the application is running. We need to trap this error message, decipher it, and flash
a message to the user in a meaningful manner so that he/she is not panicked and runs to
the system administrator or lodges a needless complaint.
Once we have established a connection with the database, we can open any table in that
database.
In the statement opening the database table, the following components are present:
1. A keyword that tells the computer to open the desired database table.
2. The name of the table.
3. The lock to be applied on the table. This lock is usually of three types:
a. Read-only: This lock allows the current program only to read the existing
records. It prevents the current program from inserting new records. It also
prevents the existing records from being updated.
b. Read and write: This type of lock allows the current program to read the exist-
ing records, insert new records, and to update the existing records.
c. There could be other types of locks specific to the DBMS package. One such
is to provide a situational lock. It allows the records to be read and writ-
ten depending on the availability of the database. If no other connection
is using the database, the current program would be allowed to write or
update records. Otherwise, an error message would be returned. This is one
method people use to allow a single-user database to be used by multiple
users concurrently.
4. If there is security for the table besides the security on the database, we need to
supply the user ID and password for the table.
Now, we form a string in the order specified by the DBMS package and issue the statement
to open the database table. Once we open the table, we can extract the records from the table.
Output Statements 113
To extract records from a database table, we need two pieces of information in addition to
the name of the table:
We usually use the SELECT keyword in almost all the DBMS packages. Actually, the
standard SQL specifies the SELECT keyword for extracting the records from a database
table. In most DBMS packages, the table opening and the record extraction statements are
merged into one statement, but there could be different statements for these two actions.
where:
Field1, field2, field3, etc. are the names of the fields defined in the table
Value1, value2, value3, etc. are the values with which the fields would be filled
1. The number of values supplied must be the same as the number fields mentioned.
2. The order of values supplied must be the same as that of the fields mentioned in
the statement.
3. The data type must be the same for both the field and its corresponding value.
4. The values can be constants or variables, but as we noted earlier, hard-coding
values in the programs is a bad practice. So, we ought to use only variables in this
statement.
5. If we do not enter all the fields available in the table, the remaining fields will be
filled with NULL values or the default values defined in the table design.
6. The primary key field must be included in the list of fields unless it is an auto-
incrementing field.
114 Computer Programming for Beginners
7. In the case that a record with the same primary key already exists in the table, an
error is thrown up by the DBMS. There is an UPDATE statement to modify the
contents of an existing record.
8. We need to write an error-trapping statement immediately after an INSERT state-
ment to ensure that the record is properly inserted.
In some DBMS packages, it is possible to insert a blank record and move data into fields
using assignment statements. The blank record is appended at the end of the table and
after the primary key field is entered, it will be indexed. The advantage in this method is
that it is very easy to debug when the insertion fails. In the INSERT statement, all fields
and their values are in one statement, and sometimes it becomes very difficult to locate the
troublemaking value. In the second method, as we use a separate assignment statement
for each field, we can easily locate the value that is causing the trouble and correct it. So, in
my humble opinion, it is better to use the second method of inserting a blank record and
fill the fields with values, if the selected DBMS package allows it.
where:
• Field1, field2, field3 … fieldn are the field names as defined in the table.
• Value1, value2, value3 … valuen are the values of the data that go into the fields.
• Logical expression used to locate the desired record.
1. The data type must be the same for the field and its value.
2. The values can be constants or variables, but as we noted earlier, hard-coding values
in the programs is a bad practice. So, we ought to use only variables in this statement.
3. We need to write an error-trapping statement immediately after an UPDATE state-
ment to ensure that the record is properly updated.
One overwhelming advantage of the UPDATE statement is that we can modify multiple
records with one single statement. We made use of this facility during the Y2K conversion
days to update the tables for changing the 2-digit year to a 4-digit year with one statement
per table. It saved significant amount of time for the organization and reduced the tedium
for the database specialists.
In some DBMS packages, it is possible to locate the desired record and move data into
the fields using assignment statements. The advantage in this method is that it is very easy
Output Statements 115
to debug when the update action fails. In the UPDATE statement, all fields and their values
are in one statement, and sometimes it becomes very difficult to locate the troublemaking
value. In the second method, as we use a separate assignment statement for each field, we
can easily locate the value that is causing the trouble and correct it. So, in my humble opin-
ion, it is better to use the second method of updating single records and modify the fields
with values, provided the selected DBMS package allows it. But if multiple records need
modification with the same values, then the UPDATE statement is the best fit.
4. PDF files are also streams of characters, but they have a header and a footer. We
need to add the header and footer to each output before they are accepted as PDF
files. Creating a text file and giving it the PDF extension will not serve the pur-
pose. Of course, we also need to embed tags for text effects along with the text to
achieve the embellishments on the text or diagrams. Alternatively, utilities are
available to convert textual matter into PDF documents. We can call them in our
programs and direct the output through them to create PDF files.
5. There could be other types of files that would require us to send outputs to them.
We need to study the requirements of each file, study the usage guide of the devel-
opment platform, learn the series of commands and tags that are essential in send-
ing output to these files, and implement them in our programs.
Then each machine, depending on its class, would have some common functionality. For
example, all cars have to move, brake, accelerate, and so on. Each washing machine will
have wash cycle, rinse cycle, start, stop, warnings, and so on. These will be common to a
class of machines but specific to every model or make. We need to learn these specifics for
each occasion and code our programs suitably.
One very important aspect we need to understand to be able to write programs to con-
trol machines is that these programs rely heavily on using system calls. A “system call” is
a programming statement in which we issue a command to a machine using the facility
provided by the development platform to pass control to another program and receive
control or feedback from that program. In fact, issuing SQL statements for managing data-
base tables are also system calls. Each development platform provides keywords for using
system calls. We need to understand these keywords thoroughly. The following process
takes place when using system calls:
Output Statements 117
1. When the computer encounters a system call during the course of program execu-
tion, it places the program execution in wait state and passes on the command to
the designated machine.
2. The machine receives the input from the computer through its communication
hardware and initiates its internal program execution.
3. The machine executes that initiated program and performs the ordered functions.
4. The machine sends back the result of execution to the computer. The result could be:
a. Signal to indicate that the program was successfully executed. In this case, the
program activates the concerned program in wait state and puts it in ready
state so the computer can execute the next instruction in the program.
b. Signal that the program execution failed with an error code. Now the com-
puter decodes the error code and hands over control to the subprogram that
handles errors.
c. Signal with the information requested by the computer. In this case, the com-
puter passes the received result to the waiting programs and puts it back in
ready state for execution.
This is how a system call is handled by the computer. So, when attempting to write
machine-control programs, we need to master the methodology of coding system calls
and also learn the programming commands provided by the machine.
1. Write
2. Print
3. Printf
4. Display
5. And so on
where:
• Variables are data items
• Constants are hard-coded values
• “\n” indicates printing a line-feed character or moving the cursor to the next line
We can also format the output as we need to. Other output statements have different syn-
tax. We have to learn the syntax rules of these output statements in the programming
language selected for our project. These statements are now mostly used to deliver outputs
118 Computer Programming for Beginners
to the printer when we do not use a report-generation utility. We used them and, until the
GUI came along, they were the only choice. But now, with GUI, we just assign the output
to controls on the screen using assignment statements.
We need to deliver the output of processing on many occasions. In fact, we deliver out-
put to the screen much more than to a printer or a hard copy in commercial environments.
Sometimes, we do both. In this section, let us see how to write programs to send output
to a screen.
1. Locate the control on the screen at an appropriate place on the screen, preferably
in the middle of the screen.
2. Provide adequate width for the control but see that horizontal scrolling is avoided
if possible. If horizontal scrolling becomes inevitable, see that the size of the con-
trol is kept as close to the screen size as possible.
3. Provide adequate width for each of the columns. The guideline for this is the
width of the column needs to be equal to the size of the data that is assigned to it.
4. If the control provides a facility to keep some space between adjacent columns,
provide a gap of half-a-character between adjacent columns. If such a facility is not
available in the control, then we need to provide a gap of one character between
adjacent columns. This gap will help the users in distinguishing the data easily in
adjacent columns without getting confused.
5. We need to provide control statistics on the last/first screen to help the users in
ascertaining the efficacy of data processing. Control statistics are explained in the
subsequent sections of this chapter.
6. Alphanumeric data has to be left-justified in the column and numeric data has
to be right-justified. Always include at least two digits after the decimal point in
numeric values. All column headings have to be justified the same way the data in
columns is.
When such a control is not available, we need to format the data on the screen using pro-
gramming statements. We need to use a loop to display the data on the screen and embed
the formatting statements within the loop. Of course, present day SDK’s are providing
some sort of control that facilitates the display of bulk data in columns. Here are the state-
ments that help us in formatting the output:
Output Statements 119
1. Locate on screen: Along with this keyword, we need to supply the row number and
the column number in characters. Now screens are coming in various sizes, but
we need to take a screen size that is most popular at the moment and program
our output. We need to use a variable for the row number so that it can be incre-
mented in every iteration of the loop. For the column number, we can use a con-
stant. This is one occasion, perhaps, where we can hard-code the values without
being accused of bad programming! But it is a better practice to use a variable for
the first column and add a number to this variable for the subsequent columns like
col_loc+4, col_loc+9, col_loc+15, and so on.
2. Use a loop to retrieve data and position it on the screen using the same coordinates
as in the column headings.
3. Count the number of records displayed and when it reaches one less than the number
of rows that can be accommodated on the screen, hold the display and flash a mes-
sage such as “Press any key to continue….” This will allow the user to read the rows
displayed and see if the desired information is available in the displayed records.
4. Alternately, we may place navigation buttons at the bottom so the user can go for-
ward or backward and also to the first record or the last record as she/he desires.
This has become the standard practice today.
5. Based on the user’s choice, perform the desired action, displaying the desired
records until the user chooses to stop browsing the data and chooses to take other
available choice.
Output to Enquiries
Enquiries are a very common action needed by users. For example, in a ticket-booking sce-
nario, the user enquires if a ticket is available for purchase. For this, we first present him
some menu of choices and after the user selects from the available choices, we retrieve the
desired information and present it on the screen. This information may even be bulky, need-
ing us to display it in multiple screens one after the other as detailed in the previous section.
If the desired information is just one item, we display it using controls like the text box,
combo box, grid, and any such suitable control by moving the retrieved information into
those controls.
Usually, organizations have software designers performing the design and we program-
mers need to implement it. Positioning of controls and the logic are given by the design-
ers. We need to convert that logic into programming statements, so the functionality is
achieved. Some organizations use graphics designers to provide the screen layout and
we just have to write programs for the controls provided on the screen so that the desired
actions are performed flawlessly.
Output to Printers
In the earlier days of batch processing, the printer was the default output device. In those
days, we were formatting the output and sending it to the printer. We, the programmers,
were controlling how exactly the output appears on the printer. Then utilities and tools were
developed to make the job of delivering output to the printer easier for us programmers.
120 Computer Programming for Beginners
We now have report-generation tools that allow us to design the report on the screen and
then call it in our programs. Let us first understand the contents of the report first:
1. Page heading: This would appear at the top of every page. While there is no restric-
tion on the number of lines we can occupy for page heading, we restrict it to a
maximum of four lines. If we use more lines for this purpose, we would be left
with less space to place our data.
a. The first line would contain the name of the organization. If we are using a
stationary that has the logo of the organization or the organization’s name as a
watermark at some place, then we can eliminate this line.
b. The second line can optionally have the department name. This is not used
unless there are many departments in the organization.
c. The third line would have the name of the report along with the date of gen-
eration and page number. This line is mandatory.
d. We usually have one more line with the date of the report data, the page num-
ber, and the total number of pages. The date is usually at the top of the page on
the same line as that of the report name at either at the left-hand corner or the
right-hand corner. The page numbers would typically be at the bottom of the
page on the right-hand corner.
2. Column headings: If we are presenting data in a tabular manner, we need to have
column headings. We generally allocate three lines for this purpose, but we may
need to use four lines sometimes.
a. The first and the last lines of the column headings shall be dashed-lines to
demarcate the headings from the data.
b. We place the column headings between these two lines. We need to restrict the
size of each column heading to the width provided for the data in that column. In
some case, the heading may be longer than the data; if so, we need to abbreviate,
if possible. If it is not possible to abbreviate the column heading, we need to put it
in two lines, one below the other, taking away one more line from the data area.
c. Sometimes, we may accommodate more than one data item in the column. For
example, it is common to place name and address in one column, especially
when there is a space crunch. In such cases, we need to use more lines for col-
umn headings. But we need to minimize the number of lines used for columns
so we can allocate more lines for the data area. While the column headings need
to explain the contents of the column, they need to be brief to conserve space.
3. Body of the report: In this place, we present the data retrieved and processed as
necessary. We need to ensure the following when presenting the data.
a. The alphanumeric data needs to be aligned with the left side of the column.
b. The numeric data needs to be aligned with the right side of the column. In
presenting numeric data, we need to maintain uniformity in respect of the
decimal point. We need to either use the decimal point in all cases or do not
use the decimal point at all. We should not present the decimal point in some
cases and leave it out in others. The number of digits after the decimal point
also needs to be same for all values presented in the column. If the digits in
some cases are less than others, then we need to pad the blank spaces with
zeroes positioned appropriately.
Output Statements 121
c. If any value is longer than the width provided for the data, we need to wrap
it around to the next line rather than either truncate it or occupy the adjacent
column space. We need to check the size of the data for a size error and trap it
to protect integrity of the report.
4. Page totals: It is customary to provide page totals of all numeric columns at the
bottom of the page except for such numeric values as serial numbers, identifica-
tion numbers, or codes. We provide page totals of all significant numeric values
presented on the report at the bottom of the page. It will be in three lines with the
first and the last lines being dashed lines to demarcate the data and the middle
line containing the page totals. The page totals may be longer than the space pro-
vided, and we need to either provide adequate column width or wrap the text. In
some cases, I have seen one extra line used for page totals, or totals presented in
alternative lines to decongest and avoid size problem of the derived values.
5. Running totals: Running totals are not very essential, but in some important appli-
cations involving large sums of money, running totals are used. A running total
is the sum of the data presented on the report from the first page until the present
page. On the first page, the page total and the running total will be same. On the
second page, the running total will be the sum of the previous page total and the
present page total. Usually, the report-generation tool provides this facility. We
usually use the bottom line of page totals, then present the running totals, and
then a bottom line is used to demarcate the running totals.
6. Grand totals: We present the grand total at the bottom of the last page. The grand
total is the sum of all the data presented from first page to the last page. On the last
page, the running total and grand total would be equal. Of course, when we pres-
ent running totals, we need not present the grand total. If we are not presenting
running totals, we need to present the grand total. We use three lines or four lines
for this purpose. The first and last line will be demarcation lines and the middle
one or two lines are used for presenting the totals.
Report-generation tools allow us to define all these and connect the report to the database
tables, as well as set relations between the selected tables. Then they also allow us to see a pre-
view of the report as it would appear. These tools also allow us to define a subreport (a report
within a report) so that we can generate really complex reports on the fly. When executed,
the report-generation tool produces the report and displays it on the screen. It will enable the
user to save the report in a file, and it provides facility to save the report in most popular file
formats including MS-Excel, PDF, Word, and so on. As programmers, we need to master the
report-generation tool selected for use in our organization. It saves significant programming
effort as well as the tedium of programming and testing the required reports. These tools also
allow the report to be printed on the printer connected to the computer or the network.
If we are using a report-generation tool, we need to include routines that call the report-
generation tool, along with the name of the report we designed, in our programs to gener-
ate the desired report with all the features described earlier.
Control Statistics
We need to present control statistics in all reports presenting bulk data. First, what is bulk
data? The opinion differs but in my humble opinion, if we are extracting data from a large
table and presenting most of it in the report, I would say it is bulk data. To be called
122 Computer Programming for Beginners
bulk data, the report must span more than three pages at a minimum. If we are counting
records or lines in the report, in my humble opinion, 500 records or more can be called
bulk data. It would be ridiculous to classify a two-page report as presenting bulk data.
Control statistics help the users in ascertaining the efficacy of the processing. We present
the following data as control statistics:
1. Total number of records processed: If we are extracting information from a single table,
this information is easy to obtain. If we are using multiple tables, determining this
value becomes dicey. In my humble opinion, it is better to present the number of
records in the main master table on which other tables depend. Alternatively, we
may present each table name and the number records in that table for all tables
considered in the report. This is going too far, but if the customer demands, we
need to provide this data.
2. Number of records included: We need to count the number of records included in the
report and present this value. This value would be less than or equal to the total
number of processed records. We may use the same rule we followed for deriving
the total number of records processed for this value also.
3. We also present grand totals of significant numeric values, such as total money
received or paid, the balance, and so on as applicable for the application at hand.
4. We may also present any other values specific to the functionality at hand based
on the need of the users.
5. In fact, we may present the summary of all significant numeric values presented
in the report.
Usually, we output the control statistics on a separate page at the end or at the beginning
of the report.
1. To address: This is usually the IP address of the destination to which we are trans-
mitting information.
2. Type of information: We need to provide the type of information being transmit-
ted by us. It could be a simple text message, a graphic, a file, an email, or some-
thing else.
3. If it is an email, we need to provide the email ID from which we are transmitting
the message, the subject, and the additional IDs if we are sending a cc (carbon
copy) or bcc (blind carbon copy).
4. If we are sending a database connection string, we need to provide the string with
all details like the IP address, the type of security applicable, the user ID, the pass-
word, the name of the database, and any other details specific to the situation.
We need to use the construct provided by the SDK and provide the required informa-
tion as detailed earlier in the specified format along with the construct. We need to pro-
vide error-trapping routines after this statement, as errors, like not being connected to the
Output Statements 123
Internet, can hinder the transmission. We also need to capture the successful transmission
message and flash it back to the user.
Sending information on the Internet is an involved subject but fortunately, the SDK takes
care of all the backend stuff and all we need to do is to utilize the constructs provided by
the SDK and code the error-trapping routine as well as to capture the transmission infor-
mation so we can inform the user with a suitable message.
1. Ensure that the network is functional and that we are connected to it.
2. Send signals to the server/client machine to ensure that it is ready to receive the
message. If it is not ready, then wait for some time and resend the signal, asking
for permission to transmit. If the permission to transmit is not received a preset
number of times, abort and flash the error message as designed.
3. Send the message in packets and receive acknowledgement that it is received
properly and resend it if it was not received or not received properly.
4. Once all data is transmitted and acknowledgements are received, terminate trans-
mission and flash a successful transmission completion message.
Final Words
Output is the most important part of data processing, as the very purpose of data process-
ing is to deliver outputs. In this chapter, we discussed how to program delivering outputs
to most frequently used output devices. Now what all we need to do is to learn the specific
constructs provided by the programming language and build our program to deliver the
outputs to the desired device.
11
Other Statements
Introduction
While the input statements, output statements, processing statements, and control state-
ments form the most important sets of statements, we have many other functions to be
performed by programs. These statement classes are:
1. Documentation statements
2. Starting and ending statement
3. Declaration statements
4. System calls
5. Inter-program communication
6. Interrupt handling
7. Device handling statements
8. Conversion from numbers to words
Documentation Statements
Why are documentation statements needed inside computer programs? They belong in
user manuals—don’t they? This is a relevant question. Inside computer programs, we
do not document how to operate the program, but one fact we all have to understand
and come to terms with is that any good program that performs useful functions and is
put into production will certainly need maintenance. By the term maintenance, I mean
corrections that need to be implemented in the program due to changes in the business
or technical environment. The tax rules keep changing; the technology keeps getting
upgraded; the business scenario keeps changing; and in this manner, there will be so
many changes that keep happening, and our programs must be updated periodically.
You may be surprised that many programs written in the 1970s are still in production
serving their users effectively. Even a major event like the Y2K (Year 2000) phenom-
enon could not replace these programs! While no accurate figures were available on
the amount of money spent on Y2K maintenance, we can safely conclude that about
125
126 Computer Programming for Beginners
5–6 billion Dollars was spent by the USA alone! The cost was huge because most of
the programs did not have the proper documentation to assist in easily upgrading the
programs. Of course, in-line documentation was made a part of good programming
standards way back in late 1970s. It is now unthinkable to write program code without
documenting statements. We write documentation statements in programs in the fol-
lowing places:
1. Program header: The program header is placed at the beginning of the program and
would consist of:
a. The name of the organization for which the program was developed and
a statement indicating that the intellectual property rights belong to that
organization
b. The author who initially developed that program
c. The date on which the program was originally written
d. The function of the program briefly in one or two lines
2. Every time maintenance is carried out on the program, we document the
maintenance history as an extension of the program header, consisting of:
a. The date of maintenance
b. The author who carried out the maintenance
c. The details of the software change, in brief
d. The location (line numbers) inside the program
e. The reference to the maintenance request that initiated the change
3. At the beginning of every set of control statements, we document the action
performed and the result to be achieved by the set of control statements.
4. At the beginning of every set of looping statements, we document the purpose of
the loop and how it gets terminated.
5. Whenever we use file or table open statements, we document the expected
contents of the file/table to be opened.
6. If we use any complicated set of processing statements and long equations, we
document the functionality briefly.
Documentation statements are ignored by the compiler. These statements would not be
compiled into the executable program. They are there only to assist the software mainte-
nance personnel. Documentation statements are also referred to as commenting statements.
1. We have different types of programs like the main program, the subprogram, the
function, the method, and so on. We use the first statement to indicate the type of
program to the computer. With the help of this statement, the computer can accord
appropriate treatment to the program.
2. In some programming languages, we place some code even before the first state-
ment like header files. In such cases, this statement tells the computer to begin the
program execution from here on.
COBOL used the phrase STOP RUN as the program termination statement. The C language
family uses the closing curly brace (“}”) as the terminating statement for the program. Most
of the programming languages use “Return” as the terminating statement for the subpro-
grams. The actual termination statement may differ between different programming lan-
guages, but every programming language mandates a program termination statement to
communicate to the OS that there are no more statements to be executed in this program.
128 Computer Programming for Beginners
Declaration Statements
When we write programs, we use predefined set of keywords defined by the suppliers of
the SDK. This set of predefined keywords tells the computer what action is to be taken.
These are like verbs in the parts of speech. These need to be followed by the objects on
which to take the specified action. We add our own words to supply data to the action
keywords supplied as part of SDK. These could be variables, constants, and calls to other
programs. The predefined keywords supplied as part of SDK are translated into computer
instructions (generally referred to as opcodes) into binary numbers during the compila-
tion process. Then the words supplied by us are translated as addresses in the RAM where
data can be found. If the word supplied by us is a call to another program, the OS resolves
it as another program and takes appropriate action.
But how does the compiler know what words are valid among all the words in the
program? While compiling the program, the compiler searches the set of keywords sup-
plied as part of the SDK first and resolves them. Then, for the other words, it searches
the library of routines supplied along with the SDK and resolves those words. Then it
searches the library of routines we added as part of the project and resolves some words.
The remaining words need to be declared within the program. If any word is not found in
any of them, the compiler throws up an error.
Declaration statements aid us in declaring those words so that compiler can resolve them
and prepare the executable code. Declaration statements help us in declaring the data,
names of subprograms, functions, and so on. Some declaration statements are implicit. In
the FORTRAN language, the letters I, J, K, L, M, and N denote integers, of which I and J
were extensively used. They were so extensively used that whenever we needed a counter,
we automatically declare and used i! In BASIC, any word with the suffix of “$” denoted a
string, and all other words are denoted as single-precision floating point numbers. Even
in present day programming languages, the word “Sub” declares a subprogram. In the
explicit declaration statements, we declare data variables, arrays, files, database connection
strings, table names, and so on.
In Visual Basic family, we use the keyword DIM to declare variables. Then we supple-
ment this keyword with other keywords like Integer to indicate the integer type of data;
Double for double-precision floating point number; Date for declaring a variable as a date
data type; String to declare a word or a string of characters; and so on.
In the C language family, we use data-type keywords as the declaration of the data.
We use the keywords like int for declaring an integer-type variable; string for a string of
characters for names; Boolean for Boolean-type variable; double for a doubleprecision
numeric data; and so on.
When we declare a variable, a corresponding amount of RAM is allocated to accommodate
the size declared and is referred to by the name we used in the declaration statement.
A value of NULL is moved into that location. Therefore, we have to explicitly move an
initial value into that variable space so that we do not encounter defects during program
execution due to oversight. This process of moving an initial value into the declared vari-
able is called “initialization.” In fact, many defects encountered in the field during produc-
tion are due to non-initialization of variables. Most languages allow the declaration of
variables to be clubbed with the initialization. For example, in the C language family, the
declaration statement and the initialization statement would be clubbed and written thus:
Int basic_pay = 0;
Other Statements 129
Most languages allow initializing multiple variables in one single statement, but one
prerequisite is that they all must be of the same type. For example, here is a sample
statement declaring and initializing a set of integers:
Int i= j= k= l= 0
We need to learn the different keywords provided by the SDK for declaring data and
program names and then declare the required items as required by our program. Here
are a few common rules to be adhered to while using declaration statements:
1. The syntax rules specified for the declaration statement must be adhered to with-
out exception.
2. The name of the variable or data name being declared would have some rules
such as:
a. The length of the variable name is restricted. In some programming languages,
it is restricted to 8 characters; in some it is allowed up to 40 characters long.
b. It must contain no spaces inside the name, that is, it has to be one word.
c. It permits alphabets, numerals, and one or two special characters like the
underscore (“_”) or the dash (“-”) characters. No other characters are usually
permitted.
d. In most cases, a variable name must begin with an alphabet.
e. There would be other rules specific to the programming language, and we
need to adhere to them while defining our own words for declaration.
3. Some programming languages allow declaration statements at any point inside
the program, but it is a good programming practice to place all the declaration
statements at the beginning of the program. That way, it becomes feasible to assess
the amount of RAM being used by the program easily without scrolling through
the entire program.
1. Local variables: Local variables are local to the program in which they are declared.
The RAM allocated to such variables is released immediately upon closure of
that program. It is especially useful in subprograms. The variables declared in
subprograms are closed and their RAM is released as soon as the subprogram
completes execution. That way, we do not hold on to finite RAM unnecessarily.
These variables can be accessed only by the program in which they are declared.
These variables are also referred to as dynamic variables.
2. Global variables: Global variables, as the name suggests, are accessible to all the sub-
programs and the main program. Usually all the variables declared in the main
130 Computer Programming for Beginners
program are global variables and are available to all subprograms in the program.
They will be released only when the main program completes execution and is
closed by the OS. Global variables are also referred to as static variables. When
a subprogram needs to return some values to the main program, it makes better
sense to use global variables than declare local variables in the subprograms, as
we would be using double the amount of RAM for the duration of the execution of
the subprogram.
As you can see, each type has its own place. My humble suggestion is to use local variables
for processing work inside the subprogram but use global variables for exchange of infor-
mation with main program.
System Calls
We write programs in higher-level languages. The first level at which the computer
understands instructions is in binary numbers, that is, zeroes and ones. The next level
is the assembly language of the CPU, and then we have programming languages. The
programs we write are translated into machine language by the compiler. The program-
ming languages make use of the facilities provided by the operating system, but not all of
the facilities provided by the OS are used by the programming languages. For example,
if we wish to read the directory listing to see if a certain file exists, we may not have the
keyword to do that. Similarly, creation of a new file on the disk, read all the devices con-
nected to the computer, and so on, are keywords that are usually not available in the
programming languages. Therefore, most programming languages provide a facility to
call the utilities provided by the OS or other utilities loaded on the computer and execute
them from within our program and process the results obtained from such execution.
This facility is referred to as a system call. System calls enable us to call a program
resident on our computer and execute it from within our program. It also facilitates us
to read from various blocks of memory, such as the processes in execution, the database
connections, the active process, and so on, and use that information in our program.
Of course, to read such information, the security privileges must be available to the user
executing the program.
Various keywords are provided by different programming languages to use system
calls. In the C language family, the keyword “system” is used to make calls to the system.
The command to make a call to the system looks something like this:
System(command);
API Programming
APIs (Application Programming Interfaces) are like system calls. While system calls are
used to call programs that are on the same computer and OS, APIs are used to make calls
to programs that are not of this OS and may not even be on this computer. APIs are what
make the applications like Google Maps work on different computers, including mobile
phones. An API is a set of subprograms, protocols, and software utilities. We can use these
in our programs to call other software like Google Maps, YouTube videos, and Flickr pho-
tos and make them available to our users.
To write programs using APIs, we need to read the manual of the specific API and then
issue appropriate statements in our program. However, we need to supply the following
information when calling APIs in our program:
Once we provide this information and issue the statement in our program, it will be com-
piled, and the desired action will be achieved during the execution of our program. We
need to study the programming manual of the specific API we are going to use before
programming the API. We need to learn the keywords it provides for calling it from our
programs and then implement them in our programs as required.
Inter-Program Communication
This is an important topic deserving full treatment and I have dedicated Chapter 13 for
this topic.
Interrupt Handling
An interrupt is a high-priority instruction to the processor either from a hardware device
or a software program. When such an interrupt is placed on the CPU, it temporarily sus-
pends the execution of the present program and attends to the interrupt received. Actually,
the CPU executes the current instruction and instead of fetching and executing the next
instruction, it attends to the interrupt. An interrupt is usually placed on the CPU by hard-
ware devices, but software also places interrupts, especially when an error that is not
132 Computer Programming for Beginners
TABLE 11.1
Interesting Interrupts of PC
Interrupt Number Interrupt Description Interrupt Number Interrupt Description
trapped by the program is encountered. The divide-by-zero error is one good example of
such a software error that places an interrupt on the CPU.
For example, when we type something on the keyboard, an interrupt is placed on the
CPU. If we attempt to read from the disk, the disk places an interrupt on the CPU when
it is ready to receive or send information. All such interrupts are programmed by the
developers of the system software.
Two terms are associated with interrupt. One is the program that needs to be executed
when an interrupt is placed on the CPU. It is referred to as the “interrupt handler,” or
ISR (Interrupt Service Routine). The second term is the “interrupt vector,” which is the
reference to the respective ISR. All the device drivers are programmed using interrupt
programming.
We need to develop ISRs in low-level languages or the C family languages which
provide constructs to handle interrupts. High-level business-oriented languages like
COBOL usually do not provide interrupt programming facilities.
Interrupt programming is like the usual programming but is focused on device handling
and hardware control. To be able to develop ISRs, we need to study the programming
manual of the device and then develop the needed ISR. Some interesting interrupts of PC
are given in Table 11.1.
and so on), and handshake commands (in session, session ended, and so on). A device
driver program needs to handle all these commands. It also needs to handle all error con-
ditions generated by the device. Device driver programs need to be written in low-level
languages or in the C family, which provides interrupt handling and device programming.
Final Words
This chapter is devoted to explaining the concepts of using advanced aspects of pro-
gramming. This is not the usual everyday stuff. These programs are required only occa-
sionally. We attempt these programs only when we have gained reasonable expertise
in a programming language and in writing software in general. We need to exercise
care while developing the code as well as in testing it. The actual implementation of
these aspects differs from OS to OS and one programming language to another. And all
programming languages have not implemented all these aspects; therefore, when you
are faced with the need to use these concepts in your programming, please thoroughly
read the programming language manual, understand it fully, experiment, and only then
implement.
12
Error Handling
135
136 Computer Programming for Beginners
We programmers should not allow this situation. We ought to invariably include an error-
handling routine in every program we develop.
The OS expects faults to develop and result in failures of the programs in execution. The
failure of the program should not be allowed to affect the functioning of the computer and
hinder the performance of other programs in execution. So, whenever a program fails,
that is, the program encounters a fault from which it cannot recover, the OS performs the
following actions:
If a program is aborted (abruptly closed), depending on the OS and the data storage we
used, the following consequences can happen:
1. If there are any flat files that are open, they could be damaged or some data may
be lost.
2. If any transaction is being processed, it may be lost or partially saved, resulting in
issues of data integrity.
3. If the backend DBMS is a weak one, the table may get corrupted.
4. The user may have already paid the money, in scenarios such as ticket booking,
but could not get the desired result. This can lead to a feud between the user and
the organization. It can even lead to legal hassles.
5. If a parameter file is impacted, we may have to rebuild it. This can result in our
application being down for some time and result in loss of revenue.
Therefore, we need to ensure that every error is handled appropriately by our program.
1. While the OS is thoroughly tested both inside the development facility and in the
field by prospective users in pre-release beta testing, still a few errors lurk inside
the OS. These errors lurk inside that portion of the OS that is not used by the gen-
eral users. The advanced users are few, and the advanced features of the OS are
not frequently used. Therefore, when you use advanced features, the OS is likely
to throw up errors.
2. It has become commonplace to use third-party code libraries for specialist func-
tionality like file management, database connectivity, rules processing, and so on.
These code libraries could have some bugs that could throw up errors when our
code tries to utilize that faulty code.
3. The framework or the app server are also built on large code that may have a few
lurking bugs. When our code tries to use that portion of faulty code, errors can
throw up.
4. The DBMS packages we use for storing and retrieving data also could have a few
undetected errors that can throw up errors.
5. We have networking hardware and software that can generate faults and cause
errors in our program execution.
6. It is widely accepted that the best quality that can be humanly achieved is of a
6-sigma level in which only 3 bugs can still lurk in a million opportunities; that is,
in a million lines of source code, there will still be 3 undetected bugs!
138 Computer Programming for Beginners
Since it is not possible for us to avoid those errors, we simply have to write a routine to
catch those errors and then allow the user an opportunity to smoothly exit that functional-
ity and do something else. We need to strive to avoid program aborts (abrupt closures of
the programs) to protect the integrity of our application data and programs.
Errors are usually classified into three levels:
1. Minor errors: Minor errors are those errors that do not cause obstruction to pro-
cessing or functionality but can cause a nuisance to the users while using the
application. They can be wrong spellings, wrong placement of labels, insufficient
space for entering data, and such other aspects. They certainly cause inconve-
nience and irritate the user but have no negative effect on the functionality. While
the users can continue to use the application, we need to improve the program by
removing the errors and rerelease the application once again.
2. Major errors: Major errors are those that impact the results of processing or cause
obstructions to using the application. Inaccurate results, or diminished precision
of results, cause a fault that is recoverable (for example, the result is truncated)
and such other errors. The OS allows us to recover from the error. The OS will
not abort the program. The users can continue to use the application by isolating
the erroneous functionality. This isolation has to be achieved by manual means,
which is not desirable. We need to immediately issue a patch if possible, to correct
the errors or fix the bugs and rerelease the application as soon as possible.
3. Critical errors: Critical errors are those that cause the program failure. The OS can-
not proceed further in the execution of the program. The OS aborts the program if
the program does not provide for error-handling instructions. When critical errors
surface in the program, we need to remove it and fix the bugs immediately. We
should not allow the users to continue using the application. We need to rerelease
the application only after the bugs are fixed and tested.
Error handling has two sections, namely, error prevention and error detection.
Error Prevention
To prevent errors from causing faults, we need to foresee all opportunities in our programs
that can cause a fault and take preventive actions. Here are some areas when errors can
raise faults:
1. Arithmetic operations:
a. The example I have been repeating is the denominator in a division operator
becoming zero. Therefore, in our programs, it is essential to split the math-
ematical formula into two or three parts in our program. We need to write a
statement for the portion that comes before the division operation; then the
division operation; then the remaining portion of the formula. We need to
ensure that each division operation must be in a separate statement. We also
need to write a statement before the division statement to check the value of
the denominator and send an error message to the user if it becomes zero. It is
Error Handling 139
better to have a common routine for this purpose and call it in our programs
whenever we encounter a division operation.
b. Negative numbers: Arithmetic operations may sometimes result in negative
numbers. Negative numbers can cause faults. For example, if we are finding a
square root of a number that is a derived value from an arithmetic operation in
the previous statement, it can result in a fault and lead to failure, aborting the
program execution. Therefore, it is essential to check for the sign of the num-
ber before subjecting it to square root, cube root, or any other such operation
to ensure that a fault is prevented.
c. Multiplication: When we multiply two positive numbers, the resulting value
can become too large for the receiving variable to accommodate. In most devel-
opment platforms, the receiving variable truncates the value if it is too large to
accommodate. This leads to an erroneous result. While multiplication does not
result in a fault and failure, it has the potential to produce an erroneous result.
d. Size of receiving variable: The result of the arithmetical operation is assigned to
some variable. We need to ensure that this receiving variable is large enough
to receive the result.
2. Loops—interminable: Finite loops based on a counter usually do not cause any
issues, but condition-dependent loops can sometimes become infinite loops and
hang up the system. An advanced OS can detect an infinite loop and terminate
the program execution, but most others freeze the system and it may need a
reboot in microcomputers. To prevent faults, we need to ensure that the loop will
certainly terminate. We have to formulate the condition carefully and insert a
mechanism inside the loop that makes the condition to become TRUE or FALSE,
as the case may be.
3. Opening files: We open a file that is expected to be in the location specified or
derived by the program. Sometimes, due to some reason, the file may not be there
altogether, or its name could have changed for some reason. In such cases, it results
in a fault and failure. More often than not, we write programs assuming that the
file open operation will not fail. Therefore, to prevent file-opening errors, we need
to check if the operation has been successful and if it is not successful, we need to
give a suitable message and move the user to some other transaction in the system.
4. Connecting databases: In the present day, the database may not be on the same
computer on which our program is running. It is usually on a separate machine
connected through the Internet. The database connection can fail if the Internet
connectivity fails. Internet connectivity can fail if there is a hardware or software
problem. The connection can also fail if the database is corrupted, even though
such a possibility is very slim. Therefore, we need to check for the success of the
connect-database operation before we move on to another program statement to
ensure that the database is indeed connected. If there is a failure, we need to flash
an appropriate message to the user and navigate the system to another transaction.
5. Empty tables: This is another frequent error that we encounter, especially immedi-
ately upon installation and during the first use. As soon as we open a database table
or a flat file, we begin the statement with the statement “While not EOF” to process
the records. Now, if the file or table is empty, it can result in a fault and a failure.
Therefore, we need to ensure that there are records, using appropriate statements,
in the file or table before issuing the statement “While not EOF” to prevent faults.
140 Computer Programming for Beginners
6. Machine readiness: In programs that interact with machines, like the printer, a
CNC machine, or any other machine controlled by the computer, the machine
may not be ready for a variety of reasons. When our program interacts with the
machine, we must check for errors reported by the machine, flash a suitable mes-
sage whenever the machine throws up an error, and navigate the user to another
functionality.
7. Deadlock: A deadlock is a serious error that is not easily detectable especially in
a microcomputer OS. This can happen only in the case of dedicated resources.
Some resources like the flat files can be opened by only one program at any given
instance of time. Earlier printers were also dedicated, but with the development
of the SPOOL (Simultaneous Peripherals Operation On Line) facility, this problem
was solved. In this, instead of allowing the program to send output directly to the
printer, the OS redirects it to a file and prints it whenever the printer is free. The
following bullets will explain how a deadlock can occur:
a. Two (or more) programs are running concurrently.
b. Program A opens a flat file X.
c. Program B opens a flat file Y.
d. Program A tries to open file Y for some reference but has to wait because
program B is using it.
e. Program B tries to open file X for some reference but could not open it because
program A is using it.
f. Now, both programs wait indefinitely!
g. This situation is called a deadlock.
h. How do we detect a deadlock? Unless the OS has the facility, it is not possible
to be detected by our program.
i. To solve it programmatically, we need to time out our program. If the required
resource is not made available within a specified amount of time, we need to
navigate the user to another functionality, giving an appropriate message to
try again after some time lapse.
Whenever we try to perform these operations in our program, we need to write appro-
priate statements to catch the error. These operations may not cause any issue in most
cases, but we need to catch the error in those few cases. Therefore, we need to include
error-handling statements at all these places likely to throw up errors. In addition to the
situations mentioned earlier, there could be situation-specific opportunities for causing
faults, and we need to include suitable error-catching statements.
Handling Errors
First of all, we need to write an error-catching statement after a statement that is likely
to throw up errors. Some of the possible actions that are likely to throw up errors are
described in the previous section. Usually, the error catching statement would begin
with “On error,” but it could be different in different development platforms. Usually, the
Error Handling 141
error-catching statement directs the program execution to a subprogram that handles the
error. Error-handling is the only scenario when the “Goto” statement is permitted in good
programming practices! We also pass some parameters to the error-handling subroutine,
which includes the error number returned by the OS and other data like the subprogram
in which the error was caught and the next subprogram to which the user would be navi-
gated to and so on. Here is an example of an error-handling instruction set:
a=b/c
On error goto errorhandler(errno, pgmname)
Let us assume a, b, and c are numeric variables. If c becomes zero, then an error will be
thrown up. The “on error” statement will pass on control to the subprogram, “errorhan-
dler,” which needs the error number returned by the OS and the program name as the
parameters. Now the errorhandler subprogram would look something like this:
In the earlier program, we assumed that the error number returned by the OS was one for
division by zero. Then, we display a message box with only the OK button. When the user
clicks the OK button, we navigate the user to the main menu form. Of course, the earlier
program is not proper syntax. It is shown for illustration purposes only.
We used a switch—case statement. We can put all the error numbers and display differ-
ent messages for each of the error numbers. It is a better programming practice to retrieve
the error messages from a database table rather than hardcoding as shown here.
1. Close the program and allow the user to restart it. While so doing, we ought to
ensure that we navigate the program execution to the program closure subpro-
gram, which will close all open files, database tables, database connections, and all
142 Computer Programming for Beginners
such actions that are necessary to protect the integrity of data and other artifacts.
This is not the best alternative because the user gets frustrated, but sometimes we
may not have any other alternative, and only in such circumstances do we need to
resort to this action.
2. If it is feasible, we need to flash a message to the user, give the user an opportunity
to rectify the situation, and redo the action once again. This is the best alternative,
but it is not possible in all cases.
3. We can flash an appropriate message, then allow the user to read it, and once he/
she acknowledges the message, navigate the user to the program from where the
user came to this current functionality. This may not always be possible.
4. We can flash an appropriate message, then allow the user to read it, and once he/
she acknowledges the message, close the current program and navigate the user to
the opening page or landing page. In this case, we do not force the user to restart
the program, and we allow the user to close the program in a smooth manner to
protect the integrity of the data and the program artifacts.
5. In all cases, we need to send an appropriate message to the application administra-
tor so that the fault can be investigated and corrected. In some scenarios, we may
need to obtain the permission of the user before sending such a message. It is a
better practice to obtain user approval, as the user may know the exact cause of the
failure and he/she may take appropriate action without troubling the application
administrator.
Clicking the wrong button: This is a frequent occurrence that can cause severe damage
unless it is prevented.
1. Clicking the delete button instead of cancel button can cause something to be
deleted. To prevent any damage to data, we need to display a confirmatory mes-
sage asking for the confirmation of the delete action, and only when the user
confirms the delete action should we perform the delete action as designed.
2. Clicking the discard button instead of save button. In this case, too, we need to
display a confirmatory message and perform the designed action only when
the user confirms the action.
3. Clicking the save button instead of the discard button. Usually we do not
take confirmation from the user for save action, as the saved information can
Error Handling 143
subprogram for checking such fields and call it during the keypress event to
prevent wrong data being entered into the controls and as a sequel into our
database.
Clicking on the empty areas: What happens if we click on the empty area on a form or
a frame of a grid? Most development platforms do nothing. They take no action.
But I have seen that it initiates some action in a few cases! Frames, forms, and such
other controls provide for programming the click or double-click actions for us to
use. If we leave that alone, no action is performed, but if we put in some statements
in such places, it may cause problems during the runtime. My suggestion is not to
program such events unless it has a specific purpose and is so designed.
Canceling the action midway: On many occasions, the users leave the transaction
unfinished or cancel the action midway. While we cannot prevent the users from
so doing, we need to put in actions in our programs so that such actions do not
damage our data integrity. For leaving the transaction unfinished, we need to
time out the session and not save it. For the cancelation, we just need to take a
confirmation before we actually cancel the transaction. But in both the cases, we
should not save the data to our database.
6. In some cases, the ticket is allotted first and then the money collection is initiated.
In some rare cases, the payment collection may fail from the user to whom the
ticket is allotted. Here we are left with an unsold ticket and a dissatisfied customer
in the user who did not get the ticket!
Then there is one more scenario. In report generation, on-the-fly generation from the data-
base has become the norm. In some cases, the report generation may take a significantly
long time and the transactions may keep taking place. Before the report is completed, some
new records may be inserted and some records may be deleted. By the time the report is
produced, the data might have changed and the report may have become erroneous.
These scenarios cannot be visualized by the programmer. These cannot be made part
of the coding standards. Of course, they need to be made a part of the design guidelines.
These error conditions have to be designed before a programmer can implement them.
Thus, the onus for handling the errors rests on the shoulders of
Introduction
Inter-program communication (IPC) is communication between two or more programs
in execution. It is also referred to as inter-process communication, as a program under
execution is called a process for the CPU. For example, when we print a document on a
printer, our word processing program is communicating with the device driver program
that drives the printer. This is much more of a requirement when we develop programs
for system software like the operating system. Communication can happen in many ways
between programs. One way is to pass parameters while calling another program, hand-
ing over the execution, and receiving the results. But if two programs have to continuously
exchange information between them, we need to use other techniques. This chapter
discusses these techniques.
When a program is loaded into the RAM for execution, the OS makes that RAM inac-
cessible to other programs. This is to protect the integrity of the program space and to
prevent other processes from modifying the contents of the RAM without authorization.
If other programs can access and modify the data of another program, then hackers will
have a field day and computers will become a joke in terms of the accuracy of results. But
the C family programming languages provide a facility to access any location in RAM
using pointers. Using pointers, we can access any location in the RAM by giving the
address of the location of the RAM. That is, pointers can access any memory location,
provided it is not allocated to any other program! If a program tries to access a memory
location that is already allocated to another program, then an unrecoverable error results,
aborting the program. But in multi-user OS computers, even that will be prevented if that
location is not allocated to the program trying to access a specific location using pointers.
Simply stated, the OS builds fortifications around a program’s space in the RAM. The
memory space of a program is impregnable and cannot be accessed by any other program
running on the computer or even from other computers on the network.
But in building applications, especially in the real-time programming and machine-
control systems, programs occasionally need to communicate with each other. Usually
we pass some data items from one program to the other program as parameters. We have
well-established means for communicating between programs if they are running on
different computers using networking, but if the communication is needed between two
programs running on the same computer concurrently, we need different means. We can
achieve this easily by placing a file or a database table on the disk that can be accessed by
both the programs. But the time taken to open a file or a database table is much longer and
cannot be relied upon in the real-time software applications. Therefore, the inter-program
147
148 Computer Programming for Beginners
communication needs to take place via the RAM. Recognizing this need, multi-user oper-
ating systems provide for a facility generally referred to as shared memory. Different oper-
ating systems provide different kinds of shared memory. The shared memory needs to be
declared as a variable or an array of variables. The C programming language provides a
specific keyword to declare a variable as shared memory. Here are the general rules to be
adhered to while using shared memory.
1. The shared memory has to be declared before it can be used. In some cases, the
program needs to declare it, and, in some cases, the system administrator needs to
set up the shared memory.
2. Once declared and set up, the shared memory becomes accessible by other
processes.
3. The program needing to access the shared memory needs to connect to it like
opening a file. Each OS provides its own set of statements for connecting to shared
memory. The connection could be to read or to write or both.
4. Once connected, the program can read from the shared memory as well as write
into it depending the type of connection.
Inter-Program Communication 149
5. Once the action is completed, the program needs to disconnect from the shared
memory. Of course, some operating systems allow multiple programs to stay con-
nected with the shared memory but not all.
6. In most cases, however, only one program can access the shared memory. That is,
even if multiple programs are connected to the shared memory, only one can read
or write the shared memory at any given point in time.
Message queue: A message queue is a specific amount of RAM set aside for providing
messages between processes. A message contains multiple data items. A queue
can accommodate multiple messages, and each message has a specific amount
of space to accommodate the message. When a process needing data from the
message queue connects, the message at the head of the queue will be delivered
to it, then all other messages are moved forward by one slot. The messages are
preserved in the queue until it is read. Any process can put its message in the
queue, and any process can read that message. These queues are implemented
by the programming language, and we need to study the concerned manual to
understand the facilities offered by the message queue before we can use it in our
programs. Here are the typical features of message queues:
1. To enable message passing, the OS needs to have a built-in message queue
facility that handles the message being passed on by the processes (programs
in execution). We need to learn about this facility by studying the advanced
features of the OS manual.
2. One of the programs utilizing the message queue needs to create and manage
the message queue, issuing appropriate statements. Then the message queue
becomes available for other programs.
3. Each of the subordinate programs needs to declare the message queue in their
programs before using it.
4. Each message has to have a specific length, which depends on the OS that
sets the maximum length and the declaration in the program that sends the
message.
5. Each message needs to include the process ID (PID) at the beginning of the
message. The OS will use this ID to direct the message to the right process.
6. Before using the message queue, the sending program needs to initialize the
message queue programmatically.
7. Using proper program statements, the program can place the message on the
message queue.
8. The OS will then place an interrupt on the process that is the destination for
the message in the message queue.
9. The destination process initiates the process for receiving the message from
the message queue.
10. The transfer can happen in synchronous mode (the receiving process halts
its execution until the transfer is completed) or in asynchronous mode (the
receiving process can continue its execution while the message transfer is
going on).
150 Computer Programming for Beginners
11. If the transfer could not be completed for any reason, then the OS places the
interrupt again and resends the message to the destination process once again.
The OS keeps doing so until it receives a successful transfer signal from the
destination process.
12. Once the transfer is completed, the receiving process sends a signal indicating
that the transfer completed successfully. Then the OS will remove the message
from the message queue.
As programmers, we need to include statements for composing the message
concatenating the data items together along with the destination process ID,
initializing the message queue, then issuing the statement for sending the mes-
sage to the message queue. The rest is handled by the OS. I have described briefly
about the message queue facility. You need to study it in depth before attempting
to utilize message queues in your application.
Message passing: Message passing is a technique in object-oriented applications
to invoke another program to take some action. In other computers, we need to
make a system call to invoke another program. Message passing is akin to calling
a subroutine and passing parameters. In this case, the main program halts until
the subroutine executes and sends the results back. In message passing, the send-
ing program sends some data to the receiving program and both can continue
to execute concurrently. Message passing allows programmers to call another
object/program without losing control to the called object/program, and once
the invoked object/program completes execution, the results, if any, are received
through another message. The message can contain multiple data items. The
sending program and the receiving program can execute concurrently. The
message is to be formed by the sending program and, using the appropriate
keywords and syntax, it then sends the message to the receiving program. This
sending can happen either in synchronous or asynchronous mode. We need to
learn the specific programming language before writing programs using message
passing facility. This facility is available in C++ as well as Java.
Semaphore: Semaphores are of two types, namely counting semaphores and toggling
semaphores. A counting semaphore is an integer and a toggling semaphore is a bit
that can be either zero or one.
1. Counting semaphore: These kind of semaphores are used to show the number
of units of a resource are available. For example, printers that are connected to
the computer are recorded here. Initially, the OS sets this semaphore to show
all the printers that are connected to the computer. When a printer is under
use by a process, then this semaphore would be decremented, and if a process
releases the printer, then this semaphore would be incremented. This sema-
phore shows only the number of resources that are free, but it will not show
which process is using which resource. We need to use some other mechanism
to do that.
2. A toggling semaphore is used to show if a resource is available or is
locked with another process. For example, if the printer is under use, then
this semaphore is toggled to 1 indicating that the printer is not available.
When the printer becomes free, it will be toggled to zero to show that the
printer is available. The programs read this semaphore and take action as
programmed.
Inter-Program Communication 151
One precaution we need to take is to open the shared memory just before acting upon it and
close it immediately thereafter. When a program is manipulating shared memory, other pro-
grams cannot manipulate it. Shared memory is like a shared toilet. At any given moment of
time, only one entity can use it, but multiple entities can use it one after the other.
14
Coding, Debugging, and Performance Tuning
Introduction
Having learnt the basics of computer programming, including various statements that get
the computer to process the data the way it is designed, let us now discuss the nuts and
bolts of getting them all together to code the program, make it work, remove the lurking
bugs, and then fine tune its performance for release to the users for productive use. In this
chapter, we will discuss these aspects.
Coding
In computer parlance, “code,” in its verb form, and “coding” are the words we use to
denote the work of actually writing the program, putting together all the statements nec-
essary to achieve the functionality assigned to the program. The word “code” in its noun
form is used to denote the program statements in a computer program. Basically, we are
codifying the algorithm for the computer to decipher it and process the data. The word
“code,” or “coding,” is selected because we are basically putting the algorithm in a code
that is understandable to the computer so it can process the instructions. If a layperson
untrained in computer programming reads the program, it simply looks like gibberish
written in English.
It also includes removing bugs as and when necessary. In Chapter 4, we discussed the
basics of computer programs, and in Chapter 20, we will be discussing the programming
standards. Right now, let us discuss the programs and programming.
As noted in Chapter 4, a computer program is a series of instructions to the computer
that tell the computer what to do. A computer is a diligent machine but not an intelligent
one. It cannot determine if an instruction is out of order. It just processes the instructions in
the order they are provided in the program. Therefore, it behooves on us to ensure that the
instructions are given in the right sequence. Here is how we code the program. We follow a
structured way to code the programs. It is generally referred to as the “program structure.”
The program structure defines how a computer program needs to be coded. A program
would have the following structure, normally:
1. Form load: In event-oriented GUI programming, all the processing operations are
tied to a screen. Therefore, all programs are connected to the events on the screen.
As soon as the user selects the option of running our software by clicking the
153
154 Computer Programming for Beginners
option from a menu, we need to load an initial screen. This is usually referred to as
the “landing page/screen” or the “opening page/screen.” We load this screen first.
This will be the first routine to be programmed or rather, it will be the screen that
will be invoked when a user clicks the option in the menu that triggers execution
of our software. All the other screens are invoked from this screen by clicking the
appropriate option or a control. If this screen has some controls on it, then each
control would have many events that need to be programmed.
2. Header routines: In the C language family including Java, header statements are
allowed. These can be declarations or defining formulas as functions and so on.
We need to write them first whenever we include header routines in a program.
3. Program beginning: Every programming language has a specific beginning that is
denoted simply by a single word such as “Main()” or “Identification Division”
or something like that. We have to write what is expected of it. In most cases,
we do not need to write anything with this statement. We just write the required
statement to tell the computer that this is a main program (not a subroutine). This
statement needs to be on a separate line in a standalone manner.
4. Program header: All professionally managed software development organizations
mandate that a program header is written immediately after the statement that
begins the program. A program header is a number of commenting statements in
which the history of the program is maintained from the first time it was coded
and all the maintenance tasks performed on it. It helps to trace any security
breach if it takes place. What should be written in the program header is given
in Chapter 20 on programming standards. In a GUI environment, we place this
header in the event that loads the screen.
5. Initial routines: In some programming languages like the RPG, certain subroutines
need to be executed before beginning the program code. These statements need
to be included immediately after the program header. If there are any predefined
routines that need to be performed at the beginning of the program execution,
they need to be written immediately after the program header.
6. Data declarations: We need to declare all the data items we propose to use inside
the program. Many languages allow for the definition of data anywhere inside
the program, and some programming languages mandate that all data must be
defined at one place. It is better to group all data declarations at one place, as it
would make it easy during maintenance and debugging. It is a good program-
ming practice to declare all the data at the top of the program, as it would be help-
ful to locate any data declaration during debugging or performance-tuning of the
program. One question is universal and that is, should I declare one variable per
statement or club multiple declarations, albeit of the same data-type, per state-
ment? It is generally better to declare one data item per statement as it helps in
maintenance. In case we need to delete or change the type of a data item, it is easier
if there is only one data item per statement. Of course, it increases the number of
lines but the increase due to declarations does not add to program complexity.
7. Initialization: Initialization refers to assigning an initial value to a variable or data
item. When a variable is declared in a program, it would be assigned a NULL
value. NULL is a separate constant that is neither a space nor a number. When we
perform operations, especially arithmetic operations, NULL causes irretrievable
errors and program aborts. Therefore, we need to initialize a variable as soon as it
Coding, Debugging, and Performance Tuning 155
is declared so that we can prevent this kind of fault. Most programming languages
allow this operation to be combined with the data declaration statement. We can
also initialize a variable at any point in the program. It would be a better practice
to initialize the variables soon after declaration or combine the initialization with
the declaration. This way, we would eliminate the possibility of using NULL val-
ues in arithmetical operations. When we use loops, either finite loops based on
counting or condition-based loops, we need to initialize the counting variable of
the one used in the condition. More often than not, we forget this initialization
operation, which causes infinite loops or accuracy issues in the computations.
8. Input operations: We include these operations in the sequence as and when an
input is required. After the declarations and initialization, we need to include the
input statements because the program needs data for processing. The input may
be received from the keyboard, a file, a table, or a port. Here is the place where the
processing begins. The first step is to get the inputs, then process them, and in the
process, we may need further inputs.
9. Output operations: We include these operations as and when data needs to be
output from the computer in the chronological sequence appropriate to the
action. The outputs may be processed data reports, error messages, confirma-
tory messages, a beeping sound, or anything that we output to the user. We
place these statements at appropriate places in the program depending on the
logic followed by the program.
10. Computational operations: We include these statements to solve mathematical
equations in the chronological sequence they are required. These are written at
appropriate places depending on the logic of processing as defined in a design
document or flow chart.
11. Decision-making operations: Decision-making statements cause the execution to
branch away from the next statement based on the outcome of the decision. We
need to be careful to ensure that the branching is accurately defined and that
the execution thereafter goes to the next chronologically appropriate statement.
They are included at appropriate places in the program depending on the need for
decisions.
12. Error-handling routines: When our program runs, it is likely to throw up some
errors. We need to include error trapping and handling routines in our programs.
Chapter 20 on programming standards covers the topic of defect prevention in
greater detail.
13. Program ending: Every program would have an identifiable beginning and an iden-
tifiable ending. While the statements at the beginning of the program create the
environment for program execution, the statements at the end of the program
carry out housekeeping activities so that the next program can run efficiently. We
need to close the program systematically. We need to:
a. Close all open files, tables, and databases that have been opened previously in
our program.
b. Terminate all connections to databases.
c. Close any input/output devices previously connected to our program.
d. The programming language may mandate certain statements in the end, and
if so, we need to include them.
156 Computer Programming for Beginners
e. Any other required statements conforming to the logic of the program like
indicating to the user that the program is completed successfully.
f. In the GUI:
i. These statements need to be put in the form unload event (or remove the
main form).
ii. Close all open forms that are open.
iii. Close the login and save the session detail in the relevant file or database
table.
In the earlier list, bullets #1 to #7 are to be coded sequentially at the beginning of the
program and bullet #13 shall be the last set of statements in the program. The remaining
actions need to be programmed as necessary in the body of the program.
In this manner, we write the program statement by statement. In the GUI, we program
all the relevant control events as needed by the program logic and complete the program.
Each programming language defines its own program structure and we need to abide
by it. Once we complete the program, we need to compile it. Compiling will convert the
source code (statements written in the selected programming language) into object code
(instructions for the computer in the binary language) that can be executed. This compila-
tion process points out syntax errors present in the program. In the bygone days, the com-
piler used to deliver a list of syntax errors that we needed to check one by one, correct all
the errors and resubmit the program for compilation. We needed to do this until all syntax
errors were removed. But now in IDEs, the cursor highlights the first erroneous statement,
allowing you to correct it. Once we correct it, it takes us to the next error and so on until
we correct all the syntax errors.
Once the compilation is completed and the object file is produced, we needed to link
it to the libraries used inside the program. In earlier days, we had to do this manually,
giving commands and listing the libraries to be linked. But now in IDEs, this is carried
out automatically once we select the option to make the EXE file. Once we have linked
the libraries and produced the executable file, we need to test it to see if there are any
logical or computational errors and then rectify those errors. Let us now discuss testing
the developed program.
Testing
It is the testing that delivers the proof of realizing the requirements set for the program and
the adherence of the program to its design. It also proves how well these two are realized
by our program. Since we wrote the program, it is our child. We have every responsibility
to ensure that it works, does what is expected of it, and does not do what is not expected
of it. There are two types of testing generally referred to a black-box testing and white-box
testing.
In black-box testing, we treat the program as a black box and supply the inputs and look at
the outputs. If the outputs are as expected, then the program is working well, and if the results
are in error, we go back to the program, look for the reasons for the unexpected results, and
then retest the program. We do this until the program delivers the expected results. In giv-
ing inputs, we test with a range of inputs, both valid and invalid, to ensure that the program
Coding, Debugging, and Performance Tuning 157
delivers accurate results with the right inputs and blocks the wrong inputs. This type of test-
ing has the disadvantage that we would not be able to give all possible combinations of data
inputs that the users in the field would subject our program to. So, we need to do more testing
before we certify our program as good. Here are the steps in black-box testing:
Once the test environment as well as the test data is ready, we need to run the program
and note the actual results and compare them with the expected results. Wherever there is
a variance, we need to note the result. Once the program is completely run, we need to go
back to the program to locate the errors and correct them, then test it again. We need to iter-
ate the activities of testing and rectification until we can detect no further errors in our pro-
gram. Then we pass it on to the organizational quality-control department for their testing.
White-box testing subjects our program to thorough testing, or in other words, we test
every line of code. White-box testing cannot be carried out the way we carry out black-
box testing. We need a debugger or an IDE for carrying out white-box testing. The IDE or
debugger would have the following facilities:
1. Step through the program statement by statement: Each time we press a specific func-
tion key, one statement of the program will be executed. Of course, we may decide
to run the program execution unhindered at anytime during the stepping through
statements.
158 Computer Programming for Beginners
We carry out white-box testing using these facilities. This is how to carry out white-box
testing:
1. When we begin testing in the IDE, we select the option to move through the pro-
gram statement by statement. Alternatively, we can place a breakpoint at the very
first statement and then step through the program, one statement at a time.
2. Pressing the function key to step statement by statement, we begin from the first
statement and then continue until all the statements are executed.
3. After stepping through the first statement, we continue until we come across a
control statement like IF…THEN…ELSE. At this point:
Coding, Debugging, and Performance Tuning 159
a. We allow the execution to take the branch that is allowed by the data and con-
tinue stepping through until the last statement of the control statement block
is executed.
b. Once all the statements in the control statement block are executed, we move
the execution pointer to the statement prior to the first of the control statement
block.
c. We change the value of the concerned variable such that the execution takes
the other branch of the control statement.
d. We continue stepping through the remaining statements of the control state-
ment block until all the statements in the control statement block in the branch
are executed and the last statement of the control statement block is executed.
4. We repeat step 3 for all the possible branches in the control statement block.
We need to note that the switch…case control statement can have multiple
branches. Similarly, there could be multiple ELSE statements in the IF…ELSE…
THEN control statement.
5. When we come across a loop, we need to execute the program in such a way that
the execution enters the loop and executes all the statements in the loop statement
block. We also need to create a condition that the execution does not enter the loop
statements block and execute the program. If we read records from a database table
or a flat file, we need to execute in such a manner that we need to execute the loop
once with records in the table or file and once with zero records in the file or table.
6. While receiving inputs from the keyboard:
a. We need to execute the program receiving expected inputs to see that the
program works as expected with right inputs.
b. We also need to execute the program with wrong inputs to see that the pro-
gram rejects the wrong inputs as well as retains control of execution without
resulting in a fault condition. While giving wrong inputs, we need to give
wrong inputs for all the controls one by one and see that the program rejects
all the wrong inputs from any of the controls.
c. We also need to execute the program with no inputs at all to see that the
program does not move forward until all the mandatory inputs are received.
7. When we receive inputs from other sources like machines and the Internet:
a. We need to execute the program with right inputs.
b. We need to execute the program with wrong inputs.
c. We need to execute the program with the source disconnected.
d. We need to see if the program rejects wrong inputs and alerts the user about
the nonexistence of connection with the source.
8. When connecting with the databases, we need to execute the program with a
proper connection and with no connection to see that the program handles both
situations properly and retains the control of program execution.
9. When delivering outputs as reports:
a. We need to ensure that the results are accurate.
b. We need to ensure that all the records that need to be included are included
and that all records that need to be excluded are excluded.
160 Computer Programming for Beginners
Software testing is a big topic. A new field of software testing has emerged, and a body of
knowledge is being gathered. But even though we are specializing as programmers, the
onus of certifying the program as satisfactorily working without any defects rests on our
shoulders, and we need to test it. In fact, programmers are the first line of software testers,
albeit the fact that they test their own programs only. I have included a brief outline and
important aspects of testing here. Interested readers may refer to a good book on software
quality assurance.
Coding, Debugging, and Performance Tuning 161
Debugging
One version of how the word debugging came into being was described in Chapter 4. Here
is the more popular version. The word “debugging” is attributed to Rear Admiral Grace
Hopper. While she was working on the Howard Aiken-built Mark-II computer (Automatic
Sequence Controlled Calculator), she located a moth stuck between the contacts of relay
#70 on panel-F on September 9, 1947. Madam Grace Hopper added the caption “First actual
case of bug being found.” You can see the picture of the log here (Figure 14.1)
I am not sure if it is really written by Madam Grace Hopper, but it is touted so! But of
course, the word “bug” to denote a glitch was used by Thomas Alva Edison, too, and much
earlier. Here is an excerpt from his papers: “It has been so in all my inventions. The first step
is an intuition and comes with a burst, then difficulties arise—this thing gives out and then that
“bugs”—as such little faults and difficulties are called—show themselves and months of intense
watching, study, and labor are requisite before commercial success or failure is certainly reached.”
It is in a letter from Edison to Puskar, dated November 13, 1878.
Having noted the history behind the word debugging, let us turn to our discussion on
removing errors, or “bugs,” as we prefer to say, from our programs. We make two kinds
of errors in our programs. The first one is syntax errors, which are trapped by the compiler
and force us to correct them before running the program. The second category is the logical
FIGURE 14.1
First-ever bug report.
162 Computer Programming for Beginners
errors. By logic, I mean that there is some defect in our logic/algorithm. A logical error pro-
duces a wrong or inaccurate result. Here are the logical errors we frequently commit:
a + b2 × c ÷ d − e
In this equation, the arithmetic symbols are between the variables. This is referred to as
“infix notation.” There are two other notations referred to as “postfix” and “prefix” nota-
tions. In infix notation, the arithmetic symbol is located between two variables. In postfix
notation, the arithmetic symbol is located immediately after two variables. In prefix nota-
tion, the arithmetic symbol precedes two variables. In postfix notation, we write the earlier
equation, thus:
2
b2 c ×× d÷: a + e −-
Multiply b2 by c; then divide the result by d; then add a to the result; and finally subtract e
from the result. As you can see, the arithmetic symbol is placed immediately after the two
variables on which it is to act. (a b +) is the same as (a + b). Computers convert our arith-
metic statements into this form for resolving them.
In prefix notation, the arithmetic symbol is placed before the variables it has to act upon.
(a + b) is written as (+ a b).
While solving this equation, we would:
We do like this because our math teacher taught us BODS (bracket, of, divide, and then
subtract or add). That is the precedence order of arithmetic symbols in solving algebraic
equations. That is:
3. Then divide.
4. Then subtract or add.
To make the earlier equation explicitly lucid, we change the equation thus:
( ) ( )
a + b2 × c ÷ d − e or b2 × c ÷ d + a − e
Should we simplify it this way? When dealing with human beings, it is not necessary
but when you are dealing with a computer, yes, I would say it is a must. The computa-
tional errors crop up because we programmers assume that computers are intelligent.
Computers by nature are diligent workers and intelligence needs to be programmed in!
Who else, apart from us, would do the programming part? Therefore, we need to program
that intelligence into the computer. To avoid computational errors, we need to take the
following precautions:
Now how do we first catch the computational errors in our programs? We do it like this:
3. Nesting too many parentheses leading to confusion in arriving at the results. The
confusion is not for the computer. It is for us.
4. Making the statement too long without splitting it into multiple smaller state-
ments. When we do so, we forget the parenthesis and order of the operators, and
even use wrong variables!
4. Step to the statement through the logical expression and execute it.
5. Pause the execution by placing a breakpoint on the statement immediately suc-
ceeding the logical expression statement.
6. Check the result and if it is not as expected, then step back to the statement with
the logical expression, analyze it, and correct it as necessary.
7. Step back to the statement preceding the logical expression, change the values of
the variables, step forward, and execute the logical expression.
8. Repeat these steps as necessary until the result of the logical expression is as
expected.
In this manner, we need to debug the errors. Once the relational and logical expressions
work without defects, the search will result in retrieving right data.
Then we rectify all the mistakes and check the report once again, repeating the cycle
until we can detect no more defects.
Performance Tuning
Performance tuning refers to the activities by which we try to improve the response times
of our programs thereby improving the performance of our programs and the throughput
of the computer.
The performance of a software program is measured by the response time delivered
by the program. Response time is the amount of time taken from the time the program
is initiated to the start of the response being perceived by the user. If you type a URL
in the browser, the response time is the amount of time taken from the time you hit the
enter button or clicked the appropriate button until the time when the display begins to
appear on the screen. If you are generating a report, response time is from the time you
clicked the generate-report button to the time the report begins to appear either on the
screen or on the printer. In communication programs, the response time is measured
from the time we click the button to send the message to the time it begins sending the
message. In a nutshell, the response time is measured from the time the user completes
the action and entrusts the data to the computer beginning the action. Unless the user
can perceive the beginning of the action, it would appear that the computer is not act-
ing. Obviously, the computer needs time to process some amount of data before it can
be presented to the user. Sometimes, the data is of low volume and sometimes data is
of very high volume. The amount of processing time taken by the program depends on
the volume of data. When the volume of data is very high, we need to devise ways and
means to speed up the processing.
Even in business applications where the response time is not critical, it is important to
keep the response time shorter. In multiple studies conducted on the user perception, the
following information surfaced:
1. A significant chunk of users will leave a website if the response time is longer than
just 4 seconds!
2. Users with a little more patience leave the website if the response time is longer
than 8 seconds!
3. Most users cannot withstand response times greater than 15 seconds.
No web-based application should have a response time greater than 15 seconds. We,
as programmers. should do our bit to keep the response times as low as possible, and
performance tuning is the way to do it.
Performance of a program and the response time depends on:
computer. Even in PC, the competing OS would have different sets of programs
with differing complexities. So, the OS would also impact the time taken for process-
ing the data.
3. Other programs running on the computer. In these days, multiple programs are
running concurrently on the computer, and our program has to share the computer
resources with all those programs. The time allocated to our program depends on
the number of programs running concurrently on the OS. The response time deteri-
orates as the number of programs concurrently running on the computer increases.
Even with all these being in place and out of our scope, we still need to ensure that our
program is taking the least possible amount of time for processing the data. For this, we
need to ensure that our program does not contain statements that are not really required
or contributing to processing the data. The following are the wasteful statements that we
need to eliminate from our program:
Of course, it is granted that we do not commit any of these mistakes willfully. But how
do we determine if we did commit those mistakes by oversight? Fortunately, the SDK
developers recognized these possibilities and generally include some performance-tuning
tools. These are:
1. Profiling: The profiling tool is a software tool (also referred to as a profiler) that
runs the program under its control and analyzes the program while it is in exe-
cution. It gives a wide variety of information about the program including the
amount of RAM used, the time consumed by a single statement or a block of state-
ments, the amount of time consumed by the subroutine, the number of times a
subroutine was called, the number of times a statement was executed, and so on.
This information is generally referred to as the execution profile of the program.
We can also use the profiler with a software product to analyze the product dur-
ing execution. In fact, most web-hosting platforms use a profiling tool to collect
information about usage and then to improve the performance of the application.
It then gives additional information about the number of times a program was
called along with the idle times and so on. Using the information collected by
the profiler tool, we inspect the places in the program that are taking more time
and improve them. We do the following to improve performance and reduce the
response times:
a. Delete unnecessary declaration of variables.
b. If it is feasible, we try to use the same variable instead of declaring too many
variables, especially for temporary purposes like flags and counters.
Coding, Debugging, and Performance Tuning 169
1. The data types of the data items that are stored in the tables are very impor-
tant. The data types declared in the database ought to be the same as those that
are available in the programming language. For example, most databases have a
data type for “money,” or “currency,” which has only two digits after the decimal
point, but most programming languages have only integer and floating-point data
types for numbers. It places an overhead on the program execution to convert or
reconvert the data type every time it is used in the program!
2. Data redundancy can be controlled in the databases, but it cannot be totally
eliminated, especially in relational databases. We have to carefully design the
databases minimizing the data redundancy.
3. When we need to query the database based on criteria, placing the query in the
program takes a longer amount of time during the program execution. Databases
give facilities to place this query on the database itself, limiting the program
to supply the parameters. It is better to use this facility and limit the program to
supplying the query parameters and receiving the results. This will reduce the
response times.
4. Most modern databases build the required indexes automatically, but if some
databases allow us the facility to build indexes, we need to design the indexes
carefully so that the data retrieval times are kept to a minimum.
5. Databases provide a facility of views or joins (or logical files) which basically take
data from two or more tables on the fly and present it to the program or user. It is
better to use this facility than to have a physical table to store the data collected
from multiple tables. This would reduce the response times.
We need to entrust the activity of the database design to an expert in database technology
and subject the design to quality control activities before using it during the development.
Once the database design is frozen and we begin development of the software, it is well-
nigh impossible to go back and change the database design, as there would be number of
programs that would be needing change.
While coding is carried out by all programmers, including new entrants to program-
ming, the activities of debugging and performance tuning are expert activities and are
generally carried out by senior programmers. Coding guidelines provide guidelines so
that our programs work efficiently, besides delivering the expected results, but we need
to tune the performance of the programs when the response times are critical. We need to
gain expertise in these activities.
15
Subroutines
Introduction
After a few years of software development, it was recognized that writing long code is
extremely difficult to debug and maintain. Therefore, structured programming was advo-
cated, which primarily involved writing a short main program that calls many subpro-
grams. It is still debated as to how long a long program is. COBOL came up with paragraphs
to shorten the length of a program and FORTRAN came up with subroutines to achieve
structured programming. All the other languages that came later on had some facility or
the other to shorten the main program and have multiple subprograms to achieve the total
functionality. A program that cannot be executed on its own is a subprogram, subroutine,
function, or method. It needs to be called by another program for execution. Each lan-
guage has its own label to refer to this facility. Subroutine, Subprogram, Function, Method,
Procedure, and Object are some of the names that are used for this facility. By whichever
name it may be called, it is simply a facility to hive off some of the code of the main pro-
gram so that the main program is shorter and easier to understand and maintain. I am
using the name “subroutine” in this book to represent all the names used in different
programming languages. In the current day GUI programming, all are subroutines. Each
event of a control is a subroutine. The main programs are embedded in the forms. All
forms are connected to a main form or a landing/home page.
Characteristics of a Subroutine
The following are the characteristics of subroutines:
1. Subroutines are usually embedded inside a main program. The subroutines can
also be placed in a library of subroutines in some programming languages. The
library is linked during the process of compilation and producing the execut-
able file.
2. A subroutine is assigned a name which the main program and other subroutines
use to call it and hand over the control of execution to accomplish a specified
functionality.
3. A subroutine is used to deliver one specific functionality. Of course, there is no
restriction on the number of functions a subroutine can deliver, but it makes sense
171
172 Computer Programming for Beginners
to hive off the second functionality to another subroutine, as the main objective of
using the subroutine is to keep it short and easier to debug and maintain. Having
multiple functionalities makes the subroutine longer and therefore defeats the
very purpose for which subroutines are used.
4. A subroutine is a self-contained program that delivers, normally, one functional-
ity. It can receive arguments (data) from the main program and returns the result
back to the calling program.
5. A subroutine can call another subroutine. It is not necessary that only the main
program has to call a subroutine. One precaution, however, is necessary. The called
subroutine should not call the subroutine that called it. For example, A is a subrou-
tine and it called the subroutine B. The subroutine B can call other subroutines as
necessary, but it should not call the subroutine A that called it. If we do that, we
enter into an infinite loop and the program crashes or the computer freezes!
6. The subroutine begins with a keyword such as “Sub” or some other word to denote
that it is a subroutine. This causes the OS to add this into the processes waiting
for execution and keep the main program in wait state until the execution of the
subroutine is completed.
7. The subroutine ends with a keyword such as “Return” or something like that
to denote the completion of the subroutine. This statement tells the OS that the
execution of the subroutine is completed and that it can be removed from the list
of processes waiting for execution, as well as to bring the main program back into
the list of processes waiting for execution.
8. Usually the subroutine hands the control of execution back to the calling program,
but in some languages, the facility to end even the calling program is available.
This facility is useful especially when the execution encounters an error from
which the program cannot recover and closure is the only option.
9. Subroutines are usually embedded in a main program. An alternative practice is
to build a library of subroutines that deliver commonly used functionalities such
as checking a text box for valid numeric data item. This library may be built at an
organizational level or a project level.
Function
Functions were initially used in FOTRAN programs in addition to subroutines. A func-
tion in the context of FORTRAN programming was a single-line program statement in
which an arithmetic formula is assigned to a numeric variable. The formula is defined in
the usual manner using variables and constants. Then this numeric variable was referred
to wherever the formula was needed. In this manner, repeated coding of formula was
avoided, and in their place, variables were used. During compile time, these variables were
substituted with the actual formulas. That way, FORTRAN reduced the tedium of coding
for the programmers.
Then the C language used functions in another way. In the C programming language, all
programs are functions. The main program is called the “main” function. To be executed
independently, the “main” function is essential. All functions are multi-line functions,
Subroutines 173
unlike in FORTRAN. What is more, C language allows each function to be compiled into
an object file independently, which is not an executable file. Further, C language allowed
all such object files to be combined into a library too! This library could then be linked to
the “main” function at the time of preparing the executable file. That was a great advan-
tage, as a function need not be coded again by another programmer. This is better than the
subroutines used by FORTRAN.
All the next-generation programming languages of the C language such as C++, C#, and
Java followed this methodology.
Methods
Methods are used in the Java family of programming languages. In Java programming
languages, a method is more or less the same as a subroutine. Many preprogrammed
methods are made available to the programmers, which can be called by the program-
mer. I am not going into the details of coding Java methods because this book is not about
programming in Java.
Subroutines can always access all the data items in the calling program without any
restriction. Therefore, there is no point in declaring a data item in the subroutine if it is
already existing in the memory space of the calling program. It will be duplication and
wastes precious RAM. But if the data item in the calling program is holding a value and
cannot be released for use by the subroutine, we need to declare the data item in the sub-
routine. Conversely, if a data item is needed only by the subroutine, there is no point in
declaring it in the calling program as it will tie up the RAM for longer duration. While
declaring data, we need to ensure that the declared data item is used only in the program
in which it is declared.
We discussed about the static and dynamic data types. Static data is that data item that
can be used in all subroutines and calling programs. We can declare static data types in
subroutines also. Such data items survive the closure of the subroutine and will be released
only when its calling program is closed and all of its memory space is released. Dynamic
data-type items are declared and are local only to the program in which it is declared as
well as all the subroutines called by it. Unless there is a pressing need, we ought to use
only dynamic data types. We may declare a static data type only as a last resort only when
there is no alternative is available.
Argument Passing
There are two methods to pass on data to subroutines. One way is to declare the variable
in the calling program so that the subroutine can make use of the variable to carry out
the processing task assigned to it. In this method, the value of the variable in the calling
program can be modified by the subroutine. The second method is to pass arguments
to the subroutine. This keeps the values of the variables in the calling program to retain
their original values. The subroutine receives only the values and uses them to carry out
the processing task assigned to it. However, we need to declare the variables inside the
subroutine to receive those values and store them until the execution of the subroutine is
completed.
The values being sent to by the calling program are referred to as arguments in the
main program. The values in the subroutine that receive values from the main program are
referred to as the parameters. But, I confess that these terms are not universally accepted.
Some may call both parameters or arguments, or may call the values being sent as parameters
and the receiving values as arguments!
In most cases, the second method of passing arguments is preferred, as the values of the
calling program are not changed by the subroutine in an unpredictable manner. Another
reason is the possibility of the subroutine being used at more than one location in the
main program or by another subroutine. When the subroutine is called multiple times
from different locations in the main program or by different subroutines, the values of the
variables in the main program become unpredictable and can produce inaccurate results,
and debugging and maintenance becomes a nightmare.
When passing arguments, we need to take these precautions:
The number of data items passed on to the subroutine must be the same as the number
of arguments declared in the subroutine. For example, take a look at the following call to
a subroutine:
Subroutines 175
int basicpay;
float allowance;
float deductions;
float empsalary;
/* here we read the table and obtain the data and call a subroutine
to compute the salary*/
empsalary = call Sub_compute_salary(basicpay, allowances,
deductions);
In the earlier statement, “call” is a keyword to call a subroutine. Most programming lan-
guages do not use this kind of keyword. Just the name of the subroutine is sufficient to call
the subroutine. “compute_salary” is the name of the subroutine being called. The three
words, “basicpay,” “allowances,” and “deductions” are the arguments being passed on to
the subroutine. “empsalary” is the variable to which the value computed by the subroutine
is assigned.
Now somewhere in the program is our subroutine named compute_salary. It would look
something like this:
The word “sub” is the keyword to denote that this statement declares a subroutine. Some
programming languages require such a keyword in front of the name of the subroutine,
but some programming languages do not require such a tag. The variables in the paren-
thesis are the values that receive the values passed on by the calling program. They have
the same number as those in the calling program. Their type also needs to be the same.
The variable “d” in the subroutine is the value into which the result of the processing is
stored. The statement “return d” causes the computed value to be returned to the calling
program, and it would be received by the variable “empsalary” declared in the calling
program. Summarizing this discussion, let us enumerate the rules of argument passing:
1. The number of values passed from the calling program to the subroutine must be
the same as those of the variables mentioned in the parenthesis of the subroutine
being called.
2. The order of the arguments passed from the calling program must be the same as
the order of the receiving variables declared in the subroutine.
3. The data type of each of the arguments being passed on by the calling pro-
gram must be the same as the corresponding receiving variable declared in the
subroutine.
4. The size of each receiving variable in the subroutine must be equal to or greater
than the corresponding argument in the calling program. If the size of the receiv-
ing variable is smaller than that of the corresponding argument, then the value
may be truncated and the result can be inaccurate.
176 Computer Programming for Beginners
5. The names in the arguments (variables supplied by the calling program) need not
be the same as those in the parameters (variables in the subroutine for receiving
values from the calling program).
6. The subroutine can return one or multiple values back to the calling program as
required by the situation at hand.
7. The size of the variable that is designated for receiving the value returned by the
subroutine must be equal to or greater than the corresponding variable of the sub-
routine. If the receiving variable is smaller in size than the returning variable, the
value may be truncated.
Message Passing
OOM uses the term “message passing” for communication between objects. The C pro-
gramming language family does not have any programs or subprograms. They just have
functions, one of which must be the “main” function. In OOM, all are objects. Therefore,
there is no calling program or a subroutine. But objects can and do call each other. When
all are treated equally, the phrase “parameter passing” does not look appropriate. So, they
used this phrase. In our terminology, there is a calling program and there is a called pro-
gram or, in OOM terminology, there is an object calling another object and there is an
object receiving that call and returning the result of processing. My averment is that the
called program (or object) is a subroutine of the calling program (or object). My definition
(or that of Mr. Dennis Ritchie, who developed the C programming language) of a subrou-
tine is that it is a program that cannot be converted into an executable program on its own.
The message being passed between the objects usually contains:
1. The name of the object being called: In a program, many objects are likely to be called.
Therefore, the message needs to contain the object ID so that the message can be
passed on to the appropriate object.
2. Function ID: An object is likely to contain multiple functions within it, so the
message needs to contain the function ID so that the right function is called into
execution.
3. Information: Information consists of data items that need to be passed on to the
function in the object that is being called.
The results are returned by the function using the same mechanism of message passing.
remember multiple data items, multiple processing tasks, their interrelation, and
so on. To be able to do so, we need super skilled programmers who need to be paid
higher rates, and they are in short supply! With shorter programs, programmers
can easily understand and debug or modify easily. Normal programmers can eas-
ily understand shorter programs as they contain fewer actions and fewer data
items, making it amenable to effortless understanding. Normal people are less
costly and are in greater supply.
2. Avoid redundancy/duplication of code in the program: Often times, we need to write
the same code multiple times in the programs to cater to common tasks. If we do
not use subroutines, we need to insert the same code at multiple locations in the
program or application. If we have the same code at multiple locations, we may
forget to modify the code at all location when a change occurs. This gives rise to
code integrity issues. With subroutines, we can avoid this code integrity issue,
especially during software maintenance. Subroutines can be reused in the pro-
gram as required.
3. Use the subroutine across multiple programs: It is a common occurrence that the same
task needs to be performed in multiple programs. If we do not use subroutines, we
need to insert the code in all the programs that need the task. With subroutines,
we can avoid such a situation. We can code the subroutine once and then link it
with the programs that need the task. If we change the code in the subroutine,
all the programs using it will automatically get the updated functionality. With
this facility, we can effectively reuse the code and reduce the development time of
applications.
4. Build libraries: Besides the previous benefits, subroutines facilitate building librar-
ies, especially in the modern programming languages. We can include all the
common subroutines into one library and build it as a library. Then, any program
needing a subroutine that is included in the library can use it and build an execut-
able by linking with the library at the link time during the process of building
the executable file. In the case of DLL (Dynamic Link Library), the library will
be linked at the runtime. It promotes code reuse and reduces the total software
development cycle time, too.
5. Make it easier for debugging during initial development and software maintenance dur-
ing production: Smaller programs are easier to understand, making it easier to
debug them during the time of initial development and during software main-
tenance in the production runs. The time taken for debugging or maintenance
is not linear in proportion to its length. If a 50-line program takes one hour to
locate and make a modification, either for fixing a bug or maintenance, a 1000-
line single program takes much more than 20 hours (1 × 1000/50)! But if we have
twenty 50-line programs, we take just 20 hours to carry out the same mainte-
nance task! The same is true in the case of initial development too. Complexity
increases when the quantity or size increases. Imagine posting a single letter
and posting 10,000 letters—a single letter in an envelope takes practically no
time at all! But posting 10,000 letters and inserting the letters in the envelopes
can take days if you do not use a machine. In case you use a machine, see—the
complexity has increased so much that you needed a machine for inserting let-
ters in envelopes! So, we need to keep the programs shorter, and subroutines are
a great mechanism to do so.
178 Computer Programming for Beginners
6. Make quality control and testing easier and quicker: When you walk through a pro-
gram, you need to remember most of the code to determine the action being taken
by the program. If the program is long, our memory fails us, and we need to walk
back and forth through the program to understand what the statements are pro-
posing to do. With shorter programs, our memory can easily handle the code, and
we do not spend time walking back and forth during reviews. Similarly, during
testing, too, we need to run fewer steps to test a short program. We need to run
disproportionately more steps to test a longer program. Shorter programs increase
the productivity of our quality-control activities and reduce the time spent on
quality control of our programs. Subroutines are a great way to do so.
1. Limit the tasks to one: No programming language places any restrictions on the
number of tasks that can be included in a subroutine. Usually, it is a best practice
to limit the subroutine to one task. Of course, occasionally we need to include
more tasks in a subroutine, especially when these tasks are closely related and
are selected by clicking an option. But we need to see if we can have a separate
subroutine for each task even in such cases. When we limit the subroutine to one
task, we can generalize it and reuse the code in other projects or applications.
2. Limit the code to 50 lines: Again, no programming language places any restriction on
the number of lines a subroutine can have. Since the very purpose of a subroutine is to
shorten the calling program, coding a long routine defeats the very purpose of a sub-
routine. But, the question always faced by programmers is—how long is really long?
I have seen some organizations defining 50 lines of code as a long program, espe-
cially for subroutines. It is easy for programmers as well as the reviewers to hold 50
lines in memory. Of course, this is an arbitrary number that I observed to be serving
the purpose well. What I advocate is that each organization ought to define its own
maximum length of a subroutine for adherence in that organization. The program-
ming language has an impact on the number of lines that can be easily understood
for easy programming and reviewing. So, the definition of the maximum number of
lines of code that can be permitted in a subroutine ought to be different for each of
the programming languages used in the organization.
3. Make it universal: Remove hard-coding and make the subroutine universal so it can
be used by other programmers. I would say that all subroutines must be written in
such a way that they can be called from any program needing that task to be carried
out. To achieve this objective, we should not use any static variables, either defined
inside the subroutine or defined in the calling program. We should completely avoid
manipulating the variables declared in the calling program. We should receive all
parameters and return all the results back to the calling program. We should declare
all intermediate variables as required within the subroutine itself. This way, we can
hive off the subroutine and make it part of a library of subroutines at the organiza-
tional level for use by all the programmers and projects inside the organization.
Subroutines 179
1. Making it too long: This is one common pitfall that the programmers often fall into.
We get carried away and make the subroutine too long. When we come across a
situation in which may need a subroutine that needs to be longer than 50 lines,
then we need to consider the possibility of splitting the subroutine into two or
more subroutines. Especially in scientific and mathematical programming, we
may need longer subprograms. In such cases, we can have a separate specification
of subroutine length if we cannot split the processing into multiple subroutines.
My suggestion is to treat 50 lines as the optimum length for a subroutine. Any
subroutine longer than 50 lines is too long a subroutine. Do not argue—how about
55 lines! Any specification has certain tolerance to it. Generally, 5% tolerance is
accepted in most cases. In some cases, a 10% tolerance is also accepted. For a sub-
routine, if we accept a 10% tolerance, it can vary up to 55 lines on the higher side.
But even in such cases, we need to see if there is a possibility to hive of some of the
code into another subroutine.
2. Stuffing multiple tasks: This is another trap we often fall into. Instead of limiting
the subroutine to one task, we are tempted to stuff more tasks into one subrou-
tine, especially when the tasks are small steps needing very few lines of code. No
organization would place any restriction on the minimum number of lines of code
a subroutine must have. Therefore, we should not be constrained to include more
tasks in a subroutine because it is too short! As far as possible, we should limit the
number of tasks in a subroutine to one.
3. Calling of other subroutines leading to deadlocks: While developing subroutines, we
may need to include code to call other subroutines. When we call a subroutine,
we need to ensure that the called subroutine avoids calling the calling subroutine!
This causes a deadlock as the subroutines would be calling each other. In some
cases, we may call a subroutine that calls another subroutine which calls the pres-
ent subroutine! I will explain:
a. Let us assume we are developing a subroutine named subroutine-1.
b. Subroutine-1 calls subroutine-2.
c. Subroutine-2 calls subroutine-3.
d. Subroutine-3 calls subroutine-1.
e. As you can see, this enters into a deadlock and freezes the computer or the
program! We should avoid this kind of calls for subroutine.
All in all, subroutines are a great way to reduce the size of programs and thus limit their
complexity to manageable levels. We ought to and actually are making extensive of use of
subroutines in the software industry.
16
Building and Using Libraries
Introduction
When we work in organizations or have our own software development set-up, we soon real-
ize that many of the functions recur in our software development projects. It is possible that
these common functions can have minor deviations from each other, but most of the code
and functionality remains the same. One way is to write the code, copy it to the new projects,
and make necessary modifications as needed. This method is not seen as a very professional
one, as it takes time not only to make modifications and repeating the code but also for the
vital quality-control activities of code review and testing. The second method is to build
libraries of useful subroutines for commonly used functionalities, and then use them when
developing software in different projects. This aspect is discussed in this chapter.
Types of Libraries
A normal library is filled with a collection of different books. A library in the context of
software development refers to a collection of independent subroutines that can be called
by other programs. In fact, every programming language provides a set of libraries along
with its development kit. When we run the step of linking during the process of preparing
the executable program, the object code of the relevant subroutines from these libraries
are attached to the program code we wrote and then the executable file will be prepared.
The COBOL language provides for copy books, that is, files that will be brought into the
executable file during the process of preparation of the executable file. A copy book is a
file containing COBOL code and we can bring it in by inserting a COPY statement in the
code. In other programming languages, we simply use the routines inside the program
and include the library during the linking step, along with other libraries, during the pro-
cess of preparing the executable file. In the present-day IDEs (Integrated Development
Environments), we include the name of our library in the project reference. How exactly we
include our library along with the libraries supplied by the SDK (Software Development
Kit) supplier is a matter of detail and changes from one development platform to another.
There are three types of libraries:
1. Static libraries: This is the initial type of library to be used in the third-generation pro-
gramming languages. It began with the FORTRAN programming language, which
supplied a large set of mathematical routines in its libraries. The mathematical
181
182 Computer Programming for Beginners
library was the reason why the FORTRAN programming language was selected for
mathematical and scientific programming and is still the number one choice for that
type of programming, even today. In this kind of library, the library code is attached
to the program code, increasing the size of the executable file. Initially, the entire
library code was attached to the executable file irrespective of the number of rou-
tines from the library that were used in the program. Later on, this was changed to
attaching only those routines that were actually used by the program. This reduced
the size of the executable file. Still, the size of the executable file was larger than the
program size to the extent of the subroutines included in the program. In those days
of constraints on the available RAM, increased size meant increased pressure on
the resources available, leading to degrading the performance and throughput of
the computer. There was one major advantage with static libraries and that was the
program was one executable! All the required code was inside the executable file. It
can be easily carried to another identical computer with ease.
2. Dynamic libraries: Dynamic libraries are not attached to the executable file during
the linking stage but are needed on the disk for loading into the RAM during the
execution of the program. The dynamic library may be loaded into the RAM either
at the time of starting the execution of the program or during the first time it is
called by the program in execution. The library needs to be available on the disk
in the search path set for the execution of the program. This over time developed
into the DLL (Dynamic Link Library) that is being extensively used in the soft-
ware industry today. DLLs stay resident on the disk and are loaded into the RAM
as and when called by a program in execution. It is cleared from the RAM when
the called program is closed along with the other resources held by that program.
What happens when more than one program calls for the same DLL? It will be
loaded into the program space of both the programs. In other words, two copies
of the DLL are loaded into the RAM but at different locations. Each executing pro-
gram would have its own copy of the DLL in its space in the RAM. Presently, when
a routine in the library is called by a program in execution, only the routine would
be loaded into the RAM. The entire DLL would not be loaded into the RAM.
3. Shared libraries: A shared library is also referred to as a runtime library. We have
noted the disadvantage of the DLL in the previous section. Some routines are
needed by many programs in execution, and loading a separate copy for each
would consume significant amount of space in the RAM. This is common in
the software that constitutes the OS. So, the OS developers have developed a
technique to keep just one copy of the library in the RAM and then allow other
programs in execution to use it a needed. Instead of loading the entire execut-
able code of the routine for each of the calling programs, only the data needed
for the variables is created separately for each of the programs using the library
routine. It, in fact, mimics the multi-user OS. In multi-user OS, each user would
be provided with a session space in the RAM and the program remains one.
Shared libraries also work in the similar manner. Implementation of shared
libraries is in the domain of the OS. Unless the OS provides this facility, we can-
not use shared libraries. Many OS use shared libraries. Most multi-user OS pro-
vides facility for using shared libraries developed by application programmers.
As you can see, the difference is in the usage of the library. All are built the same way. Now
let us see how the libraries are built so that we can use them in our software development
projects.
Building and Using Libraries 183
Building Libraries
The components that can be included in a library are object programs. An object program
is the compiled version of the source code. We cannot usually include GUI controls in
a library. Most computers provide separate facilities for building libraries with the GUI
controls. Such libraries are usually referred to as visual components libraries or a name
similar to it. Usually we build libraries that perform some processing functions using pro-
gram statements. Different programming languages and OS have different facilities for
including routines in the library. The beauty of a library is it can be used with programs
of any language. You can write the code of the routine included in one language but call
it in a program being written in a totally different language. For example, the code in the
library routine might have been written using Visual Basic language and once it is built
into a library, it can be called by a program written in Java!
Now here are the steps in building a library:
1. Write the routine using the source code. It could be written using a text editor
or an IDE (Integrated Development Environment). The routine should be written
such that it does not call for routines from any other code libraries including the
libraries supplied along with the SDK. It should be a stand-alone program not
needing any external routine for its functioning.
2. Compile the routine to make it into object code. All development platforms pro-
vide this facility to compile any routine into its object code. Remember that an
executable file is primarily object code linked with the required libraries.
3. Develop all the routines you propose to include in the library as stated in
step 1 earlier. Then compile all those routines into their corresponding object
files.
4. Build all those routines into a library:
a. All IDEs provide a facility to build a library.
b. When you select the option to build a library, you need to specify all the rou-
tines that are intended to be included in the library. You also need to give a
name with which the new library is accessed. All these names have to adhere
to the rules of naming specified by the IDE.
c. The IDE builds the library and assigns it the name and stores it in the directory
specified by you.
d. In some cases, you may have to issue an appropriate command to build the
library from the command prompt and supply the names of the object files
being included in the library as well as the name for the new library.
5. Now the library is ready for use!
Usually, all IDEs provide facilities to build new libraries as well as to add new routines
to an existing library. We need to select the option of either to create a new library or to
add routines to an existing library. Then the IDE will carry out the command and build a
new library or add routines to an existing directory as specified by you. If you are build-
ing a library from the command prompt, you need to specify an option to either create a
new library or to add routines to an existing directory. These options need to be specified,
adhering to the rules of the command keyword of the OS.
184 Computer Programming for Beginners
The process of building a DLL is also similar to the steps described here. You will be
using the facility provided in the IDE to build a DLL.
In fact, the process of building a library is simple and its benefits are huge! But alas, I do
not find many programmers or organizations making use of this great facility.
libraries accordingly. I have seen some OS that allow libraries to be included at the time of
execution too. Especially in the Internet applications, the execution is in interpreter form.
That is, the source code statements are not precompiled into object code. Each statement is
compiled on the fly during execution, then linked to the relevant libraries and executed.
If the execution encounters an error, the OS throws up an error and aborts the execution
of the program. In such cases, the libraries are linked to the object code during the time of
program execution.
Once each routine is documented in this manner, we need to index it in the table that holds
the searchable index of all the libraries and the routines therein. I suggest the following
structure for such table:
With such a mechanism, the libraries can be effectively utilized by the programmers and
organizational productivity can be greatly increased.
186 Computer Programming for Beginners
All of the above constitute the components of an organizational framework that facilitates
and encourages the use of libraries in the organization.
Now, the individual programmers also have a role to help the organization maintain a
healthy environment for building and using the libraries. These are:
1. Identify the routine they came across during their work that can be included in
a library and inform the organizational agency responsible for maintaining the
libraries for possible inclusion.
2. Assist the agency responsible for maintaining the libraries in their assessment of
the suitability of the routine for inclusion.
3. Assist the organizational agency in documenting the routine for future use.
4. Identify any opportunities for improvement in any of the aspects of maintaining
the organizational libraries for more effectiveness.
5. Identify any opportunities for improvement, including the defects that surfaced in
the routines that are in a library, and assist the organization to replace the existing
routine with the improved routine in the library.
6. Any other activity that is necessary and assigned by the organizational agency
responsible for maintaining the libraries in the effective and efficient maintenance
of the building and using of libraries, including training newer resources in the
organization.
In this way, the organization and the individual programmers need to work shoulder-to-
shoulder in a close-knit manner to derive benefits from the activity of building and using
the libraries. While the organization derives financial benefit, the individuals derive the
benefit of avoiding the monotony of writing the same code again and again.
17
Programming Device Drivers
Introduction
Device drivers are programs that are developed to interface between the computer and
the device or machine connected to the computer and assists the computer in control-
ling such devices or machines. A device can be a printer, a VDU (Visual Display Unit), a
tape drive, a mouse, a camera, a CNC (Computer Numerically Controlled) machine, an
airplane, a rocket, a car, or any other such machine. Programming is not limited to com-
puterizing business operations, developing decision support systems, or mathematical
solutions. Computers have wide-ranging applications. In all the applications, the output
has to finally be delivered on a machine or device such as a printer or another machine.
Computers are controlling nuclear reactors, rockets, and flow process production systems
such as fertilizer and pharmaceutical manufacturing. In a business environment, when we
purchase a computer with its peripherals, we get it with system software and the needed
device drivers that are supplied along with the computer. The supplier of the device also
supplies the software or the device driver that controls the device supplied by their orga-
nization. In this chapter, let us discuss how to develop the device drivers. However, this
chapter provides a brief introduction so that you can build on it and develop programs
needed to interface and control the devices. It is by no means intended to make you an
expert device driver developer.
What Is a Device?
From the standpoint of a programmer, what is a device? For our context, a device is any
machine or gadget that can be controlled by a computer. A programmer knows how to
write a computer program that can process information but not run a device. A computer
program can deliver the processed information to any output device, but cannot pilot an
airplane or print an output on a paper. Those activities such as printing, piloting a plane or
a rocket, or running a machine have to be performed by the device itself. What a computer
can do is pass on such information that is needed by the device to run itself effectively.
In the absence of a printer, a typist used to print the matter on paper using a typewriter
machine. The airplane was piloted by a qualified and certified pilot. The human being
tending to the machine performed two functions, namely, running the machine and
making informed decisions. Running the typewriter involves loading the paper, press-
ing on the keys with the right amount of force, return the carriage of the typewriter to
189
190 Computer Programming for Beginners
the beginning position when the end of the line has reached, formatting the output by
setting tabs appropriately, and so on. The decisions the human being made included the
positions where to set the tabs, when to return the carriage, ejecting the paper when it is
filled up and loading the new paper, transferring the information from a handwritten
note to the paper, and so on. Engineers have come up with gadgets that can take over the
actions the human being performed, and the computer took over the decision-making
portion of the human being. Our device driver software makes the decisions to drive the
device as well as to transfer the data required by the device to implement the decision. In
other words, the device driver software gives commands to the device along with the data
to implement the command.
Each device that is designed for working under the control of a computer has some
interfacing hardware to interact with the computer. This hardware would perform the
following functions:
Each of the subassemblies of the device would have a means to perform the work as com-
manded by its computer interfacing component.
In short, our programs interact only with this interfacing hardware and software com-
ponent of the device. We need not concern ourselves with how the device implements our
commands and delivers the right output expected of it. For us programmers, our vision of
the device is limited to providing the right command with the right data at the right time
to the interfacing component of the device.
Core functionality actions are those that deliver the expected output from the device. A printer
is expected to print the output on paper. A music player is expected to play music conforming
to the playlist. An airplane is expected to reach its destination flying the specified route.
Programming Device Drivers 191
Ancillary functionality actions are those that keep the core functionality actions safe
and secure for the end users by protecting them from failures and malfunctions of the
device.
While we cannot generalize the core functionality actions into subclasses, we can gener-
alize the programs we need to develop for the ancillary functionality actions. Let us now
discuss about coding these two main classes of actions.
What commands to give and with what data depends on the device itself, and we need to
develop the needed programs accordingly.
192 Computer Programming for Beginners
3. Data transfer: Once the device is dedicated to our program, we begin transferring a
series of commands followed by corresponding data to the device. We follow these
steps in transferring data:
a. Query the device to ensure that it is ready to receive data.
b. Begin transferring data, one packet at a time. The size of the packet differs
from device to device.
c. Wait for the acknowledgement from the device that the packet is received and
that the data is in healthy condition. If the acknowledgement is not received,
the program needs to resend the packet once again. We repeat sending the
packet until our program receives an acknowledgement from the device that
the data received is in healthy condition.
d. Repeat the previous three steps until all the data needed by the command is
transferred to the device.
e. Move to execute the next instruction in the program.
4. Respond to device interrupts: The device needs to place interrupts on the CPU of the
computer quite a few times. Some occasions would be device-specific. Here are
some general reasons which are common to most devices:
a. The receiving buffer is full: Each computer-controlled device has a small amount
of RAM, and it gets filled up pretty quickly. In this case, the device places this
interrupt on the CPU to halt further data transfer.
b. Receiving buffer is empty: When the data transfer is halted due to buffer being
full, the device places another interrupt on the CPU to resume data transfer.
Then the CPU would resume transferring the remaining data.
c. Error condition: The device may experience a variety of errors like an output
material like paper is exhausted, the tool is not working, there is a paper jam,
and so on. Then the device places an interrupt on the CPU to take appropriate
action.
d. Online: Sometimes, the communication may break down between the com-
puter and the device for some reason. In such cases, when the communication
is reestablished, the device places this interrupt to indicate that it is ready to
receive instructions from the computer.
e. There would be many other occasions that necessitate placing an interrupt on
the CPU, and our program needs to be ready to handle all such interrupts.
5. Error trapping: This is a major function of the device driver. The device is subject to
a host of errors. A device driver software needs to handle all error conditions, trap
them, and steer the computer to a smooth passage to other functionality without
effecting the computer functioning in any manner:
a. Buffer full.
b. Out of paper.
c. Not powered up.
d. Error conditions.
e. Ribbon, cartridge, and ink.
f. Parameters out of range.
g. Tool broken.
194 Computer Programming for Beginners
In this manner, we need to program all the functions required to drive the device. All we
do in programming devices is interact with the digital interface built within the device and
pass appropriate commands along with the needed data in the specified format. The rest
is taken care of by that device interface to produce the expected results from the device.
18
Programming Multi-Language Software
Introduction
Now the world has shrunk; if not in land area, it has shrunk in terms of reach, especially
for products and markets. Gone are the days when we develop software in English and
expect all others to master the English language if they like to use our products. Others
have changed, and we also need to adapt to the ways of the world and develop software
in such a manner that the people completely non-experts in English can use our software.
The tools for doing so are available right now. Most software development platforms pro-
vide facilities to develop software for use by people of different languages.
Why do we need our software to work in any language other than English? Of course,
English is the language spoken all over the world. No other language is spoken by so many
people as English. But the people who do not speak English in the world outnumber those
that speak English. Thanks to the low-cost IBM PC and its clones, computers are now used
all over the world. Until recently, we were developing software only in English because
the OS was supporting only English. But now, the OS, especially that of the PC, is support-
ing other languages, including all European, Asian, and Arabic languages. Present-day
keyboards allow for the entry of data in native languages other than English. So, if we do
not develop software in languages other than English, we will be losing a large chunk of
market. Our software is not like a book that can be translated into the native language and
released. There is also no point in developing a separate version of software for each of
the languages. We need to build in features that allow users of different languages to use
our software. It is possible in the present day, and this chapter gives you an insight into
developing multi-language software.
I need to make one thing clear before we proceed further: We do not develop software
in multiple languages, but we develop software that is amenable for use by people using
languages other than English.
195
196 Computer Programming for Beginners
1. All the prompts for the user on the screen can be displayed in any language,
within the set of languages in which it was designed to be used.
2. All the tool tips are displayed in the chosen language.
3. All the help displayed is in the chosen language.
4. All the headings and other labels on the report are generated in the chosen
language.
5. The input is allowed to be entered in the chosen language.
6. The end user uses the software in one language with which he/she is comfort-
able. The end user would not use the software in different languages in different
sessions.
But how does a multi-language software look like from the programmer’s standpoint?
I am not aware if any programming language is available in any language other than
English. Efforts were certainly put in to develop programming languages in languages
other than English, and perhaps in some countries they may be available. But once the
programs are compiled, they will be in machine language that is in zeroes and ones! The
source statements are hidden from the end user who sees the software through the screens
and reports or the actions of a machine. So, the language of the programming language
does not matter to the end user. What we need to achieve the multi-lingual attribute is in
the user interface both on the screen and the reports.
1. Make the software amenable for use in one language only but with a customizable
user interface
2. Make the software amenable for use in English and one other language
3. Make the software amenable for use in multiple languages
1. We place all the label text in a database table that has only one field in which the
label text is stored.
2. When loading a screen/report on the application, we read the right label text from
the database table and assign the text to the labels appropriately.
3. We provide a separate screen in system administration functionality to change the
label text as desired by the customer.
4. We take the following precautions for this functionality:
a. The action of the changing of the label text is allowed only for the system
administrator.
b. The label text change is allowed for the entire functionality only. Each indi-
vidual user cannot have different label text to suit his/her liking.
c. The size (the number of characters) of the new label text is restricted to the
maximum size of the label.
The term “label” used in the previous bullets includes all text visible to the user including
tool tips, captions on buttons, menu items, and so on.
This method as described is keeping in view multi-user software. If it is a single-user
software, the method remains the same except that we give the user the facility to change
the label text as desired by the user.
1. We design a database table that has two fields. One field would contain the
label text in English, and the second field would contain label text in the desired
language.
2. We develop the software in our usual language, that is, English, storing all the
label text in the database table field specified for English language label text.
3. Concurrently, we get someone to translate the label text to the desired language
and store it in the database table field that is designated for the desired language
label text. Here the size of the new label text in another language needs to be
restricted to the size of the original label text placed on the screen or report.
4. We will also have a facility to capture the preferred language of the user and store
it in the configuration file. When we launch the screen/report, we read the pre-
ferred language and then read the label text from the appropriate field and assign
it to the labels as necessary. This will achieve the objective of showing the user
interfaces with the language desired by the user.
5. We also build in a facility to change the label text as desired by the end users to
suit their needs. However, this facility is restricted to be accessed only by the sys-
tem administrator in the case of multi-user software. No such restriction would be
placed in case of single user software.
6. The size of the new label text when the original label text is changed needs to be
restricted to the size of the original label text of the label placed on the screen or
report.
The term “label” used in the earlier bullets includes all text visible to the user including
tool tips, captions on buttons, menu items, and so on.
In fact, this can serve the purpose of providing software in the language desired by the
customer. However, we need to know who the customer is before we sell the software and
be allowed time to translate the label text. This situation is possible in custom software
development. This is not a good alternative for the COTS product scenario.
1. We design a database table that has as many fields as the number of languages in
which we wish to make our software available. One field would contain the label
text in English and each of the other fields would contain the label text in one
desired language.
2. We develop the software in our usual language, that is, English, storing all the
label text in the database table field specified for English language label text.
3. Concurrently, we get language experts to translate the label text to the desired
languages and store it in the database table fields designated for the respective
languages. Here the size of the new label text in languages other than English
needs to be restricted to the size of the original label text of the size of the label
placed on the screen or report.
4. We will also have a facility to capture the preferred language of the user and store
it in the configuration file. When we launch the screen/report, we read the pre-
ferred language and then read the label text from the appropriate field and assign
it to the labels as necessary. This will achieve the objective of showing the user
interfaces with the language desired by the user.
5. We also build in a facility to change the label text as desired by the end users to
suit their needs. However, this facility is restricted to be accessed only by the sys-
tem administrator in the case of multi-user software. No such restriction would be
placed in case of single-user software.
6. The size of the new label text when the original label text is changed needs to be
restricted to the size of the original label text of the label placed on the screen or
report.
The term “label” used in the earlier bullets includes all text visible to the user including
tool tips, captions on buttons, menu items, and so on.
There is one more method of achieving this functionality. In this method, instead of
using only one table for all the label text, we use a separate table for each of the languages
in which we intend to make our software available. In the configuration file, we capture
the desired language and then use the corresponding table while loading the label and
other text.
Limitations
Now that we know how to develop software for use in languages other than English, we
need to learn the limitations of such development. Here are some such limitations:
1. For this facility to work, the OS of the computer needs to support the language
desired by the customer. If the OS user interface is in English, and the user likes
to use our software in, let us say, German, which has special character like “ű,” it
would not be possible unless the OS supports it. That is, our software works on the
OS and if the OS does not provide support for multiple languages, our software
cannot support multiple languages.
200 Computer Programming for Beginners
2. The European languages are similar to English, but some languages do have much
longer spellings than English. So, the label size has to be larger than that which
is needed for English. This can be circumvented by storing the label size also in
the database table containing the label text of different languages to help accom-
modate more characters in the label text.
3. In languages of the Arabic family, Indian language family, Chinese, Japanese, and
such other languages, it may be difficult to make our software to be amenable for
use in those languages because I am not sure if any OS supports all those lan-
guages. Fonts are made available in those languages in Windows OS, and perhaps
we can use that facility, but I am not sure about other computers, especially the
mainframe and mid-range computers.
Of course, I have not given a detailed algorithm for each of these alternatives. I have shown
you the way and explained the method so you can design your algorithm. You can use a
loop for assigning the label text to labels beginning with the first control to the last con-
trol on the screen. In a report, you need to pass values of the label text to the labels on the
report as parameters to the program calling the report from the report-generator engine, if
you are using one. If you are creating the report programmatically, you just need to assign
the value of the label text to the variable used to print the labels on the report.
19
Programming Languages and Their Evolution
Introduction
As programmers, we need to understand the evolution of programming languages from
the beginning so we can be prepared when the new programming languages come on
to the scene. If you ask me whether learning this history is essential, I would say, “No.”
But learning the history would be advantageous in that it would prepare you for the next
big change. As it has been said, “change is the only permanent thing in this world!” I am
presenting here just a gist of the evolution of the programming languages through the
generations to acquaint you with the development.
201
202 Computer Programming for Beginners
those programs. Life has become easier now in programming computers. There were quite
a few assembly languages in those days. Assembly language programs were converted to
machine language instructions using what was referred to as an “Assembler” that trans-
lated each of the assembly language instructions into machine instructions. These assem-
bler languages were christened as 2GLs (2nd Generation Languages), as they are one level
above machine languages. Even today, some portions of OS continue to be written in the
assembly language native to the machine!
High-level languages, as they were called, focused only on the procedure for processing
the data. When the programmers coded the programs in high-level languages, the com-
piler inserted the necessary CPU instructions during the compilation process. The coding
of CPU instructions was automated, and the programmers were relived of this aspect.
When the compiler compiled a high-level language program instruction, it translated the
instruction into several CPU instructions as needed to achieve the processing specified by
the high-level language instruction.
FORTRAN, which was made available commercially in 1956, is generally accepted as
the first high-level programming language. It was from the beginning aimed at solving
mathematical problems used for scientific purposes, and it continues to be the leader in
that field even today. Of course, the present-day FORTRAN programming language, while
it retains the original flavor, is much more advanced than the original set. The FORTRAN
programming language focused on solving mathematical problems. It had a large library
of routines for solving mathematical problems. To this day, it remains the first choice
when it comes to applications that involve processing complicated mathematical problems,
which include weather forecasting, astronomical science applications, and so on.
IBM computer 1401 used the Autocoder as its main programming language. It was used
for a considerable amount of time, well into the 1970s. We cannot move forward on our
discussion without mentioning the legendary Ms. Grace Hopper along with CODASYL.
It was Ms. Hopper who firmly believed that computer programs could be written in plain
English instead of in the machine language or assembly language. She developed the
first compiled language, FLOW-MATIC, for the Univac-I computer. She advocated for the
development of a programming language that was computer-independent and could be
used on any computer.
1. Executive committee: This committee set the policies and provided overall super-
vision of all other committees. It had reviewed the functioning of the rest of the
committees and accorded final approval for all the standards published by them.
2. Programming languages committee: This committee developed the specifications for
a programming language that facilitates exchange of programs and data from
one computer to the other. They came up with the COBOL (Common Business
Oriented Language) programming language.
Programming Languages and Their Evolution 203
Ms. Grace Hopper participated in the Programming Languages Committee to guide them
in creating a computer-independent programming language. The committee was influ-
enced by her idea of a programming language that is akin to normal English. The compiler
should translate it to the machine language. This led to the development of the COBOL
language, which was an extension of her FLOW-MATIC programming language. IBM also
influenced COBOL by borrowing ideas from their FORTRAN clone COMTRAN’s pro-
gramming language. The USA’s government standardized COBOL as the programming
language for all their business data processing applications, which resulted in COBOL
being the most popular programming language for data processing applications in the
USA and the rest of the world.
The FORTRAN and COBOL languages held sway over the programming and data pro-
cessing fraternity for a long time, until the 1970s.
The next language that gained significant support from the industry was BASIC
(Beginner’s All-Purpose Symbolic Instruction Code) developed at the Dartmouth College
of New Hampshire by John G. Kemeny and Thomas E. Kurtz. While it was intended for
developing programs for the college, its popularity increased due to its simplicity. DEC
(Digital Equipment Corporation) implemented it on their PDP series of lower-cost mini-
computers. They extended the original BASIC and made it an alternative to COBOL
and FORTRAN. BASIC borrowed ideas from FORTRAN. The BASIC programming lan-
guage introduced a new method of executing the programs, which was referred to as
“Interpreting.” Instead of compiling the entire program at one time and creating an object
program that can be executed on computer, the BASIC language took the source program
statements at the time of execution, compiled the statement on the fly, and then executed
it. Microcomputers, beginning with Apple computers and later IBM’s PC, adopted this
language as their main programming language. Microsoft took this to a new level and
developed it further—so much so that it is now one of the leading current programming
languages under the title of Visual Basic, or VB in short.
Then, during the development of the UNIX OS, the C language was developed at the
Bell Labs during the early 1970s by Dennis Ritchie. It went on to become one of the most
popular programming languages. The specialty of the C language was that it provided
language constructs to access locations in the RAM and manipulate it along with the con-
structs for normal data processing facilities. In a way, it provided most of the facilities
available in assembler languages as well as the features of the higher-level programming
languages. Now, there are quite a few languages of its ilk, so much so that they are referred
to as the C-family programming languages. C++ and Java belong to this family. There
are many vendors that supply these language compilers to organizations for developing
programs.
The Pascal programming language was developed by Professor Nicklaus Wirth, who
released it in 1970. It was named after Blaise Pascal, the renowned French scientist. The
language had the influence of the ALGOL language on it. It was a parallel development to
C, and both had similar objectives and had some similarities in their syntax, too, especially
204 Computer Programming for Beginners
in the statement blocks and the statement terminator character. Pascal was widely used in
scientific problem-solving applications and to a lesser extent in business applications.
SQL (Structured Query Language) was developed by developers of the DBMS as a pro-
gramming language to program the databases. It has been standardized and a common
set of SQL statements are implemented by all the DBMS suppliers.
ADA was developed for programming, and it was widely used in programming weapon
systems. It was so named in the honor of the Lady Ada Lovelace Byron, who is credited
as the first computer programmer for the work she did along with Charles Babbage. It is
still the preferred programming language in programming weapon systems, even today.
These languages are generally referred to as 3GLs (3rd Generation Languages). They
were also referred to as procedural languages, as they focused on the procedure for solv-
ing the problem and delivering the solution rather than on managing the CPU and the
RAM. While FORTRAN, COBOL, Pascal, C, and BASIC dominated the programming sce-
nario until the 1990s, there were many other 3GLs that were used but to a lesser extent.
These are:
1. They focused on the procedure of solving the problem and their compiler handled
the hardware manipulation.
2. They used the services of the OS to handle the hardware.
3. They were meant for processing bulk data that was entered offline and was made
ready for processing after eliminating data-entry errors.
4. They were used in batch processing, that is, the data is transformed from the initial
stage to the output after being processed sequentially by a number of programs.
5. They mostly produced paper outputs.
6. They used punched cards and mag tapes most of the time, but also used disk
drives toward the end of their era.
Toward the middle of 1980s, the demand for end-user computing began rising, and facili-
ties were built into the 3GLs to facilitate online data entry and on-screen outputs. Punched
cards also faded away by that time as large capacity disk drives made inroads into com-
puting. By the end of 1980s, end-user computing became the norm, and offline data entry
gradually gave way to online data entry. Even so, 3GLs, especially, COBOL, is still being
used even today with offline data entry on mainframe computers with batch processing
wherever the applications involve bulk data processing. Cases like processing of traffic
tickets by cops, tax collections, and collections by field agents in sectors like insurance are
still being handled in this manner. But, let me hasten to clarify, even these applications
Programming Languages and Their Evolution 205
are depleting with the advent of handheld computers in the form of smartphones, which
facilitate all of these users to upload data on the fly from the field to the servers in the data
center. While batch processing and the use of 3GLs may continue for some more time in the
back-end processing, offline data entry seems to have just a short lifeline into the future.
1. Thus far, the programs had to be recompiled to execute them on a different brand
of computer. They were portable at “source code” level. With IBM PCs and their
clones, they could be executed on a different brand of computer without recompil-
ing, as long as it used MS-DOS OS.
2. There was a large volume of PCs on the market that used MS-DOS OS in the
organizations.
3. Microsoft developed the SDK (Software Development Kit) to develop software
applications for MS-DOS OS at an affordable price. Almost all popular program-
ming languages were made available on PC. Due to the large volume of PCs, the
prices of popular compilers literally crashed and made them affordable, even by
home-based freelance software developers!
4. There was already a large pool of experienced software developers available in the
market, and universities were offering courses on software development.
5. The IBM PC was a powerful (in those ancient days of just 35 years ago!) computer
with a 16-bit processor and 128 KB RAM expandable to 1 MB!
This encouraged software developers to develop and market general purpose software
as COTS (Commercial Off The Shelf) products. The programs WordStar, Lotus 1-2-3, and
dBase II became very popular, and they further propelled PC sales because they gave
some useful applications to organizations for practical data processing.
At around the same time, the networking of computers was coming of age, wired net-
works became practical, and, by combining networks and PCs, it became possible for
organizations to implement organization-wide data processing with end-user comput-
ing facilities. Networked PCs became a viable alternative to costly mainframe computers.
206 Computer Programming for Beginners
End-user computing needed data-entry screens that were user-friendly and aesthetically
appealing.
The introduction of PCs gave a fillip to graphics development, and manufacturers devel-
oped and provided excellent graphics facilities at low rates to PCs. While Apple pioneered
this graphics effort, Microsoft also caught on. Apple introduced the GUI (Graphical User
Interface) in the late 1980s. That can be referred to as the harbinger of the 4th Generation
Languages (4GLs).
To be fair, we need to give credit to Borland and Philippe Kahn for introducing the first
4GL. Turbo Pascal was the first programming language, in my humble opinion, that intro-
duced the concepts of a 4GL. It was developed at Borland, which was owned by Philippe
Kahn. It has the following revolutionary features:
Thus, Turbo Pascal offered the first version of what we now call an IDE (Integrated
Development Environment). Thus began the advent of 4GLs.
The introduction of the GUI (Graphical User Interface) revolutionized the programming
scenario. Different controls were developed for receiving data for different purposes, like
text boxes, combo boxes, buttons, and so on. Each control had various events like click,
double click, hover, change, and so on, each of which could be programmed. The key fea-
ture of the 4GL is the usage of the mouse for moving the cursor on the screen as well as for
selection. While Turbo Pascal was the first 4GL, there are many others, like:
1. Power Builder
2. Visual Basic
3. C++
4. Visual C++
Programming Languages and Their Evolution 207
5. Python
6. Ruby on Rails
7. PL/SQL
8. Informix-4GL
9. Oracle Forms
There are a number of other such languages that can be used to program the GUI applica-
tions. The 4GLs have these characteristics:
While 4GLs are the result of a paradigm shift from CUI (Character User Interface) to GUI
(Graphical User Interface), as well as to networked, distributed, and end-user computing,
no such paradigm shift is visible on the horizon to shift gears in programming languages
from 4GLs to 5GLs. But I am sure the shift will come sooner rather than later.
One paradigm shift I am visualizing right now is the possible shift from PCs and laptops
as user terminals to using smartphones and tablet computers to connect to the Internet
and servers at a remote location. Right now, we are developing our programs on PCs and
laptops using a simulator software. In my humble opinion, we will stop using them and
shift to tablets and then to smaller-sized smartphones progressively in the next two- to
five-year time frame. In my humble opinion, that will be the time when we will usher in
the era of 5GLs in computer programming. Will it be called computer programming, or
phone programming, or smart devices programming? Only time will tell!
20
Programming Standards
Introduction to Standards
Why do we need standards for writing computer programs? As it is, writing computer
programs involves creativity and mental work. If we place restrictions on the way we write
programs, the life of a computer programmers becomes untenable! Besides, programming
is a creative activity, and restrictions are the surest weapon to kill creativity!
209
210 Computer Programming for Beginners
So go the arguments put forward by the people opposing standards and guidelines for
writing programs. The percentage of programmers opposing programming is significant,
and it can be more than even those that support the standards!
This opposition comes from a lack of understanding of the situation and the percent-
age of those opposing standards dwindles as they put in more years in programming
and software maintenance. As programmers begin the software maintenance work, they
begin appreciating the need for programming standards. It is commonly accepted that
the time taken for the development of software is insignificant compared with the time
it spends in maintenance. Do you remember the Y2K problem? Initially, when programs
were developed in the 1950s to the 1980s, they used only two digits to denote the year
in date fields and “19” was assumed to be the century. This was done to conserve space
in RAM as well as on the tape and the disk. When the century turned to 20, problems
were foreseen, and huge amounts of money were spent just to rectify the two-digit year
problem in the programs! Even if we assume the software developed in the 1970s, it spent
more than 30 years in maintenance, and most of those programs are still in use even today
and are being maintained! I am sure you can see the significance of software maintenance
and the necessity to make it easier.
Let us now discuss the aspects of standards.
I am not elaborating on these aspects, as this is not a book on standards and standardization.
Standards usually contain the following aspects:
Programming Standards
What do we want from developing and implementing programming standards? It is com-
mon that programmers leave the project or the organization or both in the middle of the
project due to a variety of personal and official reasons. When operating in such condi-
tions, it is imperative that we establish a set of simple coding guidelines for each of the
212 Computer Programming for Beginners
programming languages so that the next programmer can continue the code where the
first programmer left off. Without a set of coding guidelines, we would have to throw
away the code written by a programmer who left the project. Here are the objectives of
programming standards:
5. Code that is written adhering to coding guidelines can be used for training new
programmers in writing maintainable and reusable code.
6. Coding guidelines ensures a minimum set of quality aspects including defect
prevention and efficiency of execution in the code.
1. Naming conventions
2. Formatting of the code
3. Inline documentation
Naming Conventions
In any computer program, we have two types of words:
1. Keywords provided by the programming language that tell the computer what
to do.
2. The variable names that provide the data to be processed.
While the keywords are easy to understand, the variable names are difficult to understand
for others. To know what the variable name denotes, we need to spend some time in deci-
phering the name. Naming conventions bring in uniformity and clarity of its meaning. To
remove ambiguity and bring more clarity to variable names, we use naming conventions
that define the way in which variable names are defined by the programmers. Naming
conventions enable the person reading the code to distinguish between program variables,
table fields, file fields, constants, flags, counters, file names, and so on quickly and imple-
ment necessary enhancements or fix defects.
214 Computer Programming for Beginners
Presently, most modern programming languages permit long variable names so that
variable names can be named meaningfully to reflect their function. However, long
variables increase the statement length and reduce programmer productivity, as it
takes more time to type longer names than shorter names. We need to strike a balance
between meaningfulness and brevity. The guideline is that the name must not be shorter
than 5 characters or longer than 25 characters.
We use three prefix characters to denote their type and origin. It is suggested that the
name be preceded by two or three prefixes. These prefixes are separated by an underscore
character. In case an underscore character is not permitted by the programming language,
then the first character for each of the name segments shall be a capital letter.
XXX_XXX_XXX_XXXXXXXXXXXXXXXXXXXXXXXXXX
TABLE 20.1
Suggested Sample of Prefixes
Abbreviation Expansion
TABLE 20.2
Suggested Sample of Abbreviation of Names
CUS Customer
EMP Employee
ID Identification
LOC Location
MAT Material
PRJ Project
PWD Password
SAL Salary
WST Workstation
QTY Quantity
AMT Amount
216 Computer Programming for Beginners
TABLE 20.3
Sample Variable Names
Variable Name Explanation
This would help us in correctly identifying the program logic, the order of execution, and
the control flow of the program.
Limiting the Length of the Line Such That It Becomes Easily Readable
Modern programming languages permit longer line lengths, up to 255 characters per line.
Similarly, modern screens also permit longer lines. The length of a programming line
must be limited to the length of the line permitted by the screen. There should not be any
need for horizontal scrolling to read the program. If lines longer than the screen width are
required, they may be broken down to multiple lines using the statement continue conven-
tion permitted by the programming language. However, the continuation lines must be
treated as the subordinate statements described in the earlier section, and their left margin
ought to be offset by one tab-character length.
Programming Standards 217
1. Each program shall have a header. This header will contain the following:
a. The name of the program.
b. The organization name that developed this program.
c. The date of beginning the initial coding of this program.
d. The functionality achieved by the program, in brief.
e. References of calling programs and the programs called from this program, if any.
f. The revision history of the program containing the following information for
each modification:
i. The date of modification.
ii. The name of the programmer who made the modification.
iii. Description of the modification.
2. Each control statement would have an explanation of the purpose of the control
statement and the expected results.
3. Each of the loops, especially those that are used for reading all the records from
tables, would have explanation at the beginning and at the end of the loop.
4. Each subroutine/subprogram would have an explanation of the purpose of the
subroutine/subprogram and an explanation of the parameters required by this
subroutine/subprogram. It would also explain the expected parameters that are
to be received as well as the values returned by it, if any.
Commenting Style
1. As much as possible, comment and code should not be mixed in the same line. The
comment should precede the concerned statement.
218 Computer Programming for Beginners
2. Keep the length of the commenting line to the length permitted by the screen.
There should be no necessity to scroll the screen horizontally for reading the
comments.
3. As much as possible, do not spread the comment across multiple lines. Each com-
menting line should be self-contained. If more than one comment line is required,
the “#” character shall be used at the end of the previous line to indicate that the
comment is continued on to the next line. Here is a two-line commenting state-
ment for example:
Declaration Statements
We use these statements to declare variables and other objects as necessary. Each
declared variable takes up space in the RAM while under execution, so we should
declare variables carefully to conserve RAM. The following are the guidelines for this
type of statement.
Control Statements
Control statements are one source of defect injection. When we use control structures
without diligent care, the program execution may not go through the path assumed by us
and lead to erroneous results or failures. The following guidelines help in ensuring that
control structures are properly coded to prevent defects.
1. Using the right control structure would go a long way in preventing the defects.
Here are some guidelines for selecting the right control structure.
a. Use “case” structure when multiple courses of action are available based on
the result of one condition.
b. Use an “if” control statement when a set of statements have to be executed only
once, depending on one or more conditions (logical expressions).
c. Use a “for” loop when the maximum number of iterations for the loop is finite
and known beforehand.
d. Use a “while” loop when the maximum number of iterations for the loop is not
known before hand and is dependent on a condition. There are two kinds of
“while” loops, with one of them checking the condition at the start of the loop
220 Computer Programming for Beginners
and the other at the end of the loop. It is preferable to use the loop that checks
the condition at the start of the loop.
e. Avoid using the “goto” structure as much as possible. The reasons are:
i. It leads to free-fall execution of the program, and it is difficult to predict the
course of execution.
ii. If it results in closing the program, we may not be able to control the
cleanup activities before smoothly closing the program.
f. Use of “goto” structure is permitted only in the case of error-trapping
statements.
g. Ensure that a program has only one entry and one exit. It is preferable that the
same code segment has the entry point and also the exit point. The program
execution control is exercised from this segment to other segments and finally
exits from this segment. Even the error-trapping statements need to pass the
execution control to this exit point when necessary or if closing the program is
the only option available.
2. When using “if” statements, these precautions are necessary:
a. Always code the “else” part of the statement. While we may not be able to see
the possibility, the conditions in the field are always beyond our comprehen-
sion and the unthinkable always happens. Therefore, coding the “else” part of
the “if” statement helps in the prevention of defects.
b. It may become necessary to nest several “if” statements in our programs, that
is, inserting another “if” statement within an “if” statement. In such cases,
limit the level of subordinate “if” statements to a maximum of three levels.
This means one main “if” statement and two subordinate “if” statements,
totaling a maximum of three “if” statements together in one nest. If it becomes
necessary to have more levels, use “case” structure, break up the program, or
take another look at the program design.
3. When using a “while” loop, ensure that the condition has a probability of
becoming true (or false, as the case may be) so that there is an exit point for the
loop. It is very easy to enter into an infinite loop by using the “while” structure.
This loop is the one that is used to read the records from a file or table until
the EOF (End of File) condition is reached, and it is often forgotten to move the
record pointer forward in each iteration, leaving the loop processing only one
record infinitely.
4. When nesting the “while” loops, again, limit the nesting to a maximum of
three. It is preferable to call a subroutine for each nesting of “while” loop
rather than code all the statements together, as this loop normally uses a num-
ber of statements as compared to other loops that use a far less number of
statements.
5. When using the “case” structure, always code the “default” option (that is when
none of the values mentioned in each of the cases is valid). This prevents free-
falling of the program execution.
Programming Standards 221
Loops
Loops are often a source of defects. We often come across three possible defects in
loops:
1. Absence of an exit condition inside the loop: We forget to include a statement inside the
loop that makes the condition true so that the loop can be exited. This often hap-
pens when we try to access a resource that is locked by another program, switched
off, or faulty. In such cases, it is a common practice to use a timer or some such
mechanism to time out the checking and exit the loop. But often, programmers
forget this statement, and the loop becomes infinite. They do that because the
device is expected to be there waiting for the program, but sometimes the device
will not respond due to various reasons. We need to make certain that we have
included a loop exit statement inside a loop, especially when we are looking for
a device.
2. Not incrementing the counter: In finite loops based on counting, we forget to
increment the counter inside the loop. In a For…Next loop, the counter is in the
statement and is incremented automatically, but in While... loops, we need to
declare a counter and then explicitly increment/decrement the counter using
a program statement. Sometimes we forget this statement and it becomes an
infinite loop. We must ensure that a counter-incrementing statement is included
inside the loop.
3. Reading empty table or file: As most of the programs use data from flat files or
database tables, we use loops to read the records and process them one by one.
Usually we use the While… loop to perform this function. We expect that there
are some records in the file or table, especially during the first use when the table
or the file is empty and the read statement fails and causes a fault. Therefore,
every time we open a file or table for reading, we need to include a statement that
checks the table or file for the EOF (End of File) condition. We need to include
other statements along with this statement to tell the computer what to do in case
the file or table is empty. This will prevent the fault from developing due to empty
table or file.
exit by making its loop condition to true. But I agree that nesting of loops often becomes
essential. We should be very careful to see:
Computational Statements
Computational statements, just to remind you, are used to resolve mathematical formulas.
Computational statements, especially the long ones, are likely to inject defects into the pro-
gram execution. One of the reasons is the order of computer processing of the arithmetical
operators is difficult to perceive. Second, the results from the computation are also difficult
to predict. The following guidelines help in preventing defects arising out of improper
coding of computational statements:
b. Carry out the operation using whole numbers and convert them to decimals
by dividing by 100 (or 10 or 1000 or 10,000 etc.) for presentation or storage pur-
poses as much as possible.
6. Duplication of routines is another common cause for injecting errors. When the
same operations are to be performed in multiple programs, some programmers
duplicate the routine. It is always better to keep one routine and use it in all places
by passing appropriate parameters to it. This protects the integrity of processing
and prevents defects from creeping in.
7. As much as possible, do not use table/data file fields in computational statements,
especially to receive results of computation. Always copy the value of data file/
table field into a variable and then use it in computations. Similarly, receive the
value of computation into a variable and then move it to data file/table file field
just before writing it.
8. When rounding off the value, code the statement on a separate line just for round-
ing off the variable. Do not use the rounding function in combination with a com-
putational statement.
Efficiency Guidelines
Efficiency guidelines help us in ensuring the efficiency of execution as well as using the
resources of the computer economically, especially the RAM. The following guidelines
would help:
1. Do not declare any variable or constant without any purpose. It is common prac-
tice among programmers to declare a number of variables with a view that they
may be necessary in the program. Avoid the temptation to declare too many vari-
ables; even though the stringency on resource usage is now a thing of the past,
occupying too much RAM is likely to slow down the computer.
2. As much as possible, declare variables as local to the program and use param-
eters to pass values to other programs or subprograms. When we declare variables
as local variables, their RAM would be released on exit from the routine. If we
declare variables as global variables, they would hold on to the RAM until we stop
execution of the entire set of programs.
3. Open the files (or database tables) only when required, that is, just before the file
operation statements begin, and close them as soon as the file operation state-
ments end. Opening a file or database table occupies a chunk of memory and it
takes CPU time, too, to keep checking the file status/condition. It may also prevent
other concurrent users from accessing the files/tables.
4. Keep a limit on the number of objects that can be kept open concurrently as they
take up large chunks of RAM, which slows down the program execution.
5. Also, do not pack too many controls onto one screen. Instead, divide the
screen into multiple screens/tabs. This would reduce the burden on the RAM
usage.
224 Computer Programming for Beginners
Effectiveness Guidelines
These guidelines help us in using the software effectively. The following guidelines helps
us in doing so.
These guidelines are prepared as a starting point for you to develop your own coding
guidelines that are best suited for your organization. You may use these guidelines as they
are now, or add to them, modify them, or remove some of them at your free will. What I
suggest is that you have guidelines for code consistency, defect prevention, and efficiency
and effectiveness aspects.
21
Personal Software Process
Introduction
Every employee wants to work in a methodical manner without having to go back and
forth making mistakes and correcting them. The employee also wants to produce deliver-
ables that have zero defects so that no other person can point out a mistake in the deliver-
able. But the reality is different. We commit mistakes, we do go back and forth making and
correcting our mistakes. Others, especially the quality control persons, do point out our
mistakes. To err is be human, and we are human beings and commit mistakes even against
our will. We do not have all the time in the world to do the best possible job. We do have
deadlines to meet and deliveries to make. There are other pressures at the workplace, like
competing for awards.
There are three aspects to working:
We need to meet all the earlier three aspects while working in organizations. Then there
are three levels of acceptable performance:
In organizations, most employees are in the first two levels, namely, the penalty-avoidance
level and normal performance level of performance. There would be but only a few
that will be performing at award level. There are various reasons for the performance
225
226 Computer Programming for Beginners
being low. These could range from a lack of personal ambition, lack of self-motivation,
and lack of knowledge about how to improve performance. My experience shows me that:
Every individual has limitations, both physiological and psychological, on their capacity
to do work. These are genetic factors that the individual has since birth. It is extremely dif-
ficult to correct these congenital defects, but the individual with grit and commitment can
overcome these. Lack of proper training can be alleviated with training, either on the job
or in a classroom. This is very easy to correct and bridge the gap. What usually happens
is that all individuals perform equally at the time they join the workforce, but some gal-
lop ahead while others lag behind. This happens because those that galloped ahead knew
how to improve their performance either by structured instruction, parental guidance, or
by sheer intuition. For those of you that do not know how to improve your performance
and also how to highlight that improved performance, I am discussing the path in this
chapter.
There are two aspects to any work, and they are productivity and quality. Productivity
is the rate of achievement, and quality is the presence or absence of defects. Productivity is
measured in the amount of work performed per hour or per day for individuals. Quality is
measured as the number of defects per million opportunities. For example, if you wrote a
program with, let us assume, 1000 lines of code and committed 3 errors, then your quality
is 3 defects per thousand lines of code, or 0.3 defects per 100 lines of code or 3000 defects
per a million lines of code! The world of programming strives to achieve the level of 3
defects per a million lines of code. Of course, the target of 3 defects per a million lines of
code is for the delivered lines of code that were subjected to rigorous quality-control activi-
ties. This measure is also called the defect density.
So, basically, your programming work is measured by the number of lines of code devel-
oped per day and the defect density in those lines of code. Therefore, to improve your per-
formance, you need to develop a higher number of lines of code (increasing productivity)
and reduce the number of defects in those lines of code (decrease the defect density). How
do you do this? Let us discuss.
Productivity
We receive a salary as compensation for carrying out the work assigned to us in the orga-
nization. The value of our work should not only earn the salary paid to us but also some
more money to cover the overhead and result in some profit for the organization to be
passed on to the entrepreneurs who invested in our company. Initially, we may not be able
to fulfill this obligation, but as time passes, we need to improve our expertise as well as our
productivity and fulfill this obligation. To be able to achieve this target, we need to track
our productivity on a regular basis and see the trend in our productivity.
In order to derive our productivity, we need two items of data, namely, the size of the
programs we developed, and the time taken for developing them. Then, productivity can
be derived using the formula:
4. Many languages allow for the declaration of multiple variables, including their
initialization, in one line itself. We can also declare just one variable in one line
and initialize it in another line. Should we declare each variable as two lines if we
declare and initialize multiple variables in one line?
1. I suggest that the commenting lines need to be counted as the LOC because they
have a purpose, even if they are not processed by the computer. They assist the
quality-control persons in understanding the program, and also the programmers
in the future for the easy maintenance of code. You also spend time thinking and
forming brief sentences to convey the meaning accurately and precisely. They con-
sume your time and they are mandatory in any professional organization.
2. Again, I suggest that you take the physical line as one LOC. If you take a state-
ment, it may contain blocks of statements within it. For example, in loops, subloops
can exist. Similarly, in control statements, other control statements and even loops
may be there. So, it would be accurate to take a physical line as one LOC. The situ-
ation is the same for all programmers.
3. In the case of short statements and long statements, I would suggest showing no
difference for counting the LOC. If we begin assigning weights to lines based on
length, the data collection would become tedious and time-consuming. The count-
ing of the LOC should not to be so rigorous that it becomes more time-consuming
that writing of the lines themselves! Obviously, on the whole, the short lines and
the long lines would average out. The other aspect is that all the lines in a program
would never be of the same length.
4. Yes, we can declare one variable per line or multiple variables in the same line.
We can also initialize the variable during declaration in the same line or on a
separate line. The choice is ours. But in this aspect, the organizational coding
standards would specify how to declare and initialize variables. If they specify
one variable per line we code accordingly, and if they specify multiple variables
per line, we code in that manner. Since our coding practices are in adherence to
the organizational coding standards, we count just the physical lines irrespec-
tive of the fact that multiple variables are declared and initialized in the same,
single line.
I am sure that a professional programmer would never include blank or unnecessary state-
ments in the program just to boost their productivity. Also, a professional organization
would certainly subject the code to peer review, and if there are any unnecessary lines in
the program, they would be pointed out and corrected. There is an advantage in counting
physical lines because you can easily develop a small utility to count physical lines of code.
All you need to count is the number of CR characters. Better still, most IDEs give the line
count and we do not have to develop a special utility for that purpose.
When we come to time spent on developing the programs, the questions arise as to:
1. Should we include the time spent in reading the specifications or the design docu-
ment in the time taken for developing the program?
2. Sometimes, we just need to take some time thinking about the problem or how to
achieve the functionality. Should we include that time also?
Personal Software Process 229
3. How about the time we spent in answering queries from the project leader or
bosses about the progress of the work?
4. How about the time we spend on personal needs like drinking water at the water
fountain or drinking coffee?
5. What about the time that is wasted on various workplace disturbances?
I suggest that you include all those times because you are getting paid for those times, too.
I would say that the only times to be excluded are those times that are wasted as a result
of organization-wide disturbances when no one was able to work. Except that, you include
all the time you spent from the time you begin working on the program until you release
it to the quality-control activities.
Now, productivity needs to be computed at two stages. The first one is the initial cod-
ing. We need to take the time and the size when we completed the coding and submitted
it to the quality-control activities. This is referred to as the initial coding productivity. The
quality-control activities uncover defects, and we may insert some more code in fixing
those defects, so we need to compute the second productivity metric after the completion
of the quality-control activities. This is referred to as the final productivity metric in the
industry.
Quality
Quality is not easily amenable to measurement. A customer expects no defects in the
deliverable, but experience shows that zero-defect delivery is rather a goal than a reality.
So, the quality is measured in the delivered defects per unit of delivered product or
service. Now, the final delivery is affected only after the deliverable is subjected to quality-
control activities and rectifying the defects uncovered during the quality control activities.
Even then, some defects still linger on inside of the deliverable. Quality assurance of the
deliverables is a large subject in itself and is out of scope for this book. For us, the computer
programmers, quality is important and is our responsibility because the quality-control
activities only “uncover” the defects but never correct them! We have to build the deliver-
able without defects in the first place and then rectify the defects when pointed out by the
quality-control activities. Therefore, we need to measure our quality and then continu-
ously improve it.
For our purposes, let us define the quality of the deliverable as the number of defects
uncovered in our work per every 100 lines of code that was written by us. This measure is
referred to as the “defect density” in the industry. The formula for computing the defect
density is:
Number of defects: It is the number of defects that are uncovered after you submitted the
program to quality control for peer review and testing. The defects might be uncovered
in review, unit testing, integration testing, system testing, acceptance testing, and any
other testing that our deliverable was subjected to. We would certainly get the data for the
quality-control activities that are immediately conducted on our deliverable, but we may
230 Computer Programming for Beginners
not get the data for later testing activities. We ought to collect as much data as is available
and compute our defect density.
Just as in productivity, we need to compute the initial quality metric when the first set of
quality-control activities on our program are completed and the final quality metric when
all the quality-control activities are completed.
Schedule
Schedule, or the date of delivery, is another important facet of our working. Whenever
work is allocated to us, invariably it is accompanied by a date by which it needs to be com-
pleted and delivered. This delivery date is assigned, and it is expected of us to complete
the work by that date and submit the artifact to the quality-control people. The quality
control takes its own time and is subject to another schedule, so we need to restrict our
schedule to delivering our code to the quality-control activities. An important aspect of
this assumption is that we would not deliver defective code intentionally, and the code
would only have the defects that are on par with the defect density standard of the orga-
nization. Now, the schedule metric is computed using the formula:
There is a question here—the date is the significant one, not the number of days taken for
it! Right? See, when a supervisor or project leader allocates work, the beginning date and
the ending date are usually specified by him/her. More often than not, we would be able
to begin work on the specified date, but sometimes, due to vagaries of the organizational
environment, we may not be able to begin work on that specific artifact on the scheduled
day. If the beginning schedule is slipped, the completion schedule would also slip. If we
take only the date as sacrosanct, our schedule metric would be in the red most of the times.
That is why the schedule metric is always computed using the number of days spent on the
artifact.
Data Collection
To be able to compute the above metrics, we need data. We need to collect that data, as we
are the source of the data and we need to collect meticulously. This data will help us to
monitor our progress toward excellence. I suggest using a spreadsheet like MS-Excel to
collect the data. I suggest using this format shown in Table 21.1 in which to collect the data.
I suggest that you maintain this spreadsheet in ascending chronological order. That way,
we can easily plot a trend graph. It would be better if you began a new spreadsheet every
year so that you can compare year-on-year data and see the trend.
Figure 21.1 shows an illustrative trend graph for the initial and final productivity metrics
for assumed data. This graph is given only for illustrative purposes, and the data shown
therein should not be taken as real-life data.
Personal Software Process 231
TABLE 21.1
Work Register or Data Collection Format for Computing the Metrics
Initial Effort in PH
Final Effort in PH
Actual Start Date
Schedule Metric
Initial LOC
Final LOC
Productivity Metric Trend Graph
140
120
100
80
60
40
20
0
Artifact 1 Artifact 2 Artifact 3 Artifact 4 Artifact 5 Artifact 6
Initial Final
FIGURE 21.1
Trend graph for the initial and final productivity metrics.
Using similar methodology, we need to plot the graphs for defect density and schedule
metrics. We can compute periodical metrics for these three categories and then compare
the improvements in our performance. I suggest that you compute the metrics once every
quarter, as once every month is too short and we may not have completed many programs
to draw inferences based on the averages. Six months may be too long, as we lose oppor-
tunities for improvement based on factual data. You may select the periodicity based on
your environment and unique situation.
You may use these formulas for computing these metrics periodically:
The data needed for computing these metrics is available in the spreadsheet suggested in
the previous section if you maintain it regularly. You ought to take these precautions for
this process to be effective:
1. Be totally honest while collecting data. After all, you are computing these metrics
to improve yourself and your performance. Unless the data is accurate, the results
and the inferences cannot be accurate.
2. Compute these metrics regularly in the periodicity selected by you. If you are lax
in generating these metrics, you lose valuable feedback on your performance.
3. Set improvement goals realistically. Based on your capacity, set the improvement
goals. For example, there is no point in setting a 100% improvement goal for the
next period as it is not achievable. Similarly, setting a 5% improvement goal is also
counterproductive, as that can be the margin of error in the measurement.
4. Normalize the data before you compute and contrast your metrics. If you have
worked on a platform that is totally new to you, it is better to take that data out
of consideration as it would drag down your metrics. Similarly, do not consider
data that includes some known hindrance that slowed down your performance.
Perhaps there was a hold from the customer, and because of this, you were forced
to be idle for a day or two. This is not your fault.
1. Productivity metrics ought to be increasing. That is, the LOC per person hour or
person day must be increasing. If you achieved 100 LOC per person day in one
period, it must increase to 101 LOC or more during the next measurement. If it
goes down below 100 LOC in this example, then your performance is degrading!
2. Defect density must be diminishing. That is, if you had 2 defects per 100 LOC
in the first period, they must diminish to 1 or lower in the next measurement
period. If they go above 2, then your performance degraded. When we measure
defect density per each 100 LOC, it may be in fractional numbers. We should show
improvement, at least in fractions.
3. The schedule metric needs to be as close to 1 as possible. If it is more than 1, it
means that we delivered ahead of schedule. If it is less than 1, it means that we
missed the delivery date and delivered late.
We perform activities other than writing software code during our working time. We also
conduct peer review on the code developed by our colleagues as well as test the code writ-
ten by our colleagues. It is better to maintain separate spreadsheets for different activities
and compute these three metrics suggested in the previous section.
Methodology
The effort needed to perform a task depends on the methodology used to perform the
task. If we code a program taking a design document as the basis, we achieve one kind of
results, and if we code a program based on the explanation given by the designer or project
Personal Software Process 233
leader, we get a totally different set of results. This is true especially for productivity met-
rics and schedule metrics. Defect metrics are not affected by the methodology used. Then
again, developing a program using coding standards takes a different amount effort than
when we code a program without adhering to any coding standards. How do we handle
these aberrations?
I suggest that you maintain different spreadsheets for different methodologies of
working. Ideally, we ought to follow one methodology in software development, and
most professionally managed organizations do follow one documented and continu-
ously improved methodology across all the projects in the organization, but the specific
organization you work for may not be one such organization. If your organization fol-
lows a documented software development process for all projects, then you can maintain
one spreadsheet. But if your organization follows different methodologies for differ-
ent projects based on customer preference, you better maintain different spreadsheets.
Otherwise, you are likely to get erroneous values and you really cannot understand if
you are improving or not.
Is it necessary to document your own development methodology? I would say yes, but…
It is better to document the methodology you adopt in performing your work. It need not
be an elaborate document. You simply enumerate the steps you go through in performing
your work. Here are some guidelines for different activities for you to consider but have
your own methodology based on your unique situation.
Coding Methodology
Here are the guidelines for the coding process:
1. Study the design document if there is one, or spend some time in understanding
the explanation given by the project leader.
2. Contemplate on the algorithms that need to be used and finalize the algorithms.
3. Open the IDE and set up the programming environment.
4. Load the form if made available by the graphics designers or lay out the form on
the screen.
5. Begin programming:
a. First, code the form load event.
b. Then, code the events, beginning with the control at the top left-hand corner,
move to the right, and then move downwards if the right end is reached.
c. Code the “save/update” button at the end.
6. When coding the events:
a. First, declare the variables.
b. Then, code the algorithms.
c. For each event, check if there is a piece of code available for reuse and use it
wherever possible.
d. When event coding is completed, remove unused variables and trash code, if
any.
7. Review the code to ensure that the results expected of the code would be deliv-
ered effectively and efficiently and in conformance with the organizational coding
guidelines.
234 Computer Programming for Beginners
8. Conduct white-box unit testing of the entire code and fix any errors uncovered.
9. Submit the code to the project leader or quality control for carrying out quality-
control activities.
Of course, you can make it more elaborate or abridge the earlier guidelines to suit your
unique situation. You can add more guidelines or remove some of the guidelines depend-
ing upon your organizational environment.
1. First, study the design document or discuss the functionality with the author of
that program and learn its functionality.
2. Open the artifact in its IDE.
3. Scroll the entire program and ensure that the formatting of the code adheres to
organizational coding guidelines. Record the mistakes in formatting if any in the
prescribed review report format.
4. Review the form load event program and note down any errors.
5. Review the code of all controls beginning with the control on the left-hand side
top corner and move toward the right and downwards progressively.
6. Ensure that all necessary events of each control are coded.
7. While reviewing the code of all the controls including the form, ensure that the
algorithm used is appropriate for the scenario at hand. Note down suggestions for
improvement, if any.
8. While reviewing the code, ensure compliance to organizational coding guidelines
for defect prevention, efficiency, effectiveness, and the accuracy of the results.
9. While reviewing the code, ensure that no trash code (the code that should not be
there) is present.
10. While reviewing the code, ensure that no malicious code is present in the code
anywhere.
11. Prepare the review report.
12. Record the opportunities for improvement, if you noticed any, and include them
in the review report.
13. Verify it for accuracy of the defects pointed out.
14. Hand over the review report to the project leader or the author of the artifact.
You can improve the earlier guidelines to suit your unique environment by adding or
dropping some of the guidelines as necessary.
Testing Methodology
I suggest these guidelines for conducting your testing work:
1. Study the design document for the artifact, if there is one. Otherwise, discuss the
functionality of the artifact given to you for testing with the author of the artifact.
Personal Software Process 235
2. Obtain and study the testing guidelines for the type of testing to be carried out.
The type of testing could be unit testing, integration testing, system testing, neg-
ative testing, stress testing, or any other type of testing to understand what is
expected of you.
3. If the test environment is already set up, study the testing environment so that you
can conduct the testing effectively and efficiently. If the test environment is not in
existence, plan and set it up in consultation with your project leader or the author
of the artifact.
4. Obtain the test plan and test cases, if they are made ready, and study how you need
to go about testing the artifact. If there is no test plan or test cases, you need to pre-
pare one. Of course, the test plan and test cases need not be elaborate, but they need
to be comprehensive so that all aspects of the artifact can be thoroughly tested.
5. Keep the test report format, either in soft copy or hard copy, ready to record the
test results. Enter all the test cases in it, along with the expected results for each
test case.
6. Conduct the testing and record all the instances where the actual results deviate
from the expected results.
7. Once testing is completed, review the results of each test case, contrasting the
expected and actual results to decide if the test case passed or failed. That is, the
actual result of your testing is the same as the expected result. If the testing did
not pass, then denote the test case as failed in the manner required by the testing
report format.
8. Review the report and then sign it off.
9. Hand over the test report to the project leader or the author of the artifact as
required and take up new task.
Of course, you may improve the earlier guidelines as necessary to suit your unique envi-
ronment. You may add, delete, or modify the guidelines as necessary.
Housekeeping
You need to do some housekeeping in order to deliver excellent results continuously.
This will help you to learn about your own performance. All of us perceive that our
performance is top notch, but until we compare our performance with that of others, we
will never know where we stand in comparison with our peers. Everybody is unique, but
when it comes to on-the-job performance, our performance needs to be at par with that
of our colleagues who have similar qualifications and experience, drawing a comparable
salary. You need to perform these ancillary activities in addition to the main activities of
carrying out your main software engineering activities. Here they are:
1. Recording details of work: Whenever some work is allocated to you, the first thing
you need to do is make an entry in the work register described in the previous sec-
tion of this chapter. It should not take more than five minutes of your time. Then,
when you complete the assigned work, that is, when you are about to return the
deliverable, complete the remaining entries in the work register for the completed
task. This will enable you to conduct an analysis of your performance whenever
you wish to. Do this diligently.
236 Computer Programming for Beginners
What is important is to adhere to a defined method in your work, then measure your
performance continuously, subject the performance measurements to analysis to draw
inferences about your performances, and then studiously improve your performance
through careful goal setting.
Index
237
238 Index