Acc 205 Data Processing and Programming
Acc 205 Data Processing and Programming
UNIVERSITY OF MAIDUGURI
Maiduguri, Nigeria
CENTRE FOR DISTANCE LEARNING
MANAGEMENT SCIENCES
STUDY GUIDE
1
ACC 205: DATA PROCESSING AND PROGRAMMING
GENERAL INFORMATION
Course Code and Title: ACC 205: Data Processing & Programming
Credit Unit: 2
Year: 2015
Total Hours - 28 hours @ two per Week of Study.
For any queries or Questions contact the Course Lecturer Using your email through the Centre for
Distance Learning Portal.
You are welcome to this study Unit. Each Unit is arranged to simplify your study. In each topic of
the Unit we have introduction, learning outcome, in-text information, in-text questions and
answers, summary and self assessment exercises. In-text questions and answers serve as
motivation for your reading and to encourage to pay attention to major points in the text. Tutors
will be available at designated contact Centre for Tutorial. Meet them to resolve your questions
and other guide. The Centre expects you to plan your work well. Should you wish to read further
you could supplement the study with more information from the list of references and suggested
reading available in each study Unit.
PRACTICE EXERCISES
SELF ASSESSMENT EXERCISES (SAES)
This is provided at the end of each topic or Study Session. The exercises can help you to assess
whether or not you have actually studied and understood the topic/study session. Solutions to the
exercises are provided at the end of the Study Unit for you to assess yourself.
HOW TO PREPARE FOR EXAMINATION
To prepare for the examination you should read and understand the Study Materials provided for
you on C.D.ROM, prints or downloads from the Portal.
Other things you need to prepare for examination include understanding all sample questions at the
end of every Study Session/topic Reading the suggested/recommended reading texts.
ASSESSMENTS
-The continuous assessment for all courses consist of 30%.
-The Examination shall make up 70% of the total Marks.
-Feedback and advice is a component of the continuous assessment
The Examination shall be conducted at the Centre for Distance learning (Centre). Students are to
come to the Centre on the Examination date with all the necessary requirements. The Examination
Introduction
Each organisation, regardless of its size or purpose, generates data to keep a record of events and
2
ACC 205: DATA PROCESSING AND PROGRAMMING
transactions that take place within the business. Generating and organising this data in a useful way
is called data processing. In this Session, we shall discuss various terms such as data, information,
data processing and data processing system.
Study Outcomes:
After going through this lesson, you will be in a position to
1.1 define the concepts of data, information and data processing,
1.2 explain various data processing activities,
1.3 describe data processing cycle and computer processing operations and
1.4 explain the data processing system (data elements, records, files and databases)
There is no hard and fast rule for determining when data becomes information. A set of letters and
numbers may be meaningful to one person but may have no meaning to another. Information is
identified and defined by its users.
For example, when you purchase something in a departmental store, a number of data items are put
together, such as your name, address articles you bought, the number of items purchased, the
prices, the tax and the amount you paid. Separately, these are all data items but if you put these
items together, they represent information about a business transaction.
3
ACC 205: DATA PROCESSING AND PROGRAMMING
Hours Worked
Emp Name Mon Tues Wed Thurs Fri Total
No. Hours
110 Haruna 8 8 8 8 8 40
150 Daniel 8 8 8 8 6 38
160 Rabiu 8 4 8 8 8 36
170 Fanta 8 6 8 5 8 35
Fig. 1.2 The source document for a payroll application is the weekly time sheet
a) Collection
Data originates in the form of events, transactions or some observations. This data is then recorded
in some usable form. Data may be initially recorded on paper and then converted into a
machine-usable form for processing. Alternatively, they may be recorded by a direct input device
in a paperless, machine-readable form. Data collection is also termed data capture.
b) Conversion
4
ACC 205: DATA PROCESSING AND PROGRAMMING
Once the data is collected, it is converted from its source documents to a form that is more suitable
for processing. The data is first codified by assigning identification codes. A code comprises of
numbers, letters, special characters or a combination of these. For example, an employee may be
allotted a code as 52-53-162, his category as A class etc. It is useful to codify data when data
requires classification. To classify means to categorize, i.e., data with similar characteristics are
placed in similar categories or groups. For example, one may like to arrange accounts data according
to account number or date. Hence, a balance sheet can easily be prepared. After classification of
data, it is verified or checked to ensure the accuracy before processing starts. After verification, the
data is transcribed from one data medium to another. For example, in case data processing is done
using a computer, the data may be transformed from source documents to machine sensible form
using magnetic tape or a disk.
c) Manipulation
Once data is collected and converted, it is ready for the manipulation function which converts data
into information. Manipulation consists of the following activities:
Sorting: It involves the arrangement of data items in a desired sequence. Usually, it is easier
to work with data if it is arranged in a logical sequence. Most often, the data are arranged in
alphabetical sequence. Sometimes sorting itself will transform data into information. For
example, a simple act of sorting the names in alphabetical order gives meaning to a
telephone directory. The directory will be practically worthless without sorting.
Business data processing extensively utilises sorting techniques. Virtually all the records in
business files are maintained in some logical sequence. Numeric sorting is common in
computer-based processing systems because it is usually faster than alphabetical sorting.
Calculating: Arithmetic manipulation of data is called calculating. Items of recorded data
can be added to one another, subtracted, divided or multiplied to create new data as shown
in fig. 2.2(a). Calculation is an integral part of data processing. For example, in calculating
an employee's pay, the hours worked multiplied by the hourly wage rate gives the gross pay.
Based on total earning, income-tax deductions are computed and subtracted from gross-pay
to arrive at net pay.
Weekly Payroll Summary Report 07/07/2015
Dept. Employee Name Hours Pay Gross
Number Worked Rate Wages
2 170 Mohd 34 10.00 340.00
175 Rabiu 32 9.00 288.00
158 Aisha 20 5.00 100.00
160 Kaltume 36 8.00 288.00
165 Usman 45 9.00 405.00
159 Ahmad 25 3.00 75.00
5
ACC 205: DATA PROCESSING AND PROGRAMMING
activity. Of course, data should be stored only if the value of having them in future exceeds
the storage cost.
Retrieving: To retrieve means to recover or find again the stored data or information.
Retrieval techniques use data storage devices. Thus, whether in file cabinets or in
computers, data can be recalled for further processing. Retrieval and comparison of old data
gives meaning to current information.
e) Communication
f) Reproduction
Data
Collection
Input Conversion
7
ACC 205: DATA PROCESSING AND PROGRAMMING
8
ACC 205: DATA PROCESSING AND PROGRAMMING
accepting inputs and producing outputs in an organised process". For example, a production
system accepts raw material as input and produces finished goods as output.
Similarly, a data processing system can be viewed as a system that uses data as input and
processes this data to produce information as output.
INPUT PROCESSING OUTPUT
There are many kinds of data processing systems. A manual data processing system is one that
utilizes tools like pens and filing cabinets. A mechanical data processing system uses devices such
as typewriters, calculating machines and bookkeeping machines. Finally, electronic data
processing uses computers to automatically process data.
a) Data Organisation
Having discussed the Data Processing Cycle (also called Information Processing Cycle) and the
components of a computer, we will now describe how data is organised before processing on a
computer. Data can be arranged in a variety of ways, but a hierarchical approach to organisation is
generally recommended.
Data Item A data item is the smallest unit of information stored in computer file. It is a
single element used to represent a fact such as an employee's name, item price, etc. In a
payroll application, the employee number 170 is a data item. Mohd is the name and is also a
data item.
Field Data items are physically arranged as fields in a computer file. Their length may be
fixed or variable. Since all individuals have 3-digit employee numbers, a 3-digit field is
required to store the particular data. Hence, it is a fixed field. In contrast, since
customer's names vary considerably from one customer to another, a variable amount of
space must be available to store this element. This can be called as variable field.
Record A record is a collection of related data items or fields. Each record normally
corresponds to a specific unit of information. For example, various fields in the record,
illustrated in Fig. 1.2(a), are employee number, employee's name, basic salary and house
rent allowance. This is the data used to produce the payroll register report. The first record
contains all the data concerning the employee MOHD. Each subsequent record contains all
the data for a given employee. It can be seen how each related item is grouped together to
form a record.
File: The collection of records is called a file. A file contains all the related records for
an application. Therefore, the payroll file shown in Fig. 1.5 contains all records required to
produce the payroll register report. Files are stored on some medium, such as floppy
disk, magnetic tape or magnetic disk.
9
ACC 205: DATA PROCESSING AND PROGRAMMING
Database The collection of related files is called a database. A database contains all the
related files for a particular application.
600 bytes
590 bytes
of data that is treated as a single unit by the input-output device. Portions of the same logical record
may be located in different physical records or several logical records may be located in one physical
record. For example, in case of magnetic tape, numbers of logical records are stored in the form of a
block to increase the data transfer speed and this block is referred to as a physical record.
ITQ
1. Identify various data processing activities.
2. Define the various steps of data processing cycles.
3. Define the terms data, data processing and information.
ITA
1. Data processing activities are grouped under following five basic categories:
i. Collection
ii. Conversion
iii. Manipulation
iv. Storage and retrieval
v. Communication
2. The various steps involved in data processing cycle are as follows:
i. Data input
ii. Data processing
iii. Output
iv. Storage
3. The word data is the plural of datum which means fact,
observation, assumption or occurrence. On the other hand,
information can be defined as data that has been transformed into a
meaningful and useful form for specific purposes.
Data processing is the process through which facts and figures are collected, assigned meaning,
communicated to others and retained for future use. It is a series of actions or operations that
converts data into useful information. In a data processing system, we include the resources that are
used to accomplish the processing of data.
Summary
In this lesson, we have learnt how raw data is converted into useful information. The importance of
computer in carrying out the various data processing activities has been explained. Discussion about
hierarchy of data is also included in this lesson.
Self-Assessment Questions
11
ACC 205: DATA PROCESSING AND PROGRAMMING
(a) Sorting
(b) Summarizing
References/Suggested Readings
12
ACC 205: DATA PROCESSING AND PROGRAMMING
Introduction
Computer programs are collections of instructions that tell a computer how to interact with the user,
interact with the computer hardware and process data. The first programmable computers required
the programmers to write explicit instructions to directly manipulate the hardware of the computer.
This “machine language” was very tedious to write by hand since even simple tasks such as
printing some output on the screen require 10 or 20 machine language commands. Machine
language is often referred to as a “low level language” since the code directly manipulates the
hardware of the computer.
Study Outcomes:
After going through this lesson, you will be in a position to
1.1 explain programming languages and
1.2 explain real-time and distributed computing.
High-level programming languages, while simple compared to human languages, are more
complex than the languages the computer actually understands and are called machine languages.
Each type of CPU has its own unique machine language.
13
ACC 205: DATA PROCESSING AND PROGRAMMING
Lying between machine languages and high-level languages are languages called assembly
languages. Assembly languages are similar to machine languages but they are much easier to
program in because they allow a programmer to substitute names for numbers. Machine languages
consist of numbers only.
Lying above high-level languages are languages called fourth-generation languages (usually
abbreviated 4GL). 4GLs are far removed from machine languages and represent the class of
computer languages closest to human languages.
The question of which language is best is one that consumes a lot of time and energy among
computer professionals. Every language has its strengths and weaknesses. For example, FORTRAN
is a particularly good language for processing numerical data, but it does not lend itself very well to
organizing large programs. Pascal is very good for writing well-structured and readable programs
but it is not as flexible as the C programming language. C++ embodies powerful object-oriented
features but it is complex and difficult to learn.
The choice of which language to use depends on the type of computer the program is to run on,
what sort of program it is and the expertise of the programmer.
Program Structure
14
ACC 205: DATA PROCESSING AND PROGRAMMING
Real-time computing (RTC), or reactive computing, is the study of hardware and software
systems that are subject to a "real-time constraint", e.g. operational deadlines from event to system
response. Real-time programs must guarantee response within strict time constraints, often referred
to as "deadlines". Real-time responses are often understood to be in the order of milliseconds, and
sometimes microseconds. Conversely, a system without real-time facilities cannot guarantee a
response within any timeframe (regardless of actual or expected response times).
A system is said to be real-time if the total correctness of an operation depends not only upon its
logical correctness but also upon the time in which it is performed. Real-time systems, as well as
their deadlines, are classified by the consequence of missing a deadline:
Thus, the goal of a hard real-time system is to ensure that all deadlines are met, but for soft real-
time systems, the goal becomes meeting a certain subset of deadlines in order to optimize some
application specific criteria. The particular criteria optimized depends on the application, but some
typical examples include maximizing the number of deadlines met, minimizing the lateness of tasks
and maximizing the number of high priority tasks meeting their deadlines.
Hard real-time systems are used when it is imperative that an event be reacted to within a strict
deadline. Such strong guarantees are required of systems for which not reacting in a certain interval
of time would cause great loss in some manner, especially damaging the surroundings physically or
threatening human lives (although the strict definition is simply that missing the deadline
constitutes failure of the system). For example, a car engine control system is a hard real-time
system because a delayed signal may cause engine failure or damage. Other examples of hard real-
time embedded systems include medical systems such as heart pacemakers and industrial process
controllers. Hard real-time systems are typically found interacting at a low level with physical
hardware in embedded systems. Early video game systems such as the Atari 2600 and
Cinematronics vector graphics had hard real-time requirements because of the nature of the
graphics and timing hardware.
15
ACC 205: DATA PROCESSING AND PROGRAMMING
In the context of multitasking systems, the scheduling policy is normally priority driven (pre-
emptive schedulers). Other scheduling algorithms include Earliest Deadline First, which, ignoring
the overhead of context switching, is sufficient for system loads of less than 100%. New overlay
scheduling systems, such as an Adaptive Partition Scheduler assist in managing large systems with
a mixture of hard real-time and non real-time applications.
Soft real-time systems are typically used where there is some issue of concurrent access and the
need to keep a number of connected systems up to date with changing situations. For example,
software that maintains and updates the flight plans for commercial airliners; the flight plans must
be kept reasonably current but can operate to a latency of seconds. Live audio-video systems are
also usually soft real-time; violation of constraints results in degraded quality, but the system can
continue to operate.
In a real-time DSP process, the analyzed (input) and generated (output) samples can be processed
(or generated) continuously in the time it takes to input and output the same set of samples
independent of the processing delay. It means that the processing delay must be bounded even if
the processing continues for an unlimited time. That means that the mean processing time per
sample is no greater than the sampling period, which is the reciprocal of the sampling rate. This is
the criterion whether the samples are grouped together in large segments and processed as blocks or
is processed individually and whether there are long, short, or non-existent input and output
buffers.
Consider an audio DSP example; if a process requires 2.01 seconds to analyze, synthesize, or
process 2.00 seconds of sound, it is not real-time. If it takes 1.99 seconds, it is or can be made into a
real-time DSP process.
A common life analogue is standing in a line or queue waiting for the checkout in a grocery store.
If the line asymptotically grows longer and longer without bound, the checkout process is not real-
time. If the length of the line is bounded, customers are being "processed" and output as rapidly, on
average, as they are being inputted and that process is real-time. The grocer might go out of
business or must at least lose business if he/she cannot make his/her checkout process real-time (so
it is fundamentally important that this process be real-time).
A signal-processing algorithm that cannot keep up with the flow of input data with output falling
farther and farther behind the input is not real-time. But if the delay of the output (relative to the
16
ACC 205: DATA PROCESSING AND PROGRAMMING
input) is bounded regarding a process that operates over an unlimited time, then that signal
processing algorithm is real-time, even if the throughput delay may be very long.
Some kinds of software, such as many chess-playing programs, can fall into either category. For
instance, a chess program designed to play in a tournament with a clock will need to decide on a
move before a certain deadline or lose the game, and is therefore a real-time computation, but a
chess program that is allowed to run indefinitely before moving is not. In both of these cases,
however, high performance is desirable: the more work a tournament chess program can do in the
allotted time, the better its moves will be, and the faster an unconstrained chess program runs, the
sooner it will be able to move. This example also illustrates the essential difference between real-
time computations and other computations: if the tournament chess program does not make a
decision about its next move in its allotted time it loses the game—i.e., it fails as a real-time
computation—while in the other scenario, meeting the deadline is assumed not to be necessary.
High-performance is indicative of the amount of processing that is performed in a given amount of
time, while real-time is the ability to be done with the processing to yield a useful output in the
available time.
Distributed System
A distributed system links a number of independent computing entities with local properties by
way of a communication mechanism. Consequently, algorithms and other design components must
17
ACC 205: DATA PROCESSING AND PROGRAMMING
take into consideration the synchrony and the failure model. A useful summary (not entirely
objective) of distributed computing concerns is included in Deutsch's Eight Fallacies of Distributed
Computing. All of these are useful to consider in (real-time) distributed design; each is a departure
point for essential design and implementation concerns:
Time synchronization
What are the requirements for and mechanisms for achieving clock synchrony? Wiki on
clock synchronization; many applications require only NTP; more stringent requirements
may necessitate special hardware (e.g. IRIG-B) or approaches.
Synchrony requirements
What are the synchrony assumptions constraining and requirements for system synchrony?
This is connected to clock synchrony but not identical.
Design patterns
18
ACC 205: DATA PROCESSING AND PROGRAMMING
What are the moving parts, and how do they relate over the transport? (In particular, how do
these relationships affect timeliness?)
Middleware.
How are you going to encode the distributed aspects of the system? Examples include Real-
Time CORBA
Time Constraints
How are you going to document, measure, and enforce time constraints in the system?
Partial Failure. A real-time system typically has reliability requirements. One of the unique
aspects of distributed systems is the potential for whole classes of failures called "partial"
failures, due either to true crash/comms failures or timeliness errors that must be treated as
failures. SO question on failover approaches;
RTOS. What real-time operating system(s) will be employed?
ITQ
ITA
1. These are collections of instructions that tell a computer how to interact with the user and
the computer hardware as well as process data.
19
ACC 205: DATA PROCESSING AND PROGRAMMING
Self-Assessment Question
1. Distinguish between real-time and distributed computing.
References/Suggested Readings
20