CS121 Lecture Note
CS121 Lecture Note
FACULTY OF SCIENCE
MATHEMATICAL SCIENCE DEPARTMENT
CS 121
Lecture note
Basic Data Structures: Meaning of data structure. Brief discussion on Array, Linked lists,
stack, queues, tree: tree traversal, use of
A computer is a programmable device machine that accepts data, stores and manipulates data,
and provides output in a useful format.
At the early stage of its development, computer was seen as just a mere tool to assist in
computation. However, computer has now grown to be a sophisticated tool that can be
programmed to do variety of tasks. In fact, it’ll be hard to think of any field or area of human
endeavour where computer is not employed and heavily relied upon.
Most of the computers that we see and used today are based on Von Neumann Architecture.
The basic idea behind the Von Neumann architecture is the concept of ‘stored program’.
That is, computer works under the control of some instructions called program/software
stored in its memory.The CPU fetch (read) both the data and the instructions from the main
memory, decode (interpret)the instructions and finally execute it. This constitute what is
called fetch-execute cycle, based on which the CPU speed measured.
Our focus as students Computer Science is notonly limited toknowing the principles behind
how computer operate, but to be able to create and analyse computational procedures that
when implemented will cause computer to carryout meaningful tasks. The fields of Computer
Science and Software Engineering primarily focus on the design and implementation of
software
There are many definitions for “problem solving”. Here is one: Problem Solvingis the process
of analysing a given situation and generating appropriate response or solution. In the context
Computer Science, problem solving can be defined as the process of analysinga given
problem and producing its solution in form of algorithm that can be implemented using a
programming language on the computer.
The following are problem solving steps are expected to be adhere strictly. As you progress
in studying Computer Science, you would see that it is these steps that mutate into what is
called Software Development Life Cycle (SDLC), or Software process models such as Water
fall model, V-model, Agile etc
2.2 Algorithm
When you start your car, you follow a step-by-step procedure – i.e, the algorithm might look
something like this:
Attributes of Algorithm
i. Precision – algorithm must be clear and give correct solution in all case.
ii. Unambiguity – ambiguity in algorithm means having more than one step performing
the same task. An algorithm is unambiguous if it doesn’t have series of steps
performing the same task.
iii. Finite – an algorithm should not be run forever. It must have an end.
iv. Efficiency – efficiency of algorithm has to do with how much of various types of
resources it consumes. For example, processing time, memory. Some algorithms
consume much memory and time when executed. In computer science, we are
interested in algorithms that are time and space (memory) efficient.
As far as problem solving using computer is concern, the aim is to implement algorithm using
a programming language. To have a better understanding of the steps in algorithm so as to
make implementation easier, algorithm can be represented in two forms (1) Pseudocode (2)
flowchart.
Although flowcharts can be visually appealing, pseudocode is often the preferred choicefor
algorithm development because of the following reasons:
Pseudocode will vary according to whoever writes it. That is, one person’s pseudocodeis
often quite different from that of another person. However, there are some commoncontrol
structures (i.e., features) that appear whenever we write pseudocode.
The point is that there are a variety of ways to write pseudocode. The important thing to
remember is that your algorithm should be clearly explained with no ambiguity as to what
order your steps are performed in
2.4 Flowchart
A flowchart is used to represent algorithm ina diagrammatic form. Flow chart uses symbols
to show the operations and decisions to be followed by a computer in solving problem. A
flowchart is a convenient technique to show the flow of control in a program.Flowchart is one
of the programming tools that help programmers understand algorithm better
In fact, flowcharts are the plan to be followed when the program is written. Expert
programmers may write programs without drawing the flowcharts. But for a beginner it is
recommended that a flowchart should be drawn before writing a program. This reduces the
number of errors and omissions that might occur during coding/program creation. The
flowchart symbols are as follows:
Advantages of Flowchart
i. Flowcharts guide or help programmer in writing the actual code in a high level
language.
ii. A flowchart is an excellent communication technique that explains the logic of a
program to other programmers/people.
iii. A flowchart is an important tool in the hands of a programmer, which helps him in
designing the test data for systematic testing of programs
iv. Flowcharts are used as working models in designing new programs and software
systems
Disadvantages
Whether using a flow chart of pseudocode, you should test your algorithm by manually going
through the steps in your head to make sure that you did not forget a step or a special
situation. Often, you will find a flaw in your algorithm because you forgot about a special
situation that could arise. Only when you are convinced that your algorithm will solve your
problem, should you go ahead to the next step.
Example 1
Problem analysis
This problem requires the length and breadth of the rectangle to be specified. Hence it
requires input. The length and breadth can be integer or floating point numbers. So the output
(area of the rectangle)should be floating point number (to generalize the case).
print“enter length”
inputlength
print“enter breadth”
inputbreadth
print area
Example 2 stop
Write an algorithm/pseudocode to find the average of positive integers from 1 to 20. (ie,
1,2,3,4,….., 20). Represent the algorithm in aflowchart.
Problem analysis
This problem does not require any input.All that is needed is to have a variable initialized to 1
and then continuously incrementthe variableuntil it reaches 20 while accumulating the
sum.Theoutput should be a floating point number.
Algorithm
1. Start
2. Declare a variable n and initialize it to 1(ie, n = 1)
3. Declare a variable sum and initialize it to 0 (ie, sum = 0)
4. Repeat steps 5 and 6 while n is less than or equal to 20, otherwise goto step 7
5. Add n to sum (sum = sum + n)
6. Increment n by 1 (ie, add 1 to n so that n will point to the next integer in the sequence)
7. Divide sum by 20 and assign the result to average.
8. Print average
9. Stop
start
n=1 n=1
sum = 0
sum = 0
average
REPEATwhile n ≤ 20
sum = sum + n N
n = n+ 1 n<= 20 Y
endREPEAT
average = sum / 20
printaverage
sum= sum + n
stop
n=n+1
average= sum / 20
print average
stop
Example 4
Problem analysis - This problem does not require any input. The output should be an integer
number. We first need to know what is an even number?
EvenNumberCounter = 0
Y
Number = 1
REPEAT while Number <= 1000
Remainder = Number % 2.
IF Remainder == 0THEN
EvenNumberCounter = EvenNumberCounter + 1
Number = Number + 1
ELSE
Number = Number + 1
endIF
End REPEAT
Print EvenNumberCounter
Exit.
Lecture note Page 13
Example 5: Write an algorithm that read an integer value and determine whether the integer
is even or odd.
Start
inputnumber
IF numberis integer THEN
remainder = number% 2
IF remainder == 0 THEN
Print “the number is even”
ELSE
Print “the number is odd”
endIF
ELSE
Print “please enter an integer value”
Goto step 2
endIF
Stop
start
start
EvenNumberCounter = 0
input number
Number = 1
numberin
Print ‘number is not an no Number ≤ no
t?
integer’ 1000
yes yes
rem = number%2
Remainder = Number % 2
stop
Print
EvenNumberCounter
stop
Flowchart to count even numbers from 1 to 1000
So where do algorithm come from? Usually they have to be developed often with a lot of
thought and hard work. Skill in algorithm development is something that comes with
practice, but there are techniques and guidelines that help.
EXERCISES
1. Write an algorithm or pseudo code that display “Hello world” ten times, and represent
the algorithm in a flowchart.
1. Start
2. Print “please enter your age”
3. Input a number and assign it to variable age
4. IF age >= 18 THEN
Print “You are qualified to vote.”
5. ELSE
Print “You are not qualified to vote”
endIF
6. Stop
1. Start
2. Declare a variable N and assign 1 to it (ie, N=1).
3. While N is less than or equal to 10, do steps 4 and 5 and return to step 3, else go
to step 6
4. Print N
5. Increment N by 1(ie, N = N +1)
6. Stop
Different approaches to programming have developed over time. Notable, an early approach
is structured programming which was advocated since the mid-1960s. The concept of a
"programming paradigm" as such dates at least to 1978.
In Computer Science, programming paradigms can be seen as a pattern that serves as a school
of thoughts for programming of computers Programming paradigms are ways to classify
programming languages based on their features [Wikipedia]. Each paradigm supports a set of
concepts that makes it the best for a certain kind of problem.
i. Procedural/imperative paradigm
ii. Functional paradigm
iii. Logic paradigm
iv. Object Oriented paradigm
Logic programming is rule-based. The program consists of axioms and inference rules. Logic
programming is the use of a formal logic notation to communicate computational processes
to a computer. Predicate calculus is the notation used in current logic programming
languages.Logic programs are declarative rather than procedural, which means that only the
specifications of the desired results are stated rather than detailed procedures for producing
them.Logic programming paradigm fits well to problem domains that deal with the extraction
of knowledge from basic facts and relations. Program execution becomes a systematic search
in asset of facts, making use of a set of inference rules. Example is Prolog.
Bit – stands forbinary digit (ie, 0 or 1). Bit is the basic unit of information in digital
computers. Data in digital computers are represented as sequence of bits (0s and 1s) which
correspond of electrical states of off or on OR high or low. When a character is pressed on
keyboard, the computer internally recognizes the character as sequence of bits.Electrical
signals such as voltage exist throughout the computer in either one of two recognizable
states(0 or 1). The manipulation of binary information is done by logic circuits called gates.
Gates are blocks of hardware that produce signals of binary 1 or 0 when input logic
requirements are satisfied. A variety of logic gates are used in digital systems. Each gate has
distinct graphic symbol and its operation can be described by means of algebraic expression.
1 Byte = 8 bits
1KB = 210bytes = 1024 bytes i.e, thousands of bytes
1MB = 210 KB = 1024KB = (1024 × 1024 bytes) i.e, millions of bytes
1GB = 1024MBi.e, billions of bytes
1TB = 1024GB
Exercise: Work out the number of bytes of a memory card whose capacity is 1.5GB.
Word– is a sequence of bits of a particular size or length. Words can be of 16 bits, 32 bits, 64
bits, or any other size that makes sense within the context of a computer’s organization. [A
computer whose word size is 64 bits, typically called 64 bit machine, is capable of fetching
64 bit of data/instruction from main memory].Operating systems (OS), antivirus and other
application software are designed based on the word length of microprocessor. 32bits OS can
be installed on 64bits machine, but 64bits OS can never be installed on 32 bits machine.
File –
Database usually comes as part of application software called DBMS (Database Management
System). A BDMS is software that consists of integrated set of computer programs that allow
users to interact with one or more databases and provides access to the data contained in the
database. A DBMS is a generalized software system for manipulating databases. A DBMS
supports a logical view (schema, subschema); physical view (access methods, data
clustering); data definition language; data manipulation language; and important utilities,
such as transaction management and concurrency control, data integrity, crash recovery, and
security. Examples DBMS are, MySql, Microsoft Access, Microsoft SQL server, and IBM
Dbase II.
DATA STRUCTURES
Data structure is a particular way of organizing data in a computer so that it can be used
efficiently. Data structures can implement one or more particular abstract data types (ADT),
which specify the operations that can be performed on a data structure and the computational
complexity of those operations. In comparison, a data structure is a concrete implementation
of the specification provided by an ADT.
Different kinds of data structures are suited to different kinds of applications, and some are
highly specialized to specific tasks. For example, relational databases commonly use B-tree
indexes for data retrieval, while compiler implementations usually use hash tables to look up
identifiers.
Data structures provide a means to manage large amounts of data efficiently for uses such as
large databases and internet indexing services. Usually, efficient data structures are key to
designing efficient algorithms. Some formal design methods and programming languages
emphasize data structures, rather than algorithms, as the key organizing factor in software
design. Data structures can be used to organize the storage and retrieval of information stored
in both main memory and secondary memory.
There are numerous types of data structures, generally built upon simpler primitive data
types:
An array is a number of elements in a specific order, typically all of the same type.
Elements are accessed using an integer index to specify which element is required
(Depending on the language, individual elements may either all be forced to be the
same type, or may be of almost any type). Typical implementations allocate
contiguous memory words for the elements of arrays (but this is not always a
necessity). Arrays may be fixed-length or resizable.
A linked list (also just called list) is a linear collection of data elements of any type,
called nodes, where each node has itself a value, and points to the next node in the
linked list. The principal advantage of a linked list over an array, is that values can
ARRAY
An array data structure, or simply an array, is a data structure consisting of a collection of
elements (values or variables), each identified by at least one array index or key. An array is
stored so that the position of each element can be computed from its index tuple by a
mathematical formula. The simplest type of data structure is a linear array, also called one-
dimensional array.
The memory address of the first element of an array is called first address or foundation
address.
The term array is often used to mean array data type, a kind of data type provided by most
high-level programming languages that consists of a collection of values or variables that can
be selected by one or more indices computed at run-time. Array types are often implemented
by array structures; however, in some languages they may be implemented by hash tables,
linked lists, search trees, or other data structures.
Arrays are used to implement mathematical vectors and matrices, as well as other kinds of
rectangular tables. Many databases, small and large, consist of (or include) one-dimensional
arrays whose elements are records.
When data objects are stored in an array, individual objects are selected by an index that is
usually a non-negative scalar integer. Indexes are also called subscripts. An index maps the
array value to a stored object.
There are three ways in which the elements of an array can be indexed:
0 (zero-based indexing): The first element of the array is indexed by subscript of 0.
LINKED LIST
a linked list is a linear collection of data elements, in which linear order is not given by their
physical placement in memory. Each pointing to the next node by means of a pointer. It is a
data structure consisting of a group of nodes which together represent a sequence. Under the
simplest form, each node is composed of data and a reference (in other words, a link) to the
next node in the sequence. This structure allows for efficient insertion or removal of elements
from any position in the sequence during iteration. More complex variants add additional
links, allowing efficient insertion or removal from arbitrary element references.
A linked list whose nodes contain two fields: an integer value and a link to the next node. The
last node is linked to a terminator used to signify the end of the list.
The principal benefit of a linked list over a conventional array is that the list elements can
easily be inserted or removed without reallocation or reorganization of the entire structure
because the data items need not be stored contiguously in memory or on disk, while an array
has to be declared in the source code, before compiling and running the program. Linked lists
allow insertion and removal of nodes at any point in the list, and can do so with a constant
number of operations if the link previous to the link being added or removed is maintained
during list traversal.
Advantages
Linked lists are a dynamic data structure, which can grow and be pruned, allocating
and deallocating memory while the program is running.
Insertion and deletion node operations are easily implemented in a linked list.
Dynamic data structures such as stacks and queues can be implemented using a linked
list.
There is no need to define an initial size for a linked list.
Items can be added or removed from the middle of list.
Backtracking is possible in two way linked list.
Disadvantages
They use more memory than arrays because of the storage used by their pointers.
Nodes in a linked list must be read in order from the beginning as linked lists are
inherently sequential access.
Nodes are stored incontiguously, greatly increasing the time required to access
individual elements within the list, especially with a CPU cache.
Difficulties arise in linked lists when it comes to reverse traversing. For instance,
singly linked lists are cumbersome to navigate backwards and while doubly linked
lists are somewhat easier to read, memory is consumed in allocating space for a back-
pointer.
The field of each node that contains the address of the next node is usually called the 'next
link' or 'next pointer'. The remaining fields are known as the 'data', 'information', 'value',
'cargo', or 'payload' fields.
The 'head' of a list is its first node. The 'tail' of a list may refer either to the rest of the list after
the head, or to the last node in the list. In Lisp and some derived languages, the next node
may be called the 'cdr' (pronounced could-er) of the list, while the payload of the head node
may be called the 'car'.
Singly linked lists contain nodes which have a data field as well as a 'next' field, which points
to the next node in line of nodes. Operations that can be performed on singly linked lists
include insertion, deletion and traversal.
In a 'doubly linked list', each node contains, besides the next-node link, a second link field
pointing to the 'previous' node in the sequence. The two links may be called 'forward('s') and
'backwards', or 'next' and 'prev'('previous').
A doubly linked list whose nodes contain three fields: an integer value, the link forward to the
next node, and the link backward to the previous node
Many modern operating systems use doubly linked lists to maintain references to active
processes, threads, and other dynamic objects. A common strategy for rootkits to evade
detection is to unlink themselves from these list.
In a 'multiply linked list', each node contains two or more link fields, each field being used to
connect the same set of data records in a different order (e.g., by name, by department, by
date of birth, etc.). While doubly linked lists can be seen as special cases of multiply linked
list, the fact that the two orders are opposite to each other leads to simpler and more efficient
algorithms, so they are usually treated as a separate case.
In the last node of a list, the link field often contains a null reference, a special value used to
indicate the lack of further nodes. A less common convention is to make it point to the first
node of the list; in that case the list is said to be 'circular' or 'circularly linked'; otherwise it is
said to be 'open' or 'linear'.
Sentinel nodes
In some implementations an extra 'sentinel' or 'dummy' node may be added before the first
data record or after the last one. This convention simplifies and accelerates some list-handling
algorithms, by ensuring that all links can be safely dereferenced and that every list (even one
that contains no data elements) always has a "first" and "last" node.
Empty lists
An empty list is a list that contains no data records. This is usually the same as saying that it
has zero nodes. If sentinel nodes are being used, the list is usually said to be empty when it
has only sentinel nodes.
Hash linking
The link fields need not be physically part of the nodes. If the data records are stored in an
array and referenced by their indices, the link field may be stored in a separate array with the
same indices as the data records.
List handles
Since a reference to the first node gives access to the whole list, that reference is often called
the 'address', 'pointer', or 'handle' of the list. Algorithms that manipulate linked lists usually
get such handles to the input lists and return the handles to the resulting lists. In fact, in the
context of such algorithms, the word "list" often means "list handle". In some situations,
however, it may be convenient to refer to a list by a handle that consists of two links, pointing
to its first and last nodes.
Combining alternatives
The alternatives listed above may be arbitrarily combined in almost every way, so one may
have circular doubly linked lists without sentinels, circular singly linked lists with sentinels,
etc.
STACK
A stack is an abstract data type that serves as a collection of elements, with two principal
operations: push, which adds an element to the collection, and pop, which removes the most
recently added element that was not yet removed. The order in which elements come off a
Example of stack
The name "stack" for this type of structure comes from the analogy to a set of physical items
stacked on top of each other, which makes it easy to take an item off the top of the stack,
while getting to an item deeper in the stack may require taking off multiple other items first.
Considered as a linear data structure, or more abstractly a sequential collection, the push and
pop operations occur only at one end of the structure, referred to as the top of the stack. This
makes it possible to implement a stack as a singly linked list and a pointer to the top element.
A stack may be implemented to have a bounded capacity. If the stack is full and does not
contain enough space to accept an entity to be pushed, the stack is then considered to be in an
overflow state. The pop operation removes an item from the top of the stack.
The push operation adds an element and increments the top index, after checking for
overflow:
Similarly, pop decrements the top index after checking for underflow, and returns the item
that was previously the top one:
QUEUE
A queue is a particular kind of abstract data type or collection in which the entities in the
collection are kept in order and the principle (or only) operations on the collection are the
addition of entities to the rear terminal position, known as enqueue, and removal of entities
from the front terminal position, known as dequeue. This makes the queue a First-In-First-
Out (FIFO) data structure. In a FIFO data structure, the first element added to the queue will
be the first one to be removed. This is equivalent to the requirement that once a new element
is added, all elements that were added before have to be removed before the new element can
be removed. Often a peek or front operation is also entered, returning the value of the front
element without dequeuing it. A queue is an example of a linear data structure, or more
abstractly a sequential collection.
Queues provide services in computer science, transport, and operations research where
various entities such as data, objects, persons, or events are stored and held to be processed
later. In these contexts, the queue performs the function of a buffer.
TREE
A tree structure or tree diagram is a way of representing the hierarchical nature of a
structure in a graphical form. It is named a "tree structure" because the classic representation
resembles a tree, even though the chart is generally upside down compared to an actual tree,
with the "root" at the top and the "leaves" at the bottom.
A tree structure is conceptual, and appears in several forms. For a discussion of tree
structures in specific fields, see Tree (data structure) for computer science: insofar as it
relates to graph theory, see tree (graph theory), or also tree (set theory).
In the example, "encyclopedia" is the parent of "science" and "culture", its children. "Art"
and "craft" are siblings, and children of "culture", which is their parent and thus one of their
ancestors. Also, "encyclopedia", as the root of the tree, is the ancestor of "science", "culture",
"art" and "craft". Finally, "science", "art" and "craft", as leaves, are ancestors of no other
node.
Tree structures can depict all kinds of taxonomic knowledge, such as family trees, the
biological evolutionary tree, the evolutionary tree of a language family, the grammatical
structure of a language (a key example being S → NP VP, meaning a sentence is a noun
phrase and a verb phrase, with each in turn having other components which have other
components), the way web pages are logically ordered in a web site, mathematical trees of
integer sets, et cetera.
The Oxford English Dictionary records use of both the terms "tree structure" and "tree-
diagram" from 1965 in Noam Chomsky's Aspects of the Theory of Syntax.
In a tree structure there is one and only one path from any point to any other point. Computer
science uses tree structures extensively
For a formal definition see set theory, and for a generalization in which children are not
necessarily successors, see prefix order.
Examples of Tree Structures
Internet:
o usenet hierarchy
o Document Object Model's logical structure, Yahoo! subject index, DMOZ
Operating system: directory structure
Information management: Dewey Decimal System, PSH, this hierarchical bulleted list
Management: hierarchical organizational structures
Computer Science:
o binary search tree
o Red-Black Tree
o AVL tree
o R-tree
Biology: evolutionary tree
Business: pyramid selling scheme
Project management: work breakdown structure
Linguistics:
o (Syntax) Phrase structure trees
o (Historical Linguistics) Tree model of language change
BINARY TREE
A binary tree is a tree data structure in which each node has at most two children, which are
referred to as the left child and the right child. A recursive definition using just set theory
notions is that a (non-empty) binary tree is a triple (L, S, R), where L and R are binary trees or
the empty set and S is a singleton set. Some authors allow the binary tree to be the empty set
as well.
𝑛−𝑙 = ∑ 2𝑘 = 2log2(𝑙) − 1 = 𝑙 − 1
𝑘=0
This means that a perfect binary tree with l leaves has n=2l-1 nodes.
The maximum possible number of null links (i.e., absent children of the nodes) in a
complete binary tree of n nodes is (n+1), where only 1 node exists in bottom-most
level to the far left.
The number of internal nodes in a complete binary tree of n nodes is (n/2).
For any non-empty binary tree with n0 leaf nodes and n2 nodes of degree 2, n0 = n2 +
1.
Tree Traversal
A tree traversal (also known as tree search) is a form of graph traversal and refers to the
process of visiting (checking and/or updating) each node in a tree data structure, exactly once.
Such traversals are classified by the order in which the nodes are visited.
Depth-first search
These searches are referred to as depth-first search (DFS), as the search tree is deepened as
much as possible on each child before going to the next sibling. For a binary tree, they are
defined as display operations recursively at each node, starting with the root, whose
algorithm is as follows:
The general recursive pattern for traversing a (non-empty) binary tree is this: At node N you
must do these three things:
(L) Recursively traverse its left subtree. When this step is finished you are back at N again.
(R) Recursively traverse its right subtree. When this step is finished you are back at N again.
(N) Process N itself.
We may do these things in any order. If we do (L) before (R), we call it left-to-right traversal,
otherwise we call it right-to-left traversal. The following methods show left-to-right traversal.
Pre-order
Inorder
In-order: A, B, C, D, E, F, G, H, I.
Post order
The trace of a traversal is called a sequentialisation of the tree. The traversal trace is a list of
each visited root. No one sequentialisation according to pre-, in- or post-order describes the
underlying tree uniquely. Given a tree with distinct elements, either pre-order or post-order
paired with in-order is sufficient to describe the tree uniquely. However, pre-order with post-
order leaves some ambiguity in the tree structure.
Depending on the problem at hand, the pre-order, in-order or post-order operations may be
void, or you may only want to visit a specific child, so these operations are optional. Also, in
practice more than one of pre-order, in-order and post-order operations may be required. For
example, when inserting into a ternary tree, a pre-order operation is performed by comparing
items. A post-order operation may be needed afterwards to re-balance the tree.
Breadth-first search
Level-order: F, B, G, A, D, I, C, E, H.
Trees can also be traversed in level-order, where we visit every node on a level before going
to a lower level. This search is referred to as breadth-first search (BFS), as the search tree is
broadened as much as possible on each depth before going to the next depth.
Other types
There are also tree traversal algorithms that classify as neither depth-first search nor breadth-
first search. One such algorithm is Monte Carlo tree search, which concentrates on analyzing
the most promising moves, basing the expansion of the search tree on random sampling of the
search space.
QBASIC
Introduction
Can you guise why it is called BASIC? I meant what do you suppose is the emphasis of the
BASIC computer programming language? The answer is simple; it is called BASIC because
the language emphasizes the BASIC ideas found in all programming languages. BASIC is an
acronym, for– Beginners’ All-purpose Symbolic Instruction Code. The language BASIC
was designed in the early 1960s for teaching the basic principles of programming to non-
science majors. It has been popular ever since. There are many versions of BASIC. These
notes use QBasic; a Microsoft released a version of 1985, it is a version that once came free
with Microsoft operating systems. With QBasic you can easily write small programs and get
the idea of what programming is about. Other versions of BASIC are intended for
Lesson 1
Running Your First Program
Before you can create a program in QBASIC, you need the QBASIC compiler.
QBASIC compiler can be downloaded from the Internet
QBASIC compiler has a startup window with blue background.
The Startup window will normally request you to press optional keys to continue.
Press the key that is appropriate.
QBasic on your Computer
If your computer is running any variety of a Microsoft operating system, it can run
QBasic.
If you are running a more recent operating system, click on the "Start" button.
o Click on "Run"
o In the "Open" box, enter CMD
o Click "OK" (or hit "Enter")
When you get a window with the prompt, enter the command: qbasic then press the
ENTER key.
QBASIC programming language is made up of some fundamental building blocks which are
treated as follows:
a. QBASIC Data Types there are various data types in QBASIC and can be
classified into two major categories:
(i.) Constants and
(ii.) Variables
Constant: Constants are numbers within a program whose values do not change.
Constant Declaration
To declare constants of any type (integer, real or string), the keyword used in QBasci is
CONST.
Constants can be declared as simply a value or with a descriptive name:
E.g.
CONST CourseUnit = 4.0
CONST CGPA = 4.49
CONST CourseTitle = “CS142”
Variable
Modern computers have a large amount of main memory (also called RAM). This memory is
used for many things. When you run a QBasic program, the statements of the program are
stored in main memory. The data for the program also can be stored in main memory. A
variable in QBasic is a small amount of computer memory that has been given a name. You (the
programmer) think of the name you want to use. The QBasic system will use a section of main
memory for that name. A variable is like a small box that holds a value. The value in the variable
can change (that is why it is called a variable). Here is a program that uses a variable:
A Variable in QBASIC is a small amount of computer memory that has been given a
name. The location of a variable in RAM is called the "address.”
In algebra, what do you call symbols like "x" and "y”, as in 3x2 + 2y?
Variable Declaration
There are two methods for variable declaration:
i. Explicit Declaration: Here, you explicitly declare the variable as a type. This is done by
using the DIM statement.
$ -- String
% -- Integer
& -- Long
! -- Single
# -- Double
With these symbols, variables could be declared and defined with a single statement using the
LET keyword.
E.g.
LET x$ = “QBASIC is cool!”
Names for variables must be single words that the programmer picks. The names don't have to
be real words, but it they should wisely choose to help in understanding the program they are.
Look over the following rules.
QUESTION :
Which of the following are OK names to use for a variable that will hold a floating point
number?
SUM, GRAND TOTAL, MyValue, 16Candles, SUM23, YEAR%
Answer:
o SUM --- OK (made up of correct characters, not too long)
o MyValue --- OK (upper and lower case letters can be used)
o SUM23 --- OK (digits can be used after the first letter)
o GRAND TOTAL --- BAD (no spaces allowed)
o 16Candles --- BAD (digits can't be used as first character)
o YEAR% --- BAD (last character % means this is an integer variable)
As soon as you move the cursor out of the PRINT statement the QBasic system will change
your program:
QUESTION :
VALUE is the name of a variable. Which of the following name a different variable?
Answer:
MILES MILE
528.7 0
The arithmetic expression MILE / 12.5 will get the 0 from MILE and divide it by 12.5, resulting
in 0. Finally the PRINT will write to the monitor:
0
This is a bug, and hard to track down unless you carefully look at the spelling of each variable
name in the program. If a program you write mysteriously calculates an incorrect answer of zero,
check the spelling of each variable!
i. INPUT Command
Syntax
(ii)
? list of variables
OR
? “Prompting Message”, list of variables
In this program, NUM is a variable. The programmer chose the name NUM . When the
program runs, several things happen when the LET statement executes:
Memory is reserved for the variable NUM
The number 23.5 is stored in the variable
So after this statement has executed a section of memory named NUM holds 23.5:
NUM
23.5
After the first statement has executed, the second statement executes:
PRINT "The variable contains", NUM
The variable NUM already exists, so no more memory is reserved for it.
The PRINT statement does several things:
It prints "The variable contains"
It looks in the variable NUM for a number.
It prints out the number it found.
QUESTION:
What do you think the following program will write to the monitor?
' Program with a variable
'
LET VALUE = 2 + 3
PRINT "The result is", VALUE
END
Answer:
The program will print: The result is 5
Saving a Result
Look at this new program again:
' Program with a variable
'
LET VALUE = 2 + 3
PRINT "The result is", VALUE
END
The LET statement can be used with a variable to save the result of arithmetic. The value saved
in the variable will stay there until another statement changes it, or until the program stops
running.
QUESTION 4:
What do you think the following program will write to the monitor?
' Saving a result in a variable
'
LET SUM = 1 + 2 + 3
PRINT "The sum is", SUM
END
Answer:
The program will print: The sum is 6
Exercise
Using LET and other relevant commands, write a QBASIC program that finds the area and
perimeter of a triangle.
Numeric Functions
These are functions for mathematical calculations provided in the QBASIC compiler. Numeric
functions take numbers as arguments and returns a number as output. Some of these functions
are:
Num = ABS(signednumber)
Lectures Topics
o Input and Output in a computer system.
Up until now in these notes, all the data a program uses have been part of the program
itself. For example:
' Calculate Miles per Gallon
'
LET MILES = 45678.3 - 45149.6
LET GALLONS = 12.5
PRINT MILES / GALLONS
END
The data is the first odometer reading (45149.6), the second odometer reading (45678.3),
and the number of gallons.
QUESTION:
Will this program do the same thing each time it is run?
Answer:
Yes.
Input Devices
Consider the program.
LET MILES = 45678.3 - 45149.6
LET GALLONS = 12.5
PRINT MILES / GALLONS
END
Every time you run the program it will do exactly the same thing. The next time you fill
up your gas tank again, you would have to change the program to calculate with the new
values. The program is not very useful.
Most useful computer programs input data from various sources when they run. Input
means data that comes from outside the program. A program that does this can work
with new data each time it is run.
Sources of Input Data
o The Keyboard
o A Hard Disk
o A Floppy Disk
o A Cd-Rom
o The Mouse
o A Video Camera
o An Audio Digitizer (Part Of A Sound Board)
o The Internet
o Hundreds Of Other Sources
For us, most data will come from the user of a program typing on the keyboard. Some
sources of data (like audio data) require special hardware. A piece of hardware that is a
source of input to a computer program is called an input device.
Output Devices
QUESTION:
Some devices are on the list of input devices and on the list of output devices. What were
some of these devices?
Answer:
o a hard disk
o a diskette
o the Internet
I/O
The hard disk of a computer system is used for both Input to a program and Output
from a program. Other devices are exclusively input devices (such as the keyboard or
mouse) or output devices (such as the printer and monitor.) Input and Output are so
important to a computer system that the abbreviation I/O is used. Any device which
does either input, output, or both is called an I/O device. The movement of data from
or to such a device is often called "I/O".
QUESTION:
What is the first thing the program writes to the monitor?
Answer:
Type a number
Now the second statement INPUT NUMBER executes. The INPUT statement is used
to get data from the keyboard. It will:
o Print a question mark "?" to the screen.
o Wait until the user has typed in a number.
o Get the number when the user hits the "enter" key.
o Put the number in the variable NUMBER.
Just after this statement starts the monitor will look something like:
Type a number
?
The question mark came from the INPUT statement. The INPUT statement is waiting
for the user (you) to type a number. Say that you type 23. Now the monitor looks like:
Type a number
? 23
To enter the number, hit the enter key. The INPUT statement puts the 23 into the
variable NUMBER. Next, the PRINT statement executes.
QUESTION:
What will the monitor finally look like?
Answer:
The program has done both output and input. It output data to the monitor when it
wrote "Type a number" and the "?". It did input from the keyboard when it got the 23
and stored it in NUMBER. Then it did output again when it wrote 46 to the monitor.
After each run of the program, push any key to return to the QBasic screen. The picture
shows the DOS window after the program was run three times.
QUESTION:
Does the program work with floating point numbers?
Answer:
Yes. Floating point numbers are those with a decimal point, such as in the second run of
the program. The variable NUMBER can hold floating point values.
The variable should be the correct type for the expected input. Remember that the last
character of a variable name indicates what the variable is expected to hold. For example,
a variable VALUE# can potentially hold a very big floating point number. A variable
DATA% is expected to hold an integer (no decimal point). Here is a program that does
input of an integer:
QUESTION:
Say that the user types 1.2 when the INPUT statement of this program asks for data.
What will the monitor show after the user hits "enter"?
Answer:
Type a number
? 1.2
2
The variable DATA% can only hold an integer, which cannot have a decimal point. The
user typed 1.2, but only the 1 was put into the variable. (The 1.2 is said to have been
truncated to 1). The 1 was then doubled, and 2 was output.
Control Structure
Loops
Many machines do their work by using repeated motions, cycles. The engine in your car
performs the same motions over and over again as it burns gasoline to provide power.
Electric motors are similar; they convert electric power into spinning motion. Both of
these machines are useful because they do the same things over and over and can keep
going as long as we want.
Computer programs, also, use cycles. Much of the usefulness of computer software
comes from doing things in cycles. In programming, cycles are called loops. When a
program has a loop in it, some statements are done over and over as long as is needed to
get the work done. Most computer programs execute millions of program statements
each time they are used. Most of these statements are the same statements being
executed over and over many times. This section of the lesson discusses loops in QBasic.
Lecture Topics:
o The DO ... LOOP
o Using CONTROL-BREAK to Exit a Loop.
o The Loop Body.
o Indenting Loop Bodies.
o Sequential Execution And Looping Combined.
o The SOUND Statement.
NUMBER
100
Now the statements inside the loop start again with the first statement after DO:
o For a second time, the PRINT statement prints "Type a number".
o For a second time, the INPUT statement gets a number to put in the variable
NUMBER.
o Say that the user typed in 50. The number 50 will replace what was previously in
the variable.
o For a second time, the second PRINT statement prints "6% of the number is"
and then computes and prints 50 * 0.06.
o The LOOP statement marks the end of the loop.
At this point, variable NUMBER now has the new value of 50 in it:
50
QUESTION:
What do you suppose happens next? Have the user enter the number 200.
Answer:
The statements inside the loop are executed a THIRD time:
o For the third time, the PRINT statement prints a prompt.
o For the third time, the INPUT statement gets a number to put in the variable
NUMBER.
o Say that the user typed in 200. The number 200 will replace what was previously
in the variable.
o For the third time, the second PRINT statement prints "6% of the number is"
and then computes and prints 200 * 0.06.
o The LOOP statement marks the end of the loop.
Ending the Loop
Now the monitor shows:
Type a number
? 100
6% of the number is 6
Type a number
? 50
6% of the number is 3
Type a number
? 200
6% of the number is 12
NUMBER
200
The program will continue forever, performing the statements inside the DO ... LOOP
again and again. If you run this program, you will have to stop it, somehow. To stop the
program, hit CONTROL-BREAK. (Hold down the key marked "Ctrl" then tap the key
marked "Break.")
Brackets
Look again at the program:
' Example of a loop
DO
PRINT "Type a number"
INPUT NUMBER
PRINT "6% of the number is", NUMBER * 0.06
LOOP
END
The DO and the LOOP are matched like brackets ( ) or [ ]. The statements inside of
them will be done over and over, starting with the first enclosed statement and going on
in sequence. You can have as many statements as you need between DO and LOOP:
QUESTION:
Do you see anything in the DO or the LOOP statement that says how many times the
statements between them are to be repeated?
Answer:
No, neither the DO nor the LOOP (nor the statements between them) say how many
times to repeat. The statements between the DO and the LOOP will be repeated
endlessly, or until the user hits CONTROL-BREAK.
A Loop Body
The statements between the DO and LOOP are called the body of the loop. So you can
say that for the example program, the loop body is repeated endlessly. Soon you will
learn how to write loops that do not repeat endlessly. Sometimes, however, endlessly
repeating loops are useful.
A Story Problem
Say that you are in charge of putting price tags on new merchandise in a clothing store. A
new shipment of several hundred items has just arrived. You have a list that tells you the
wholesale price of each item. The markup in your store is 50 percent, so the price tag
should be 1.5 times the wholesale price. You would like a program that asks you for the
QUESTION :
In rough outline, what do you suppose the program looks like?
Click here for a .
Answer:
Here is a rough outline for the program:
DO
' ask the user for the wholesale price of an item
' get the price, put it in a variable
' compute and print out the retail price
LOOP
END
Loop Conditions
The DO WHILE statement acts like a "gate keeper" for the loop body: execution is allowed to
enter the loop body only when a certain condition is true. This chapter will discuss how that
condition can be described.
Lesson Topics:
o Review of DO WHILE statements.
o The condition part of DO WHILE statements.
o Comparisons between numbers.
o Ending a loop with a Sentinel value.
o Examples of programs using sentinel values.
o General Scheme for DO WHILE loops with sentinels.
QUESTION:
(Review:) What is a loop condition?
Answer:
A loop condition is a test that is part of the DO WHILE statement.
Here the DO WHILE in statement 2 is the loop condition (the gate keeper). Execution enters
the loop body only if
COUNT <= 5
That is, only if the number stored in COUNT is less than or equal to 5. Then statements 3 and 4
are executed, and then LOOP sends execution back to the DO WHILE. The test COUNT <= 5
must be true every time the loop body starts to execute.
QUESTION:
DO WHILE condition
. . . loop body . . .
LOOP
o The DO and the LOOP are brackets that mark the loop body.
o The WHILE condition lets execution enter (or re-enter) the loop body only if condition
is true.
o condition is a test involving variables and numbers.
So far in these chapters the condition has tested if one number is less than or equal to another
number. You can also test if one number is LESS THAN another number:
COUNT < 5
This tests if COUNT is LESS THAN 5. Here is the program, slightly modified.
' Loop with LESS THAN test
'
LET COUNT = 0 'Statement 1
'
DO WHILE COUNT < 5 'Statement 2
PRINT COUNT 'Statement 3
LET COUNT = COUNT + 1 'Statement 4
LOOP
END
Now the DO WHILE lets execution into the loop body only if COUNT is less than (but not
equal) to 5.
QUESTION:
Say that Statement 4 has just changed COUNT to 5. The LOOP statement sends execution back
to the DO WHILE. Will the loop body execute again?
Answer:
NO, because COUNT has a 5 stored in it, and:
COUNT < 5
^
|
+----- holds a 5
is FALSE because 5 is NOT less than 5.
QUESTION:
You are driving in your car and it starts to rain. The rain falls on your windshield and makes it
hard to see. What should you do with the windshield wipers?
Answer:
Turn the windshield wipers on.
Two-way Decisions
The windshield wipers are controlled with an ON-OFF switch. The decision to turn the switch
on looks like this:
In this picture of a decision, you are supposed to start at the top, then follow the line to the
question:
is it raining?
The answer to the question is either TRUE or FALSE.
o If the answer is TRUE,
o follow the line labeled TRUE,do the directions in the box "turn wipers on", follow the
line to "continue"
o If the answer is FALSE
o follow the line labeled FALSE,
o do the directions in the box "turn wipers off", follow the line to "continue"
QUESTION:
How many ways can you go from "start" to "continue"?
Answer:
There are two paths through this chart.
The words IF, THEN, ELSE, and END IF are brackets that divide the part of the program into
two branches. The ELSE is like a dividing line between the "true branch" and the "false branch".
The IF statement always asks a question (usually about the number in a variable).
o The answer will be TRUE or FALSE.
If the answer is TRUE only the true-branch is executed.
If the answer is FALSE only the false-branch is executed.
No matter which branch was chosen, execution continues with the statement after the
END IF.
A two-way decision is like picking which of two roads to take to the same destination. The fork
in the road is the IF statement, and the two roads come together just after the END IF
statement.
QUESTION:
The user runs the program and enters "12". What will the program print?
Answer:
Enter a number
? 12
The number is zero or positive.
Bye
The true branch was executed because the answer to the question NUMBER >= 0 was true.
The Program as a Chart
Here is the program again, done as a chart. Because the answer to the question was "true", the
path on the left was done.
QUESTION:
The user runs the program and enters "-5". What will the program print?
Answer:
Enter a number
? -5
The number is negative.
Bye
ONLY the FALSE branch was executed because the answer to the question NUMBER >= 0
was FALSE.
More than one Statement per Branch
Here is the program again with some added statements:
PRINT "Enter a Number"
INPUT NUMBER
'
IF NUMBER >= 0 THEN
PRINT "The number is zero or positive" ' true branch
PRINT "Positive numbers are > 0" ' true branch
ELSE
PRINT "The number is negative" ' false branch
PRINT "Negative numbers are < 0" ' false branch
END IF
'
PRINT "Bye" ' this statement is always done
END
The statements in the true branch are executed when the question in the IF statement is TRUE.
There can be as many statements as you want in the true branch. The true branch consists of the
statements between the IF statement and the ELSE statement. Of course, the statements in the
false branch are executed when the question in the IF statement is FALSE. There can be as many
statements as you want in the false branch. The false branch consists of the statements between the
ELSE statement and the END IF statement.
The condition looks like the part of a DO WHILE condition statement that compares what is held
in a variable with other values. You can use the same comparisons: <, <=, =, and so on.
QUESTION:
Is the following program correct?
PRINT "Enter a Number"
INPUT NUMBER
'
IF NUMBER >= 0 THEN
PRINT "The square root is:", SQR( NUMBER )
ELSE PRINT "There is no square root"
PRINT "Run the program again."
END IF
QUESTION:
How would you fix the above program?
Answer:
PRINT "Enter a Number"
INPUT NUMBER
'
IF NUMBER >= 0 THEN
PRINT "The square root is:", SQR( NUMBER )
ELSE
PRINT "There is no square root"
PRINT "Run the program again."
END IF
'
PRINT "Bye"
END
In this program there are a different number of statements in the true branch than in the false
branch. This is fine.