PASCAL Plus Data Structures, Algorithms, and Advanced Programming PDF
PASCAL Plus Data Structures, Algorithms, and Advanced Programming PDF
Until recently there has not been much consensus among educators on
what formal education is necessary for a computer professional. It has al-
ways been considered essential to have a great deal of mathematical knowl-
edge, as \vell as an understanding of computer hardware. Software educa-
tion, however, often consisted of the teaching of a number of programming
languages. The development of programming techniques, it was often as-
sumed, was not a subject of formal education, but rather a matter of experi-
ence. This situation is analagous to an English department teaching only
grammar without freshman cOlnposition, literature, or creative writing.
Today the shared experience of a generation of programmers has com-
bined to form a body of practical and theoretical knowledge that can be
taught along with computer languages. Software is now an area of study in
its own right. The evolution of this situation can be clearly seen in the
progression of computer science curricula advocated by the major com-
puter professional organizations.
Three main professional organizations represent the spectrum of com-
puter educators and professionals: the ACM (Association of Computing
Machinery), the IEEE (Institute of Electronic and Electrical Engineers),
and the DPMA (Data Processing Management Association). Each of these
organizations has contributed greatly to computer science education over
the years by publishing curriculum guidelines that identify what a student
should know upon graduation from a four-year institution of higher learn-
ing.
The first curriculum guidelines were published in 1968 by the ACM.
These guidelines assumed that only those students who had a broad mathe-
matical background could possibly learn computer science. Programming
was considered a skill that the student was to master on his or her own,
perhaps within a laboratory setting. In 1976 the IEEE issued guidelines for
vii
vi iii Preface
ADD~l~ONAL fEATURES
Chapter Goals: At the beginning of each chapter the goals of that chap-
ter are presented. These goals are then tested in the exercises and pretests
at the end of the chapter.
Chapter Exercises: At the end of each chapter (except Chapter 13), there
is a set of paper-and-pencil exercises to test whether the chapter goals have
been attained. The complete exercise answers are in the back of the book.
Chapter Pretests: At the end of each chapter (except Chapter 13) there is
a test for the student to measure his or her own progress. The answers to
these questions are in the Instructor's Guide.
Preface \ xi
Class Tested: This manuscript has been class tested with approximately
3,000 students over two years at the University of Texas at Austin. In addi-
tion, it has been reviewed and class tested at the following sites:
Southern Illinois University at Carbondale (Robert J. McGlinn)
University of Maryland (Richard H. Austing)
Broome Community College (Morton Goldberg)
National Cathedral School (Suzanne Golomb)
ACKNOWLEDGEMENTS
Thanking people who have contributed to a textbook is a little like saying
thank-you at the Academy Awards: You thank a few and run the risk of
missing someone important or you thank everyone and run the risk of los-
ing the audience.
We cannot list all the students who helped so much by using and critiqu-
ing various manuscripts, but we can thank each of our fellow professionals
who acted as reviewers:
Jeff Brumfield (The University of Texas at Austin)
Frank Burke (Middlesex County College)
Thomas E. Byther (University of Maine at Orono)
Henry A. Etlinger (Rochester Institute of Technology)
David L. Feinstein (University of South Alabama)
Gary Ford (University of Colorado at Colorado Springs)
George W. Heyworth (United States Military Academy)
Nancy Penney (University of Kansas)
Bruce Presley (Lawrenceville School)
Edwin D. Reilly (State University of New York at Albany)
xii I Preface
1 PROGRAMMING TOOLS
The Programmers' Toolbox 2
The Goal 3
Why Bother? 4
Getting Started 4
Understanding the Problem 6
Top-down Design of Data Structures 12
Writing Easily Modifiable Code 14
Documentation 14
Debugging 18
Program Testing 21
A Warning 24
Summary 24
Application: Command-Driven Problems 26
Exercises 34
Pre-Test 36
One-dimensional Arrays 44
Two-dimensional Arrays 49
Records 56
Packed Structures 58
Error Checking 59
Summary 59
xiii
xiv I Contents
Application: Strings 61
Exercises 86
Pre-Test 88
3 STACKS 91
4 QUEUES 149
7 RECURSION 277
An Example of Recursion 279
Programming Recursively 281
Function Sum Revisited 284
A Boolean Function 286
A Recursive Procedure 288
A Few More Points 290
Revprint Revisited 290
How Recursion Works 292
NFACT-One More Time 296
Summary 299
Application: Quicksort-An Example of a Recursive Algorithm 300
Exercises 308
Pre-Test 310
xvi I Contents
10 VERIFICAT!ON 409
12 SEARCHING 451
APPENDiXES A1
EXERCISE ANSWERS A25
GLOSSARY A59
CHAPTER PROGRAMMING ASSIGNMENTS 81
INDEX
Applications
Command-driven problems 26
Strings 61
Expression evaluation 106
Maze 129
Simulation 163
Magazine circulation 220
Quicksort-An example of a recursive algorithm 300
Index 343
The insertion sort 418
xix
2 \ Chapter 1 Programming Tools
THE GOAL
It is not enough to write a program that does something. A programmer
must determine what the program is supposed to do and then write a good
program that accomplishes the task.
A good program
(a) works (that is, accomplishes its intended function),
(b) can be read and understood,
(c) can be modified if necessary without excruciating effort, and
(d) is completed on time and within its budget.
We will not guarantee that simply by reading this chapter you will be
able to write programs that will all run perfectly the first time. For all but
small, trivial programs that is a rare occurrence. We will promise that the
techniques described in this chapter will help you write programs that are
easier to understand, to test and debug, and to modify. And, very likely, you
will be able to write programs that are debugged and running before the
deadline in class or at work.
The ideas in this chapter are not limited to Pascal. The techniques we
will discuss are equally applicable to FORTRAN, PL/l, BASIC, and other
programming languages.
41 Chapter 1 Programming Tools
If everything in this chapter is old hat to you, if the chapter is a review of
your first programming course, congratulations! You were taught well the
first time around. If all or most of this chapter is new to you, have patience.
With a little practice and perseverance, you will learn good habits that will
serve you all your programming life.
WHY B01HER?
In your first Pascal course, you learned the vocabulary and the grammar of
the language-the syntax of Pascal. You learned the reserved words and the
constructs for selection (IF-THEN-ELSE) and looping (WHILE-DO).
You learned the mechanism for declaring and manipulating the built-in
data structures-the array and the record. And you learned how to define
and use subprograms-procedures and functions.
You mayor may not have learned a technique for putting it all together to
solve problems. There are several popular approaches, including flow-
charting, top-down, and bottom-up methods. Though all of these ap-
proaches have their fans and promoters, in this book we will use the top-
down approach to designing computer programs.
Why is it important to have a technique for programming? Why not just
sit down and write programs? Aren't we wasting a lot of time and paper,
when we could just as easily write the program directly in Pascal or FOR-
TRAN or assembly language?
If the degree of our programming sophistication never had to rise above
the level of trivial problems (like summing a list of prices or averaging
grades), we might get away with such a code-first technique (or, rather, lack
of technique). Many new owners of personal computers program this way,
hacking away at the code until the progran1 works more or less correctly.
However, if the problem is not small or not trivial, or if the program may
need later modification (and most programs do), the need for a structured
programming methodology becomes apparent.
GlEn~NG STARTleD
The First Step
No matter which programming design technique you use, the first steps
will be the same. Imagine the following all-too-familiar situation: On the
third day of class, you are given a twelve-page description of Programming
Assignment One, which must be running perfectly and turned in by noon, a
week from yesterday. You read the assignment and realize that this program
will be three times as long as any program you've ever written before. Now,
what is your first step?
The responses below are typical of those given by a class of computer
science students in such a situation:
Getllng Started I5
1. PANIC 39%
2. PICK UP A PENCIL AND START WRITING 260/0
3. DROP THE COURSE 24%
4. COPY THE CODE FROM A SMART CLASSMATE. . . . . . . .. 70/0
5. STOP AND THINK........................................ 40/0
Response 1 is a reasonable reaction from students who have not yet learned
a good programming technique. Students who answered with response 3
will find their education coming along rather slowly. Response 4 will get
you scholastic probation at most universities. Response 2 seems like a good
idea, given the deadline looming ahead. But-resist the temptation to grab
Gput specifications, the goals, the requirements, and all the assumptions
about the problem.
When you have finished this task, you have not only mastered your fear
of the blank page, but also started the documentation for your program. In
addition, you will become aware of any holes in the specifications. For
instance, are embedded blanks in the input significant or can they be ig-
nored? Do you need to check for errors in the input? By getting the answers
to these questions at this stage, you can write the program correctly the first
time.
At this point, you will probably appreciate your twelve-page assignment
description. Chances are that in twelve pages most of the details are speci-
fied, and you will not be expected to divine them by ESP or by hanging
around the professor's office. Of particular horror are "simple" program
specifications, like the following:
61 Chapter 1 Programming Tools
Output How should the output be formatted? Should you print out all
the information in the student record or just the name and CPA? Should
you list by class rank from highest to lowest or vice versa? And so on.
Special Requirements Do you need to use the most efficient sort possi-
ble or can you use a slow but simple sorting routine?
You must know some details in order to write and run the program. Obvi-
ously, you cannot expect to input the data correctly if you do not know how
it will be formatted. If the output will be written to a file and saved for use
as input to another program, you must know the precise format required.
Other details, if not explicitly stated in the program description, may be
handled according to the programmer's preference. Decisions about un-
stated or ambiguous specifications are called assumptions, and they should
always be written explicitly in the program's documentation. In short, a
complete description will both clarify the problem to be solved and serve
as written documentation of the program.
Writing a complete description of the class rank problem is left as an
exercise. A sample description is given in the Answer Key in the back of the
book.
Figure 1-1.
Since following either route will get the traveler to Joe's, both answers are
functionally correct.
However, if the request for directions had special requirements, one
solution might be clearly preferable to the other. For instance, "I'm late to
dinner. Tell me the quickest route to Joe's Diner" prompts the first re-
sponse, whereas "Is there a pretty road to Joe's?" suggests the second.
When no special requirements are known, the choice is a matter of personal
preference: Which road do you like better?
In this book, we will present numerous algorithms. Choosing between
two algorithms that do the same task often depends on the requirements of
a particular application. If no such requirements exist, the choice depends
on the programmer's own style.
ALGORITHM
solve a problem in __ _ O'l"'nn11'1 ...
ing routines \vill result in the correct answer. Ho\vever, if computing speed
is mentioned in the requirements, one routine may be preferable to the
other.
TOP-DOWN DESIGN
Once you have fully clarified the goals of the program, you can begin to
develop a strategy for meeting them. The method we will use is called
top-dolvn design. This method, also called step-lvise refinement, takes the
divide-and-conquer approach. First the problem is broken into several
large tasks. Each of these in turn is divided into sections, the sections are
subdivided and so on. The important feature is that details are deferred as
long as possible.
This approach is probably familiar to you. Suppose you needed to write a
cOlnprehensive term paper on "Walt Whitman, His Poetry, and His Effect
on American Literature." You could just sit down and start writing tIle
paper. On the other hand, if you want to ,vrite a paper that sticks to the
subject and presents an orderly discussion of the topic, you had better write
an outline first. The main modules, denoted by large Roman numerals,
might consist of
I. Biography of Walt Whitmall
II. Whitluan's poems
III. His effect on American literature
Within each main section, you would then add subsections. For in-
stance, under Section I, you could add
A. Family history
B. Birth to early years
C. Newspaper years
D. The war years
E. The last years
Under each of these sections, you would add numbered subsections.
Note that you have deferred as much detail as possible until the lower
levels. For instance, I.A. does not tell the names ofWhitluan's parents and
I.B. does not include the date of his birth. When the outline is complete
down to the lowest level, it is a relatively simple task to write the paper.
Note that the decisions on what to write are begun before the outline is
written and continue throllgh its development. By the time you actually
begin writing the paper, these decisions have already been made.
The development of a computer program by top-down design proceeds
in a similar way. You 11ave already thought out the problem, defined it
completely, and established your goals. You then devise a general strategy
for solving the problem by dividing it into Inanageable units. Next, each of
these large modules is subdivided into several tasks. The top levels of the
Top-down Design [9
Idea Top-down Design Code
General Main module Main program
strategy
Figure 1-2.
top-down design will not be written in source code (like Pascal or FOR-
TRAN or BASIC), but rather in English or "pseudo-code." This divide-and-
conquer activity goes on until, at the lowest level, you are down to individ-
ual lines of code.
Now the problem is simpler to code into a well-structured program. This
approach encourages programming in logical units, using procedures and
functions. The main module of the top-down design will become the main
program, and subsections will develop into procedures. Figure 1-2 shows
how the development of the top-down design parallels the thought process
for solving the problem.
As an example, let's write part of the top-down design for the class rank
problem. We have ascertained the input and output required, as well as the
special requirements. The input is free format with each item separated by
at least one blank. NAME consists of two character strings, LAST and
FIRST, SEX is a one-character code, DEPT is a two- or three-letter abbre-
viation, and CLASS is an integer in the range 1-4. The last record is fol-
lowed by a zero in place of the ID number.
GETDATA
SORTDATA
PRINTRESULTS
The program is now divided into three logical units, each of which will
probably be developed into a procedure in the final program. In fact, we
can already imagine what the main program will look like:
10 IChaPt~r 1 Programming Tools
Next, we develop each of these tasks by adding one layer of detail. Let's
consider the first module:
GETDATA Level 1
GETRECORD Level 2
get IDNUM
IF IDNUM () 0
THEN
increment NUMSTUDENTS
store IDNUM
GETWORD (LASTNAME)
store LASTNAME
CETWORD (FIRSTNAME)
store FIRSTNAME
get SEX
store SEX
GETWORD (DEPT)
store DEPT
get CLASS
store CLASS
get CPA
store CPA
Top-down Design 111
But where do we store all these data? We will discuss the process used to
design data structures in the next section. For no\v, we will use an array of
records.
Again, some lines of this module will convert directly into Pascal code.
The storing tasks can be coded as
STUDENTS[IJ.IDNUM := IDNUM;
STUDENTS[IJ.LASTNAME := LASTNAME;
STUDENTS[IJ.GPA := GPA
Lines like "get CLASS" and "get CPA" can also be coded directly as sim-
ple READ statements. Other lines, like "GETWORD," will require further
breaking down.
GETWORD Level 3
SKIPBLANKS Level 4
REPEAT
get CHARACTER
UNTIL CHARACTER not a blank
Level 0
Levell
GET SORT PRINT
DATA DATA RESULTS
Each of these
Level 2
sections is similarly
subdivided.
Level 3
Level 4
SKIP PA.D
BLANKS WORD
Figure 1-3.
IJata Structures
"Abstract" data structures
Intermediate development
of data structures
Implementation in terms of
primitive data types in the
source language
Throughout this book we will discuss the need for data abstraction. The
goal is to be able to manipulate the program's data with regard to its logical
representation, rather than its physical storage. To a limited degree, pro-
gramming languages provide built-in data structures that hide the physical
placement of data in memory. For instance, you tend to picture a two-di-
mensional array as having rows and columns, though you know that its data
is actually stored linearly in contiguous memory locations. A high-level
programming language like Pascal allows you to access members of a
two-dimensional array through their logical addresses, for instance:
TABLE[ROW, COLUMN]
(We will discuss the implementations of built-in data structures in Chap-
ter 2.)
Why is it important to be able to hide the actual implementation of a data
141 Chapter 1 Programming Tools
structure? We have already seen ho\v this abstraction simplifies the pro-
gram-design process. As we consider various programmer-defined data
structures in the following chapters, we will discover other advantages to
data abstraction.
My Smart Friend
One of the nice things about writing a top-down design is that when one
task is especially difficult or complicated, it doesn't have to put a halt to the
whole works. You merely make that task into a module and give it a name,
thinking "I'll do the rest and my smart friend can take care of this part
later." You can then return to the tricky module, working out its details and
submodules at a later time. Even though you know that, eventually, you
will have to come back to it yourself, it is very reassuring to think that your
smart friend will take care of it.
DOCUMENTATION
Documentation consists of the written descriptions, specifications, design,
and actual code of a program.
We mentioned earlier that your complete description is an important
piece of the program's documentation. The external documentation of a
program is the written information that is outside of the body of executable
code. In addition to the program specifications, the external documentation
may include the history of the program's development and subsequent
nlodifications, the top-down design, prologues to the program and its sub-
programs, and the user's manual.
The internal documentation includes comments, program formatting,
and self-documenting code. The goal of all these features is to make the
program readable, understandable, and easy to modify.
Documentation 115
Comments
Comments can usually be taken directly from the pseudo-code in your top-
down design. Although it is important to use comments to help the reader
determine what the programmer is trying to do, it is not necessary to com-
ment on every statement. Frequent use of comments like
or
(:i: If value is found, return its index and stop processing; othenvise, check next value. *)
IF VAL = LIST[INDEXJ
THEN
BEGIN
LISTINDEX := INDEX;
FOUND := TRUE
END
ELSE
INDEX := INDEX + 1
Prettyprinfing
Program formatting, known as prettyprinting, is encouraged by Pascal's
relaxed rules about blank spaces and lines. By using blank spaces to indent,
we can convey the logical structure of a program at a glance. In addition,
blank lines can be used to separate the logical units of the program. Many
systems have a prettyprinting program that will do much of the formatting
for you. (In general, however, it ~s not a good idea to rely too much on such
a tool.)
Self-documenfing Code
Self-documenting code uses meaningful identifier names to convey the in-
tended function of variables and constants. For instance,
CONST T = O.OllZ;
VAR X, Y : REAL;
X := Y * T
doesn't tell us anything about the relationships and use of the constant and
variables. On the other hand,
Using Constants
The use of constants to represent "magic" numbers like pi, nonvariable tax
rates, the maximum number of items in an array, and so on also makes code
more readable and easily modifiable. Consider a program that reads num-
bers into an array DATA (ARRAY[l .. lOO] OF INTEGER). Numerous oper-
ations will be performed on the values in the array throughout the program,
each with a section of code beginning
FOR I := 1 TO 100 DO •• t
At some later ti~e, if you need to increase the size of DATA to 200 ele-
1s1 Chapter 1 Programming Tools
ments, you must go back through the whole program to change every place
that the size of the array is indicated:
FOR I : = 1 TO 200 DO • • •
and declared the array as ARRAY[l .. MAX], the modifications of the array
size would be trivial, involving only one line:
Throughout the rest of the program, the references to the size of the array
would not need modification:
FOR I : = 1 TO MAH DO • • •
DEIBUGG~NG
The top-down design methodology will help you write better programs,
but it will not guarantee that these programs will be correct on the first run.
In fact, only the most trivial programs are likely to be written perfectly at
first. By taking an aggressive approach to debugging, however, you can
usually simplify the process of finding and removing errors from programs.
The goal is to prevent as many types of errors as possible and to make the
rest easy to detect and correct.
Errors in Syntax
"If worse comes to worst, read the manual."
the typo (U left offVALU) should generate a syntax error (undeclared iden-
tifier). But what happens if VAL is an identifier declared elsewhere in the
program? The assignment statement may succeed, and a hard-to-detect log-
ical error will result. (For this reason, it is a good idea to avoid nearly
identical identifiers.) In short, don't put your trust in the omnipotent com-
piler; check your own typing.
Programmers should also be familiar with the implementation of the
language at their site, including all of its idiosyncrasies. A program written
in the standard Pascal you learned in school may not run on the computer at
your first job. When in doubt, check the manual!
Errors in Logic
In general, syntax errors are relatively easy to locate and correct. Logical
errors, which occur during the execution of a program, are usually harder to
detect.
There are two broad categories of logical errors encountered: those that
stop execution of the program and those that allow execution to continue
20 I Chapter 1 Programming Tools
but produce the wrong results. The first type is often the result of the pro-
grammer's making too many assumptions. For instance,
}{ := Y / Z
READ( }.{)
will cause a run-time error (type conflict) when the input is a nonnumeric
character.
These are situations where a defensive posture produces good results.
You can check explicitly for error-creating conditions rather than letting
them abort your program. For instance, it is generally unwise to make many
assumptions about the correctness of input, especially input from a key-
board. A better approach is to check explicitly for the correct type and
bounds of the input. The programmer can then decide how to handle the
errors (request new input, go on to next record, print out error message,
terminate progralTI, and so on), rather than leaving the decision to the sys-
tem.
Errors that do not stop execution but produce incorrect results are often
harder to prevent and to locate. Our aggressive approach will take a dual
focus.
First, don't use questionable code. Hand-check sections of code with
paper and pencil. Look for loops that execute one time too many or too few.
Make sure that all loops terminate. Check for variables that are incorrectly
initialized or not initialized at all (but should be). Make sure that proce-
dures that return values have VAR parameters, and that the order of the
parameters in the heading and call matches up. See if the algorithm is
correctly coded-no steps left out, all steps in the right order. Tryout the
lower and upper bounds of the algorithm.
Second, plan your debugging in the design phase of your program. When
you write your top-down design, you will identify predictable trouble
spots; anything left for your smart friend is surely suspect. Then, insert
temporary "debug WRITELNs" into your code in the places where errors
are likely to occur. If you want to trace the program's execution though a
complicated sequence of nested procedures, add output statements that
indicate when you are entering and/or leaving a procedure. A more useful
debug output will also indicate the values of key variables, especially pa-
rameters of the procedure or function.
If hand-testing doesn't reveal all the bugs before you run the program,
well-placed debug lines will at least help you to locate the rest of them at
execution time. One popular technique is to make the debug WRITELNs
dependent on a Boolean flag, which can then be turned off and on as neces-
sary. For instance, one section of code known to be error-prone may be
flagged in various spots for trace output using the Boolean variable
DEBUGFLAG:
Program Testing \ 21
IF DEBUGFLAG
THEN WRITELN( I SECTION A ENTERED ')
PROGRAM lES1~NG
How do you know when you have found all the bugs in a program? You
never know for sure. Running the program on sample data sets can reveal
errors, but it does not guarantee that other errors do not exist. In all but the
most limited cases, it is not practical to test the program on every possible
set of data.
Many programmers are rather lax about prograln testing. It doesn't seem
as interesting, challenging, or glamorous as writing the original program.
Furthermore, after a large investment in a particular solution, who
wouldn't be reluctant to see it fail? However, thorough testing is an integral
part of the programming process, and it can be as challenging as writing the
program itself.
Here is the challenge: Try to find data sets that will "break" the code in
as many different ways as possible. This goal requires you to change roles-
to become the adversary, rather than the creator, of the program. For this
reason, it is often desirable to have someone else try to break the program.
In fact, many software companies have separate staff for program design
and program testing.
A more structured approach requires that the test cases cause every pos-
sible path to be executed at least once. (A simple IF-THEN-ELSE state-
ment has two paths.) This approach is more thorough. For a large program
with many paths, however, it may not be a practical testing strategy.
Structured Testing
A large, complicated program can be tested in a structured way through a
method very similar to the structured approach used to design the program.
One method is top-down testing. Testing begins at the top levels, to see if
the overall logical design works and if the interfaces between modules are
correct. This approach assumes, at each level, tllat the lower levels will
work correctly. To make this assumption, you replace the lower-level sub-
programs with dummy modules called stubs. A stub may consist of a single
WRITELN statement, indicating that you have entered the procedure, or it
may assign a value if the subprogram is supposed to return a value.
As a simple example, consider a program that reads in a command and a
pair of values, executes the appropriate operation, and prints the result. To
test the top level, run the main program:
BEGIN
RESULT := 0.0
END; (* operate *)
At the next level of testing, substitute the actual procedures for the stubs,
creating new stubs to stand in for subprograms called from the second-level
Program Testing 123
modules. For instance, Procedure OPERATE contains a CASE statement:
CASE COMMAND OF
ADDCOM ADD(X, Y, RESULT);
SUBCOM SUBTRACT(X, Y, RESULT);
MULCOM MULTIPLY(X, Y, RESULT);
DIVCOM DIVIDE(X, Y, RESULT)
END; (* case *)
You can create stubs to stand in for the untested procedures ADD, SUB-
TRACT, and so on.
Finally, at the lowest level, replace the stubs with the real procedures
ADD, SUBTRACT, and so on.
Although a program as simple as this one does not require such a structured
testing strategy, it illustrates the approach.
A second program testing strategy is to test from the bottom up. This
approach tests the lowest level subprograms first, using driver programs to
call them. This is a useful approach in testing and debugging a critical
module, where an error would have significant effects on other modules. It
is also useful in a group-programming environment, where each program-
mer writes and tests a separate module. This testing strategy is illustrated
with an example at the end of the chapter.
TEST DRIVER
A program that sets up the testing environment by declaring and as-
signing initial values to variables, then calls the procedure to be
tested.
money to build the scaffolding, even though it will not be part of the final
product. But without it, the building could not be constructed.
A WARN~NG
If, after you have gone through the whole top-down design process, your
approach just doesn't work, don't be afraid to start over. If you think that
you've already invested a lot of time in a solution that has become overly
complicated, convoluted, and cumbersome, just think of what will be in-
volved in trying to maintain the code.
It is unwise to skimp on any of these tasks for the sake of speeding up the
programming process. A program that is not well designed and coded will
be expensive to maintain over its whole lifetime of use.
We can speed up the process of problem solving, however. It is not al-
ways necessary to solve every problem from scratch. As you go along, you
will pick up some standard approaches to common programming tasks. For
instance, in your first course, writing the code to keep a cumulative counter
took some thought: first initialize the counter, then increment it on every
iteration. Now you can write this kind of code in your sleep.
In this book, we will discuss many popular and useful algorithms. Famil-
iarity with well-known algorithms can simplify the problem-solving proc-
ess. It is not intended that the student memorize every algorithm. They are
only included to show ways that others before you have solved common
computing problems.
26\ Chapter 1 Programming Tools
Our job is not to write the code to sort, merge, or print. Our job is to write
the top level of the program, which reads in a command, determines which
task is to be done, and calls the proper subprogram to execute the task.
Our main module, then, is quite straightforward:
Application: Command-Driven Problems \ 27
GET COMMAND
WHILE COMMAND () STOP DO
CASE COMMAND OF
MERGE: DOMERGE;
SORT : DOSORT;
PRINT : DOPRINT
GET COMMAND
WRITE ('RUN IS FINISHED')
The Pascal declarations for this structure and a picture of this logical
structure after it has been initialized are shown in the program below.
(Note that the name for the type of the character array is STRING5. When
dealing with strings of alphanumeric characters, we will call the type
STRING followed by the number of characters in the string.)
FIRSTCOM COMTYPE;
COMMAND COMTYPE;
COMTABLE TABLE;
COMTABLE
[MERGE]
[SORT]
[PRINT]
[STOP]
COMSTRING: used to store commands as read in
COMSTRING
SKIPBLANKS (* Returns first nonblank character. *)
WHILE lllore characters DO
get next letter into COM STRING
CONVERT from STRING to cOMTYPE
Application: Command-Driven Problems \ 29
SKIPBLANKS Level 2
REPEAT
READ CH
UNTIL CH < > BLANK
CONVERT Level 2
COMMAND ~ FIRSTCOM
WHILE CO~1TABLE[COMMAND]< > COMSTRING
COMMAND ~ SUCC(COMMAND)
DOMERGE Level 1
SKIPBLANKS(CH);
CASE CH OF
'I' : (* Call integer D'!erge routine. *)
'R' : (* Call real lnerge routine. *)
PROCEDURE INITIALIZE;
(:j: Initializes the conl1nand table and sets FIRSTCO~1. *)
COMTABLE[MERGEJ := 'MERGE';
COMTABLE[SORTJ := 'SORT ';
COMTABLE[PRINTJ := 'PRINT';
COMTABLE[STOPJ := 'STOP ';
FIRSTCOM := MERGE
(************************************************
PROCEDURE SKIPBLANKS (VAR CH : CHAR);
(::: Returns the first nonbJank character frC)lll the input streanl. :j:)
(************************************************
PROCEDURE CONVERT (COMSTRING : STRING5f
VAR COMMAND : COMTYPE);
(::: Takes a c0l1ll11and in character string fonnat and returns :j:)
(::: the appropriate cOllllnand in the user-defined type CO:fvlTYPE. :1:)
(:i: ASSlllnes that CO~/lSTRIN(; is a valid conllnancl string. *)
(************************************************
Application: Command-Driven Problems 131
I NDE}-{ : = 1;
REPEAT (:i: HCcld CO~ISTHINC;. OJ:)
COMSTRING[INDEX] := CH;
INDEX := INDEX + 1;
READ(CH)
UNT IL CH = ' ';
CONVERT(COMSTRINGt COMMAND)
(************************************************
BEG I N (:i: 111ain prognUl1 *)
INITIALIZE; (:1: Set lip cOllllnand table. :;:)
GETCOMMAND(COMMAND) ; (:;: C;et first cOl1nnancl. :;:)
WHILE COMMAND <> STOP DO
BEGIN
(* Process according to conl1nand. :i:)
CASE COMMAND OF
MERGE DOMERGE;
SORT : DOSORT;
PRINT : DOPRINT
END; (:;: case *)
GETCOMMAND(COMMAND) (:I: C;et next COlllll1ancl. :,: ')
END; (:i: \vhile :i:)
WR I TELN (' RUN IS FIN ISHED ')
END. (* rnain prognllll :t:)
A built-in error check can be created by adding one more element to the
defined type COMTYPE; call it ERROR. Add a condition to your WHILE
loop that stops if the COMMAND becomes ERROR.
These additions are illustrated below.
Test Driver
The command-driven program has many real-world applications. It is
also useful as a programming tool-to build a driver for a bottom-up test
strategy. The input command is used to determine which test to perform.
For instance, if a program has several subprograms that may be independ-
ently tested, we can write a test driver that inputs a command that indicates
which procedure is to be tested and its parameters, if any. An example
follows:
Application: Command-Driven Problems \ 33
Note that many cases for each procedure may be tested using tl1is driver,
by replacing (* Set up calling arguments .... *) with code that reads in the
calling arguments from the input stream. This allows us to use this program
as a generalized test tool, as opposed to creating a separate driver for each
procedure to be tested.
34 \ Chapter 1 Programming Tools
1. Write a complete description of the class rank problem discussed in this chap-
ter.
6. Give a top-down design for getting up in the morning and going to an eight
o'clock class.
7. What part of a program's code is analagous to the main module of the top-down
design? #
8. How does the top-down method encourage the use of procedures and func-
tions?
9. The implementation of the data structures that will be used in a program should
be decided before anything else. T F
10. There is usually only one good way to solve a problem and to structure its data.
T F
11. In general, most of the effort involved over the lifetime of a program is spent in
maintaining it. T F
12. How can the code of a program be considered part of its documentation?
14. Use meaningful identifiers and constants to make sense out of the following
code:
l)AR }{, y, Z REAL;
15. Differentiate between errors in logic and in syntax. When is each type of error
likely to be found?
16. The compiler will always catch your typos and syntax errors. T F
17. This program has three separate errors, each of which causes an infinite loop.
Can you find them?
Exercises 1 35
(* ll1ain :j:)
BEGIN
COUNT := 1;
WHILE COUNT < 10 DO
WRITELN( 'HELLO'); (:;: Procedure INCREivlENT adds
INCREMENT(COUNT) ; J to the value of COUNT *)
WRITELN( 'GOODBYE')
END. (* Inain *)
18. What are stubs and drivers used for in program testing? Which testing stategy
goes with each?
19. The following is a progralll description with some problellls.
(* ** THIS IS A PROGRAM TO TAKE SOME NAMES
** AND ALPHABETIZE THEM. INPUT: THE NAMES OF
** SOME PEOPLE, THEIR CITIES, AND ZIPCODES.
** OUTPUT: THE NAMES, CITIES AND ZIPCODES
** PRINTED OUT, ALPHABETIZED BY LAST NAMES.
*>
Write a COMPLETE program description, given the following sample input
and output. (Don't write the progralll.)
Sample Input:
Col. 1 9 10 11 19 20 21 34 35 36 40
FIRSTNAME }f LASTNAME )1 CITY if ZIPCODE
TOM Q. JONES CHICAGO 30303
MISS SMITH HOLLYWOOD 99999
CAROLE VAN KIRK SWAMPCITY 00000
SUSAN MACMILLAN AUSTIN,TX 78787
NOMORE
(denotes
end of
input)
361 Chapler 1 Programming Tools
Sample Output:
TOM Q. JONES CHICAGO 30303
SUSAN MACMILLAN AUSTIN,TX 78787
MISS SMITH HOLLYWOOD 99999
CAROLE VAN KIRK SWAMP CITY 00000
20. This is a program to read in 25 words, sort them, and print them out. The input
is formatted one word per line, each word left-justified in the first 16 columns.
As you can see, this program has a few style problems.
PROGRAM BADSTYLE (INPUTt OUTPUT);
4. What is an algorithm?
5. Logical errors are easier to find and correct than syntax errors. T F
6. Two goals of internal program documentation are to make it and
PROGRAM BADSTYLE;
LABEL 10;
VAR At Bt C : INTEGER;
BEGIN (*MAINPROGRAM*> C :=0;
10: REAO(A);
WRITE(A:5); B := A * 100 OIlyJ 18; WRITELN(B); C := C + 1;
IF C < 117 THEN GOTO 10 END.
40 I Chapter 2 Built-In Data Structures
One of the tools that beginning programmers often take for granted is the
high-level language in which they write their programs. Since most of us
first learn to program in a language like Pascal, we do not appreciate its
branching and looping structures and built-in data structures until we are
later introduced to languages that do not have these features.
In the class rank example in Chapter 1, we decided to use an array of
records to store our data. But what is an array? What is a record? Pascal, as
well as many other high-level programming languages, provides arrays and
records as built-in data structures. As a Pascal programmer, you can use
these tools without concern about their implementation, much as a carpen-
ter can use an electric drill without knowing about electricity.
However, there are many interesting and useful ways of structuring data
that are not provided in general-purpose programming languages. The pro-
grammer who wants to use these structures must build them. In this book,
we will look in detail at four very useful data structures: stacks, queues,
lists, and binary trees. We will describe each of these structures and design
algorithms to manipulate them. We will build (implement) them using the
tools that are available in the Pascal language. Finally, we will examine
applications where each is appropriate.
Built-in Data Structures I 41
First, however, we will develop a definition of data structure and an
approach that we can use to examine data structures. By way of example,
we will apply our definition and approach to familiar Pascal data structures:
the one-dimensional array, the two-dimensional array, and the record. In
later chapters, we will extend this definition and approach to other user-
defined data structures.
This definition does not specify how the accessing is to be done, only
that these functions return the desired element. For built-in data structures
(those provided in a programming language), the programmer may never
know how this accessing is done-it is transparent. For a data structure that
is not provided in a language, a set of procedures and functions that carry
out the accessing functions is specified by the programmer. In a job situa-
tion, the coding of these procedures and functions might well be done by
someone else. In either case, how the accessing functions and procedures
are coded would be immaterial to the program using them.
The definition of a data structure above doesn't say anything about what
the data structure means. The physical or logical relationship of one ele-
ment to another and the mode of access of each element are specified in the
structure's definition, but the meaning of the relationships of one element
to another is not. This meaning can be defined only when the structure is
used in a program to represent the data in a particular problem. By way of
analogy, let's look at Figure 2-1, which shows a floor plan with four rooms.
The doors (accesses) to each room are marked. Is this the plan for an apart-
ment? An office suite? A doctor's office? There is no way to tell. We can see
only a shell. Only when the rooms are occupied will we be able to tell what
the plan represents.
What we are leading up to is that there are three distinct ways to look at
a data structure: as an abstract collection of elements with a particular set of
accessing functions, as a coding problem (i.e., how will the accessing func-
tions be implemented?), and as a way of representing the data relationships
in a specific context. We will analyze each data structure from these three
perspectives.
ONlE-D~MENSIONAl ARRAYS
Since almost all programming languages have a one-dimensional array as a
built-in data structure, this is a good place to start. At the logical level, a
one-dimensional array is a structured data type made up of a finite collec-
tion of ordered elements, all of which are of the same data type. (By ordered
we simply mean that there is a first element, a second element, and so on.
Since the collection is finite, there is also a last element.) Accessing is done
through the use of an index that allows you to specify the desired element
by giving its position in the collection.
The syntax and semantics of the language specify the accessing function.
In Pascal you declare a data type that defines what a one-dimensional array
will look like. You then create an array variable to be of that type in the
VAR section. For example,
DATA
[1]
[2]
[10]
The syntax of the the accessing function is the name of the collection of
elements, followed by an open bracket C['), followed by an indexing ex-
pression whose value is between 1 and 10, concluded by a closed bracket
C],). The semantics of the accessing function is "locate the element associ-
ated with the indexing expression in the collection of elements whose
name is DATA."
One-Dimensional Arrays [45
DATA[l]
DATA[I]
The accessing function is used in one of two ways: to specify a place into
which a value is to be copied or to specify a place from which a value is to
be extracted. As a user of arrays, you have probably not given much thought
to how they are implemented. The implementation is transparent to you;
the Pascal run-time support system took care of the accessing functions.
However, someone somewhere wrote the code that actually accessed the
correct place in the array. Let's change hats here and look at how arrays are
actually represented and accessed.
7 07 07 07 07 07 07 07 o
Byte I Byte Byte I Byte Byte IByte Byte IByte I Word I
15 0
I-Il\I.~I~\\lOIll) I-If\I-JF'\V()RI) I~IAIJJTW() Ill) lIAl.JFWOI{))
16-Bit
\Vord \Vord Machines
l)ouble \;Vord
\Vord I Word I
59 CDC o 7 0
8-Bit
Machine
Figure 2·3.
UPBOUND 10 2 Z
LOBOUND 1 -3 A
BASE unknown unknown unknown
SIZE 1 1 1
We can look at the declaration for DATA and immediately see that ten cells
will be required. However, the index type of VALUES is more compli-
cated. To determine the number of cells needed by a particular index type,
take the ORD of the upper bound of the index type, subtract the ORD of the
lower bound of the index type, then add one. The following table applies
this formula to the arrays above:
One-Dimensional Arrays 147
Index Type ORD(UPBOUND) ORD(LOBOUND) + 1 Number of Cells
[1 .. 10] 10 1 + 1 10
[-3..2] 2 (-3) + 1 6
['A'..'Z'] ('Z') CA') + 1 26
Now we have determined the base address of each array: DATA is 100,
VALUES is 110, and LETTERCOUNT is 116. The arrangement of these
arrays in memory gives us the following relationships:
DATA[l] 100
DATA[9] 108
Given LETTERCOUNT ['A'] the program must access 116
LETTERCOUNT ['C'] 118
VALUES[-l] 112
VALUES[O] 113
In our discussion so far we have assumed that the component type of the
array requires only one memory cell. If the component type requires SIZE
cells, the formula to determine the number of cells needed becomes
(ORD(UPBOUND) - ORD(LOBOUND) + 1) * SIZE
48\ Chapter 2 Built-in Data Structures
If the component type of the array takes more than one word in memory,
we will again have to adjust the formula slightly. If each element takes
SIZE words in memory, the formula becomes
Location(INDEX) = BASE + (ORD(INDEX) - ORD(LOBOUND)) * SIZE
The following table shows the calculations if SIZE is 3 and LET-
TERCOUNT is assigned to Location 50:
LETTERCOUNT +
50 ['A']
(ORD('A') - ORD('A')) * 3 = 50
53 ['B'] LETTERCOUNT +
(ORD('B') - ORD('A')) * 3 = 53
LETTERCOUNT +
125 ['Z']
(ORD('Z') - ORD('A')) * 3 = 125
In Pascal, the upper bound (UPBOUND) for each array is used in error
checking. After an indexing expression has been calculated, this value is
compared to the BASE + UPBOUND. If the calculated expression is
Two-Dimensional Arrays \ 49
greater than or equal to the BASE + UPBOUND, the address is not within
the array. An error message is printed and execution halts.
rwO-DIMIENS~ONAl ARRAYS
TABLE[I][J]
/~
specifies which specifies which
VECTOR position within VECTOR
TABLE[I, J]
speCl·f'les / \speCl'f·les
which element which element
in first in second
dimension dimension
CONST NUMSTOCKS = 6;
NUMWEEKS = l!;
AVGPRICES
[1] [2] [3] [4]
[1]
[2]
Value in AVGPRICES[3, 2]
[3]
represents the average
price of the third stock
[4]
for the second week
[5]
[6]
Figure 2-4.
AVGPRICES ADDRESSES
[1, 1] 400
[1,2] 401
Row 1
[1,3] 402
[1,4] 403
[2, 1] 404
[2,2] 405
Row 2
[2,3] 406
[2,4] 407
[6, 1] 420
[6,2] 421
Row 6
[6,3] 422
[6,4] 423
Figure 2..5.
Two-Dimensional Arrays 153
the block of storage set aside for this table. Let's assume that the address is
400.
To access an element in the second row, you must skip over the elements
in the first row. To access an element in row I, you must skip over I - 1
rows. How many elements are there in each row? The number of elements
in the second dimension tells us. There is one element in a row for each
column. Therefore,
UB COLUMN - LB COLUMN + 1
gives us the number of items in each row. Since there are four columns in
AVGPRICES, there are four elements in a row.
The base address plus the number of elements in a row times the num-
ber of rows to be skipped gives us the first cell in the correct row. Consider
this value to be the base address of the correct row. Now the same formula
can be used to find the correct row item as was used to find the place in the
one-dimensional case: the column index minus the lower bound of the
column.
where
UBI = upper bound of first dimension
LBI = lower bound of first dimension
UB2 = upper bound of second dimension
LB2 = lower bound of second dimension
II = first dimension index
12 = second dimension index
BASE = base address of the array
SIZE = the number of cells each component type occupies
AVGPRICES ADDRESSES
[1, 1] 400
[2, 1] 401
[3, 1] 402
Column 1
[4, 1] 403
[5, 1] 404
[6, 1] 405
[1,4] 418
[2,4] 419
420
Column 4
421
422
[6,4] 423
COLUMN-MAJOR ROW-MAJOR
AVGPRiCES [1, 1] 400 + 6 x (1 - 1) + (1 - 1) = 400 400
AVGPRICES [3,2] 400 + 6 x (2 - 1) + (3 - 1) = 408 409
AVGPRICES [4, 1] 400 + 6 x (1 - 1) + (4 - 1) = 403 412
AVGPRICES [6,4] 400 + 6 x (4 - 1) + (6 - 1) = 423 423
56\ Chapter 2 Built-In Data Structures
RIECORDS
Pascal has an additional built-in data type: the record. This very useful
structure is not available in all programming languages. FORTRAN, for
example, does not support this structure. However, COBOL, a business-
oriented language, uses records extensively.
A record is a structured data type made up of a finite collection of not
necessarily homogeneous elements called fields. Accessing is done
through a set of named field selectors.
Let's define the syntax and semantics of the accessing function within
the context of the following Pascal declaration:
CAR.YEAR
record
/1~ period field
variable selector
CAR.COLOR[2]
record
//~~period field second element
variable selector in the field
* In some machines this may not be exactly true, since boundary alignment (full or half word)
may require that some space in memory be skipped so that the next field starts on a byte that
is divisible by 2 or 4. This is true of the IBM 360/370. See Figure 2-3.
58\ Chapter 2 Built-in Data Structures
FIELD LENGTH
YEAR 1
COLOR 10
PRICE 1
ADDRESS
1000 (YEAR field)
1001
1002
(COLOR field)
1010
1011 (PRICE field)
Figure 2 7.
a
Cartype
this table is examined to see where the desired field is within the record
variable. Figure 2-7 shows such a table for CARTYPE and what the record
CAR would look like in memory if the base address of CAR were 1000 and
each element took one cell.
If the accessing expression CAR.YEAR is encountered, the location is
the base address of CAR plus the sum of the lengths of the fields in CAR-
TYPE that precede the desired field YEAR.
CAR.YEAR = 1000 + a = 1000
CAR.COLOR = 1000 + 1 = 1001
CAR.PRICE = 1000 + 1 + 10 = 1011
Note that CAR.COLOR is the base address of an array. If an element
within this array is to be accessed, this base address is used in the array
accessing formula.
PACKED SlRUC1URES
The structures we have examined so far have not been packed structures.
In a packed structure, the elements mayor may not map directly into an
integral number of cells. The form of a packed structure is machine de-
pendent. Figure 2-3 showed several different memory configurations. In a
machine where the word size is eight bits, it takes eight bits to represent a
character and a packed structure is identical to its unpacked version. In a
machine where there are 60 bits in a word, each character requires six bits
and ten characters can be packed into one word.
The accessing functions remain basically the same in packed structures.
Error Checking 159
You calculate how many places to skip from a base address to get to the one
you want. However, in a packed structure the number of places may be in
terms of binary digits (bits) or bytes rather than words.
!ERROR CHlECK~NG
In planning your programs, you should always build in a way to check for
invalid data. This same principle applies to the implementation level of
data structures.
Pascal has the error checking built into the accessing function. For in-
stance, suppose the user tries to access DATA[ll] where DATA is a ten-ele-
ment ARRAY[l .. lO]. Since the calculated address is outside the range of
cells assigned to DATA, a message is printed saying INDEX OUT OF
RANGE and the program crashes.
In contrast, FORTRAN leaves error checking to the user. If the address
calculated is outside of the range of addresses associated with DATA, a
value is either stored into or extracted from the wrong place. In other
words, the programmer is responsible for checking to make sure the value
of the index is within range.
These two approaches illustrate different philosophies of error handling.
Pascal checks for run-time errors and takes control away from the program
if an error occurs. FORTRAN does very little run-time error checking.
There is a third philosophy that checks for errors but leaves the determi-
nation of what to do if an error condition arises to the programmer. That is,
the accessing routines check for errors, but leave the error-handling deci-
sion to the programmer by setting an error flag. After each call to an access-
ing routine, the programmer can check the error flag to see if an error condi-
tion occurred. If so, the programmer makes the decision as to how to handle
error conditions within the context of the semantics of the problem itself.
This third philosophy is used in the application at the end of this chap-
ter. In the planning stage of each operation, possible error conditions are
considered. An error flag is an output parameter from each procedure
where an error condition is possible.
Summary ~r:::::::=:::::::::"""""",""":::::::E"'""""",,~::::::::::::::::::::::::::::::::::::::::::::::::::=====================:::=====::::::::::::::=============::::::1
In this chapter we have examined two useful built-in data structures
from three perspectives: the logical level, the application level, and the
implementation level.
A data structure as defined at the logical level is called an abstract data
type. A programmer should be able to use an abstract data type without
being concerned about its implementation level. In your previous course
60 I Chapter 2 Built-in Data Structures
you did use both the array type and the record type without having to know
how they were implemented.
We will continue to study additional abstract data types that are not built
into existing programming languages. We will not stop at the logical level;
we will implement these structures in Pascal. Once you have an implemen-
tation for these structures, you can forget about that level. Subsequent pro-
grams can be written using these structures as if they were built-in.
This concept of augmenting or enhancing a programming language with
your own structured data types is a very important one. Pascal does hot
have a mechanism for actually adding them to the compiler. You must de-
clare them in each program and include accessing procedures and func-
tions. There is no way to make them physically transparent to the user.
This concept has, however, been incorporated in the programming lan-
guage ADA. The packaging feature of ADA that allows you to do this will
be discussed in the last chapter.
Application: Strings \ 61
LJ APPLICATION: STRINGS
The chief programmer in your team took the general descriptions listed,
determined an appropriate data structure in which to represent a string,
and wrote detailed specifications for each operation. You, as junior pro-
grammer, have been assigned to write the Pascal code to implement the
operations according to these specifications.
Each string will be represented by two parts: the string of characters
itself and an integer variable that indicates how many characters are in the
string. Since Pascal doesn't allow variable-length arrays, a maximum length
must be chosen. Each string will be represented as an array of the maxi-
mum length, but the string itself will constitute only that part of the array
from position 1 to the position stated in the LENGTH field.
The declarations you are to use are:
CONST MAXLENGTH ?;
STRINGTYPE = RECORD
LENGTH: INDEXRANGE;
CHARS: ARRAY[l •• MAXLENGTHJ OF CHAR
END; (* record *)
STRING
.LENGTH
I.---.............~......&-~~~r-"""""' ~~""""""'!'.....-~~ ..........."""""""'~~~.---,~ ............""""""'!'.,
.CHARS
LENGTH OPERATION
Application: Strings 163
Isn't it inefficient to have a one-line function? Why not just replace the
function invocation, LENGTH(STRING), in the calling program with
STRING.LENGTH? Remember that the implementation is transparent to
the user: You know this implementation takes only one line of code; the
user sees this operation only from the logical level. If you let the user
access the length directly, you have violated the concept of an abstract
data type.
GETLINE OPERATION
We will need to keep track of the position within the string (POS) in
order to store characters into LINE. The value of POS when we leave the
WHILE loop will be the number of characters in the line. LINE[POS + 1]
to LINE[MAXLENGTH] will contain blanks. How do we set the ERROR
flag? IF EOLN is TRUE, all of the characters fit into the string and there
has not been an error. Now the code can be written.
GETSTRING OPERATION
For those of you who have not used the built-in data type called SET, we
will take a side trip and show you how to define and use a set. The follow-
ing declaration statements will create a set type named SETOFPUNCT
and a set variable named ENDSTRING.
TYPE PUNCTMARKS = I I •• I A'; (:f: This gets 1110St of the punctuation or:)
(:1: nlarks in ASCII and EBCIJIC. :::)
If you only want blanks to mark the ends of words, you can use the
following assignment statement:
ENDSTR I NG : = [' ']
1'---- _
661 Chapter 2 Built-in Data Structures
IF CH IN ENDSTRING
THE N (::: yes, it is :i:)
ELSE (::: no, it is not :;:)
GETSTRING
READ a character
WHILE character not in ENDSTRING AND more room in word DO
STORE character in string
READ character
SET length of string
PAD word with blanks
SET error flag
IF error
READ while character not in ENDSTRING
We need an index here to play the same role as POS in Procedure GET-
LINE. We can tell if there has been an error by checking to see if the last
character read (CH) is in ENDSTRING. If not, we know that there is an
error condition.
POS := 0;
READ (DATA ,CH) ; (::: Cet fi rs t character. :j:)
I
Application: Strings 167
ERROR : = NOT (CH IN ENDSTR I NG) ; (:1: Set error Hag. :1:)
WHILE NOT (CH IN ENDSTRING) DO
READ (DATA tCH) (:;: \~Tord is longer than lvlAXLENC;TH so read till end. :1:)
END; (of: getstring :;:)
PRINT OPERATION
This operation is very simple. The string will be printed exactly as it is,
character by character, using procedure WRITE. No WRITELN is issued.
BEG I N (* print *)
FOR POS := 1 TO STRING.LENGTH DO
WRITE(STRING.CHARS[POS])
END; (:1: print :::)
68 1 Chapter 2 Built-in Data Structures
SUBSTRING OPERATION
STARTPOS + NUM - 1
should give us this position. Let's try several values to convince ourselves
that we are right. IfNUM is 1, STARTPOS is the last (and only) character to
move. If NUM is 3, STARTPOS + 2 is the last position to be copied. If
I I
Application: Strings 169
END; substring *)
(:i:
CONCAT OPERATION
The CONCAT operation "adds" one string to the end of a second string.
Note that we have only two strings as parameters, not three. We will
leave the result in the first string. That is, we will copy the characters of the
70 I Chapter 2 Built-in Data Structures
second string, one at a tiIne, into the first string beginning at position
STRINGl.LENGTH + 1. (See Figure 2-8.)
CONCAT
INPUT:
STHINGl:
~- - - - - - - - - - - - - - - - -
3
ROSES ARE RED~---------------~
STHING2:
r----------------
OUTPUT:
STHINGl:
DELETE OPERATION
There are two distinct algorithms that we could use here. We could take
that part of the original string on the right of the part to be deleted and
move it down to fill in the deleted section. The part freed up could then be
filled with blanks. Another approach would be to take advantage of two
procedures we have already coded: CONCAT and SUBSTRING. We will
use this second approach. We will use SUBSTRING to break the original
string into two parts: the part before the deleted characters and the part
following them. These two strings can then be concatenated to produce the
string we want. Figure 2-9 shows a code walk-through of this algorithm.
INPUT:
STRING:
POS: 11
NUM: 15
LASTPOS: 11 + 15 - 1 = 25
ERROR: 25> 29 = FALSE
~--------------------------
STRING: (* after second call to SUBSTRING *)
STRING:
DELETE
Note that it wasn't necessary to check the value of the error flag returned
from SUBSTRING and CONCAT, since we know that we set up the param-
eters correctly.
INSERT OPERATION
INSERT
SEARCH OPERATION
INPUT:
STRING:
------------
INSTRING:
,...-------------------
POS: 11
ERROR: 14 + 9 > MAXLENGTH = FALSE
TSTRING: after lir"t c'all to St'BSTHI:\C
hr----------------------------.
STRING:
o
ROS ESARE .r------------------~
"---------------------_.....
STRING:
OUTPUT:
STRING: al"kr ,,(·t·()lld call to C( )\'( :Xl"
The algorithm has two parts. We will keep looking in the main string for
an occurrence of the first letter in the substring. If we never find it, we
know the substring is not there. If we find a match for the first character, we
continue trying to match the next characters. If we match them all, the
search is successful. If we find a character that does not match, we go back
and begin looking for another match for the first character in the substring.
The process stops when the search is successful or when the number of
characters left to check in the main string is less than the length of the
substring.
SEARCH
When a match for the first letter in SUBSTR has been found in STRING,
we check the next letter in SUBSTR with the next position in STRING.
This expands to a second loop nested inside the first:
MATCH ~TRUE
WHILE more letters in SUBSTR AND MATCH
IF the letter at this position of SUBSTR = the letter
at the appropriate position of STRING
THEN increment position
ELSE MATCH ~ FALSE
One line of the algorithm may require some explanation. How do we set
MORETOCHECK, the Boolean variable that controls the outer loop?
There is more to check when there are enough characters left in STRING
I I
78 1 Chapter 2 Buill-in Data Structures
to warrant continuing the search. That is, when the current line position
in STRING plus the length of SUBSTR minus 1 exceeds the length of
STRING, there is no possibility of a match. When this occurs,
MORETOCHECK is FALSE, and the process ends. .
PROCEDURE SEARCH (STRING, SUBSTR : STRINGTYPE;
VAR FOUND: BOOLEAN;
VAR POS : INDEXRANGE;
VAR ERROR: BOOLEAN);
(I: Insert specificatioll l(JI' proper dO(,llllH'lllalioll. :;:)
BEGIN
SUB POS : = 1;
MATCH := TRUE;
(::: Chcck rest of \vord. :;:)
WHILE (SUBPoS < SUBSTR.LENGTH)
AND MATCH DO
r
(:;: I it I \latl·hl's, k('l'p COlli parlll g.
(:;: ()tlwnvis{' illtTClllC'llt STP( )S. :::)
IF SUBSTR.CHARS[SUBPoS + 1J =
STRING.CHARS[STPoS + SUBPoSJ
THEN SUBPoS := SUBPoS + 1
ELSE (:i: docsn't lliatch :::)
BEGIN
MATCH := FALSE;
STPoS := STPOS +
END;
FOUND := MATCH;
END (:;: first character Inatch :!:)
Application: Strings I 79
IF FOUND
THEN POS := STPOS (* search succesful
*)
END (* not error *) (* Set POS. *)
END; (* search *)
STRING:
SUBSTR:
~r-,--------------------------,.1
STRING is scanned character by character until there is a match for
SUBSTR.CHARS[l].
Now: SUBSTR.CHARS[l] = STRING.CHARS[7]
We try to match SUBSTR.CHARS[2] with STRING.CHARS[8] and the
match fails.
We go back to scanning STRING character by character until we find an-
other match for SUBSTR.CHARS[l].
Now: SUBSTR.CHARS[l] = STRING.CHARS[16]
We try to match the next characters, and the following are the results:
SUBSTR.CHARS[2] = STRING.CHARS[17]
SUBSTR.CHARS[3] = STRING.CHARS[18]
Now: SUBPOS = SUBSTR.LENGTH (3 = 3)
MATCH = TRUE, FOUND = TRUE, POS = 16
Figure 2·11. Walk-Through of Search Operation
80 I Chapter 2 Built-In Data Structures
COMPARE OPERATION
SUBSTRI -a string
SUBSTR2 -a string
OUTPUT: COMPARE ~LESS if SUBSTRI comes before SUBSTR2
alphabetically (is less than)
-EQUAL if SUBSTRI is identical to SUBSTR2
-GREATER if SUBSTRI comes after SUBSTR2
alphabetically (is greater than)
ASSUMPTIONS: RELATION is a user-defined data type consisting of
(LESS, EQUAL, GREATER)
Since the strings we are concerned with here are not packed arrays, we
will have to compare them character by character. The first time we find
two characters that are not the same, the order of those two characters de-
termines the order of the two strings. We must set up our loop to start by
comparing the first character of both strings and then to continue compar-
ing characters either until a pair of characters that do not match is found or
until there are no more characters in the shorter string. If the two strings are
of the same length and all of the characters match, they are equal. If all of
the characters match and one string is shorter than the other, the shorter
one comes before the longer one alphabetically.
COMPARE
BEGIN
(* MINLENGTI-I ~ length of shorter word. *)
IF SUBSTR1.LENGTH < SUBSTR2.LENGTH
THEN MINLENGTH := SUBSTR1.LENGTH
ELSE MINLENGTH := SUBSTR2.LENGTH;
POS : = 1 ;
ST I LLMATCH : = TRUE;
WHILE (POS <= MINLENGTH) AND STILLMATCH DO
(* Search for characters that do not Inatch. *)
IF SUBSTR1.CHARS[POSJ = SUBSTR2.CHARS[POSJ
THEN POS : = POS + 1 (* Keep c0111paring. *)
ELSE
BEG IN (* two strings do not 111atch *)
STILLMATCH := FALSE;
IF SUBSTR1.CHARS[POSJ < SUBSTR2.CHARS[POSJ
THEN COMPARE := LESS
ELSE COMPARE := GREATER
END;
IF ST I LLMATCH (* ran out of characters in shorter string *)
(* Use length to detennine ordering. *)
THEN IF SUBSTR1.LENGTH = SUBSTR2.LENGTH
THEN COMPARE := EQUAL
ELSE IF SUBSTR1.LENGTH < SUBSTR2.LENGTH
THEN COMPARE := LESS
ELSE COMPARE := GREATER
END; (* C0111pare *)
821 Chapter 2 Built-in Data Structures
APPEND OPERATION
CONST
STR I NG = I t I
Of course, we could create such a literal string, but we could not use it
with our string operations, because it is of a different type than our strings.
Pascal would create a packed array of two characters in this case. Since our
strings are records, however, we have to create constant strings by using
APPEND.
Remember, in our concatenation example we ended up with two words
run together. We could have used APPEND to put a blank between those
two words.
Testing: At this point, you have not yet finished coding the string opera-
tions in accordance with the specifications you were given. Before you can
consider the job finished, you \\rill have to demonstrate that these proce-
dures and functions have been thoroughly tested. Since these routines are
all low-level utility routines, you should use a bottom-up testing strategy.
You must write a driver program that will call or invoke each procedure
and function with varying data values and print the results. Drivers of this
type are so useful that we will digress here and describe a driver program
called a command-driven tester.
The basic structure of the program is the same as the command-driven
system discussed in Chapter 1. The names ofthe.procedures and functions
to be tested become the commands in the system. The task to be done for a
particular command is to execute the procedure or function with the same
name and print out the results.
Since each of the procedures and functions has a string as a parameter,
we will have to read in which string the procedure or function is to manipu-
late. We will declare an array of strings. The input to the tester will be the
name of the procedure or function to be executed followed by an integer
number, which will be used as an index into the array of strings.
If the procedure to be executed needs additional parameters, they must
also be read in. When GETSTRING and GETLINE are the input com-
mands, the data to be read in are on the line immediately following the
command. Therefore, the file name to be given GETSTRING and GET-
LINE will be \vhatever file the commands themselves are coming in on.
The syntax of the input for a command.. driven tester for our string opera-
tions is summarized below.
84\ Chapter 2 Built-in Data Structures
LENGTH, STRING
GETLINE, LINE (::: data arc on next line :;:)
GETSTRING, STRING, CH, CH, ... (::: data arc 01] next line :i:)
PRINT, STRING
SUBSTRING, STRING SUBSTR STARTPOS NUM
CONCAT, STRINGl STRING2
DELETE, STRING POS NUM
INSERT, STRING IN STRING POS
SEARCH, STRING SUBSTR
COMPARE, SUBSTRl SUBSTR2
APPEND, STRING CH
STOP.
GETL I NE, 1
GOOD MORNING AMERICA
would cause Procedure GETLINE to be called with STRINGS[l] as its
second parameter (the first parameter would be the file from which the
input is being read). Procedure GETLINE reads the next line of data and
Application: Strings \ 85
stores it in STRINGS[I].
Note that the heading of Procedure GETLINE is
PROCEDURE GETLINE (VAR DATA: TEXT; lJAR LINE: STRINGTYPE;
VAR ERROR: BOOLEAN);
The invoking statement in the driver for this example is
GETLINE(DATAt STRINGS[lJ tERROR);
Procedure GETLINE expects a string variable of STRINGTYPE as its
second parameter. The driver passes it a row of an array variable,
STRINGS. Is there any problem? No, each row in the array STRINGS is a
string variable of STRINGTYPE. After GETLINE has been executed, the
driver prints the contents of STRINGS[l].
The person who defined the test data would look at the printed output to
see if 'GOOD MORNING AMERICA' was printed. If it was, the GET-
LINE procedure worked correctly on this input string.
Testing using a command-driven tester is divided into two parts. The
first part is to write and debug the command-driven tester. The second part
is to create the input to the tester so that each procedure and function is
thoroughly tested.
The first part requires a large investment of time when you first design
the tester. If your tester is designed in good modular fashion, however, the
second time you need to use it the time investment will be minimal. The
basic structure of the program will remain the same; only the data, such as
the contents of the command table, need to be changed. If you will be
testing a second implementation of the same data type, nothing needs to be
changed.
The second part-designing the test data-will vary from application to
application. You have to make sure each boundary case, each error, and
several general cases are tested. For example, to test the GETLINE opera-
tion thoroughly, we would have to do the following:
1. Input several strings whose lengths were between 1 and MAX-
LENGTH. These are the general cases.
2. Input a string of length 0 and a string of length MAXLENGTH. These
are the boundary cases.
3. Input a string of length greater than MAXLENGTH. This tests the
error condition.
One of the programming assignments for this chapter asks you to finish
the command-driven testing program for the string data type that we have
outlined here. One of the exercises at the end of the chapter asks you to
define the input data necessary to test several of these routines.
861 Chapter 2 Built-In Data Structures
Exercises
1. Create a one-dimensional real array \vhose index type is 1..40.
2. Describe the Pascal accessing function of a one-dimensional array at the logical
level.
Use the following declarations for 3 and 4:
TYPE NUMTYPE = ARRAY[1 •• 5] OF INTEGER;
LETTYPE = ARRAY[ 'A' •• '2'] OF CHAR;
FPTYPE = ARRAY[-4+.8] OF REAL;
VAR NUM : NUMTYPE;
LET: LETTYPE;
FP : FPTYPE;
3. How much storage is set aside for
(a) NUM
(b) LET
(c) FP
4. If storage for NUM begins in location 0, and LET begins immediately after
NUM, and FP begins immediately after LET, calculate the addresses of the
following elements.
NUM[I]
NUM[3]
NUM[5]
LET['A']
LET['N']
LET['Z']
FP[ -4]
FP[O]
FP[6]
5. Create a two-dimensional array where the index type of the first dimension is
1.. 10, the index type of the second dimension is 'A'.. 'Z', and the component
type is CHAR.
6. Create a two-dimensional array \vhere the index type is 1.. 10, and the compo-
nent type is an array whose index type is 'A'.o 'z' and component type is
CHAR.
7. Define the Pascal accessing function to a two-dimensional array at the logical
level.
Use the following declarations for 8 and 9:
(a) VALU
(b) TABLE
(c) BOX
9. If the three arrays are assigned consecutively beginning at location 1000, calcu-
late the addresses of the following elements (row major order).
VALU[l,l]
VALU[5,2]
VALU[5,6]
TABLE[l, -2]
TABLE[2,1]
TABLE[3,0]
BOX['A',l]
BOX['Z',2]
BOX['N',3]
10. Define a record at the logical level.
11. Define the Pascal accessing function to a record at the logical level.
Use the following declarations for 12, 13, and 14.
TYPE PEOPLE = RECORD
NAME: ARRAY[1 •• 20J OF CHAR;
BDATE : INTEGER;
AGE : INTEGER;
ADDRESS: ARRAY[1 •• 15J OF CHAR
END; (:i' record :i:)
VAR PERSON: PEOPLE;
12. How much storage does each field require?
NAME
BDATE
AGE
ADDRESS
13. If PERSON is assigned beginning at location 50, calculate the addresses of the
following elements.
PERSON.NAME
PERSON.BDATE
PERSON .ADDRESS
PERSON.AGE
14. If CROWD is an array of ten PEOPLE beginning at location 10, calculate the
addresses of the following elements:
CROWD[I].NAME
CROWD[l].NAME[l]
CROWD[5].BDATE
CROWD[4].AGE
CROWD[10].ADDRESS[6]
CROWD[1].NAME[20]
15. Give the general formula for accessing an element in a two-dimensional array
stored in column-major order.
881 Chapter 2 Built-In Data structures
1. Which of the following formulas is used to access an element LIST [I], where
BASE is the address associated with the array LIST [1 .. 100]?
(a) BASE +I
(b) BASE - I
(c) BASE +I- 1
2. Name four data structures.
3. It is the responsibility of the person writing the code to modify specifications as
necessary. T F
4. Limiting communication within a program is called _
5. Pascal stores arrays in order, while FORTRAN uses _
ordering.
6. CASE selectors must be of type.
7. Pascal has the error checking built into the accessing functions. T F
8. At which level do you picture the organization and specify the general access-
ing procedures and functions-abstract, usage, or implementation?
9. Define data structure.
10. List the three distinct ways of looking at a data structure.
11. The formula that gives the Ith position in a one-dimensional array, each of
whose elements is of size N, is _
12. Define a two-dimensional array using either of the two methods from the text.
13. Records, a built-in data type, are made up of a finite collection of - - - -
that are accessed by means of - - - -
(b) LETTER
(c) FPNUMBER
15. If NUMBER begins at location 100, and LETTER and FPNUMBER immedi-
ately follow, calculate the addresses of the following elements:
(a) NUMBER[l]
(b) NUMBER[6]
(c) LETTER['D']
(d) FPNUMBER[l,l]
(e) FPNUMBER[2,2]
(f) FPNUMBER[4,3]
Use the following declarations for problems 16 and 17:
TYPE CARTYPE = RECORD
MAKE: ARRAY[1 •• 20J OF CHAR;
MODEL : INTEGER;
COST: REAL;
PRICE : REAL
END; (::: record ;;:)
VAR CAR: CARTYPE;
16. How much storage does each field require?
(a) MAKE
(b) MODEL
(c) COST
(d) PRICE
17. If MAKE is stored beginning at location 100, calculate the addresses of the
following elements:
(a) CAR.MAKE
(b) CAR.MAKE[4]
(c) CAR.MODEL
(d) CAR.PRICE
921 Chapter 3 Stacks
Goals
To be able to hand-simulate stack operations at the logical
level.
To be able to hand-simulate the effect of stack operations on
a particular implementation of a stack.
To be able to encode the basic stack operations given the
implementation specifications.
To be able to determine when a stack is an· appropriate data
structure for a specific problem.
To be able to code the solution to a problem for which a
stack is an appropriate data structure.
DATA ENCAPSULATION
Separation of the representation of data from the applications that use
the data at a logical level.
different program, we can just lift out its implementation procedures and
functions. We may be using the data structure for a completely different
application, but the basic accessing functions and operations will remain
the same.
Let's illustrate these important concepts with a simple data structure
called the stack.
WHAT ~S A STACK?
Consider the illustrations in Figure 3-1. Although the various pictures are
very different, each illustrates a common concept: the stack.
STACK
A data structure in which elements are added and removed from only
one end; a "last in, first out" (LIFO) structure.
a stack of
cafeteria trays a stack of
neatly folded shirts
a stack
of pennies
relnember this rlIle of stack behavior: A stack is an LIFO (last in, first out)
list.
To summarize, what is the accessing function for a stack? We retrieve
elements only from the top of tIle stack. Assignment of new elements to the
stack is also through the top.
OPERAT~ONS ON STACKS
You need to be familiar with a number of operations in order to use a stack.
You must be able to create or clear a stack; that is, to initialize it to its empty
state. (An empty stack is one that contains no elements.) As mentioned
before, a stack is a dynamic structure, changing first when new elements
are added to the top of the stack (called pushing an element onto the stack)
and second when its top element is removed (called popping the stack).
You must also be able to check whether a stack is empty before you attempt
to pop it. Furthermore, altholIgh as a logical data structure a stack is never
full, for a particular implementation you may need to test whether the stack
is full before you try to push another element onto it.
For a moment, let's envision a stack as a stack of building blocks, and see
how the basic PUSH and POP operations affect it.
POP (STACK, X)
means: POP THE TOP BLOCK
AND PUT IT INTO X
POP (STACK, X)
PUSH (STACK, X)
Operations on Stocks \ 95
(:: Head rest of string and conlpare characters to those in the stack. :1:)
REl.,JERSE : = TRUE;
WHILE NOT EMPTYSTACK(STACK) DO
BEGIN
READ (CH2) ;
POP(STACK t CH1);
IF CH 1 -: : > CH2
THEN REVERSE := FALSE
END
END; (::: reverse ;i:)
Note that we haven't yet considered how the stack will be implemented.
The details of the implementation are hidden somewhere in the code of the
Procedures PUSH, POP, and CLEARSTACK and the Function EMPTY-
STACK. This observation illustrates information hiding. You don't need to
know how the stack has been implemented to be able to use the PUSH and
POP routines.
We need one more thing before we can iInplement our stack as an array.
We need to know how to find the top element when we want to POP, and
where to put our new element when we want to PUSH. Remember:
Though we know that we can access any element of an array directly, we
The Implementation of a Stack as an Array \ 97
have agreed to the convention "last in, first out" for a stack. We will access
the stack only through the top, not through the bottom or the middle. Rec-
ognizing this distinction from the start is important: Even though the im-
plementation of the stack may be a random-access structure like an array,
the stack itself as a logical entity is not randomly accessed.
T11ere are a number of ways to keep track of the top position. We could
have an integer variable TOP that would indicate the index of the current
top position. However, this scheme would require us to pass TOP as an
additional parameter to Procedures PUSH and POP. It would be better to
find a way to bind both the elements and the top indicator into a single
entity, STACK. This can be accomplished by extending the array to include
one more position, STACK[O], in which we will store the index of the cur-
rent top element. So we modify our declarations as follows:
BEG I N (* elllptystack *)
EMPTYSTACK := STACK[OJ 0
END; (:j: en1ptystack :1:)
IF STACK[OJ = 0
THEN EMPTYSTACK := TRUE
ELSE EMPTYSTACK := FALSE
We said before that the stack as an abstract data structure cannot be full.
Why then are we coding a function to test for a full stack? This function is
made necessary by our choice of implementation, since the array has fixed
bounds.
To add, or PUSH, an element onto the stack is a two-step task:
BEG I N (* push *)
STACK[OJ := STACK[OJ + 1;
STACK[STACK[OJJ := NEWELEMENT
END; (:;: push :j:)
The Implementation of a Stock as on Array 1 99
3 15 25 35 eon tainsig,trbage
4 15 25 35 65 containsgarlJage
To use this operation in a progra111, we Inust n1ake sure that the stack is
not already FULL before we call PUSH.
IF NOT FULLSTACK(STACK)
THEN PUSH(STACKt 85);
If the stack is already full when we try to PUSH, the result is called stack
overflotv. Error checking for overflow 111ay be handled in different ways.
We could test for overflow inside our PUSH procedure, instead of in the
calling program. We 111ight add a Boolean variable, OVERFLOW, to the
formal para111eter list. The revised algorith111 would be
IF stack is full
THEN OVERFLOW ~ TRDE
ELSE OVERFLOW ~ FALSE
increlnent top indicator
STACK[top indicator] ~ new elenlent
Which version of PUSH you decide to use lTIay depend on the specifIca-
tions-especially if the utility procedures are being written by different
progralnmers, as often happens on a large progra111. Since the interface dif-
fers in the nU111ber of para111eters, it is i111portant to establish whose respon-
sibility it is to c11eck for overflow.
Try writing Procedure TESTANDPUSH yourself.
garbage llere
We want to POP the stack. The value in STACK[O] tells us that the top
element is stored in STACK[3]. First the top element is popped from
STACK[3]. Then the top indicator (STACK[O]) is decremented, giving us
the following:
Note that after popping, 38 is still stored in the third element place in the
array, but we cannot access it through our stack. The stack only contains
two elements.
To execute the POP operation illustrated above, we must first test for an
en1pty stack, and then call our POP procedure:
IF NOT EMPTYSTACK(STACK)
THEN POP(STACKt X)
when the stack is empty and we try to POP it, the resulting condition is
called stack underflow. Obviously, the test for underflow could also be
A More General Implementation 11 0 'i
written into the POP operation. The algorithm of POP would be modified
slightly, to return a Boolean variable, UNDERFLOW, in addition to the
popped element.
IF stack is elnpty
THEN UNDERFLOW ~ TRUE
ELSE UNDERFLOW ~ FALSE
popped element ~ STACK[top indicator]
decrement top indicator
This operation is similar to POP, but it doesn't change the stack in any
way. The top indicator is not lTIodified. This function could also be written
with an internal test for underflow.
Note that our implementation of a stack as an array with the top indicator in
the first position takes advantage of the fact that the elements in the stack
are of the same type (integer) as the top indicator. If the elements in the
stack are to be of another type (real numbers, for instance), another method
of keeping track of the top of the stack must be used. One such method is to
make STACKTYPE a record with two fields: ELEMENTS (an array of ele-
ments) and TOP (an integer index to the array).
We will also have to change the headers of all the stack operation proce-
dures and functions that specify the elenlent type in tIle paranleters. For
instance,
POP and STACKTOP will have to be l1lodified in the same way. We can get
around having to nlake these trivial changes througllout the program by
adding a type ELTYPE (elen1ent type) in our declarations.
All of the procedure and function headings will use the type ELTYPE for
stack elenlents; for exanlple,
Now a change in elen1ent type will result in changes only to the declara-
tions of the progranl. This feature nlakes our progran1 lTIOre easily nlodifla-
ble.
A stack represented with these declarations I1light have the following
utility routines:
A More General Implementation 11 03
(************************************************
(************************************************
(~***********************************************
BEG I N (* push *)
STACK.ToP := STACK.ToP + 1;
STACK.ELEMENTS[STACK.ToP] := NEWELEMENT
END; (* push :1:)
(************************************************
Summary -_._--"I!----",&..
~,Wi¥!i§!~'_1iMiB!i·-~_._+ ,,£ -~~
We have defined a stack at the logical level as an abstract data type and
discussed two implementations tIlat use arrays to contain the stack ele-
ments. The first used an extra slot in the array to store the index of the top
element of a stack of integers. The second implementation used a record to
separate the representations of the top indicator and the elements of the
stack.
Which of these two hnplementations is better? Certainly the second one
is more flexible, since it allows the type of the elements in the stack to be
changed without affecting the code of the stack utility routines. This makes
the utilities portable; they could be used in another program for a com-
pletely different purpose. The first implementation is rather rigid; we are
limited to stacks whose elements are of the saIne type as the array index.
The Application Level 1 105
of the operators to tell us which part to evaluate first, but simply rely on the
location of the parentheses). The innermost level of parentheses indicates
which part of the expression must be evaluated first, and we work outward
from there.
The second question, involving the storage of intermediate values, sug-
gests that we must design an appropriate data structure for our solution. If
we knew how many intermediate values would be produced, we could
declare temporary variables to hold them (TEMPI, TEMP2, ... TEMPN).
But, obviously, the number of intermediate operands produced will vary
from expression to expression. Luckily, we know of an ideal data structure
for saving dynamically changing values for later processing-the stack.
Let's consider how we can use a pair of stacks to evaluate the fully paren-
thesized expression
z= (((Y - 1) / X) * (X + Y))
We \vill use one stack to store the operators and a second to store the
operands. Assuming for now that the expression is fully and correctly pa-
renthesized, we will ignore the left parentheses ("("). As we pass through
the expression from left to right, we push each element onto the appropri-
ate stack, until \ve come to a right parenthesis (")"). Figure 3-2 shows how
the two stacks will look when we come to the first right parenthesis. At this
point, we have reached the innermost level of parentheses (for this term, at
least), and we can perform the first operation. Where are the two operands?
They should be the last and next-to-Iast values pushed onto the operand
stack. Where is the operator? It should be the top element on the operator
stack. To evaluate the first intermediate operand, we pop the top two ele-
Inents from the operand stack; these become OPERAND2 and OPER-
ANDI, respectively. Then we pop the top element from the operator stack
and perform the appropriate operation, 5 - 1, producing the value 4. Note
that Y - 1 = 5 - 1 = 4 is an intermediate step in the evaluation of the larger
expression. The result of this step, 4, will be one of the operands in the next
step, calculating (Y - I)/X. Where do we put operands? That's right, on the
operand stack, so we PUSl1 4 onto the operand stack.
X=2
Y=5
Z= (((Y - 1) / X) * (X + Y))
r
lUJ U
Y .5 ----'> (Y - 1) W I empty I
Operands Operators Operands Operators
Figure 3-2.
108 I Chopter 3 Stacks
I"-~--~---~~'-
X=2
Y=5
Z= (((Y - 1) / X) * (X + Y))
T
Figure 3-4 shows the processing that occurs when we reach the next
right parenthesis. This cycle is continued until we "reach the end of the
expression (Figure 3-5). At that point the operator stack should be empty,
and the operand stack will contain only one value-the evaluated result of
the whole expression.
Let's use this strategy to write a program for a very simple expression
calculator.
X=2
Y=5
Z = (((Y - 1) / X) * (X + Y))
I
((Y-l)/~W
Operands
LJ -- ((Y~~)~~~W U
Operators Operands Operators
Figure 3·4.
L_~.~~ __._~-- _
Application: Expression Evaluation \ 109
X=2
Y=5
Z = (((Y - 1) / X) * (X + Y))
(X + Y) I ~ II ili I (((Y - 1) I X) I
~ ~ II . I
((Y-l)/X) ~ ~~ * (X+Y)) ~ ~
Operands Operators Operands Operators
Figure 3-5.
(varna1ne) : (expression);
where (varname) consists of a single letter, (real number) is a real number
literal (like 3.5), and (expression) is a string representing a fully and cor-
rectly parenthesized expression made up of operators (the cl1aracters +, -,
*, and /) and varnames. (Note that literal constants cannot be used in an
expression in our simple example, only varnames like X and Y.)
At least one blank (maybe more) will separate each element of an as-
signment statelnent. Each assignment will terminate with a semicolon.
Note that assignments of literal values to varname are indicated by the =
operator, while assignments of expressions are indicated by a colon. We are
using two different assignment operators to simplify the parsing (breaking
into cOlnponent parts) of the statement. There will be exactly one assign-
ment statement per line, and the last line will contain only the character o.
Exan1ples of valid assignment statements are
X = 5.0;
Y = 92.34;
Z:((X-Y)*(X+Y));
Assumptions: For the sake of simplicity, we will allow the following as-
slunptions:
1. The expl~essions will be fully and correctly parenthesized.
2. The assignment statements will be valid forms.
3. The operations in expressions will be valid at run time. This means
that we will not try to divide by O. We will also assume that any var-
name tIlat has not been defined before it is used as a term in an ex-
pression will be given the value of 0.0.
We have put these tremendous limitations on the input in order to con-
centrate on tIle processing of the expression evaluation llsing stacks. Since
we will test and use this expression evaluator as an interactive tool (i.e., all
the input will come from the keyboard), these assumptions are pretty un-
reasonable. We will come back to this point later. For now we will allow
these restrictions so that we can develop tIle algorithms of interest to a stack
user.
Data Structures: We have already seen that we will need to use a pair of
stacks to hold intermediate values of operands and operators. Let's assume
that someone was nice enough to prepare all the necessary stack lltilities
for our use. (We may not need to use all of them.)
For the operand stack:
We will need one other data structure for the processing of this program.
We need a place to store the values assigned to various varnames, since
these values will be required in the subsequent evaluation of expressions
that use them. For our simple example, the range of varnames is limited to
the characters of the alphabet, so we can use the varname itself as an index
to an array of real values:
Top-Down Design:
MAIN Level 0
INITIALIZE
WHILE FLAG not stop DO
PROCESSLINE
print final message
INITIALIZE will set the data structures to their starting values and set
FLAG to some value other than stop.
11 2 I Chapter 3 Stacks
INITIALIZE Level 1
GETVARNAME
IF VARNAME is stop marker CO')
THEN FLAG ~ stop
ELSE GETVALUE
INllVALUES Level 2
FORINDEX:='A'TO'Z'DO
VALUES[INDEX] ~ 0.0
GElVARNAME Level 2
GETCHAR Level 3
SAVEVALUE Level 3
EVALUATE Level 3
GETCHAR (TOKEN)
WHILE TOKEN is not ENDEXPRESS (' ;') DO
CASE TOKEN is
'A'..'Z' : RETRIEVE value of VARNAME
push retrieved value onto stack of operands
'+', '-',
'*','1' : push TOKEN onto stack of operators
'(' : do nothing
')' : PERFORM appropriate operation
·END CASE
GETCHAR (TOKEN)
pop operands stack to get final value of expression
RETRIEVE Level 4
PERFORM gets the next two operands and the next operator from their
respective stacks, performs the indicated operation, and pushes the result
back onto the operand stack.
114 I Chapler 3 Slacks
PERFORM Level 4
There are a couple of points worth noting in this top-down design. You
may have wondered why we pushed the details of SAVEVALUE and RE-
TRIEVE to lower levels in the design, even though they each "expanded"
to only a single line. Why didn't we just put that line directly into the
design? Note that these two tasks involve the manipulation of the array
VALUES, the data structure in which we are storing the designated or cal-
culated values of VARNAME. By separating this function into a lower
level, we have tried to encapsulate tIle data structure. Why should we
bother to do this, since we already decided how to implement this list of
values? We are trying to l11ake the design more easily l110difiable. What
happens if \ve decide to allow VARNAME to be a string of characters, in-
stead of a single character? We cannot index the array of vallles by a string,
and thus we will have to change the whole implelnentation of this list. By
moving the part of the design that touches this data structure into a lower
level, \ve can try to lin1it the changes that would result from a modification
to the program.
We have accomplished this same flexibility by manipulating the various
stacks through the stack utilities listed above. Do yOll know from the design
how the stack of operators or the stack of operands has been ilnplen1ented?
The algorithm and data structures are now sufficiently defined to allow
us to write our calculator program. Because of the brevity of the top level,
we have combined the Level 0 and Levell designs into the main progral11.
Application: Expression Evaluation 1115
CHARSTACK RECORD
TOP : INTEGER;
ELEMENTS: ARRAY[l •• MAXSTACKJ OF CHAR
END ; (:!: record :1:)
FLAG FLAGTYPE;
l"JARNAME l"JARTYPE;
(************************************************
(VAR STACK: REALSTACK);
nUlll1 bel's to *)
.TOP
END; (* clealTeal *)
(************************************************
1161 Chapter 3 Stacks
(************************************************
<************************************************
(************************************************
Application: Expression Evaluation 1117
BEcrIN
STACK.TOP := STACK.TOP 1;
STACK .ELEMENTSESTACK :=
END;
<************************************************
BEGIN (*
<************************************************
END; (* • • • getvarnanle.··.*)
<************************************************
118 I Chapter 3 Stacks
(************************************************
(************************************************
(************************************************
(************************************************
Application: Expression Evaluation \ 119
PROCEDURE PERFORM;
(* Perfonns the next operation in the expression evaluation, and :1:)
(* leaves the result at the top of the operands stack. :1:)
BEG I N (* perfonn *)
(* perfonn :f:)
(************************************************
120 I Chapter 3 Stacks
GETCHAR(TOKEN) ;
(* case :f:)
(* evaluate *)
(***********************~************************
Application: Expression Evaluation 1121
READLN;
WR I TELN ( l"JARNAME t ' IS' t NEWl"JALUE: 8 : 2) ; (::: Print nlessage. :::)
WRITELN
(****************.*******************************
(* Initialize. :1:)
CLEARREAL(OPERANDS) ; (:1: Set the stacks to enlpty. :j:)
CLEARCHAR(OPERATORS) ;
INITVALUES(VALUES) ; (:!: Set values of all vanlanlCS to O. :1:)
FLAG : = OK;
WRITELN( 'GOODBYE. COME BACK SOON. ') (::: Print final n1cssage. *)
our calculator program. We first examine the top level of the program. To
make this program as robust as possible, we will specify that if there is an
input error anywhere in a line, we will stop processing that line, print an
appropriate message, and go on to the next line. What is involved in recov-
ering from an error witl1in a line? First of all, we will want to get rid of the
rest of the line. That's easy; a simple READLN will take care of it. What
about the program's data, the stacks and the values list? Normally, if there
are no errors, both stacks end up empty at the end of an expression evalua-
tion. However, if we stop processing midway through the expression, there
may be elements left over in the stacks. To get rid of these data, we will
need to reset both stacks to their empty states, using the CLEAR routines.
Should we also reinitialize the list VALUE S? No, the values that are stored
in this list have been assigned as the result of successful lines of input, so
we will leave them alone.
ROBUSTNESS
The ability ofaprogram to recover to a known state following an error.
We can rewrite the top level of the program with error checking by add-
ing a third value, ERROR, to FLAGTYPE and letting Procedure GET-
VALUE return an additional parameter, FLAG.
What conditions in GETVALUE will set FLAG to ERROR? First of all,
an error may result from bad input. In general, it is a good idea to check any
input, especially when it comes from the keyboard. (The safest approach-
although it is not particularly convenient-is to read in everything as
CHAR data and to convert it yourself to the appropriate type.)
One kind of input situation that may cause errors is seen in Procedure
GETVALUE. We check the assignment operator, a CHAR variable, to see if
it is ' ='; if not, \ve assume that it is ':'. What if, in fact, ASSIGN is neither' ='
nor ':'? Since it is not '=', we take the ELSE clause and process the rest of
the line as an expression. It would be safer to check explicitly for each
value; if assign is neither' =' nor':', we set the error flag:
IF ASSIGN = '='
THEN
ELSE
IF ASSIGN ': '
THEN
ELSE FLAG := ERROR
124\ Chapter 3 Stacks
Other input errors occur if the parentheses are not correct. Let's see what
happens, for instance, if the expression is not fully parenthesized. The
input lines
A = 5.0;
B = 3.0;
X:((A+B)*(A-B);
will produce the result "X IS 2.00" instead of "X IS 16.00". The last opera-
tion is never performed. (Try it yourself.)
We can check for matching parentheses by keeping a counter, PAR-
COUNT. At the beginning of an expression evaluation we set PARCOUNT
to 0, then add 1 to it each time we encounter a left parenthesis and subtract
1 each time we come across a right parenthesis. If PARCOUNT is not 0
when we get to the end of the expression, there has been an error in the
inpllt, and we must reject the line.
What about expressions like (A + B * C)? The parentheses match, but
there aren't enough of them. Again, in this case, one of the operations
would not be performed. We can check for this situation also: There should
be as many sets of parentl1eses as there are operators; if there are not, an
error has occurred.
We can't prevent errors in the input, especially input from a keyboard,
but we can try to limit their effect on the continuing execution of the pro-
gram.
A second source of run-tin1e errors is soft\vare limitations like stack over-
flow. In this case, there is nothing really wrong with either the input or the
logic of the program, but design decisions like the size of the data struc-
tures Inay impede the execution. We discussed in this chapter the need to
check for stack overflow; in this program it might occur if the expression is
complicated and requires the stacking of more values than the stack has
been declared to hold. To iInplelnent the error checking in this program,
we would use the FULLREAL and FULLCHAR functions that have been
provided. If one of them returned a TRUE vallIe, we would need to send an
error message, reset the stacks to empty, and skip to the next line. What
good would a "software error: stack overflow" message be to the user of
this prograln? If the user consistently got this message, it would be clear
tl1at the software needed modification in order to be useful. A progralnmer
might be called in to "tune" the size of the data structure to make it fit the
needs of the user.
We have mentioned several ideas for making Program CALCULATOR a
more reliable piece of software. When we add error checking, we decrease
Application: Expression Evaluation 1125
the likelihood that the program will fail at run time. By increasing the
robustness of the program, we also increase the satisfaction of its user.
One of the things that made our calculator so clumsy to use was the need for
parentheses to tell us the order of evaluation. The way that we are used to
seeing expressions is called infix notation-the operator is in between the
operands. Infix notation can be fully parenthesized or it can rely on a
scheme of operator precedence, as well as the use of parentheses to over-
ride the rules, to express the order of evaluation within an expression. For
instance, the multiplication operators * and / usually have a higher prece-
dence than the addition operators + and -. The use of a precedence
scheme like this reduces the need for parentheses to situations where we
want to override the order imposed by the scheme. For example, if we want
to multiply first,
A+B*C
is sufficient to express the correct order of evaluation of the expression.
However, if we want to do the addition before the multiplication (breaking
the rule), we must indicate this with parentheses:
(A + B) * C
For one of the programming exercises in this chapter you are asked to
develop a program that incorporates a precedence scheme into the expres-
sion evaluation.
The problem with infix notation is its ambiguity. We must resort to an
agreed-upon scheme to determine how to evaluate the expression. There
are other ways of writing expressions, however, that do not require paren-
theses or such precedence schemes. We will briefly describe two of them
here, and then show how we can convert from infix to another notation with
the help of our new friend, the stack.
Prefix Notation
In prefix notation, the operator precedes the operands. For instance, we
would write the infix expression "A + B" as "+ A B". The infix expression
(A + B) * C
126\ Chapter 3 Stacks
7 * 4 == 28
Now \ve are at the end of the expression, and we are left with tIle single
value 28 as the result of the expression evaluation.
Use this approach to show that the prefix expression
*-+435/+243
equals 4.
\Ve will leave to tl1e reader the development of this algorithm into a
procedure. Hint: Note that using the most recent operator implies a last-in-
first-out type of solution.
How can we convert an infix expression into a prefix expression? Our
solution to this question will make use of the stack routines from this chap-
Application: Expression Evaluation 1127
Since this module directly codes into a Pascal procedure, using PUSH,
POP, and CONCAT, we will not include the code here.
1281 Chapter 3 Stacks
Postfix Notation
In an alternative ,vay of writing expressions, postfix notation, the operator
follows the operands. For instance, the infix expression A + B would be
written as the postfix expression A B +. The infix expression
(A + B) * C
which requires parentheses to indicate the order of evaluation, would be
written as the postfix expression
AB+C*
You should note three features of postfix notation:
III like prefix notation, the relative order of tIle operands is maintained,
Jill parentheses are not necessary, since postfix expressions are by nature
unambiguous, and
III the postfix notation is not merely the reverse of the equivalent prefix
notation.
The algorithms to convert expressions into postfix notation and to evalu-
ate postfix expressions, like those for prefix notation, make use of the stack
data structure. For one of the programming assignments at the end of this
chapter you are asked to develop and code algorithms for postfix expression
notation.
So far, we have limited our discussion to expressions that are stored in
strings. We will return to the topic of prefix and postfix notation for expres-
sions in Chapter 9, wIlen we will see a different way of representing ex-
pressions.
Application: Maze 1129
III APPLICATION: MAlE
0 0 E 0 0 1 1 1
0 0 0 0 0 0
0 0 0 0 0 0 0 0
1 0 1 0 0
0 0 0 1 0 0 0 0
0 0 1 0 1 1 0
0 0 0 0 0 1 0 0
0 1 0 0
0 0 0 0 0 0 0
0 0 1 0 0 0 0
[1] E 1 0 0 1
[2] 0 0 0 1 ]
[3] 0 1 0 1 1
[4] 0 1 0 1 0
[5] 1 1 1 1 0
Figure 3·7.
tained a I? The position [1,1] would not have been put in the stack, so
instead of moving into [1,1], you would have n10ved into [3,1]. You would
have been in an infinite loop! Square [3, 1] was the starting square. You
would have cycled froIll [3, 1] to [2, 1] to [3, 1] over and over and over. You
will need to mark the squares you have visited so that you will not visit
them again. You decide to Illark tl1em with a period (.). This means that you
will put on the stack those squares that contain a 0 or E bllt not those
squares that contain a 1 or a period.
There was one thing that you did auton1atically that will have to be Illade
explicit in the program. When you put [4, 1] and [2, 1] on the stack, you
knew by looking that there wasn't a square on the left of [4,1]. These
outside squares are a special case. The easiest way to handle them is to put
a border of Is around the whole lllaze. Then the borders of the actual n1aze
are handled just like any other square.
Let's sun1n1arize this discussion with a list of the steps to take at each
move.
1. At the beginning square we will exan1ine the four adjacent squares
(the one above, the one below, and the two on either side) and put the
ones with a 0 or an E in theIll on the stack.
2. Mark the square we are in as having been visited by putting a period
in the square. This will protect against infinite loops.
132[ Chapter 3 Stacks
3. Get the next move from the stack. That is, pop the stack. Make the
square whose coordinates have been popped the current square.
4. Repeat the process until you either reach the exit point or try to back-
track and the stack is elnpty. When you try to get an alternative path
from the stack and it is empty, then there are no more possibilities.
This means that there is no exit from the beginning square. You are
surrounded by Is and periods and the stack is empty.
Assumptions:
The entry point is \;vithin the maze.
Data Structures:
A two-dimensional array (MAZE) to represent the maze.
A record to hold a position in the maze represented by a row/column pair
(MOVE).
A stack of possible moves (s).
To simplify the processing we will declare the maze to be (0.. 11, 0.. 11).
Ones will be put in the borders. This will keep us from having to check for
the edges at each move.
INITIALIZE
GETMAZE
WHILE more entry points DO
GET START
PROCESSMAZE(MAZE, MOVE)
INITIALIZE Level 1
GETMAZE Level 1
GETSTART Level 1
PRINTMAZE(MOVE) Level 2
MARK(MOVE) Level 2
STACKPOSSIBLES Level 2
IF MAZE[ROW + 1, COL] is a 0 or E
PUT on stack (ROW + 1, COL)
IF MAZE[ROW - 1, COL] is a 0 or E
PUT on stack (ROW - 1, COL)
IF MAZE[ROW, COL + 1] is a 0 or E
PUT on stack (ROW, COL + 1)
IF MAZE[ROW, COL - 1] is a 0 or E
PUT on stack (ROW, COL - 1)
1341 Chapter 3 Stacks
PUTONSTACK(I, J) Level 3
TMOVE.ROW ~ I
TMOVE.COL ~ J
PUSH(S, TMOVE)
Control Structures:
TRAPPED is set to TRUE (1) if the starting point contains a 1 or (2) if the
stack is empty.
FREE is set to TRUE if current square contains an E.
Parameters:
GETMAZE and INITIALIZE will both need MAZE as a VAR parame-
ter.
STACKPOSSIBLES will need MOVE as a value paralneter and the stack
S. Remember, MOVE contains the row and column of the current
square. T11at is, by plItting the coordinates of t11e desired square into
MOVE, you are thel~e; you have MOVEd.
PUTONSTACK takes a pair of integers that represent a row/column pair
as parameters. They are put into a record and pushed onto the stack S.
MARK and GETSTART are only one line of code eacl1 and should just
be coded inline.
PROCESSMAZE needs tl1e Inaze and the beginning position. Note that
this procedllre changes the values in the maze by marking positions as
it goes through thein. T11erefore the maze has to be restored before the
next starting position is read and the process begins again. We can
take care of tl1is by passing MAZE as a vallIe parameter to
PROCESSMAZE. Therefore the changes are made to the copy passed
to PROCESSMAZE, not the original. In order not to Inake a copy of
the copy of MAZE, we will pass the maze as a VAR parameter to
PRINTMAZE and STACKPOSSIBLES even though neither actually
needs it as a VAR parameter.
Application: Maze \ 135
STACKTYPE RECORD
TOP: O•• MAXSTACK;
STACK: ARRAY[l •• MAXSTACKJ OF ETYPE
END; (* record *)
<************************************************
PROCEDURE PUSH (~JAR· S : STACKTYPE; ETYPE) ;
(* Adds X to the top of stack S. *)
BEG I N (*push*)
S.TOP := StTOP + 1;
S •STACK [ S TOP]
t X
ENO ; (*push*)
<************************************************
PROCEDUR.E POP «(JAR S : STACKTYPE; lJAR X
(*Renl0vesthe topelen1entfi'o1l1 stackS and returns itinX.
BEG I N (* pop *)
}{ : = St·STACKCS. TOP] ;
S.TOP:= S.TOP - ··1
END; pop *)
<************************************************ )
136[ Chapter 3 Stacks
BEGIN (*clearstack *)
S.TOP := 0
E NO ; (* clearstack *)
( ************************************************>
FUNCTIONEMPTYSTACK : STACK TYPE) :600LEAN;
(:;: Heturns TRUE if stack S is FALSE othenvise. *)
BEG I N (* en1ptystack·· *)
EMPTYSTACK := StTOP = 0
END; (*enl ptystack •. *)
(************************************************
PROCEDURE GET MAZE (VAR MAZE: MAZETYPE);
(::: Heads values into array representing the lnaze. *)
(************************************************
PROCEDURE PUTONSTACK (1 t J : INTEGER);
(::: Ho\v anel colunnl of possible l110ve are pHt into an :1:)
(:j: ETYPE record and pushed on the stack S. :1:)
c************************************************
Application: Maze 1137
(************************************************
BEG I N (* stackpossibles *)
(************************************************
(************************************************
138 ( Chapter 3 Stacks
FOR ROW := 0 TO 11 DO
BEGIN
MAZE [ROW t o ] : = '1';
MAZE[ROWt 11] := '1'
END;
( :i: set top and b 0 tt 0 n 1 rc)\v s to l' s :1: )
FOR COL := 1 TO 10 DO
BEGIN
MAZE [0 t COL] : = ' 1 ' ;
MAZE [ 11 t COL] : = ' 1 '
END
END; (:i: initialize :1:)
(************************************************
STACKTYPE;
END; (* prOCeSS111ClZe *)
(************************************************
BEG I N (* lllain prognlln *)
INITIALIZE(MAZE) ;
GETMAZE(MAZE) ;
WHILE NOT EOF DO
BEGIN
READLN(MOVE.ROW, MOVE.COL);
PROCESSMAZE(MAZE, MOVE)
END
END. (* 111ain progran1 *)
Sample Output: The following output was created by running this pro-
granl on six test cases.
1 1 1
1 0 0 E 0 0 1
1 0 0 0 0 0 0 1
1 0 0 0 0 0 0 0 0 1
1 0 1 0 0 1
1 0 0 0 0 0 0 1 0 1
1 0 1 0 0 1 0 1
1 0 1 0 0 0 1 0 0 1
1
'*'
1 0 0 1 0 1
1 0 1 0 0 0 (> 0 1 0 1
1 (> 1 (> (> 0 0 0 1
1 1 1
I AM FREE!
1 1 1
1 0 0 * (> 0 1
1 0 0 (> 0 0 0 1
1 0 0 0 0 0 0 0 0 1
1 0 1 0 0 1
1 (> 0 0 0 (> 0 1 0 1
1 0 1 0 0 1 0 1
1 0 1 0 0 (> 0 1 0 0 1
1 1 0 0 1 0 1
1 0 1 0 0 0 0 0 1 0 1
1 0 1 0 0 0 0 0 1
1 1 1 1 1 1
I AM FREE!
I
140 \ Chapter 3 Stacks
1 1 1
1 0 0 E 0 0
1 (> 0 0 0 (> 0
1 0 0 0 0 0 0 0 0
1 1 0 0 0
1 0 0 0 1 0 0 0 0
1 (> 0 1 0 0
1 0 0 1 0 0 0 0 0
1 0 1 * 0
1 0 0 0 0 0 0 0
1 0 0 1 0 0 0 0
1 1 1 1 1 1
I AM FREE 1
1 1 1 1 1 1
1 0 0 1 E 0 0
1 0 1 1 0 0 0 0 0
1 0 0 (> 0 0 0 0 0
1 1 1 0 1 0 0
1 0 0 0 1 0 0 0 1 0
1 0 0 1 0 1 1 0
1 0 0 1 0 0 0 1
1 0 1 0 1
'* 00
1 0 0 0 0 0 0 1 0
1 0 0 0 1 0 0 0
1 1 1 1 1 1 1 1 1
HELPt I AM TRAPPEDl
1 1 1 1 1 1 1 1 1 1
1 0 0 1 E 1 0 0 1 1
1 0 1 0 1 0 1 0 0 0 1
1 0 0 0 0 0 0 1 0 0 1
1 1 0 1 0 0 1
1 0 0 0 1 0 0 0 0 1
1 0 1 0 1 0 1 0 1
1 0 1 0 1 0 0 0 0 0 1
1 1 1 0 1 1 0 0 1
1 0 1 0 0 0 0 (> 1
1 0 1 0
* 0 1 0 0 0 1
1 1 1 1 1 1 1
I AM FREE 1
Application: Maze 1141
1 1 1
1 0 0 E 0 0 1
1 0 0 0 0 0 0 1
1 0 0 0 0 0 (> (> 1
1
* 0 1 0 0 1
1 0 0 0 0 0 0 1 0 1
1 0 1 0 0 1 0 1
1 0 1 (> (> 0 (> 1 0 0 1
1 1 0 (> 1 0 1
1 0 1 0 (> 0 0 0 1 0 1
1 0 1 0 0 (> 0 (> 1
1 1 1 1 1 1
I AM FREE!
2. Show what is written by the following segments of code. (SI and S2 are stacks of
Exercises 1143
integer elenlents; I, J, and K are integer variables.)
(a) CLEARSTACK ( S 1 ) ;
CLEARSTACK(SZ) ;
FOR I : = 1 TO 10 DO
PUSH ( S 1, I);
WHILE NOT EMPTYSTACK(Sl) DO
BEGIN
POP(Sl, I);
IF I MOD 2 = 0
THEN PUSH(S2, I)
END;
WHILE NOT EMPTYSTACK(S2) DO
BEGIN
POP(S2, I);
WRITELN(I)
END;
(b) I : = 1;
CLEARSTACK(Sl) ;
CLEARSTACK(SZ) ;
WHILE I * I < 50 DO
BEG I.N
J:= I * I;
PUSH(Sl, J);
I := I + 1
END;
FOR I:= TO 5 DO
BEGIN
POP(Sl, J);
PUSH(S2, J)
END;
PO P( S 1, I);
FOR J := 1 TO DO
BEGIN
POP (S2, K);
PUSH ( S 1, K)
END;
WHILE NOT EMPTYSTACK(Sl) DO
BEGIN
PO P( S1, I);
WRITELN(I)
END;
Use the following information for Exercises 3, 4, 5, and 6: A stack, S, is imple-
mented by an ARRAY[1 ..5] OF CHAR; TOP is an integer variable (0..5); C is a
1441 Chapter 3 Stacks
character variable. For each example below, show the result of the operation on the
stack. If overflow or underflow occurs, check the appropriate box; otherwise show
the new contents of the array, TOP, and C. (Note: Some values in the array Inay not
be in the stack.)
3. s TOP == 3
C == 'F'
[1] [2] [3] [4] [5]
s TOP== _ _
c== _
[1] [2] [3] [4] [5]
4. s~ TOP == 5
C == 'A'
[1] [2] [3] [4] [5]
PUSH(S, C);
OVERFLOW? - U~DERFLOW? -
s TOP==_-
c== _
[1] [2] [3] [4] [5]
5. s TOP == 1
C == 'X'
[1] [2] [3] [4] [5]
POP(S, C);
OVERFLOW? _ UNDERFLOW?
s TOP== _ _
c== _ _
[1] [2] [3] [4] [5]
Exercises 1145
6. s TOP = 5
C = 'C'
[1] [2] [3] [4] [5]
POP(S, C);
OVERFLOW? UNDERFLOW? _
s TOP= _ _
C= _ _
[1] [2] [3] [4] [5]
7. Use Procedures POP and PUSH and Functions EMPTYSTACK and STACK-
TOP to do the following:
(a) Set I (an integer variable) to the third element in the stack, leaving the
stack without its top two elements.
(b) Set I to the value of the third elelnent in the stack, leaving the stack un-
changed.
(c) Given an integer N, set I to the Nth elelnent in the stack, leaving the stack
without its top N elements.
(d) Given an integer N, set I to the Nth elelnent in the stack, leaving the stack
unchanged. (Hint: Use a second stack.)
(e) Set I equal to the bottonl element in the stack, leaving the stack empty.
(f) Set I equal to the bottom element in the stack, leaving the stack un-
changed.
8. Show what is wrong with the following segment of code. (S is a stack of integer
elelnents; X, Y, and Z are integer variables.)
CLEARSTACK(S) ;
}{ : = 5;
PUSH(S t '();
PUSH(St }-{ - 2);
POP (S t }{);
POP(St'{-2);
POP(S t Z);
9. Read in a string of characters and detennine if they are a palindrolne. (A palin-
drolne is a sequence of characters that reads the same forward and backward.)
The character '.' ends the string. Example:
ABLE WAS I ERE I SAW ELBA..
Write 'yes' if the string is a palindrome, and 'no' otherwise. This does not need
to be a conlplete progranl, only a program fragment. You may assume that the
data are correct and that the maxiInum number of characters is 100.
1461 Chapter 3 Stacks
(a) TOP = 1
CH='X'
[1] [2] [3] [4] [5]
POP(S, CH)
TOP =
CH =
[1] [2] [3] [4] [5]
OVERFLOW? UNDERFLOW? _
(b) TOP = 5
CH='C'
[1] [2] [3] [4] [5]
PUSH(S,'A')
TOP =
CH =
[1] [2] [3] [4] [5]
OVERFLOW? _ UNDERFLOW? _
3. What is the output produced by the following segn1ent of code?
N : = 1;
CLEARSTACK(STACK) ;
REPEAT
IF N <= 7
THEN BEGIN
PUSH(STACK t N);
N := 2 * N
END
ELSE BEGIN
POP(STACK, N):
WRITE(N) ;
N := 2 * N + 1
END
UNTIL EMPTYSTACK(STACK) AND N > 7;
WRITELN;
Hint: The REPEAT-UNTIL loop will repeat 14 times, in 7 of these 14 iterations
the WRITE(N) statelnent will execute.
Advice: Keep track of the values of STACK and N at the end of each iteration.
148 I Chapter 3 Stacks
4. Suppose that TEST is some Boolean function that takes any given integer and
returns either TRUE or FALSE. Consider the following segment of code:
CLEARSTACK(STACK) ;
FOR I : = 1 TO 3 DO
IF TEST(I) THEN WRITE(I)
ELSE PUSH(STACK, I);
WHILE NOT EMPTYSTACK(STACK) DO
BEGIN
POP(STACK, I);
WRITE(I)
END;
WRITELN;
Which of the following are possible outputs of the above code? Circle one: P
(for possible) or I (for impossible).
(a) 1 2 3 P I
(b) 1 3 2 P I
(c) 2 1 3 P I
(d) 3 1 2 P I
(e) 2 3 1 P I
(f) 3 2 1 P I
5. Write a procedure REPLACES which takes as arguments a stack and two varia-
bles. If the first variable (OLDEL) is found anywhere in the stack, replace it
with the second variable (NEWEL). IfOLDEL is not found in the stack, return
the stack unchanged. You may assume the following stack operations have been
defined and coded. You do not know what implementation has been used, how-
ever.
CLEARSTACK (VAR S : STACKTYPE)
PUSH (VAR S : STACKTYPE; CH : CHAR)
POP (VAR S : STACKTYPE; VAR CH : CHAR)
EMPTYSTACK (S : STACKTYPE): BOOLEAN
MOO
150 I Chapler 4 Queues
Rear of
QUEUE
Operations on Queues 1151
What is the accessing function for a queue? For adding elements, we
access the rear of the queue; for removing elements, we access the front.
The middle elements are logically inaccessible, even if we physically store
the queue in a random-access structure like an array.
OPERATiONS ON QUEUES
The bookstore example suggests two operations that can be applied to a
queue. First, new elements are added to the queue, an operation that we
will call ENQ (en-queue). ENQ(Q, X) means "add item X at the rear of
queue Q." We also take elements off the front of the queue, an operation
that we will call DEQ (de-queue). DEQ(Q, X) means "remove the front
element from queue Q, and return its contents in X." Unlike the stack
operations PUSH and POP, whose names are fairly standard, the adding
and deleting operations on a queue have no standard names. ENQ is some-
times called ADD or INSERT; DEQ can also be called REMOVE or DE-
LETE.
Another useful queue operation is checking whether the queue is
empty. EMPTYQ(Q) returns TRUE if queue Q is empty and FALSE other-
wise. We can only DEQ when the queue is not empty. Theoretically, we
can always ENQ, since in principle a queue is not limited in size. We know
from our experience with implementing stacks, however, that certain im-
plementations (an array representation, for instance) require that we test
whether the structure is full before we add another element. This real-
world consideration applies to queues as well.
Finally, we need to be able to initialize a queue to an empty state, an
operation that we will call CLEARQ(Q).
In terms of program design, what we have just described are the inter-
faces to the queue. We don't know anything at this point about the insides
of these queue routines; we only know that to use them we must have the
correct interface, or calling sequence-the name of the procedure or func-
tion, and the parameter list.
A Simple Problem
Suppose a string contains character information in the form
substringl.substring2
where substringl and substring2 are the saIne length, and are separated by
a period. We want to write a function to see if the two substrings are the
same. If so, function MATCH returns TRUE; otherwise, it returns FALSE.
For instance, MATCH(STRINGl), where STRINGI is
'ABCDEFG.ABCDEFG'
returns TRUE, while MATCH(STRING2), where STRING2 is
'ABCDEFG.ABCDEFQ'
returns FALS E.
The algorithm for this function is as follows:
CLEARQ (Q) ;
(:;: eet the first character. :::)
CHi := GETCHAR(STRING);
(:i: Queue all the characters in the first substring, up to the period. *)
WH ILE CH i -:::::=- DO I. I
BEGIN
ENQ(QUEUE t CHi);
CHi := GETCHAR(STRING)
END;
(:;: COlnpare the queued substring to the second substring, one character at a *)
(:i: tirne, until there is a 111isnlatch or the end of the substring. :1:)
MATCHING := TRUE;
WHILE NOT EMPTYQ(QUEUE) AND MATCHING DO
BEGIN
(:1: C;et i:l character b'onl each substring. :i:)
CH2 := GETCHAR(STRING);
DEQ(QUEUE t CHi);
The front of the queue is fixed, remember, at the first slot in the array,
while the rear of the queue moves down with each ENQ. Now we DEQ the
front element in the queue:
DEQ (Q, X)
[1] [2] [5]
This deletes the element in the first array slot and leaves a hole. To keep
the front of the array fixed at the top of the array, we will need to move
every element in the queue up one slot:
The need to move up elements in the array was caused by our decision to
. keep the front of the queue fixed in the first array slot. If we keep track of
the index of the front, as well as the rear, of the queue, we can let both ends
of the queue float in the array.
1561 Chapter 4 Queues
FRONT = 1
[1] [2] [3] [4] [5] REAR = 1
(a) ENQ(Q, 'A')
FRONT = 1
[1] [2] [3] [4] [5] REAR = 2
(b) ENQ(Q, '8')
FRONT = 1
[1] [2] [3] [4] [5] REAR = 3
(c) ENQ(Q, 'e')
FRONT = 2
[1] [3] [4] [5] REAR = 3
(d) DEQ(Q, X) X= 'A'
Figure 4...1.
Figure 4-1 shows how several ENQs and DEQs would affect the queue.
The ENQ operations have the san1e effect as before; they add elements to
subsequent slots in the array and increment the index of the REAR indica-
tor. The DEQ operation is simpler, however. Instead of moving elements
up to the beginning of the array, it merely increments the FRONT indicator
to the next slot. For simplicity, in these figures, we only show the elements
that are in the queue. The other slots may contain garbage; including val-
ues that have been DEQed.
Letting the elements in the queue float within tl1e array will CTeate a
FRONT = 4
[1] [2] [3] [4] [5] REAR = 5
(a) REAR is at the bottom of the array.
FRONT = 4
[1] [2] [3] [4] [5] REAR = 1
(b) Using the array as a circular structure, we can wrap
the queue around to the top of the array.
figure 4...2.
Another Queue Design 1157
new problem when the REAR gets to the end of the array. In our first
design, this situation told us that the queue was full. Now, however, it is
possible for the rear of the queue to reach the end of the (physical) array
when the (logical) queue is not yet full [Figure 4-2(a)].
Since there may still be space available at the beginning of the array, the
obvious solution is to let the queue "wrap around" the end of the array.
That is, the array can be treated as a circular structure, in which the last slot
is followed by the first slot [Figure 4-2(b)]. To get the next position for
REAR, for instance, we can use an IF statement:
IF REAR = MAXQUEUE
THEN REAR := 1
ELSE REAR := REAR + 1
FRONT = 3
[1] [2] [3] [4] [5] REAR = 3
(a) DEQ(Q, X)
FRONT = 4
[1] [2] [3] [4] [5] REAR = 3
(b) The result of removing the last element.
Figure 4-3.
FRONT = 4
[1] [2] [3] [4] [5] REAR = 2
(a) ENQ(Q, 'E')
FRONT = 4
[2] [3] [4] [5] REAR = 3
The result of adding the last element,
making the queue full.
figure 4-4.
1581 Chapler 4 Queues
the queue, leaving the queue fll11. The values of FRONT and REAR, how-
ever, are identical in the two situations.
The first solution that comes to mind is to add another field to our queue
record, in addition to FRONT and REAR-a count of the elements in the
queue. When the count is 0, the queue is empty; when the count is equal to
the maximum nUlnber of array slots, the queue is full. Note that keeping
this count adds processing to the ENQ and DEQ routines. If the queue user
frequently needed to know the number of elements in the queue, however,
this would certainly be a good solution. We will leave the development of
this design as a homework assignment.
Another common, but less intuitive, approach is to let FRONT indicate
the index of the array slot preceding the front element in the queue, not the
front element itself. (The reason for this will not be immediately clear;
keep reading.) REAR will still indicate the index of the rear element in the
queue. In this case, a queue is empty when FRONT = REAR. Before we
DEQ [Figure 4-5(a)], we first check for the empty condition. Since the
queue is not empty, we can DEQ. FRONT is incremented to indicate the
true first queue element, and the value of that slot is assigned to X. (Note
that updating the FRONT index precedes assigning the value in this de-
sign, since FRONT does not point to the actual front element at the begin-
ning of DEQ.) After this DEQ, EMPTYQ will find that FRONT is now
equal to REAR, indicating that the queue is empty [Figure 4-5(b)].
FRONT = 2
[1] [2] [3] [4] [5] REAR = 3
(a) DEQ(Q, X)
FRONT = 3
[1] [2] [3] [4] [5] REAR = 3
(b) Testing for an empty queue: FRONT = REAR.
Figure 4-5.
FRONT = 3
REAR = 2
VAR Q: QTYPE;
<************************************************
PROCEDURE CLEARQ <VAR QUEUE: QTYPE);
(* Initialize ()UEUE to enlpty conditioll. :::)
<************************************************
160 I Chapter 4 Queues
BEG I N (* fullq *)
END;
(************************************************
(************************************************
(************************************************
Queue Applications 1161
PROCEDURE DEQ (VAR QUEUE : QTYPE;
VAR DEQVAL : ELTYPE);
(::: Helnove the hont elenlcnt hOlll QUEUE, and return it in IJECjVAL. :::)
(::: QUEUE.FRC)NT is the index of the array slot preceding the front :::)
(:j: elenlent in the queue. Assurnes that the queue is not enlpty. :::)
(************************************************
Note that DEQ, like POP, does not actually remove the value of the
element from the queue. The value that has just been DEQed still exists
physically in the array, but cannot be accessed because of the change in
QUEUE.FRONT.
This solution is not nearly as simple or intuitive as our first queue de-
sign. What did we gain by adding SOlne amount of complexity to our de-
sign? We wanted to achieve better performance; specifically, we needed a
more efficient DEQ algorithln. In the design of data structures and program
algorithms, we will find that there are often tradeoffs-a more complex
algorithm may give more efficient performance, a less efficient solution
may allow us to save much memory space. As always, we must make design
decisions according to what we know of the requirements of the problem.
QUlEUE APPLICATIONS
One application in which queues figure as the prominent data structure is
the computer simulation of a real-world situation. The sample program at
the end of this chapter describes this type of application in more detail.
Queues are also used in many ways by the operating system, t11e pro-
gram that schedules and allocates the resources of a computer system. One
of these resources is the CPU (central processing unit) itself. If you are
working on a multi-user system and you tell the computer to run a particu-
lar program, the operating system adds your request to its "job queue."
When your request gets to the front of the queue, the program you re-
quested is executed. Similarly, the various users of the system must share
the I/O devices (printers, disks, tapes, card readers, and so forth). Each
device has its own queue of requests to print, read, or \vrite to these de-
vices.
162 \ Chapter 4 Queues
MAIN Level a
INITIALIZE Level 1
CARARRIVES Level 1
CARTOTELLER Level 1
IF car in queue
get car
increment number of cars
add time spent in queue to total time
set TELLER to SERVICETIME
To increment the time of each car in the queue, we must access each
element in the queue. The queue itself must remain unchanged. We can do
this by dequeueing a car, incrementing it, and enqueueing it again. Re-
member that the car is being represented in the queue by a simple variable
that keeps track of how many minutes the car has been in the queue.
DEQ(Q, CAR) returns the counter for the car which is at the front of the
line in the variable CAR. You add 1 to CAR and then ENQ(Q, CAR). Now
the counter for that car is at the end of the line. You continue this process
until you are back at the point where you began.
The problem is to determine when we are back where we started. We
can solve this by putting a dummy element or flag in the queue. When we
dequeue this flag element, we have accessed each element in the queue.
Since our elements are just positive integers in this case, we can use -1 as
a flag.
Increment CLOCK
set FLAG to -1 (* -1 is entered at the rear of *)
IF TELLER <> 0 (* the queue. Elements are removed *)
decrement TELLER (* from the front, incremented, and *)
ENQ(Q, FLAG) (* put back on the rear of the queue. *)
DEQ(Q, CAR) (* When the -1 is dequeued, each *)
WHILE CAR <> FLAG (* element of the queue has been *)
increment CAR (* incremented and the original order *)
ENQ(Q, CAR) (* of the queue has been restored. *)
DEQ(Q, CAR)
168 I Chapter 4 Queues
CONTROL STRUCTURES
CONST ZERO = 0;
MA}-{QUEUE = 100;
TY PE I NDE}-{TY PE 1 •• MA}-{QUEUE;
QUEUETYPE RECORD
ELEMENTS: ARRAY[INDEXTYPEJ OF INTEGER;
FRONT,
REAR : INDEXTYPE
END; (::: record *)
<************************************************ )
170 I Chapler 4 Queues
BEGIN (*cTearq. *)
Q. FRONT : = MAXQUEUE.;
Q.REAR := MAXQUEUE
END; (* clearq *)
<************************************************
BEG IN (* en1ptyq *)
EMPTYQ := Q.
END; (* elnptyq
<************************************************
PROCEDURE ENQ (VAR Q : ,QUEUETYPE; NEWVAL : ) ;
(* See text f{)f c0111plete dOClunentatiol1. *)
BEGIN (*enq *)
IF Q.REAR = MAXQUEUE
THEN Q.REAR := 1
ELSE Q.REAR := Q.REAR + i;
Q • ELEMENTS [Q • :=
END; (*enq *)
<************************************************
BEGIN (*deq *)
IF Q.FRONT = MAXQUEUE
THEN Q.FRONT := 1
ELSE Q.FRONT := Q.FRONT + 1;
DEQl,JAL Q. ELEMENTS [Q. FRONT
END; *) .
<************************************************
Application: Simulation 11 71
CONST PI 3.14158;
BEG I N (* randc)lll *)
TEMP := SEED + PI; (* SEED is a global variable. :::)
TEMP := EXP(5.0 * LN(TEMP»;
SEED := TEMP - TRUNC(TEMP);
RANDOM := SEED
END; (* ranclolll :1:)
(************************************************
PROCEDURE INITIALIZE;
(* Initialize all siln ulation variables. :1:)
(************************************************
PROCEDURE GETPARAMETERS (VAR ARRIVALPROB : REAL;
VAR SERVICETIME : INTEGER;
VAR DATAOK : BOOLEAN);
(* ARRIVALPROB and SEHVICETI~/IE are read until a pair \vhere :::)
(* both are positive is found. I)ATAC)K is FALSE if no good *)
(* pair of values for AHHIVALPRC)B and SEHVICETIl\rlE is found. :1:)
BEGIN (* getparan1eters *)
READLN(ARRIVALPROB, SERVICETIME);
WRITELN( 'PROBABILITY OF ARRIVAL IS , ARRIVALPROB);
WRITELN( 'TIME TO SERl"JICE IS : " SERl"JICETIME);
DATAOK := (ARRIVALPROB > ZERO) AND (SERVICETIME > ZERO);
1721 Chapter 4 Queues
(::: If'this iUPllt is irlc01Tect, read until a correct pair is f()llnd. :j:
<************************************************
(************************************************
Application: Simulation 11 73
CLOCK := CLOCK + 1;
IF TELLER <> 0
THEN
TELLER := TELLER - 1;
WHILE CAR <> FLAG DO (::: Entire (I'I('IIC will he ac('('ssc'cl. or:)
END
(************************************************
(************************************************
17 41 Chapter 4 Queues
The data for the first run of this program were as follo\vs:
120
0.4 5
0.3 4
0.5 7
0.5 -3
-0.1 5
-0.1 -2
Note that the first three lines of arrival probabilities and service times
are correct. The fourth line has a negative service time, the fifth line has a
negative arrival probability, and the sixth line has negative values for both
of these items. This should test the error-checking routine.
However, it doesn't work. We get a TRIED TO READ PAST EOF error.
Can you find the cause?
J~ ----I
Application: Simulation 11 75
each transaction tells us the probability of a car arriving during the transac-
tion time. If the probability is greater than 1.0, then the queue will be
unstable.
If two cars arrive every 5 minutes but only one leaves, the queue just
keeps getting longer and longer and longer! (You have probably experi-
enced this situation.) In each of the cases where the wait time just keeps
increasing, you will see that the probability of a car arriving within the
transaction time is greater than 1.0.
In the cases where the probability of a car arriving within the transaction
time is equal to 1.0, you will note the highest fluctuations (Le., 0.25 with
a transaction time of 4 minutes and 0.20 with a transaction time of 5
minutes).
When we were discussing error conditions, we indicated that there
would be a case where the relationship between the arrival probability and
the service time should be checked. If the queue is known to be unstable
from the beginning, the numbers are meaningless. You should consider
putting in such a check in a job situation. In the case \vhere the prob-
ability multiplied by the service tin1e is greater than 1, we could print
a message and read in another pair of arrival probabilities and service
times.
We have used a random number generator or function to simulate the
arrival of a car. The idea here is that each time a random number generator
is called, it will give us a particular value depending purely on chance.
Much theoretical work has gone into the development of algorithms to pro-
duce random numbers. However, given a particular function to produce
random numbers and the current output from the function, the next value is
completely predictable-not random at all!
Therefore, the numbers from such a function are called pseudo-random.
For simulation purposes, however, pseudo-random numbers are sufficient.
If a simulation is run for a long enough period of time, the theory of random
numbers says that the wait time will converge no matter \vhat random (or
pseudo-random) number generator you use.
This is illustrated in Table 4-1. The average \vait times for cases where
the probability of a car arriving during a transaction time is less than 1.0
fluctuates over time but stays within the same general range.
Now let's answer our original questions. With the probability of 0.2 of a
car arriving each minute (Le., the probability of 1.0 that a car will arrive
every 5 minutes) and a transaction time of5 minutes, the average wait times
vary from 4 minutes to 1 hour and 19 minutes. The wait time is already too
long. The bank had better either lower the average transaction time or add
a new drive-in window immediately.
Application: Simulation 11 77
Use the following information for Exercises 1 and 2. Q is a queue that contains
integer elements. X, Y, and Z are integer variables. Show what is written by the
following segments of code:
1. CLEARQ (Q) ;
ENQ (Q t 5);
ENQ (Q t G);
ENQ (Q t 7);
ENQ (Q t 8);
DEQ( Q t ,.();
DEQ (Q t Y);
ENQ( Q t ,.();
ENQ(Qt Y+1);
DEQ(Q t ,.();
ENQ (Q t Y);
WHILE NOT EMPTYQ(Q) DO
BEGIN
DEQ( Q t }();
WRITELN(}-{)
END
2. CLEARQ (Q) ;
}{ : = 5 ;
Y := 7;
ENQ(Qt }-() ;
ENQ(Qt 5) ;
ENQ(Qt Y) ;
DEQ(Qt Y) ;
ENQ(Qt 2) ;
ENQ(Qt ,.() ;
ENQ( Q t y) ;
Z : = }{ - Y;
IF Z = 0
THEN WHILE NOT EMPTYQ(Q) DO
BEGIN
DEQ (Q t }-();
lAIR I TELN ( }.{ )
END
ELSE WRITELN( 'THE END')
(a) FRONT = 1
[1] [2] [3] [4] [5] REAR = 4
ENQ(Q, 'F')
OVERFLOW? _ UNDERFLOW? _
FRONT = _ REAR = _
FRONT = 5
[1] [2] [3] [4] [5] REAR = 4
ENQ(Q, 'G')
OVERFLOW? _ UNDERFLOW? _
FRONT = _ REAR =_
FRONT = 4
[1] [2] [3] [4] [5] REAR = 5
ENQ(Q, 'H')
OVERFLOW? _ UNDERFLOW? _
FRONT = _ REAR =_
FRONT = 2
[1] [2] [3] [4] [5] REAR = 1
DEQ(Q, X)
OVERFLOW? _ UNDERFLOW? _
FRONT = _ REAR = - X=_
180 I Chapter 4 Queues
DEQ(Q, X)
OVERFLOW? _ UNDERFLOW? _
FRONT=_ REAR=_ X=_
(f) FRONT = 5
[1] [2] [3] [4] [5] REAR = 3
DEQ(Q, X)
OVERFLOW?- UNDERFLOW? _
FRONT=_ REAR=_ X=_
5. You have been assigned the task of testing the set of general-purpose utility
routines. You decide to use a bottom-up approach, and you write a test-driver to
read in a series of commands to manipulate the queue. The commands are
CLEAR (* Clears the queue. *)
ENQ element
DEQ (* Dequeues an element and prints it. *)
PRINTALL (prints the current elements in the queue without changing
the queue)
Assulning that ELTYPE is INTEGER and MAXQUEUE = 5, create a set of test
data (con1n1ands) that would adequately test the queue routines. The driver is
written to test for empty and full before dequeing and enqueing.
6. Using the final implementation discussed in this chapter, write procedures
with the following interfaces:
(a) PROCEDURE TESTENQ (l,.JAR QUEUE QTYPE;
NEWl,.JAL EL TYPE;
VAR OFLOW BOOLEAN) ;
(:;: Tests ()UElJE for overflow coudition before :::)
(:0: trying to add NE\V\lAL to the rear of the :;:)
(::: queue. *)
Application: Simulation 1181
(b) PROCEDURE TESTDEQ (l"JAR QUEUE QTY PE ;
l"JAR DEQVAL ELTYPE;
l"JAR UFLOW BOOLEAN) ;
(:,: Tests QUEUE for lllldcrtlc)\v conditioll !)('f()rc ;;:)
(::: relnoving the front elClllcnt fro]1l the queue '::)
(:i: and assigning it to ])E()\lAL. :,:)
7. The user of the queue routines will have frequent need for a count of the ele-
ments in the queue, so we add another Function QCOUNT(Q). We decide to
add a field COUNT to the record that contains the queue. Rewrite the queue
routines using the following declaration for QTYPE:
TYPE QTYPE = RECORD
ELEMENTS = ARRAY[INDEXTYPEJ OF ELTYPE;
COUNT : INTEGER;
FRONTt REAR INDEXTYPE
Note that we no longer need to reserve an unused slot in the array to differenti-
ate between an empty and full queue.
8. A deque is what you get when you cross a stack with a schizophrenic queue. You
can add to and delete fronl either end of a deque. It's sort of a FLIFLO (first or
last in, first or last out) structure. Using an array to contain the elements of the
deque, write
(a) the declarations for DEQUETYPE
(b) PROCEDURE INDEQUEFRONT
(c) PROCEDURE INDEQUEREAR
(d) PROCEDURE OUTDEQUEFRONT
(e) PROCEDURE OUTDEQUEREAR
(f) a description (25 words or less) of SOlne application of this data structure.
9. A (fictional) operating system queues jobs waiting to execute according to the
following scheme:
[JUsers of the system have relative priorities according to their user ID num-
ber:
users 100-199 highest
users 200-299 next highest
users 300-399
D Within each priority group, the jobs execute in the same order that they arrive
in the system.
[] If there is a highest-priority job queued, it will execute before any other job; if
182\ Chapler 4 Queues
5.
FRONT = 4
REAR = 1
At this high level, these are logical operations on a list. At a low level, these
operations will be implelnented as Pascal procedures or functions that
manipulate an array or other data structure that contains the list's elements.
In between, there are intermediate design decisions.
l~ST REPRESEN1Al~ONS
Figure 5·1. Total space for 2500 records, of which 2000 will be wasted.
Holiday
Sleep Inn Inn Sheraton Y'all Come Inn
STOCKHOLDERS
ANATOMY OF A NODE
Each node in a linked list must contain at least two fields. The first is the
data field. This field may be a siInple integer, a character, a string, or even a
large record containing many other fields. In the following discussion, we
will refer to this field as the INFO field. The second field is the pointer, or
link, to the next node in the list. We will refer to tllis field as the NEXT
field. This is one node:
INFO NEXT
The NEXT field of the last node in the list contains a special value that is
not a valid address. This value tells us that we have reached the end of the
list.
SOME NOIAr~ON
(1) P ~ LIST
(2) WHILE the list is not enlpty DO
(3) print (INFO(P))
(4) P ~ NEXT(P)
How did the information get into the linked list? Consider a list whose
nodes were inserted, one node at a time, at the beginning of the list. (See
Figure 5-4.) After three nodes were added, the list looked like Figure
5-4(a). l\ow, to insert a new node containing the data value 5 to the front of
the list, we must first get an empty node Eronl sonlewhere. Let's aSSUl1le
Some Operations on a Linked list 1191
(a) The original list.
L1ST--G-G-~
f------.-.------.~------__j
L1ST_~~~
P -~-l (d) NEXT (P)~ LIST.
r------------ J
L1ST~~~~
'"--
~~~ ~ ______J
that all the nodes not currently in use are in a pool of available nodes.
Whenever we need one, we call the procedure GETNODE (whose coding
we will discuss shortly), which will return the pointer to an empty node
(the INFO and NEXT fields of this new node may actually contain some
values, but we will consider them to be garbage). Now that we have a free
node, we can put the value 5 in its INFO field and insert the node into the
list by manipulating the relevant pointers.
To delete the first node of a linked list, we need to make the external
pointer LIST equal to the NEXT field of the first node in the list. This task
is easily accomplished (see Figure 5-5) as follows:
LIST ~ NEXT(LIST)
Note, in Figure 5-5(b), that the node containing the value 1 is no longer
pointed to by any pointers. We no longer have a way to access this node at
all. After many nodes have been deleted in this manner, a large amount of
space will be taken up by these unused, and unusable, nodes. It would be
better to put each node deleted back into the pool of available nodes. To do
this, let's assume the existence of a procedure FREENODE (which we will
write later) that takes a temporary pointer to an unneeded node to put that
LIST
LIST
P ---
(a) P~L1ST
/.-.--------.....
/'
LIST // ,......--~
P
(b) LIST ~ NEXT(L1ST).
figure 5 6.. Q
Delete-first-node with FREENODE.
Getnode and Freenode Operations 1193
node back into the pool of available nodes. We can rewrite our delete-first-
node algorithm, using a temporary pointer P to hold onto the node that will
be deleted. (See Figure 5-6.)
LIST
list of data
To get a node from the list of available nodes, we will use our delete-
first-node algorithm. NEWNODE is the pointer to the node returned by
GETNODE. (See Figure 5-7.)
AVAIL ---..IU
~IU ~ ~
NEWNODE -----i>-~~L----lU
(a) NEWNODE~ AVAIL.
-------
AVAIL"/ I U '~ ~
NEWNODE ~L-J..:]--+-~L----lU
returned
(b) AVAIL~ NEXT (AVAIL).
• •~
AVAIL---j-_-i>
OLDNODE .. I BJ
L----- (a) NEXT(OLDNODE) <E- AVAIL.
CONST NULL 0;
MAX MaxiMuM nodes in list;
CONST NULL 0;
MAH 500;
DATETYPE = RECORD
DAY, MONTH, YEAR INTEGER
END; (:i: datety'pe :1:)
INFOTYPE RECORD
NAME, ADDRESS, CITY: STRINGZO;
ARRIVAL: DATETYPE;
LENGTHSTAY : INTEGER;
ROOMCODE : CHAR;
CREDITCODE : CHAR;
CREDITNO : INTEGER;
CHARGE : REAL
END; (* infotype :1:)
NODETYPE RECORD
GUEST: INFOTYPE;
NEXT : POINTERTYPE
END; (:1: nocletype :j:)
We can print the name of the first guest in the Hilton list by the state-
ment
WRITELN(NODES[HILTONJ.GUEST.NAME) ;
We will always access a field of a given node by its array index (a pointer)
and field specification. For instance, in the following list, the INFO field
of the first node in the list is referenced by NODES[LIST].INFO. The
INFO field of NODE(P) can be referenced by NODES[P].INFO. The
Getting Started 1197
INFO field of the following node (containing 'C') can be referenced by
NODES[NODES[P].NEXT].INFO, and so on.
L1ST~[Q3-~~~
P
GET1~NG STARTED
Let's consider the simplest situation: We plan to keep one list of data and
one list of available nodes in the array NODES. Each node record has two
fields: INFO and NEXT. When we first begin, the linked list of data will be
empty, so LIST = 0 (or NULL). AVAIL will point to a list containing all of
the nodes in the array (see Figure 5-10). The nodes can be linked together
in any order, of course, but it is simplest to string them together sequen-
tially, as in the procedure, INITIALIZE.
[7] 8
[8] 9
[9] 10
[10] 0
Figure 5-10. Initializing the AVAIL list to contain all the nodes in the array.
1981 Chapter 5 Linked Lists
VAR P: POINTERTYPE;
p ~ LIST
WHILE the list is not empty DO
print (INFO(P))
P ~ NEXT(P)
P : = LIST;
WHILE P <> NULL DO
BEGIN
WRITE(NODES[PJ.INFO);
P := NODES[PJ.NEXT
END;
Implementing the List Operations 1199
NODES LIST I!;~!I
[1]
[2] AVAIL r;~1
[3]
[4]
[5]
[6]
[7]
[8]
[9] Prints:
D G J M R
[10]
Figure 5-11.
Using the array shown in Figure 5-11, let's see what is written by this
segment of code. P is originally set to 3 (the value of the external pointer,
LIST), and the value of NODES[3].INFO is printed. Then P is advanced
to NODES[3].NEXT, or 6. Since 6 <> NULL, the loop is repeated.
NODES[6].INFO is printed, and P is advanced to NODES[6].NEXT, or 1.
Again P <> NULL, so the loop is repeated.
This cycle continues until P = 4. Then NODES[4].INFO is printed, and
P is advanced to NODES[4].NEXT, or 0 (our NULL value). This 0 signifies
the end of the list, and we exit the loop.
Why are there two NULL values in the NEXT field? Simply because two
linked lists are being stored in the array, each with its own external
pointer (LIST and AVAIL) and each with a final pointer value
(NODES[P].NEXT = 0). In Figure 5-10, there is only one list stored in the
array, the list of available nodes, so only one NULL value is seen in the
NEXT field.
Coding GETNODE and FREENODE is also simple. Since all of our
nodes are coming from the same source, NODES, we will access this array
and the external pointer AVAIL globally.
200 I Chapter 5 Linked Lists
BEG I N (* getnode *)
IF At.JA I L <:> NULL
THE N (* If free nodes exist, get a node. :!:)
BEGIN
P : = At.,JAIL;
AVAIL := NODESEAVAILJ.NEXT
END
ELSE (* No free nodes; set P to NULL. *)
P := NULL
END; (:i: getnode :;:)
(************************************************
PROCEDURE FREENODE (P : POINTERTYPE);
(::: Puts the node pointed to by P into the list of available nodes. *)
LILLY
NEWEMP - - - ; > hired
today
NEWNODE--.I 7 ~
(a) Get a node and store the value.
L1ST~1 2 B-J P
6 ~I
insert here
9 B-1 11 ~
NEWNODE---I 7 I]
(b) Find the insertion place.
figure 5-12..
First, we will need to get a node to put the value, NEWVALUE, into. We
will call the pointer to this node NEWNODE.
The first two lines of the algorithm can be coded directly. The insert task
can be broken into two parts:
Find the place to insert [Figure 5-12(b)]
Connect the pointers [Figure 5-12(c)]
The first task involves comparing the new value to the INFO field of
each successive node in the list (using a temporary pointer P to traverse tIle
list) until NEWVALUE < INFO(P). This task may be seen as a WHILE
loop:
202\ Chapter 5 Linked Lists
p ~ LIST
WHILE NEWVALUE >= INFO(P) DO
advance P
However, when the correct place is found, P is pointing to the node that
should follow the new node. We cannot access the preceding node to
change its pointer.
LIST~
NEWNODE-~ p
Here we have found the place, but we cannot get back to change the NEXT
field of the node containing 6.
P ~ LIST
WHILE NEWVALUE >= INFO(NEXT(P)) DO
advance P
Working through the example in Figure 5-12, we quickly notice that the
first node is skipped in the comparison. Further, it is clear that if
NEWVALUE < INFO(LIST) (the first node), we have a special case. The
reason this case is special becomes apparent when we do the second task of
the insert algorithm: connecting the pointers. In the general case [Figure
5-12(c)], we need to change two pointers:
NEXT(NEWNODE) ~ P(NEXT)
P(NEXT) ~ NEWNODE
NEXT(NEWNODE) ~ LIST
LIST ~ NEWNODE
We can modify our algorithm to test for this case before entering the
WHILE loop:
P ~ LIST
IF NEWVALUE < INFO(P)
THEN change the pointers to insert NODE(NEWNODE) as the 1st node
ELSE
WHILE ...
We 11ave considered the general case of inserting into the middle of the
list and the special case of inserting at the beginning of the list. What about
the case when NEWVALUE is greater than all the values in the list? We
will then need to insert at the end of the list. Our loop control
IF LIST = NULL
THEN LIST ~ pointer to new node
ELSE
p ~ LIST
IF NEWVALUE < INFO(P)
THEN change the pointers to insert NODE(NEWNODE)
as the 1st node
ELSE PLACEFOUND ~ FALSE
WHILE NEXT(P) <> NULL
AND NOT PLACEFOUND DO
IF NEWVALUE >= INFO(NEXT(P))
THEN advance P
ELSE PLACEFOUND ~ TRUE
Change pointers to insert NODE(NEWNODE)
Hint: Remember that Pascal evaluates all of the conditions of the WHILE
clause. What happens when NEXT(P) = NULL? This consideration is lan-
guage-dependent; some other programming languages stop evaluating
compound Boolean expressions as soon as the result is deternlined (e.g.,
after the first FALSE when expressions are ANDed together).
Consider the linked list in Figure 5-13(a). (The INFO fields of nodes in the
AVAIL list are represented by X to simplify our view of the list.) Following
the insert algorithm given above, we will insert a node with an INFO value
of 8. (This is the gelleral case of inserting into the middle of the list.) First,
the value is put into the INFO field of the first available node
(NODES[4].INFO). Next, the value of AVAIL is incremented to
NODES[4].NEXT, or 6. Then, the linked list pointed to by LIST is tra-
versed until the insertion place is found [when NEWVALDE <
INFO(NEXT(P))]. This situation occurs when P= 9 and
INFO(NEXT(P)) = 11. The two pointers are changed (NODES[4].NEXT
and NODES[9].NEXT), and our insertion is complete. In all, we have
Implementing the Insert Algorithm \205
changed 4 values: AVAIL, NODES[4].INFO, NODES[4].NEXT, and
NODES[9].NEXT. [See Figure 5-13(b).]
Figure 5-13.
206\ Chapter 5 Linked Lists
Figure 5-14.
ELSE
BEG I N (* general case ;j:)
(* Find insertion place. :1:)
PLACEFOUND := FALSE;
WHILE (NODES[PJ.NEXT <> NULL) AND
NOT PLACEFOUND DO
IF NEWVALUE >= NODES[NODES[PJ.NEXTJ.INFO
THEN P := NODES[PJ.NEXT
ELSE PLACEFOUND := TRUE;
(* Connect the pointers. :1:)
NODES[NEWNODEJ.NEXT := NODES[PJ.NEXT;
NODES[PJ.NEXT := NEWNODE
END (* general case *)
END (* insert into nonenlpty list *)
END; (* insert *)
Since we know that the value DELETEVAL will be in the list, we can
find the node with a simple WHILE loop:
LIST ~ NEXT(LIST)
FREENODE(P)
The other cases are more complicated. When P is pointing to the node
we want to delete, we don't have a pointer to its predecessor in the list,
which we would normally use in order to change its NEXT field. We could
use the method of peeking ahead from the insert algorithm. However, an-
other simple way to deal with this problem is to keep a pair of pointers,
CURRENTNODE and BACKNODE, to traverse the list. When CUR-
RENTNODE is pointing to the node we want to delete, BACKNODE is
pointing to its predecessor. Deleting NODE(CURRENTNODE) becomes
simple [Figure 5-15(b)]:
NEXT(BACKNODE) ~ NEXT(CURRENTNODE)
FREENODE(CURRENTNODE)
....... ,
t
(a) DELETE (2). CURRENTNODE
,.".--------
LIST
t t
(b) DELETE (6). BACKNODE CURRENTNODE
//.---t>NULL
---//"--- /
LIST
~ t
(c) DELETE (8). BACKNODE CURRENTNODE
Figure 5-15.
Using a second pointer will complicate our search loop slightly. We can
initialize CURRENTNODE to LIST, and BACKNODE to NULL. Then,
WHILE INFO(CURRENTNODE) <> DELETEVAL, we increment both
pointers, like an inchworm inching his way. BACKNODE (the tail of the
inchworm) catches up with CURRENTNODE (the head), then CUR-
RENTNODE advances (Figure 5-16).
How do we know if we are deleting the first node? If BACKNODE is
still equal to NULL when we drop out of the search loop, we know that the
LIST
(a)
LIST
LIST
Figure 5-16.
21 0 I Chapter 5 Linked Lists
pointers have not been advanced and that \ve need to delete the first node.
The whole algorithm:
(* initialize pointers *)
CURRENTNODE ~ LIST
BACKNODE ~ NULL
FREENODE(CURRENTNODE)
VAR CURRENTNODE,
BACKNODE POINTERTYPE;
Consider the linked list in Figure 5-17(a). We want to delete the node
containing the value 9. CURRENTNODE is initialized to 2, the index of
(a) NODES
t------..--f----~
LIST \21
[1]
[2] AVAIL \91
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
(b) NODES
[1]
.....-----..-+-------~
LIST 1 21
Figure 5-17.
2121 Chapter 5 Linked Lists
the first node in the list, and BACKNODE is initialized to zero (NULL).
The search loop begins.
When CURRENTNODE is pointing to the node containing 9
(CURRENTNODE = 8), BACKNODE is pointing to its immediate prede-
cessor in the list (BACKNODE = 5). We are not deleting the first node in
the list, so NODES[BACKNODE].NEXT is set to NODES[CUR-
RENTNODE].NEXT, effectively skipping NODE(CURRENTNODE).
Then FREENODE(CURRENTNODE) is called, adding NODE(CUR-
RENTNODE) to the beginning of the AVAIL list. Freeing NODE(CUR-
RENTNODE) necessitates changing NODES[CURRENTNODE].NEXT
and AVAIL.
Using the declarations for the array NODES and an external pointer
STACK, let's write the basic routines needed for stack operations. Note that
type STACKTYPE is now equal to POINTERTYPE.
It can be readily seen that the PUSH and POP operations are analogous
to our insert-first-node and delete-first-node algorithms.
NODES[NEWNODEJ.NEXT := STACK;
STACK := NEWNODE
END; (* push OJ:)
Keeping a Queue in a Linked List I 213
PROCEDURE POP (VAR STACK: STACKTYPE;
VAR POPPEDELEMENT : INFOTYPE);
(:i: IJelete the top elelnent fn)111 S'TACK, and return it in P()PPEI)ELE~vlEN'T. :::)
(* ASSUl11eS the stack is not ernpty. NC)DES is accessed glohally. :;:)
BEG I N (* pop *)
(* Get the value of the top elclnent. *)
POPPEDELEMENT := NODES[STACKJ.INFO;
Writing a routine to test for an empty stack is a simple task. When the
stack is empty, its external pointer will equal NULL.
Do \ve need to test for a full stack? If all the stacks we are using share the
same pool of available nodes and we have declared NODES to be large
enough to hold the maximum number of elements in all the stacks com-
bined, we will never have to test whether a particular stack is full. If we do
not have this assurance, we will have to write a function FULL. (Try this
yourself.)
OFRONT OREAR
t
~13-CE-···--1__13-CJ2J
*
2141 Chapter 5 Linked Lists
l.JAR Q : QTYPE;
We can access the front of the queue through the pointer Q.FRONT and the
rear of the queue through Q.REAR.
~CE~CG~D3---~~
NEWNODE
Note the relative pos-itions of QFRONT and QREAR. Had they been
reversed (Figure 5-18), we could have used our insert-First-node algorithm
for ENQ, but how could we DEQ? To delete the last node of the linked list,
we need to be able to reset QFRONT to point to the node preceding the
deleted node. Since our pointers all go forward, we can't get back to the
preceding node. To accomplish this task, we would have to either traverse
the whole list (very inefficient, especially if the list is long) or keep a list
with pointers in both directions. Such a doubly linked list is not necessary
if \ve set up our queue pointers correctly to begin with.
Keeping a Queue in a Linked List \ 215
~
LrJClr-... --LJ3-
OREAR....................
•..•. .
. .• . . •.
t
i--O.FR
••...•.•.
. ON20
. . .• . .• . . .
,.
Figure 5 18. oa
A bad queue design.
Note in the insert algorithm that inserting into an empty queue is a spe-
cial case, since we need to make QFRONT point to the new node also.,
Similarly, in our delete algorithm, we will need to allow for the case of
deleting when the list only contains one node, leaving the list empty. If,
after we have deleted the front node, QFRONT is NULL, we know that the
queue is now empty. In this case, we will need to set QREAR to NULL
also. The algorithm for deleting the front element from a queue is illus-
trated as follows:
TEMP-~~orO
" '~DEOVALUE
This algorithm assumes that the test for EMPTYQ is made before entering
the delete routine-that is, that we know that the queue has at least one
node.
TEMP ~ QFRONT
DEQVALUE ~ INFO(QFRONT)
QFRONT ~ NEXT(QFRONT)
IF QFRONT = NULL (* queue is now empty *)
THEN QREAR ~ NULL
FREENODE(TEMP)
We know that when the queue is empty, both QFRONT and QREAR
will equal NULL. Testing for an empty queue simply involves checking
either of these pointers for NULL.
216\ Chapter 5 linked lists
BEG I N (* enlptyq *)
EMPTYCJ ~= CJ.REAR = NULL
(* or *)
(* EtvIPTYQ:= Q.FRON1' = NULL *)
END; (:1: enlptyq :1:)
(************************************************
PROCEDURE DECJ (VAR CJ : QTYPE;
VAR DECJVALUE : INFOTYPE);
(:i: Hel110ve the front elelnent hanl the queue, and return it in ))EQVALUE. :j:)
(:!: Accesses NO))ES globally. ASSlllnes that the queue is not enlpty. *)
There is a common problem with using linear linked lists: Given a pointer
to a node somewhere in the list, we can access all of the nodes that follow,
but none of the nodes that precede it. With a linear singly linked list struc-
ture, we must always have the pointer to the beginning of the list to access
all of the nodes in the list.
We can, however, change the linear list slightly by making the pointer in
the NEXT field of the last node point back to the FIRST NODE instead of
NULL:
Now our list is circular, rather than linear. We can start at any node in the
list and traverse the whole list. For this reason, we can make our external
pointer to the list point to any node and still access all the nodes in the list.
It is convenient, but not necessary, to let the external pointer to a circular
list point to the last node in the list. In this way, we can easily access both
ends of the list, since the NEXT field of the last node contains a pointer to
the first node. (See. Figure 5-19.) Note that an empty circular list has been
represented by a NULL value for the external pointer to the list.
Using a circular, rather than a linear, linked list makes one obvious
change in our list traversal algorithm: We no longer stop traversing the list
when the pointer becomes NULL. Instead, we look for the external pointer
as a stop sign. Let's write a procedure that will print out the elements of a
circular linked list.
We can initialize a temporary pointer, P, to LIST, the external pointer.
Then we can print one node ahead of the pointer, until P comes full circle-
when P = LIST. Note that this algorithm works even when there is only
one node in the list-when P, LIST, and NODES[P].NEXT are all equal
[Figure 5-19(b)].
(a) LIST
Figure 5-19. Some circular linked lists, with external pointer to rear element.
2181 Chapter 5 Linked Lists
p ~ LIST
REPEAT
Print INFO(NEXT(P))
P ~ NEXT(P)
UNTIL P = LIST
VAR P: POINTERTYPE;
BEG I N (:1: printlist :f:)
P := LIST;
(* Check for elnpty list. If not elnpty, print the elelnents. *)
IF P = NULL
THEN WRITELN( 'THE LIST IS EMPTY')
ELSE
REPEAT
WRITELN(NODES[NODES[PJ.NEXTJ.INFO) ;
P := NODES[PJ.NEXT
UNTIL P LIST
END; (:;: printlist :1:)
QUEUE
~
We can access the rear node, in order to ENQ, through the external
pointer, QUEUE, and the front node, in order to DEQ, through
NEXT(QUEUE). An empty queue would" be represented by QUEUE =
NULL.
A Circular List Application-The Queue 1219
Coding the ENQ and DEQ procedures for this implementation of a
queue is left as a homework exercise.
A list of items is something that most people use often in their everyday
lives. Thus we are already familiar with the concept of storing the elements
of a list sequentially, as if we were writing them down one after another on
a piece of paper. An array is often used to fill this role in computer solutions
to problems that use lists. The elements of the list are stored in subsequent
locations of the array.
In this chapter, we introduced the concept of linking the elements in a
list, allowing us to keep them physically in any convenient order. The links
preserve the logical order of the elements.
At an implementation level, we can use an array to store t4e elements
and their links. An external pointer to the list indicates where in the array
we can find the first element. Although the array itself is a random-access
data structure (we can locate any element directly through its index), we
can only access the elements of the linked list stored there through the
pointer scheme we have imposed. That is, if the linked list is stored in an
array called DATA, we do not know anything about the relative position of
a list element by accessing its array position directly. DATA[5] may be the
first, fifth, or any other element in the list, or it may not be in the list at all.
We only give meaning to the structure by using the external poWte:a; to
access the beginning of the list and then following the pointers from~le
ment to element.
Using an array representation for a linked list, we are responsible for
creating our own GETNODE and FREENODE operations. In the next
chapter, we will see how Pascal provides a mechanism for letting the sys-
tem allocate and free nodes for us. The basic list operations at a logical
level remain the same. The changes we will describe occur at the imple-
mentation level.
220 I Chapter 5 linked lists
All the entries except the expiration date are just strings of characters.
The date is in the form mm/yy (the month followed by the year). For exam-
ple, 11/85 would be Noveluber 1985; 02/87 would be February 1987. The
missing columns between fields will contain blanks.
After the original input, in each succeeding month, there will be new
subscriptions entered in the same format.
Output: Three sets of mailing labels, ordered by zip code:
1. those people whose subscriptions have expired
2. those people whose subscriptions will expire in the current
month
3. those people whose subscriptions will expire sometime after
the current month
Application: Magazine Circulation 1221
r--------./ Throwaway
The data for the subscribers whose subscriptions have not expired is
saved to be used the next time the program is run. It is called the retained
data of the program.
The output labels must have the following format.
(FIRST NAME) (LAST NAME) (with exactly one blank bet,,'een)
(STREET ADDRESS)
(CITY), (STATE) (ZIP) (one blank between state and zip)
Management agrees that this description is exactly what they want. The
input format looks strange but is correct. It comes from the old days of
computing, when information had to be in particular columns on a line or
card. The blanks were inserted between each field of information for clarity
when printing since the entire field may be filled with characters. These
blanks may actually be useful when you write your input routines.
Several questions concerning input and output processing occur to you
immediately. Should you keep the subscription records as they are given to
you from sales, converting them to the proper format for printing on each
run, or should you convert them to a format for easy printing once and save
them in that format? Saving them in a format for easy printing seems rea-
sonable, but just what should that format be?
The precise printing specification would imply that you know the length
of each first name, city, and state so that the appropriate commas and blanks
can be inserted. Perhaps the STRINGTYPE data type defined in Chapter 2
would be appropriate.
If you decide to keep the data in a format for easy printing, another
question arises: Should you write a special-purpose program to convert the
original input to the printing format? Since each monthly run will contain
new subscriptions in the sales format, maybe the program can be organized
so that the first run has no master file and all subscribers are treated as new
subscribers.
222\ Chapter 5 Linked Lists
You decide to keep these questions in mind while you do a first pass at
the top level of your design. You also decide to save the subscription data in
a format designed for easy printing, which you will call the label format.
MAIN Level a
The exact format of the labels themselves can be decided later, but the
data structures to be used to hold the three sets of labels must be deter-
mined before any further development can be done. Since the sizes of the
three lists to be printed may vary from 0 to the whole subscription list, a
linked representation seems appropriate for EXPIRED, EXPIRING, and
OKAY. In contrast, the master lists of labels could be read into an array of
records, putting new subscriptions into the bottom of the same array. This
data structure is shown below.
SUBSCRIBERS EXPIRED
[1 ]
[2]
E3-U- . . --C0
EXPIRING
G3-~c=a- . . -C0
[N]
[N+1l
OKAY
~U-···-U
Application: Magazine Circulation 1223
MAIN Level a
READ current date
WHILE more labels DO
GET label
INSERT into proper list ordered by zip code
WHILE more new subscribers DO
GET person
CREATE label
INSERT into proper list ordered by zip code
PRINTLISTS
COMBINE EXPIRING AND OKAY ordered by last name
PRINT
SAVE
One objective was to have the program itself handle the conversion of
the original file. The first time the program is run, the first WHILE loop
will be skipped because the file containing the master list of subscribers
will be empty. All the subscribers will be treated as new. This should take
care of the conversion problem. In succeeding months, the file of labels
will be the file written by the program itself. This has a distinct advantage.
The file can be treated as a nontext file. You can read and write each label
as a complete record, rather than writing and reading each field separately.
This makes the input/output routines much easier to write and faster to
execute.
For those of you who have not used nontext files before, we will make a
slight digression and describe how they operate.
The built-in data type for files is TEXT. A file of type TEXT is actually a
FILE OF CHAR. READ(LN) and WRITE(LN) are procedures provided to
handle input and output with TEXT files. If a file is declared to be of some
other type, READ(LN) and WRITE(LN) cannot be used. The built-in pro-
cedures to be used with non-TEXT files are GET and PUT, which read and
write one element of the component type of the file at a time.
I.
224 [ Chapter 5 linked lists
RESET and REWRITE have their usual meanings-they open the file
for reading or writing. Remember that RESET also defines the file buffer
variable. The file buffer variable is the name of the file (e.g., MASTER)
followed by an up-arrow. To get the next component of a file, you use
built-in Procedure GET. The parameter of GET is a file name. GET-
(FILEl) puts into FILEI the next component of FILEI. Let's summarize
the terminology referring to files:
MASTER i refers to the current record on file MASTER called the file
buffer variable. (See Appendix H.)
MASTER i .FNAME refers to the FNAME field of the current record on
file MASTER.
GET(MASTER) advances the file to the next record.
MASTER i := SUBSCRIBER copies the record SUBSCRIBER into file
buffer variable for file MASTER.
PUT(MASTER) writes the record on file MASTER.
EOF(MASTER) is TRUE if MASTER i is the last record on file MAS-
TER.
MAIN Level 0
INSERTPROPER Level 1
PRINTLISTS Level 1
PRINTLABEL(EXPIRED)
PRINTLABEL(EXPIRING)
PRINTLABEL(OKAY)
COMBINE Level 1
PRINTMASTER Level 1
SAVE Level 1
The level 2 routines will all require tl1at you know exactly what the label
records look like. The type declarations on page 224 showed all the fields
as STRINGTYPE. This seems an appropriate representation for everything
except the zip code and the date. The zip code could be carried as an
integer number. This would make the insertion into the three lists more
efficient, since numeric comparisons are much faster than string compari-
sons using our COMPARE function. However, a five-digit integer is greater
than MAXINT on some machines. Therefore, for consistency across ma-
c:rl1nes, it is safer to represent the zip code as a string. This will make your
program more portable.
Application: Magazine Circulation 1227
DATE will be stored in the label records as an integer value that was
calculated in this fashion from the original sales information. The current
date will also be input and saved in this format. The following table shows
several dates in the original format and the converted format, and indicates
into which list the label containing that date would be put.
To input the fields of STRINGTYPE, you have two choices: You can use
either GETLINE or GETSTRING. GETSTRING reads a string until a de-
limiter character is read. What character could be used here? The first
name field Inay have initials and/or names, so a blank won't work as a
delimiter. In fact, the same is true of all of the fields.
You could instead input the entire line using GETLINE, and break the
line into the proper fields using SUBSTRING and DELETE. However,
there is a different problenl here. The strings would be the length of the
field in the original line. There nlay be trailing blanks that you would need
to relnove. Again you could use other string operations such as SEARCH,
SUBSTRING, and DELETE to get rid of these trailing blanks.
Yet, if you look at the input description, you will see that the routine to
input the proper columns into each field is really quite easy. Likewise,
removing trailing blanks only requires beginning at the end of the string
and moving up until a nonblank character is found.
This, then, is a case where you should write a special-purpose input
routine for the variables of STRINGTYPE. This routine would store the
characters directly into the proper field and remove trailing blanks.
228\ Chapter 5 Linked Lists
CREATELABEL Level 1
GETFIELD will need three parameters: the string in which it will store
the characters and tl1e beginning and ending column numbers, which will
indicate how many characters should be read.
GETFIELD Level 2
INDEX ~ 0
FOR CT from FIRST to LAST
READ CH
increment INDEX
STR.CHARS[INDEX] ~ CH
IF STR.CHARS[INDEX] = "
THEN STR.LENGTH ~ 0
ELSE STR.LENGTH ~ INDEX
All of the design has now been completed except for INSERT and
PRINTLABEL. You should be able to use Procedure INSERT from page
206. All you need to do is change the variable names in the cOlnparison
statement. When inserting into the three separate mailing lists, you will
compare on zip code. When inserting into LIST, you will compare on last
name.
Unfortunately, Pascal does not let you pass a field name as a parameter.
You could write two insert procedures, one for the zip code field and one
for the last name field. Another alternative, which will be illustrated here,
is to pass the name of the function to use in the comparison as a parameter.
(Yes, you can pass function or procedure names as parameters.)
,1..- --1
Application: Magazine Circulation \ 229
You will have to write two functions, COMNAME and COMZIP. COM-
NAME will call function COMPARE with the name fields. COMZIP will
call function COMPARE with the zipcode fields. Procedure INSERT will
have an added formal parameter which is a function name. The function
will be called when the insert algorithm needs to make a comparison.
When you are inserting into the three lists by zipcode, the actual param-
eter for Procedure INSERT will be COMZIP. When you are inserting into
the list kept in alphabetical order by last name, the actual parameter to
Procedure INSERT will be COMNAME.
PRINTLABEl level 2
As you look back over this design in preparation for coding it, you may
see an inconsistency in the notation. The statement
GET label
has been used in several modules. In COMBINE, it was defined to be POP,
i.e., remove the first node in the linked list. In the MAIN lllodule,
it refers to the built-in procedure GET. In PRINTMASTER,
PRINTLABEL, and SAVE, it means to cycle through the list without re-
moving the node.
Although it was clear what was meant at each point, we should be more
precise with our language even when we are creating our top-down de-
signs. A more precise formulation of "GET label" in the cases where we
are cycling would be "label ~ next label."
An improvement shows up at this stage. There is a great deal of duplica-
tion in PRINT~IASTERand SAVE. Both of these modules cycle through
the list of subscribers to be used for next month's processing.
PRINTMASTER prints the list, and SAVE writes each label out to
PRINTANDSAVE Level 1
The MAIN module needs to be revised one more time to reflect the
combination of the modules PRINTMASTER and SAVE into one module,
PRINTANDSAVE.
MAIN Level a
READ date
WHILE more labels DO
INSERTPROPER(MASTER i )
GET(~1ASTER)
WHILE more new subscriptions DO
CREATELABEL(SUBSCRIBER)
INSERTPROPER(SUBSCRIBER)
PRINTLISTS
COMBINE EXPIRING and OKAY
PRINTANDSAVE
Application: Magazine Circulation I 231
STRINGTYPE = RECORD
LENGTH: INDEXRANGE;
CHARS : ARRAY[1 •• MAXLENGTHJ OF CHAR
END; (* record :1:)
LABELTYPE RECORD
FNAME, (* first nanle *)
LNAME, (* last ncUl1e *)
ADDRESS, (:I: street address :t:)
CITY, (* city *)
STATEt (* state *)
ZIPCODE : STRINGTYPE; (* zip code *)
DATE : INTEGER (* expiration date *)
END; (* record *)
NODETYPE = RECORD
SLABEL : LABEL TYPE; (* address label ;1:)
NEXT : POINTERTYPE
END; (* record *)
RELATION = (LESS, EQUAL, GREATER);
(* result of string C0111pare *)
FILETYPE FILE OF LABEL TYPE;
VAR NODES: ARRAY[1 •• MAXNODESJ OF NODETYPE;
AlyJA I L , (* available space external pointer *)
NODE, (* telnporary pointer variable :1:)
E>{ PI RING, (* labels which expire this 1110nth *)
EX P I RED, (* labels \vhich expired last nl0nth *)
OK AY , (* labels expiring in the future *)
LIST : POINTERTYPE; (:1: EXPIRING cOll1bined with OKAY *)
MONTH, YEAR,
CURRENTDATE : INTEGER; (*. date of run to be read fron1 console :1:)
MASTER : FI LETY PE ; (:1: file of labels kept fron1 n10nth to 1110nth *)
NEWF I LE : TE>{T; (* file of new subscriptions *)
SUBSCR I BER : LABEL TY PE ; (* temporary label for new subscriber *)
(************************************************ )
I I
232 \ Chapter 5 Linked Lists
PROCEDURE INITIALIZE;
(* Initialize available space list to contain all nodes; :1:)
(:1: initialize list pointers to NULL, :1:)
VAR P: POINTERTYPE;
<************************************************
PROCEDURE GETNODE (VAR P POINTERTYPE) ;
(* See text for proper documentation. *)
IN (* getnode *)
F At.,JAIL -()-
THEN
BEGIN
P :=
At)A I L J • NE}(T
END
ELSE
P := NULL
END; (* getnode *)
<************************************************
PROCEDURE PO I NTERTY PE) ;
(* •• See text for cornplete OO(~Unlentat]lon. *)
BEG! N (*
NODESEPJ.
AVAIL := P
END; (* freenode *)
<************************************************
Application: Magazine Circulation I 233
BEG I N (* getfield *)
I NDE}-{ : = 0;
(:1: Read in and store characters in field, \vhich includes trailing blank. *)
FOR CT := FIRST TO LAST DO
BEGIN
READ(NEWFILE, CH);
INDEX := INDEX + 1;
STR.CHARSEINDEXJ := CH
END; (* f()l'loop *)
(* Nlove index fron1 right to first nonblank character. *)
REPEAT
INDEX := INDEX - 1
UNT I L (STR. CHARS EI NDE}-{ J <:> I I) OR (I NDEX 1) ;
THEN STR.LENGTH := o
ELSE STR.LENGTH := INDE}-{
END; (* getfield *)
(************************************************
<************************************************
234\ Chapter 5 Linked Lists
(************************************************ )
FUNCTION COMPARE (SUBSTRlt SUBSTR2 : STRINGTYPE) : RELATION;
(* See chapter 2 for proper documentation. *)
BEGIN (*>compare•• • *)
IF Sl.JBSTRl.LE.NGTH < SUBSTR2.
THEN MINLE.NGTH := SU6STR1.
I EL.SEMINL.ENGTH: = SUBSTR2.
POS··:=··l;
STILLMATCH := TRUE;
IF STILLMATCH
THEN
IF SUBSTR1.LENGTH ::::SUBSTR2.LENGTH
THEN.COMPARE:=/EQUAL
ELSE
IF SUBSTR1.LENGTH < SUBSTR2.LENGTH
THEN COMPARE := LESS
ELSE COMPARE := GREATER
E NO.; (* compare*)
(************************************************
FUNCTION COMNAME (NEWLABEL, OLDLABEL : LABELTYPE) : RELATION;
(::: Function C()~IPAHE is called \vith the LNAi\IE fields as pararl1etcrs. :::)
BEGIN
COMNAME := COMPARE(NEWLABEL.LNAME, OLDLABEL.LNAME)
END;
(************************************************
FUNCTION COMZIP (NEWLABELt OLD LABEL : LABELTYPE) RELATION;
(* Function C()~IIPAHE is called \vith the ZIPC()OE fields as panuneters. :;:)
BEGIN
COMZIP := COMPARE(NEWLABEL.ZIPCODE, OLDLABEL.ZIPCODE)
END;
(************************************************
PROCEDURE INSERT (VAR LIST: POINTERTYPE;
NEWLABEL : LABEL TYPE;
FUNCT I ON • XCOM PARE ( NEWLABEL f OLDLABEL
LA.BELTYPE) :.RELAT
(* See text for properdocumentation.*)
(* Insertion is by zip or nan1e based on Function XCOMPARE. *)
(************************************************ )
BEGIN (* pop *)
SLABEL := NODESCLISTJ.SLABEL;
TEMP := LIST;
LIST := NoDESCLISTJ.NEXT;
FREENODE(TEMP)
END; (:i: pop :!:)
Applicl11ion: Magazine Circulation 1237
(************************************************
(************************************************
(************************************************
BEG I N ( co In I) i II e
:i: :i: )
(************************************************
PROCEDURE PRINTLABEL (LIST: POINTERTYPE);
(* Labels are printed according to specification. :1:)
BEG I N (* printlabel *)
WHILE LIST <> NULL DO
BEGIN
TLABEL := NODES[LISTJ.SLABEL;
PRINT(TLABEL.FNAME) ;
WRITE(' I);
PRINT(TLABEL.LNAME) ;
WRITELN;
PRINT(TLABEL.ADDRESS) ;
WRITELN;
PRINT(TLABEL.CITY) ;
WRITE(', ');
PRINT(TLABEL.STATE) ;
WRITE(' ');
PRINT(TLABEL.ZIPCODE) ;
LIST := NODES[LISTJ.NEXT;
WRITELN
END
END; (:1: printlabel *)
(************************************************
PROCEDURE PRINTLISTS (OKAY, EXPIRING, EXPIRED: POINTERTYPE);
(:r: Invokes PRINTLABEL to print three sets of labels. *)
<************************************************ }
Application: Magazine Circulation 1239
(************************************************
BEG I N (:!: Inain :1:)
WRITELN( 'MAIN PROGRAM BEGUN');
RESET(NEWFILE) ;
RESET(MASTER) ;
INITIALIZE;
READ(CURRENTDATE) ;
BEGIN
WRITELN( 'READING MASTER');
INSERTPROPER(MASTERj, EXPIRED, EXPIRING, OKAY);
GET(MASTER)
END;
PRI NT LIS TS ( 0 KAY, E}-{ P I RI NG, E}-{ P IRE D) ; (:!: Pri n t Ia be Is.
COMBINE(LIST, OKAY, EXPIRING);
PRI NT AND S AlyJ E( LIS T ) (:1: ere a te n e\v In as te r lis t. :;: )
Your program has been running now for several months. There have
been no major problems, but it seems to take a long time to execute. The
company is concerned that as the number of subsc,ribers increases, the
amount of machine time (and cost) required will be excessive. They ask
you to analyze the program and see if you can find ways to speed it up.
Several thoughts occur to you immediately. You could change the data
structure of the fields from strings to packed arrays of a fixed length. This
would speed up the comparison operation when labels are inserted into the
master list ordered alphabetically. Of course, the print routine would also
take longer, since you would have to check for trailing blanks as you
printed.
Before doing anything drastic like changing the data structure, you de-
cide to analyze the processing. You were able to save processing during the
original design by noticing that PRINTMASTER and SAVE both cycled
through the same list. You were able to cut out one traversal of the linked
list by combining those two modules. Perhaps there are other places where
some of the processing is duplicated.
The first loop in the main module gets a label from the master file and
inserts it into its list for printing. The second loop in the main module gets a
ne,\' subscriber, creates a label and inserts it into its list for printing. There
is no duplication here.
Procedure PRINTLIST calls Procedure PRINTLABEL three times with
three different lists to print; no duplication here. Procedure COMBINE
removes labels from the lists that are to be kept and inserts them into the
master list. There is no duplication here. PRINTLABEL cycles through the
three lists, printing them. Wait! PRINTLABEL and COMBINE both cycle
through EXPIRING and OKAY.
The control structure in Procedure PRINTLABEL can be moved up into
Procedure PRINTLISTS. As you cycle through the lists, labels can be re-
moved. After the labels from EXPIRED are printed, nothing more will be
done to them. As the labels from EXPIRING and OKAY are being printed,
they can be inserted into LIST. Procedure COMBINE can now be deleted
completely. The design for PRINTLISTS becomes as follows.
Application: Magazine Circulallon 1241
PRINTLISTS Level 1
1. Show how the linked list would be affected by the following operations:
NODES
....-..........................+--o----...........- i
LIST li41
[1]
[2] AVAIL li~1
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
2. (a) Write a segment of code to print out the INFO (char) portion of a linked list
pointed to by LIST. (NULL is represented by 0.)
(b) Show what would be printed out by applying the above code to the follow-
ing list:
NODES LIST
[1]
[2] AVAIL
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
10. Why are there two Os (NULL values) in the NEXT field of the array?
Exercises 245
1
~)()
--
[10] ..;...J...-J
11. A circular linked list contains integer elements. LIST points to the last node in
the list. Write a procedure to print the positive (not including 0) elements in the
list. If there are none, print NO POSITIVE ELEMENTS.
12. Using the description of a queue implemented as a circular list (see the figure
on page 218), write PROCEDURE ENQ (VAR Q : PTR; X : ELTYPE). You do
not have to test for FULLQ.
13. Write PROCEDURE DEQ (VAR Q : PTR; VAR X : ELTYPE) for a queue imple-
mented as a circular list. You do not have to test for EMPTYQ.
For this test use the following declarations where appropriate. You may use GET-
NODE without defining it.
CONST MAX = 7;
NULL = 0;
TYPE PTR = O•• MAX;
NODE = RECORD
INFO : INTEGER;
NEXT : PTR
END (::: r('c(nd oi)
1. Fill in the contents of the array NODES after the following numbers have been
inserted into their proper place in the list pointed to by LIST. The list has been
initialized and is empty to begin with. These operations are the first executed
follo\ving the execution of Procedure INITIALIZE. Show also the contents of
LIST and AVAIL. Numbers: 17, -23,42, -17 (the list should be in ascending
order).
2. Given the situation shown above, show the values of the variables that have
been changed by the following insertion and deletion. The list should remain
in order.
INSERT: 7
DELETE: 42
(a) NODES[NODES[NODES[B].NEXT].NEXT].INFO is
(b) NODES[NODES[B].NEXT}.INFO is
(c) NODES[NODES[NODES[NODES[B].NEXT].NEXT].NEXT].INFO
is
(d) NODES[NODES[NODES[A].NEXT].NEXT].INFO is
4. LIST PTR
{~E~
Write one Pascal statement for each of the following three tasks. Each time
restart with the original data structure as shown above.
(a) Make PTR point to the first block in the list.
(b) Print out the value in the block pointed to by PTR.
(c) Make the block pointed to by PTR point to the first block in the list (creat-
ing a circle).
5. Write a procedure to sum the elements of a linked list pointed to by LIST.
6. Write a procedure, NEXTOP, that inserts a node with the value VAL into the
linked list pointed to by LIST immediately before NODE(P). (P is a pointer, not
a value.) The formal parameters should be LIST, P, and VAL.
250 I
Chapter 6 Pointer Variables
DYNAMIC ALLOCATION
Creation·ofa variable's storage space in memory during the execution
(rather than compilation) of the program.
NODETYPE = RECORD
INFO : INFOTYPE;
NEXT : POINTERTYPE
END; (:i: record :;:)
P := LIST
Using Pascal Pointer Variables 1253
If, however, we decided to keep another list with nodes of a different
type (e.g., i XNODETYPE), we could not assign values of type
i NODETYPE (as defined above) to pointer variables of the second
type. Therefore, after adding these declarations to the ones above
XNODETYPE = RECORD
DATA : INTEGER;
NEXT : XPOINTERTYPE
END; (:;: record :i:)
}·{LIST : = LIST;
LIST := Q;
P := }{LIST;
These variables are all pointers, but LIST and P point to records of
type NODETYPE, while XLIST and Q point to records of type
XNODETYPE. We could, however, make the assignment
Q : = }-{L 1ST;
The definition of types in the VAR section is legal but bad style.
These variables cannot subsequently be used as actual parameters for
procedures and functions. That is, we cannot write
2541 Chapler 6 Pointer Variables
UP-ARROW SYN1AX
We have already seen that the symbol i, followed by a type identifier,
defines a pointer type. This up-arrow symbol is also used in other ways.
Pointer-variable-name t denotes the variable to which the pointer varia-
ble is pointing. For example, using the declarations of POINTERTYPE and
NODETYPE above, LIST i refers to the node of the linked list pointed to
by the pointer variable LIST. Note the difference between LIST and
LIST i -the first is a pointer variable (an address), while the second is the
actual data it points to (a record).
G-r--Il
LisT'T
LIST
P := Pf .NE}{T
LIST ----~c±E-~
(a) LIST and P are pointers.
p
L1ST----~~d=0
(b) P: = Pt. NEXT
L1S~=~~c=J21
(c) P:= LIST
L1S~=~~c=J21
(d) LIST t . INFO: = 2
Q
L1ST_~c±E-~
(e) Q: = LIST t . NEXT
Figure 6..1.
P : = LIST
Ii Set the value of the INFO field of the first node in the list to 2 [Figure
6-1(d)] :
LIST j • INFO a -
a - Ii
,:..
Both LIST. i INFO and 2 are of type INTEGER. (This could also have
been accomplished, in tllis example, by the statement P t .INFO : =
2.)
• Set another pointer, Q, to point to the second node in the list [Figure
6-1(e)] :
Q := LISrj .NEHT
256\ Chapter 6 Pointer Variables
To see how this works, follow the arrows in Figure 6-1(e). LIST points
to the first node. LIST t is the first node. LIST t .NEXT is the NEXT
field of the first node. It contains a pointer to the second node. The
NEXT field is of type POINTERTYPE, as is the variable Q, so we can
make the assignment.
As you can see, it is possible to make some very complicated pointer
expressions. For instance, to print the value of the INFO field of the third
node of the linked list in Figure 6-2(a), we can write
Use the diagrams in Figure 6-2 to convince yourself that the statement is
correct.
P : = LIST
L1ST~~ L1ST-1·';'iir~
(a) LIST (b) LIST t
L1ST---i>~I;i;>0
(g) LIST t . NEXT t . NEXT t . INFO
Figure 6-2.
Up-Arrow Syntax 1257
WRITELN(Pj .INFO)
P : = P j • NE}-{T
This process is repeated until the end of the list is reached. Ho\v do we
know when we have reached the end? The last node in the list has a special
value, called NIL, in its NEXT field. In the array-of-records implementa-
tion, we created a special constant called NULL to indicate an impossible
address (index) in the array. Similarly, the Pascal reserved word NIL is an
impossible address for a pointer variable. We can control our print loop
with the statement
(Note that we cannot substitute the value 0 for the word NIL in this imple-
mentation.) We can represent the value NIL graphically by placing a slash
across the field whose value is NIL.
A procedure that would print all the values in a linked list might look
like this:
l"JAR P : POINTERTYPE;
(a)
NEWNODE--~~
STACK ----<>~CJlI
(b)
NEWNODE----<>~
STACK -----<>U-----<>CJ]
(c) ~
NEWNODE-~
NEW(P)
BEG I N (* push *)
(::: Get a ne\\' node and put VAL into the INFO field. :l:)
NEW(NEWNODE) ; (* Figure 6-3(a) *)
NEWNODEf.INFO := VAL; (* Figure 6-3(b) *)
DISPOSE(P)
makes future reference to P illegal, until P has been redefined by an assign-
ment statement (e.g., P : = LIST) or a subsequent call to NEW(P).
What happens when DISPOSE is called? There is no standard treatment
in the Pascal language; each implementation of Pascal has its own version
of the DISPOSE procedure. In some implelnentations, P is set to NIL; in
others, P is left unchanged, with a valid address stored in P. Therefore, it is
the programmer's responsibility to make sure that references to P never
follow a call to DISPOSE.
A simple exalnple of the lIse of the DISPOSE procedure can be seen in
the following procedure (see Figure 6-4), which removes the first node
froln tl1e list and returI1S the value of its INFO fIeld to the calling program.
(This should also sound fan1iliar by I10W.)
STACK~~~
(a) ~ (b)
TOP TOP
(c) (d)
TOP~VAL VAL = *
D!EBUGG~NG H~N1S
Compile-Time Problems
• Pascal pointer variables are typed pointers. That is, they point to data
of a particular type and only that type. Remember that pointers must
be of the same type in order for them to be compared or assigned to
each other.
• Don't confuse the pointer with the variable it points to. If P and Q are
both type PTRTYPE = l' NODETYPE [where NODETYPE is a rec-
ord with two fields, INFO and NEXT (type PTRTYPE)], P :=
Q l' .NEXT is legal, since P and the NEXT field of Q are both type
PTRTYPE, but P : = Q l' is illegal, since P is a pointer and Q l' is a
node.
• Be careful when using complex pointer expressions. Some compilers
limit the complexity of pointer expressions, requiring the programmer
to rewrite the expression using a temporary variable. For instance, if
your compiler will not accept
TEMPPTR := Pi.NEXTi.NEXT;
DATAl := TEMPPTRi.NEXTi.INFO
Run-Time Problems
• Be sure that a pointer is not NIL before accessing its referenced varia-
ble. If the pointer P is either NIL or undefined, accessing P i will give
you a run-time error.
• Be especially careful with compound expressions in a WHILE loop.
Most Pascal compilers evaluate both sides of a compound expression
(one using AND or OR), regardless of the outcome of the first expres-
sion. For instance, '
MORlE ON l~SlS
Now that you are familiar with tl1e syntax of the Pascal pointer type, let's
look at some other things we can do witl1 linked lists.
Now, when we insert into or delete from the list, there will be only one
case to consider-the case of inserting or deleting in the middle of the list.
No value for the key field will be "smaller" than that in the header node or
"larger" than that in the trailer node. (See Figure 6-6.)
A header node may also be used for a quite different purpose. There may
be times when you wish to carry some special information about the list,
data that you will need often. For example, you may frequently want to
know how many nodes there are in the list. You could keep count of the
number of elements in the list in a separate variable, LISTCOUNT. Alter-
natively, you could bind the count inforn1ation to the list itself by storing
the count in the INFO field of a header node, incrementing and decrement-
ing the count as you insert into and delete from the list.
Depending on the particular application, you may want to use a header,
a trailer, both, or neither.
LIST
t t
? P
Figure 6-7. We want to delete NODE(P), but we can't get to its predecessor node.
Note that in a linear doubly linked list, the BACK field of the first node,
as well as the NEXT field of the last node, contains NIL.
L1ST~~Q~CE-~c]
0/ i-h t
L~~_J
P
(a) Inserting into a singly linked list.
LIST~~[Z[]J::Q!! rn=GIJJ=~
0" L_EtBJ
p
(b) Inserting into a doubly linked list.
Figure 6-8.
our pointer to Q's successor. A correct order for the pointer changes would
be
Pj.BACK := Q;
Pj.NEXT := Qj.NEXT;
Qj.NEXTj.BACK := P;
Q j • NE}{T : = P;
Given a circular doubly linked list, let's write a procedure to traverse the
list backwards, printing all the elements in the list. (We will assume that
the external pointer LIST points to the last node in the list, as discussed in
the previous chapter. Note, however, that since the list is doubly linked, it
is completely sylnmetrical. We could just as well have LIST point to the
first node.)
VAR P : POINTERTYPE;
o o o 0 o 1 1 2 2 2 8 8 9 9 9
o o 2 5 7 2 5 3 7 8 2 5 3 4 4
Salesperson 3 4 6 6 2 4 5 7 4 7 2 3 3 5 9
Addanls 0 36 91 0 0 0 0 0 0 28 ... 0 0 0 0 0
Baker 93 o 33 59 0 0
0 0 0 0 ... 0 0 56 0 0
Cole 39 0 o 26 55 0 0 0 o 33 ... 0 0 39 0 5
Dale 0 0 0 0 0 0 0 0 0 0 ... 0 76 47 98 45
Xavier 0 20 23 33 64 0 0 0 0 0 ... 36 0 0 0 0
Young 0 0 0 0 0 54 46 78 36 0 ... 71 0 0 0 0
Zarro 48 0 0 0 0 0 0 0 0 69 ... 0 87 0 67 0
header nodes
SALES for columns
I ~
+I~U
header * 1)
~
1)
~
1)
nod e
l
~ ~ r----
for
mat rix J---T-
~
1<>- 003
a
()~
r+ .f,)f-e>
O~6
10-1-··· -J> 945
- a
:-J'"'
Addams
-Addams Addams
-"'-
r---- ~
;--i>
l---T- ~ 004
r----
36
O'-i>
~
026
91
Of-
~
Baker Baker
~ I---
I of--!> ~ 003
1+
0 1 -1 - ' " -i:>
93
~
~I
Cole Cole
~ I---
...
r~
9 000 0- 949
I a 39 5 II
t
r~III
Dale Dale
~
r---T-
~
Zarro 1 Zarro Zarro
'--
f---
r-o f -000
--
a
O~
f·~···~ 48
.. ,
~.
945
67
1
V
',---,-\-----//
header nodes
for the rows
figure 6-10.
Applications ot Linked Lists 1269
example of a sparse matrix. Though it is natural to think of implementing a
matrix as a two-dimensional array, a sparse matrix may be more efficiently
implemented (with regard to space use, not tilne) as a linked-list structure.
A sparse matrix represented as linked lists with lleader nodes is pictured in
Figure 6-10.
An operating system is a program that manages and allocates the re-
sources of a computer system. Operating systems use linked lists in many
ways. The allocation of memory space lnay be nlanaged using a doubly
linked list of variably sized blocks of memory. Doubly linking the list facili-
tates removal of blocks from the middle of the list. In a multi-user system,
the operating system may keep track of user jobs waiting to execute through
linked queues of control blocks.
The Pascal pointer type, coupled with dynamic storage allocation, gives us
another way to implelnent linked lists. In general, applications that use
linked lists can be coded with either an array or pointer-variable imple-
mentation. Therefore, the applications discussed in the previous chapter
are also appropriate to this Olle.
There are, however, situations in which the ability to allocate more
space dynamically makes the pointer-variable implementation a better
choice. In a prograln in \vhich the amount of data to be stored is very unpre-
dictable or may vary widely, the dynamic implelnentation has major advan-
tages. Pointer variables provide efficient access to nodes. The actual ad-
dress of the node is stored in the pointer. In an array implementation, the
address ofNODES[P] must be computed by adding P (an index) to the base
address of the array NODES, as we saw in Chapter 2.
Pointer variables also present a problenl when \ve need to retain the data
in the data structure between runs of the program. For instance, we may
want to write all the nodes in the list to a file, and then use this file as input
the next time we run the prograln. An array index would still be valid on the
next run of the progranl, while a pointer variable-an actual address-
would be nleaningless.
Finally, there are a nunlber of languages which do not have pointer
types. If you were programlning in FORTRAN, for instance, you would
llave to represent pointers as array indexes. In fact, since FORTRAN also
doesn't support record types, you would need to set up a parallel array of
pointers (indexes).
270 l Chapter 6 Pointer Variables
(b) Q : = P p Q R
(f) R t . NE}-{T : = P p Q R
Exercises 1271
2. Write one statement (using the i notation) to effect the change indicated by the
dotted line.
(a)
IL ~ J
! !
(b)
CE-[]i}--LB--~ L ---l
(c)
(d) LIST P
! !
CE-~C0
L ~
BEGIN
IF Pj.INFO MOD 2 = 0
THEN EVEN := TRUE
ELSE EVEN := FALSE
END;
(b) PROCEDURE SUCCESSOR (P : PTR);
BEGIN
WHILE P <> NIL DO
WRITELN(Pj.INFO, I IS FOLLOWED BY I,
Pt.NE}-{Tj. INFO);
END;
BEGIN
VAL := LISrj.INFO;
DISPOSE(LIST) ;
LIST := LISTj.NEXT
END;
8. Write a procedure to delete a node from a doubly linked linear list that does not
have a header or a trailer node.
A B c
9. For the circular doubly linked list above:
(a) Express the INFO field of node 1 referenced from pointer A.
(b) Express the INFO field of node 1 referenced from pointer C.
(c) Express the NEXT field of node 4 referenced from pointer B.
(d) Express the NEXT field of node 4 referenced from pointer C.
(e) Express node 1 referenced from pointer B.
(f) Express the BACK field of node 3 referenced from pointer B.
(g) Express the BACK field of node 2 referenced from pointer B.
2741 Chapter 6 Pointer Variables
PTR;
INTEGER;
BEG IN (* count *)
CT : = 0;
P :=
WHILE DO
BEGIN
CT : = CT + 1;
P :=
END;
:= CT
END; (* count *)
2. Fill in the values indicated by the pointer variable expressions below, based on
the following linked list:
A B
Pre-Test 1275
(a) Ai. NE}-{Ti • NE}-{Ti • INFO
(b) Bi . NE}.{ T i . I NF 0
(c) Bi.NE}-{Ti.BACKi.BACKi.INFO
(d) Ai. BACK i. NE}-{Ti. INFO
3. TYPE PTRTYPE = i BLOCK;
BLOCK = RECORD
SCORE : INTEGER;
NAME: PACKED ARRAY[1 •• 20J OF CHAR;
NE}{T : PTRTYPE
END; (* record *)
Write a code fragment to take a list of blocks (as defined above) pointed to by
START and create two new lists pointed to by PASS and FAIL, respectively, in
which the first list contains all blocks having scores > = 50 and the second
contains all blocks having scores < 50. It is required that the blocks in the two
new lists occur in exactly the same order as in the original list.
4. Write a function, MEMBER, that returns TRUE if a value VAL occurs anywhere
in the linked list pointed to by LIST. This function returns FALSE otherwise.
Use the following function heading:
FUNCTION MEMBER (VAL: INTEGER; LIST: PTRTYPE)
: BOOLEAN;
5. TYPE PTRTYPE = i NODETYPE;
NODETYPE = RECORD
INFO: CHAR;
NE}{T : PTRTYPE
END; (:1: record *)
Write a nonrecursive Pascal procedure, REVERSE, that takes one input param-
eter, CIRCLE, of type PTRTYPE. Assume that CIRCLE points to the "first"
node in a singly linked nonempty circular list. The purpose of this procedure is
to print out the characters in the list CIRCLE in reverse order. For instance, the
output for the following example would be ZYX:
Hint: Use a stack of pointers. You may assume that the appropriate STACK-
TYPE declaration and all stack utility routines are already written.
2781 Chapter 7 Recursion
As a beginning programmer, you may have been told never to use a func-
tion name within the function on the right-hand side of an assignment state-
ment, as in the following program:
l.,JAR I : INTEGER;
BEG I N (* SU 111 *)
SUM := 0;
FOR I := 1 TO NUMITEMS DO
SUM := SUM + LISTEI]
END; (* sun1 *)
You were probably told that using a function name this way would cause
something mysterious and undesirable to occur-the function would try to
call itself recursively. You may also have been told that you would learn
how to use recursion as a powerful programming tool in a later course. In
this chapter we will explore how to understand and to write recursive func-
An Example of Recursion 1279
tions and procedures, as well as how recursion works in a high-level lan-
guage like Pascal.
AN !EXAMPLE Of RECURS!ON
Matheluaticians often define concepts in terms of the process used to gen-
erate them. For instance, one mathematical description of nl (read "n facto-
rial"-this value is equal to the number of permutations of n elements) is
1 if n = 0
n'· = ( n ' * (n - 1) * (n - 2) * ... * 1, if n > 0
Consider the case of 41 Since n > 0, we use the second part of the defini-
tion:
41 = 4 * 3 * 2 * 1 = 24
This definition essentially provides a different definition for each value
of n, since the three dots stand in for the intermediate factors. That is, the
definition of 21 is 2 * 1, while the definition of 31 is 3 * 2 * 1, and so forth.
We can also express nl with a single definition for any nonnegative value
ofn:
1 ifn = 0
nl = ( '
· n*(n-1)1, if n > 0
This definition is recursive, since we express the factorial function in terms
of itself.
Let's consider the recursive calculation of 4! intuitively. Four is not
equal to 0, so we use the second half of the definition:
41 = 4 * (4 - 1)1 = 4 * 31
Of course, we can't do the multiplication yet, since we don't know the
value of3!. So \ve call up our good friend Sue Ann, who has a Ph.D in math,
to find the value of 31.
(---...~
Wha.\ 15
3! ?
4~:~*3~
Sue Ann has the same formula that we have for calculating the factorial
function, so she knows that
31 = 3 * (3 - 1)1 = 3 * 2!
280 I Chapler 7 Recursion
She doesn't know that value of 21 however, so she puts you on hold and
calls up her friend Max, who has an ~1.S. in math.
Max has the same formula that Sue Ann has, so he quickly calculates that
21 = 2' * (2 - 1)1 = 2 * 11
But Max can't complete the multiplication because he doesn't know the
value of I!. He puts Sue Ann on hold and calls up his mother, who has a
B.A. in math education.
Max's mother has the same formula that Max has, so she quickly figures
out that
11 = 1 * (1 - 1)1 = 1 * 01
Of course, she can't perform the multiplication, since she doesn't have the
value of Of. So Mom puts Max on hold and calls up her colleague Bernie,
who has a B.A. in English literature.
Programming Recursively 1281
Bernie doesn't need to know any 111ath to figure out that 01 = 1 because
he can read that information in the first clause of the formula (n! = 1, if
n = 0). He reports the answer immediately to Max's mother. She can now
complete her calclllations:
I! = 1 * 01 = 1 * 1 = 1
and she reports back to Max. Max now performs the multiplication in his
forlTIula, and learns that
21 = 2 * I! = 2 * 1 = 2
He reports back to Sue Ann, who can now finish her calculation:
3! = 3 * 2! = 3 * 2 = 6
Sue Ann calls you with this exciting bit of inforlTIation. You can now com-
plete your calculation:
41 = 4 * 3! = 4 * 6 = 24
(S~
~
PROGRAMM~NG RECURS~VElV
Notice the two uses of NFACT in the ELSE clause. On the left side of
the assignment statement, NFACT is the function name receiving a value.
This use is the one that we are accustomed to seeing. On the right side of
the assignment statement, NFACT is a recursive call to the function, with
the parameter N - 1.
Let's walk through the calclllation of 4! using function NFACT. The
original value of N is 4.
Line Action
1 4 is not equal to 0, so skip to ELSE clause.
3 NFACT := 4 * NFACT(4 - 1)
First recursive call returns us to the beginning of the function,
with N = 3.
1 3 is not equal to 0, so skip to ELSE clause.
3 NFACT := 3 * NFACT(3 - 1)
Second recursive call returns us to the beginning of the function,
with N = 2.
1 2 is not equal to 0, so skip to ELSE clause.
3 NFACT := 2 * NFACT(2 - 1)
Third recursive call returns us to the beginning of the function,
with N = 1.
1 1 is not equal to 0, so skip to ELSE clause.
3 NFACT := 1 * NFACT(1 - 1)
Fourth recursive call returns us to the beginning of the function,
with N = O.
Programming Recursively 1283
1 o= 0, so go to line 2.
2 NFACT := 1
The value of NFACT(O) is returned to the calling statement, the
fourth recursive call.
3 NFACT := 1 * NFACT(O) = 1 * 1 = 1
The value of NFACT(l) is returned to the calling statement, the
third recursive call.
3 NFACT := 2 * NFACT(l) = 2 * 1 = 2
The value of NFACT(2) is returned to the calling statement, the
second recursive call.
3 NFACT := 3 * NFACT(2) = 3 * 2 = 6
The value of NFACT(3) is returned to the calling statement, the
first recursive call.
3 NFACT := 4 * NFACT(3) = 4 * 6 = 24
The function now returns a value of 24 to the original calling
statement, e.g., WRITE(NFACT(4)).
If only we had a function that \vould Slun all the rest of the elelllents in
the array .... But we do have one! Function SUM sums elenlents in an
array; we just need to start at the second, instead of the first, element (a
smaller case). Tl1is indicates that we will need to pass tIle starting place (an
Function Sum Revisited 1285
array index) to the function as a paralneter, rather than keeping the index as
a local variable.
, . . - - - - - - - - - - - FUNCTION SUM - - - - - - - - -
Definition of proble1n: SUln all the elements in the array.
Size of problem: the whole array LIST, froln LIST[l] to
LIST[NUMITEMS].
Base case: when I = NUMITEMS, SUM ~ LIST[I].
General case: when I < NU~1ITEMS, SUM ~ LIST[I] + SU~1(rest
of LIST).
The user of this function would invoke it with the name of the array, and
286 ( Chapter 7 Recursion
the upper and lower indexes of the array to be summed. For instance, to
print the sum of the first 100 elements in the array INVENTORY, we \\70uld
state
A BOOLEAN fUNCT~ON
Our next problem is to \vrite a Boolean function, SEARCH, that searches an
array, LIST (ARRAY[l .. MAXLIST] OF INTEGER), for the value, VAL.
Using the approach discussed above, we generate the following informa-
tion:
- - - - - - - - - FUNCTION SEARCH
Definition of proble'm: search array LIST froln LIST[l] to
LIST[MAXLIST]. Return TRUE if VAL is
found, FALSE otherwise.
Size of problent: size of the array (MAXLIST).
Base cases: (1) when LIST[I] = VAL, SEARCH ~ TRUE;
(2) when I = MAXLIST (whole array searched) and
LIST[I] <> VAL, SEARCH ~ FALSE.
General case: SEARCH the rest of the array.
A Boolean Function I 287
We know from the definition and size of the problem that we will need to
pass several pieces of information to the function. Of course, we need to
pass the array (LIST), its size (MAXLIST), and VAL. This information
would be required by an iterative solution as well. For a recursive solution,
the function needs an additional parameter-the index (I) of the lower limit
of the array to be searched. Obviously, in our initial function call, this value
will be 1. In our general case (SEARCH the rest of the array), this parame-
ter will have to be incremented, effectively diminishing the size of the
problelll solved by the recursive call. That is, searching the array from I + 1
to MAXLIST is a smaller task than searching from I to MAXLIST. Follow-
ing is the Function SEARCH frozen in mid-execution:
VAL = 7
Note again that the index that acts as a counter through the array, I, is
288\ Chapter 7 Recursion
A RECURSIVE PROCEDURE
So far we have written several recursive functions. Now let's try to write a
recursive procedure. We want to print out the elelnents of a linked list
implemented with pointer variables.
Use the following declarations:
NODE = RECORD
INFO : INFOTYPE;
NE}-{T : PTR
END; (* record *)
By now, you are probably protesting that this task is so easy to accom-
plish nonrecursively that it doesn't make any sense to write it using recur-
sion. So let's make the task more fun: 'Vrite a procedure, REVPRINT, to
print out the elelnents of a linked list backwards. This problem is some-
what more challenging to write nonrecursively, but simple to solve recur-
sively.
What is the task to be perforlned? First, we want to print out the second
through last elements in the list, in reverse order. Then, we need to print
the first elelnent in the list (Figure 7-1).
L1ST-~~~~D
'I \
Then, print
First, print out this secti~n
J
of the list, backwards. I
this element. - - - - -
/ \
Result: E DeB A
T
Figure 7-1. The algorithm for the recursive REVPRINT.
A Recursive Procedure 1289
We know ho,v to do the second part of this task. If P points to the first
node in the list, we can print its contents with the staten1ent
WRITE(P i .INFO). The first part of the task is not much more compli-
cated, since we already have a procedure for printing out the second
through last elements of the list-REVPRINT. Of course, we have to adjust
the parameter somewhat: REVPRINT(P i .NEXT). This says: print, in re-
verse order, the linked list pointed to by P i .NEXT. Of course, this task is
also accomplished recursively in two steps:
- - - - - - - - PROCEDURE REVPRINT - - - - - - - . . . . . . ,
Definition: print out the list in reverse order.
Size: number of elements in the list pointed to by P.
Base case: when list is empty, do nothing.
General case: REVPRINT the list pointed to by P i .NEXT, then
print P i .INFO.
REVPR~Nl lREV~S~TED
BEG I N (* revprint *)
(* Push pointers to all nodes onto stack. :1:)
CLEARSTACK(STACK) ;
P := LIST;
WHILE P <> NIL DO
BEGIN
PUSH(STACKt P);
P := Pt.NE}-{T
END; (:1: \vhile P <> nil :1:)
VAR XI YI Z : INTEGER;
Z := }-{ + Y;
SYMBOL LOCATION
}-{ 0000
Y 0001
Z 0002
Location 0000
MAIN PROGRAM
RETURN
first paralneter
SUBPROGRAM 1
variables
code
RETURN
....
second parauleter
local variables SUBPROGRAM 2
Figure 7-2. Static allocation of space for a program with two subprograms.
l.,JAR At B f C : INTEGER;
ANS : REAL;
LIST: LISTTYPE;
How Recursion Works 1295
would create the following memory assignluents:
BAS E ------. A
C
ANS
LIST[O]
LIST[9]
l"J AR I : I NT EGER ;
RETURN
y
I . - TOP
TOP is set to the address of the last paralneter or local variable. The space
required to store these values, therefore, is allowed to gro\v along witl1 the
level of nesting of procedure calls.
After the procedure or function has finished executing, it returns these
locations for reuse by resetting TOP to TOP lninus the locations being
returned. Storage allocation of data connected with a particular invocation
of a procedure or function is supported by a run-time stack, with all the
parameters and local variables accessed by their positions relative to the
top of the stack. These variables are pushed onto the stack at the entrance to
a sllbprograln, then popped off the stack when the procedure or function
con1pletes executiol1.
This schen1e Inight be compared to another way of allocating seats in an
auditorium where a lecture has been scheduled. A finite number of invita-
tions are issued, bllt each guest is asked to bring a chair. In addition, each
guest can invite an unlimited number of friends, as long as they all bring
their o\vn chairs. Of course, if the number of extra guests gets out of 11and,
the space in the auditorium will run out, and there will not be enough room
for any more friends or chairs. Similarly, the level of recursion in a program
must eventually be limited by the alnount of memory available in the run-
tilne stack.
ANS := NFACT(S)
~
R1 (:I: the original call :1:)
ANS := NFACT(S)
three locations are put in the run-tilne stack: one for the return address, one
NFACT-One More Time \ 297
for the formal parameter N, and one for the function identifier, which is in
essence a VAR parameter. The return address will be the place in the trans-
lated code where the result is stored into ANS. Let's call it Rl. The value of
the actual parameter N is stored in the second place (N's relative place),
and the third place will still be undefined, since we have not yet made any
assignInents to NFACT.
global
variables
(RETURN) RI
(N) 5
(NFACT) ? ..-TOP
NFACT := N * NFACT(N - 1)
global
variables
(RETURN) Rl
(N) 5
(NFACT) ?
(RETURN) R2
(N) 4
(NFACT) ? .--TOP
NFACT := N * NFACT(N - 1)
So the function NFACT is called again [roln the place within the function
that we called R2. This process continues until the situation looks like
2981 Chapter 7 Recursion
(RETURN) RI
(NFACT)
(N) 5
?
lIst call
(RETURN) R2
(N) 4 j2nd call
(NFACT) ?
(RETURN) R2
(NFACT)
(N) 3
?
j3rd call
(RETURN) 112
..~
(N) j4th call
(NFACT)
(RETURN) R2
(N) 1 j5th call
p
(NFACT)
(RETURN) R2
(N) 0 j6th call
TOP ----. (NFACT)
Figure 7-3. Now, as the code is being executed, we again ask the question:
Is N (the value in TOP - 1) equal to O? Yes! Now we take the THEN
branch, which stores the value 1 in NFACT (located in TOP). This time the
function has executed to completion. The value in NFACT is returned to
the place of the call (R2) and the stack is popped. (That is, TOP becomes
TOP - 3.)
The place of the call is where the returned value is multiplied by N (the
value in TOP - 1) and stored in NFACT (in location TOP). This is done,
and the function has been completed. The value in NFACT is then re-
turned to the place of the call, and TOP is reset to TOP - 3.
This process continues until \ve are back to the first call:
global
variables
(RETURN)
(N)
(NFACT) ~TOP
and 24 has just been retllrned as the value of NFACT(N - 1). This value is
multiplied by N (that is, 5) and the result, 120, is stored in NFACT (location
TOP). This assignment completes the execution of Function NFACT. The
Summary 1299
value in location TOP (that is, NFACT) is returned to the place of the
original call and TOP is reset to TOP - 3. Now, 120 is stored in ANS and
the statelnent following tl1e original call is executed.
A ... z
1
r---- A ... L '---- - M ... Z -
Fagure 7·4.
Application: Quicksort-An Example of a Recursive Algorithm \301
IF not finished
THEN select a splitting value V
SPLIT on V
QSORT the elements less than or equal to V
QSORT the elements greater than V
QSORT(LIST, 1, N);
Let's use the value in LIST[FIRST] as the splitting value, V. After the
call to SPLIT, all the elements less than or equal to V will be on the left
side of the array and all those greater than V will be on the right side of the
array:
The two "halves" meet at SPLITPOINT, the index of the last element
that is less than or equal to V. Note that we don't kno'" the value of SPLIT-
POINT until the splitting process is complete. We can then swap V with
the value at SPLITPOINT:
c=c=1
FIRST
[1] [2]
SPLITPOINT
CYJ
LAST
LJ
[N]
Our recursive calls to QSORT will use this index (SPLITPOINT) to re-
duce the size of the problelll in the general case..
QSORT(LIST, FIRST, SPLITPOINT - 1) sorts the bottom "half" of the
array. QSORT(LIST, SPLITPOINT + 1, LAST) sorts the top half of the
array. The value in LIST[SPLITPOINT] is already in its correct position.
What is the base case? When the segment being examined has less than
two elements, we do not need to go on. So "IF not finished" can be trans-
lated into "IF FIRST < LAST".
We can now code Procedure QSORT.
2. Does each recursive call involve a smaller case of the problem? Yes.
SPLIT divides the segment into two not-necessarily-equal pieces,
and each of these smaller pieces is then QSORTed. Note that even if
V is the largest or smallest value in the seglnent, the two pieces will
still be smaller than the original one:
~ >_\l D
FIRST LAST
SPLITPOINT
In good top-down fashion we have shown that our algorithm will work if
Procedure SPLIT works. Now we must develop our splitting algorithm. We
must find a way to get all of the elements equal to or less than V on one side
of V and the elements greater than V on the other side.
We will do this by using a pair of indexes, RIGHT and LEFT. RIGHT
will be initialized to FIRST + 1 and LEFT will be initialized to LAST.
[See Figure 7-5(a).] We then move RIGHT toward the middle, comparing
LIST[RIGHT] to V. If LIST[RIGHT] <= V, we keep incrementing
RIGHT; otherwise, we leave RIGHT and begin moving LEFT toward the
middle. [See Figure 7-5(b).]
Now LIST[LEFT] is compared to V. If it is greater than V, we continue
decrementing LEFT; otherwise, we leave LEFT in place. [See Figure
7-5(c).]
At this point, it is clear that LIST[LEFT] AND LIST[RIGHT] are each
on the wrong side of the array. Note that the elements to the left of
LIST[RIGHT] or to the right of LIST[LEFT] are not necessarily sorted;
I
(a) Initialization.
FIRST LAST
[8]
they are just on the correct side with respect to V. To put LIST[RIGHT]
and LIST[LEFT] onto their correct sides, we merely swap them, then in-
crement RIGHT and decrement LEFT. [See Figure 7-5(d).]
Now we repeat the whole cycle, moving RIGHT to the right until it
encounters a value that is greater than V, then moving LEFT to the left
until it encounters a value that is less than or equal to V. [See Figure
7-5(e).]
When does the process stop? When RIGHT and LEFT pass each other,
no further swaps are necessary. Now we just exchange LIST[FIRST] and
LIST[LEFT], and the SPLIT procedure is finished. [See Figure 7-5(f).]
SWAP(LIST[FIRSTJ, LIST[LEFTJ);
SPLITPOINT := LEFT
END; (:i: split :;:)
3061 Chapter 7 Recursion
What happens if our splitting value is the largest or the smallest value in
the segment? The algorithm will still work correctly, but because of the
lopsided splits it will not be quick.
Is this situation likely to occur? It depends on how we choose our split-
ting value and on the original order of the data in the array. If we use
LIST[FIRST] as the splitting value and the array is already sorted, then
every split will be lopsided. One side will contain one element, while the
other side will contain all but one of the elements. Thus our Quicksort will
not be a quick sort. This splitting algorithm favors an array in random order.
It is not unusual, however, to want to sort an array that is already in
nearly sorted order. If this is the case, a better splitting value would be the
middle value,
REPEAT
WHILE LIST[RIGHTJ < V DO
RIGHT := RIGHT + 1; (:1: LIST[HIGHT] < \1 *)
WHILE LIST[LEFTJ > V DO
LEFT := LEFT - 1; (::: LIST[LEFT] > V ;j:)
SPLITPT1 := RIGHT;
SPLITPTZ := LEFT
END ; (:: S p Ii t2 *)
Notice that QSORT2 checks to see how many elements are in a segment
and does not recurse if there is only one. This lnakes the code nlore effi-
cient.
3081 Chapler 7 Recursion
1. Write a recursive function that takes as parameters two elements of a set, and
returns TRUE if the first element is less than the second and FALSE otherwise.
You can assume that both elements are actually in the set and the last element is
called LASTEL. You may assume that ELl_~!1cl_EL2()re not LASTEL.
Example: Given the enumerated type -
ETYPE = (DOG, FOX, COYOTE, HYENA, WOLF, LASTEL);
if ELI is FOX and EL2 is WOLF, then Function LESSTHAN will return
TRUE. If ELI is HYENA and EL2 is DOG, then LESSTHAN will return
FALSE.
Hint: Use the SUCC function.
2. How would Procedure REVPRlNT from this chapter be changed to make it
print out a list in order?
Use these declarations for Problem 3.
TYPE PTR = jNODE;
NODE = RECORD
INFO : INTEGER;
NEHT : PTR
END; (* record *)
SUMSQRS = (5 * 5) + (2 * 2) + (3 * 3) + (1 * 1) = 39
4. For each of the following recursive functions, tell whether or not the function
will "work" and if so, what does it do? If not, why not?
(a) FUNCT I ON MYSTERY (L 1ST, TEMP : PTR) : INTEGER;
BEG IN (* 111ystery *)
TEMP := LIST
IF TEMP = NIL
THEN MYSTERY := 0
ELSE MYSTERY := TEMPf.INFO +
MYSTERY(LIST, TEMPf.NEXT)
END; (* Inystery *)
Exercises 1309
(b) FUNCT I ON PUZZLE (L 1ST PTR) INTEGER;
BEG I N (* brainless *)
IF LIST = NIL
THEN BRAINLESS := FALSE
ELSE BRAINLESS := (1 + 1 = 2) AND
BRAINLESS(LISTj.NEXT)
END; (* brainless *)
5. Show what would be returned by the following recursive function after the
given calls.
FUNCTION QUIZ (BASE, LIM: INTEGER) : INTEGER;
BEG I N (* quiz *)
IF BASE = LIM
THEN QUIZ :=
ELSE IF BASE > LIM
THEN QUIZ := 0
ELSE QUIZ := BASE + QUIZ(BASE + 1, LIM)
END; (* quiz :1:)
(a) X:= QUIZ(O, 3) X is - - - -
(b) Y:= QUIZ(lO, -7) Y is _
(c) Z:= QUIZ(5, 5) Z is - - -
END;
END; (:1: print *)
t"JAR J : INTEGER;
IF LL <> UL
THEN BEGIN
J : = MAHPOS (A t LL t UL);
SWAP(A t J t UL);
SORT(A, LLt UL - 1)
END
END; (::: sort :1:)
1. Consider the following recursive Pascal procedure that prints and permutes
characters in a peculiar fashion:
PROCEDURE PANDP (Ht Yt Z : CHAR; N : INTEGER);
Pre-Test 1311
BEGIN
IF N )- 0
THEN
BEGIN
WRITE(}{) ;
PANDP(Y, Z, X, N - 1);
WRITE(Z)
END (* if :;:)
END; (:1: pandp *)
Show the output produced by each of the following three procedure calls:
(a) PANDP( 'A' I '8' I 'C', 1)
(b) PA NDP ( 'A 't ' 6" ' C" 2)
(c) PANDP( 'A' I '6' I 'C't 3)
2. Let COMM(N, K) represent the number of different committees of K people
that can be formed given N people to choose from. For example,
COMM(4, 3) = 4, since, given four people A, B, C, and D, there are four possi-
ble three-member conlmittees: ABC, ABD, ACD, and BCD. It is well known
that
COMM(N, 1) = N
COMM(K, K) = 1
COMM(N, K) = COMM(N - 1, K - 1) + COMM(N - 1, K) for N > K > 1
Write a recursive Pascal function to compute COMM(N, K) for N >= K and
K>= 1.
3. Consider the following TYPE declaration:
PTRTYPE = iNODETYPE;
NODETYPE = RECORD
INFO CHAR;
NE}<T PTRTYPE
END;
Recall that a palindrome is a list that has the same sequence of elements when
read from right to left as it does when read from left to right. Define a recursive
Pascal procedure, PALINDROMIZE, that takes a pointer to a linked list of
characters as its input parameter and prints out a palindrome twice as long as
the linked list. For instance, the following PALINDROlvlIZE(P) prints out
ABCCBA:
4. Write a recursive function, POSCT, to calculate the number of cells that contain
positive values in the list pointed to by LIST.
5. Consider the following recursive function that calculated X - Y. (The two writ-
ten statements are only to help you trace the function execution.)
3121 Chapter 7 Recursion
Goals i~J(·f*i§,yM1i§til!i;:.:.'··.;.~~&ijr&&G~&·h wr;::&AA0b.¥iA~·"/..K'~~MYlY·::~:,:;~<-1.~~:i·;~·;~1tfr· ~~Ylr::::·:;W·~I·I~.i! (+~~~N;;:;,:····::·:·:·:·x····::":::":":";:"···)z·f·l';~: 'D~t::·:r~'·~·"··:;:";:":·:·:··~··Y·r~:i···".1~::ili!U J J"1 t'l'i~·:~!tf.:::~·····"'t·:;···¥~rr·frf·~1A+h§·"·:·:·:·:w. i f"f J&&8&M&'"-e4!2(T..J. ! "";>:8%XmWW@u . , . d·Impl n/ij;imt~.\w'@H3@i
II sibling
II ancestor
Ii descendant
• level
iii subtree
in a doubly linked list, a node in a binary search tree does not necessarily
point to the nodes whose values imlnediately precede and follow it. The
nodes pointed to may be any nodes in the list, as long as they satisfy the
basic rule: the node to the "left" contains a value smaller than the node
pointing to it, and the node to the "right" contains a larger value. (We will
assume that the nodes in the tree are ordered with respect to some key
field.)
Figure 8-2 shows a binary search tree that could be created from the
nodes in Figure 8-1. Notice that, for any given node, the nodes to its left
contain smaller values and the nodes to its right contain larger values. As in
a linear linked list, the first node in the tree is pointed to by an external
pointer. To access the node containing the value 10, we look in the first
node (called the root of the tree). The value in the root node is smaller than
10, so we know by the basic rule that the node we seek is located some-
where to its right. Now we check the node immediately to its right and
compare the value there to 10. It is smaller, so we move again to the right.
This process continues until we arrive at the node that contains 10. Note,
by following the path from the root to the node containing 10, that our
search only required four comparisons. By contrast, the search for the value
10 in the linear linked list required ten comparisons.
How should duplicate nodes be handled? In some applications, it may
be desirable to ignore them. In another situation, occurrences of duplicates
may be noted by checking a special flag field or incrementing a counter
field in the node. Or new nodes with duplicate values may be inserted to
the right or left of the original node. The choice of how to handle duplicates
is dependent on the nature of the problem. In our discussion of binary tree
algorithms in this chapter, we will assume, for the sake of simplicity, that
the nodes are ordered with respect to some unique value.
In this chapter we will learn the basic tree vocabulary, and then develop
the algorithms and implementations of some of the procedures needed to
3161 Chapter 8 81nary Search Trees
ROOT ~ LO-------
~----~------L1------
[!J----- rWl----rG---- L2 - - - - - -
0---ITJ-[!J-------LS------
Figure 8..3.
Searching the Tree I 31 7
We will access the whole tree through an external pointer (e.g., ROOT).
As with linear linked lists, we can access nodes through their pointers. We
will refer to three basic fields in each node of a tree:
• INFO(P)-contains the data stored in the node pointed to by P. Like
the INFO fields of linked list nodes, it may contain an integer, charac-
ter, record, or any other data type.
• LEFT(P)-a pointer to the left child of the node pointed to by P.
• RIGHT(P)-a pointer to the right child of the node pointed to by P.
P ~ROOT
WHILE INFO(Pj <> VAL DO
IF INFO(P) > VAL
THEN P ~ LEFT(P)
ELSE P ~ RIGHT(P)
(* Assuming that VAL is in the tree, P will point to the *)
(* node containing the desired value when we exit *)
(* from the loop. *)
must look at every node in the list. If the list contains 1000 nodes, you must
make 1000 comparisons. If the 1000 nodes were arranged in a binary search
tree (and the tree was evenly balanced), you would never make more than
11 comparisons, no matter \vhat node you were seeking!
ROOT--0
ROOT~ ROOT
3 3 3
LEFT(NEWNODE) ~ NULL
RIGHT(NEWNODE) ~ NULL
INFO(NEWNODE) +- VAL I
Now that the new node has been created, we can search for its insertion
point. After initializing a temporary pointer, P, to point to the root node, we
can move P left and right through the tree, as if we were searching for VAL
in the tree. When P equals NULL, we will have found the insertion point.
Of course, once P is NULL, we cannot link the new node to the node P was
pointing to just before it became NULL. We have "fallen out" of the tree
and need to climb back into it. In Figure 8-5, P is equal to NULL \vhen we
have found the insertion point. We need the pointer P' to be able to access
the node containing 13, in order to set its right link to point to node 14.
Our solution will be to have a second pointer trail P as it moves through
P'
Figure 8-5.
320 I Chapter 8 Binary Search Trees
the tree. When P becomes NULL, this back pointer allows us to access the
leaf node to which we will link the new node.
The algorithm for the search for the insertion point is
(* BACK catches up to P. *)
BACK ~ P
(* Advance P. *)
IF INFO(P) > VAL
THEN move P left
ELSE move P right
At the end of the loop, BACK will point to the node to which we want to
link the new node.
The third task is to fix the pointers to attach the new node. In the general
case, we can compare the new value to the value in the node pointed to by
BACK. We then set either the left or right pointer field of NODE(BACK) to
NEWNODE.
IF BACK = NULL
THEN ROOT ~ NEWNODE
ELSE attach new node to NODE(BACK)
From our discussion of the insert algorithm, it can be seen that the order .
in which nodes are inserted determines the shape of the tree. Figure 8-6
illustrates how the same data, inserted in different orders, will produce
very differently shaped trees. If the values are inserted in order (or in re-
Inserting Into a Binary Search Tree 1321
A C E G
o
E
Figure 8-6. The input order determines the shape of the tree.
verse order), the resulting tree will be very skewed. A random mix of the
elements will produce a shorter, "bushy" tree. Since the height of the tree
(the maximum level of nodes in the tree) determines the maximum number
of comparisons in a binary search, the tree's shape is very important. Obvi-
ously, minimizing the height of the tree will maximize search efficiency.
There are algorithms to adjust a tree to make its shape more desirable, but
these schemes are subject matter for more advanced courses.
Taken together, these pieces of the insert algorithm can be coded as a
3221 Chapter 8 Binary Search Trees
PTRTYPE jNODETYPE;
NODETYPE RECORD
INFO g INFOTYPE;
LEFT. RIGHT g PTRTYPE
END; (* record :i:)
l)AR NEWNODE.
P.
BACK g PTRTYPE;
BEG I N (:;: insert :i:)
Dl.o-..--..paren_t
of X I~ 1
1111111111 " 0 ex-p~r~,~~~~_,~_, ~JLr~®
o X
o "-\.",
/--'0" -:
DISPOSE
~l'-'--~~?
_/----'-/
Delete the node containing X. ',--_..... ..A , /'--"'"
o parent of X ~ ex-parent of X
r-&J~x_0
o child of X
D Delete the node containing X.
2. Deleting a node with only one child. The simple solution for deleting
a leaf will not suffice for deleting a node with children, since we don't
want to lose all of its descendants from the tree. We want to make the
pointer from the parent skip over the deleted node and point instead
to the child of the node we intend to delete. We then dispose of the
unneeded node (Figure 8-8).
3. Deleting a node with two children. This case is the most complicated,
as we cannot make the parent of the deleted node point to both of the
deleted node's children. In fact, there are several ways to accomplish
this deletion. One common method is to replace the node we wish to
delete with the node that is closest in value to the deleted node. This
node can come from either the left or the right subtree. In this exam-
ple, we will replace the node to be deleted with the node of closest
J J
K z K
p p
A A
p p
A F
p F p
IF RIGHT(BACK) = P
THEN RIGHT(BACK) ~ NULL
ELSE LEFT(BACK) ~ NULL
IF RIGHT(BACK) = P
THEN set RIGHT(BACK)
ELSE set LEFT(BACK)
Deleting a node with two- children involves searching the tree for the
value that is closest to the value in the node to be deleted (immediately
before or immediately after). We will not actually delete NODE(P). In-
stead, \ve will delete its contents by putting this closest value in its place.
Then \ve can delete the node whose value we moved. Our algorithm guar-
antees that the node we are now deleting has at most one child, so its
deletion is relatively simple.
3281 Chapter B Binary Search Trees
y y
BACK p P
J s
A TEMP T A T
TEMP
Figure 8-13.
The following algorithm will find this value, copy it into NODE(P), and
delete the unneeded node. We will let P act as a placeholder, and use
TEMP and BACK to find and delete the replacement node.
BACK ~ P
TEMP ~ LEFT(P)
WHILE RIGHT(TEMP) <> NULL DO
BACK ~ TEMP
move TEMP to the right
Replace info in deleted node
Delete NODE(TEMP)
Implementing the Delete Algorithm 1329
Deleting NODE(TE~lP) is simple, since at the termination of the loop,
RIGHT(TEMP) is NULL. Therefore, we know that NODE(TEMP) has at
most one child. Deleting this node requires only a simple pointer manipu-
lation:
This design has one serious problem: If we try to delete the root node,
difficulties will result. Particularly, if the root has no children or one child,
it will be necessary to change the value of the external pointer to the tree,
ROOT. In addition, the references to BACK l' .LEFT and BACK l' .RIGHT
will cause run-time errors, since BACK = NIL.
There are a couple of solutions to this problem. We could keep a header
node in the tree, so that we need never attempt to delete the root. Alterna-
tively, we could write the delete routine with the additional VAR parame-
ter ROOT, checking for this case separately with a few more IFs.
Procedure DELETENODE below incorporates the second approach.
Note that we do not have to do any special processing for deleting the root
node when it has two children, since the actual node is not deleted.
(* If N()DE(P) is a leaf *)
IF (Pj.RIGHT NIL) AND (Pj.LEFT = NIL)
THEN
I F BACK NIL (:1: NODE(P) is the last node in tree. *)
THEN ROOT := NIL
ELSE
IF BACKj.RIGHT = P
THEN BACKj.RIGHT := NIL
ELSE BACKj.LEFT := NIL
330 I Chapler 8 Binary Search Trees
END; (* deletenode *)
Alternate Deletion Algorithm 1331
ALTERNATE DELETION AlGOR~lHM
As an example, in Figure 8-14, the search for the value B ends with P
pointing to the node containing B and BACK pointing to the node contain-
ing J. Copies of these two pointers are sent to procedure DELETENODE.
(Remember, they are value parameters.) When the delete routine wants to
ROOT
BACK s
p K
modify the appropriate field of the parent node, it has no information re-
garding the relative position of NODE(P)-right or left? We must explicitly
check for this information.
It would be nice if, instead of an external pointer to NODE(P), we could
send the delete routine the actual pointer in the tree (see Figure 8-15). In
this case, the delete routine no longer needs to have a back pointer at all,
since the pointer to the node to be deleted (a VAR parameter) can be modi-
fied directly. As a matter of fact, we know the name of this pointer. If
NODE(P) is the root node, we call DELETENODE(ROOT). If
BACK i .LEFT = P, then we call DELETENODE(BACK i .LEFT).
Otherwise, we call DELETENODE(BACK i .RIGHT). Note that, in any
case, we are sending an actual pointer in the tree to the delete routine.
Since we will make this pointer a VAR parameter, we can make changes in
the tree directly.
ROOT
s
p
Figure 8 15.
co
Pointer BACK is external to the tree, but BACK i .LEFT is an actual
pointer in the tree.
IF NODE(P) is a leaf
THEN P ~ NULL
ELSE
IF NODE(P) has one child
THEN IF RIGHT(P) = NULL
THEN P ~ LEFT(P)
ELSE P ~ RIGHT(P)
Alternate Deletion Algorithm 1333
Actually, the cases of no children and one child can be considered to-
gether. In either case, ifRIGHT(P) = NULL, we want to set P to LEFT(P).
[When NODE(P) is a leaf, LEFT(P) will be NULL, but that's okay.] Other-
wise, if LEFT(P) = NULL, we set P to RIGHT(P). Let's start again:
IF RIGHT(P) = NULL
THEN P ~ LEFT(P)
ELSE IF LEFT(P) = NULL
THEN P ~ RIGHT(P)
ELSE ... we know that NODE(P) has two children. We will leave P in
place, and use a local pointer variable, TEMP, to search for the replace-
ment value. A local back pointer, BACK, is also used.
When the node containing the replacement value is found, the values
are copied into NODE(P); then the replacement node is deleted.
INFO(P) ~ INFO(TEMP)
IF BACK = P
THEN LEFT(BACK) ~ LEFT(TEMP)
ELSE RIGHT(BACK) ~ LEFT(TEMP)
The only thing left to do is to dispose of the deleted node. In the case of
the node with two children, we can now delete NODE(TEMP). But in the
other two cases, we have "jumped over" the deleted node without saving
its pointer. This problem is easily remedied by setting TEMP to P at the
beginning of the procedure. Now in all cases we finish with a call to DIS-
POSE(TEMP).
3341 Chapler 8 Binary Search Trees
END; (* deletenocle2 *)
Tree Traversals 1335
To complete Procedure FINDANDKILL, we can replace the last line
with
IF P = ROOT
THEN DELETENODE2(ROOT)
ELSE
IF BACKj.LEFT = P
THEN DELETENODE2(BACKj.LEFT)
ELSE DELETENODE2(BACKj.RIGHT)
TREE TRAVERSALS
Traversing a tree means to visit all of its nodes-for example, to print all of
the values in the tree. In traversing a linear linked list, we set a temporary
pointer equal to the start of the list, and then follow the links from one node
to the next until we reach a node whose pointer value is NIL. Similarly, to
traverse a binary tree, we initialize our pointer to the root of the tree. But
where do we go from there? To the left or to the right? Do we print the root
first, or should we print the leaves first?
Suppose we decide to print out the values in the tree in order, from
smallest to largest. We first need to print the root's left subtree-the values
in the tree that are smaller than the value in the root node. Then \ve print
the value in the root node. Finally, we print the values in the root's right
subtree-values that are larger than the value in the root node (Figure
8-16).
/-----------------'"
/ ,
/ ROOT print second \
\
, /'
/
J
."",.------- ---
"-
"- ........
/
/
/'
"'"----
-- , "-
\
~ print first A J T print last J
/
/
\_------------------ ---------_/ ,/
of the root (Figure 8-17), we need to repeat the whole procedure-print the
nodes to the left, print NODE(P), then print the nodes to the right. We will
move our pointer again to the left, not yet printing anything, but again we
won't be able to get back to nodes we have passed. We need some way of
keeping track of these nodes as we pass them. Later we will want to re-
trieve them, beginning with the node that was most recently saved and
proceeding backwards.
M
p
A R z
Figure 8·17. When you have finished with the left subtree, how can you get back
up to print the root and the right subtree?
Luckily we know of a data structure that is just perfect for this backtrack-
ing task-a stack. As we travel down the left branch as far as possible, we
push the pointers to the nodes we have passed onto the stack:
When P is NULL, we know the left subtree is empty, and we can climb
back into the tree by popping the stack. We print the value in the node
pointed to by P. Then we traverse that node's right subtree by setting P to
RIGHT(P) and repeating the whole routine.
How do we know when \ve are finished? In traversing a linear linked
list, we need only note ,vhen P is equal to NULL. When we are traversing a
binary tree, this condition is not sufficient; we also need to look at the
status of the stack. When P is NULL and the stack is empty, we are fin-
ished.
Tree Traversals 1337
Using our pointer-variable representation of a binary search tree, we
come up with the follo\ving procedure. We will assume that the declara-
tions for STACKTYPE, as well as the stack utility routines, exist elsewhere
in the program.
VAR STACK: STACKTYPE; ~::: stack of pointers useel to keep track of' noell's *)
(::: as they are passed, until the~~/ are printed :t:)
CLEARSTACK(STACK) ;
Figure 8-18 traces through this procedure, using a tree with only four
nodes.
START END OF FIRST WHILE LOOP
T T
p
STACK STACK
EMPTY \ NIL
POP AND WRITE GET RIGHT SUBTREE
P M M
T T
STACK P STACK
Output: A Output: A
M M
T T
STACK P STACK
Output: A Output: A B
\NIL STACK
Output: A B Output: A B M
STACK
T
P
STACK is EMPTY STACK
but P< > NIL, so ~
keep going NIL
Output: A B M Output: A B M
P
M M
T T
P
STACK
EMPTY '\NIL STACK
EMPTY
Output: A B M T Output: A B M T
Figure 8-18.
Tree Traversals 1339
Recursive Tree Traversals
The procedure given in the previous section is not exactly intuitive, and
only an example of a walk-through like the one in Figure 8-18 makes it
really clear how the tree is traversed. As we said in Chapter 7, the binary
search tree provides us with an example of a lTIOre elegant use of recursion.
Pascal offers us a way of performing the necessary stacking to keep track
of the nodes in the tree through its support of recursive programming. We
can write a very short recursive procedure to do this in-order printing oper-
ation.
Let's use the technique we developed in Chapter 7 to describe the prob-
lem in detail:
END; (* inorder2 *)
340 I Chapter 8 Binary Search Trees
To print out the whole tree, we initially call this procedure with the state-
ment INORDER2(ROOT). Note that the initialization of P to the root of the
tree is effected through the original calling argument.
As an exercise, use the three-question method to verify this procedure.
P F B H G S R Y T WZ
B R
INORDER: B F G H P R S T W Y Z
PREORDER: P F B H G S R Y T W Z
POSTORDER: B G H F R W T Z Y S P
BEG I N (* postorder *)
(* Base case: if P is NIL, do nothing. *)
IF P <::::- NIL
THEN (* general case :;:)
BEGIN
(* Traverse the left subtree. oj:)
POSTORDER(Pj.LEFT) ;
When might one want to use a postorder traversal of a binary search tree?
Consider the following situation: We \vant to traverse the tree and delete
all the nodes that Ineet some criteria. In this case, we could use any of the
traversal orders with the same final results. However, we know that delet-
ing a leaf is simpler than deleting a node with children. If we can start our
traversal at the bottom of the tree (where the leaves are) using a postorder
traversal, we will increase the likelihood of deleting leaves.
Input:
The dictionary of words not to include in the index on file WORDS. These
words are not in alphabetical order.
The text to be indexed on file BOOK. Pages are separated by the symbol #
surrounded by blanks. This end-of-page marker may appear anywhere in
the text between words.
3441 Chapter 8 Binary Search Trees
Output:
Words and page references for any word deleted from the index because of
frequency. If a word appears twice on the same page, the page number
should be listed only once.
MAIN Level a
GET DICTIONARY will read a list of words from file WORDS and save
them in the dictionary of words to reject.
REPEAT
GETWORD(WORD, BOOK)
IF WORD = ENDOFPAGE
increment CURRENTPAGE
UNTIL length(WORD) >= MINLENGTH
Before we can go on to the next level of our design, we must decide what
data structures to use. There are two decisions to be made here. We must
decide how to represent words and how to represent the index and the
dictionary of \\Tords to omit from the index.
There are t,vo ways that we can represent words: as packed arrays of
characters or as the stringtype defined in Chapter 2. Before determining
which representation to use, let's review what processing we will do to
words. They are to be read, compared, and printed. Each operation can be
applied to either representation. However, we will need to know the num-
ber of characters in a word. If we use the stringtype, determining the num-
ber of characters is a predefined function. We could, of course, write such a
function on our packed arrays.
Which then is the better representation? The comparison operation is
the operation that will be applied the most often as we search both the
dictionary and the index. The comparison operation defined on the
stringtype is slow because it is done character by character. Since we are
working with words only, not lines, we can determine a maximum number
of characters we wish to carry for each word and define the packed array to
be of this size. We can pad the shorter words with blanks and truncate the
longer words. We save space because we don't need to have a length field,
and the comparison operation is much faster because we can con1pare the
I I
3461 Chapter 8 Binary Search Trees
words directly. For these reasons, we decide to use a packed array of char-
acters to represent a string.
The index and the dictionary will be searched frequently, so the data
structure we choose should allow for rapid searching. A binary search tree
seems a logical choice for both the index and the dictionary. The question
is whether or not both trees can have the same type of node. It would
certainly simplify processing if they could.
Each tree will need a field for a word, but the dictionary will also need a
place to keep page numbers. One possible structure would be the follow-
ing:
NODE = RECORD
WORDFIELD : STRINGTYPE;
PAGELIST : RECORD
INDEX: O•• LIMIT;
PAGE: ARRAY[l •• LIMITJ
OF INTEGER
END; (* record *)
RIGHTt LEFT: NODEPTR
END; (* record :l:)
The INDEX field indicates the number of pages associated with a word
and PAGE[INDEX] contains the last page number entered in the list of
pages.
However, if we used the same node type to represent the dictionary,
there would be a lot of wasted space. One alternative would be to store a
pointer to the PAGELIST record in the node, rather than the record itself.
This pointer would contain NIL when the node is used to represent the
words in the dictionary. This way the only wasted space in the dictionary
node would be for the pointer, not the whole record containing the page
list. Another alternative would be to represent the page list as a linked list.
This would save space in the index because a cell would only be assigned
for a page if it were needed. The processing would be a little slower, how-
ever, because of repeated calls to NEW.
Which is best? Or rather, which is better? The answer depends on your
requirements. If space is the resource you need to conserve, you should use
Application: Index I 347
two different node types, one for the dictionary and one for the index. This
would minimize the use of space, but the processing would be more com-
plicated. We would have to have two sets of tree operations, one for each
node type.
Let us assume that programmer time is the resource we are conserving
and use one node type containing a pointer to the page list. We will not
make a decision on the representation of the page list. Instead, we will treat
the page list as a queue. Our operations on the page list will be in terms of
queue operations. Our nodes will simply contain a field of type QPTR, a
pointer to a queue. The only queue operation that we will have to change is
FULL. We will have FULL return TRUE when the number of elements in
the queue is equal to the limit on the number of pages.
GETWORD Level 2
WORD ~ blanks
COUNTER ~ 0
Skip to first character (CH) that is not blank or punctuation.
WHILE CH NOT a punctuation mark or a blank
AND COUNTER < MAXLENGTH
increment COUNTER
WORD[COUNTER] ~ CH
READ(DATA, CH)
Skip to next punctuation or blank if not already there.
3481 Chapter 8 Binary Search Trees
LENGTH Level 2
IF WORD = blanks
LENGTH ~ 0
ELSE
COUNTER ~ MAXLENGTH
WHILE WORD[COUNTER] = blank DO
COUNTER ~ COUNTER - 1
LENGTH ~ COUNTER
This algorithm to determine the length of a string will work with any
string. However in this particular case, we can actually replace this algo-
rithm with a simple Boolean expression. We really don't need to kno\v the
length; we just need to knoV\' if the word is at least MINLENGTH charac-
ters. Since our words do not contain any embedded blanks, we can simply
ask ifWORD[MINLENGTH] is a blank. If it isn't, then the word is at least
MINLENGTH characters long. Therefore our terminating expression in
the REPEAT-UNTIL in GETWORD will be a Boolean function,
LENGTHOK. Function LENGTHOK is simply the one line
INSERT Level 2
PRINTQUEUE Level 2
\VRITE(P t ,WORDFIELD)
\VHILE NOT ENIPTYQ(P t .QUEUE t ) DO
DEQ(P i .QUEUE i , PAGE)
WRITE(PAGE:4, ',')
\Ve have not yet considered error conditions. What will happen to GET
A WORD if the last word in either fIle is less than three characters long? We
will get a TRIED TO READ PAST END OF FILE message. Since one of
the luarks of a good program is robustness, we must take care of this situa-
tion. Remember that robustness means that the program will not crash.
That is, all abnormal conditions are taken care of within the program itself.
vVe will therefore have to add EOF to the terminating condition of our
REPEAT loop in GET A WORD. PROCESS WORD must be skipped if
EOF is TRUE.
Application: index I 351
NODE RECORD
WORDFIELD : STRINGTYPE;
QUEUE·: QPTR ; (:1: pointer to queue of page l1ulnbers *)
RIGHT, LEFT: NODEPTR
END; (:1: record *)
(************************************************
3521 Chapter B Binary Search Trees
<************************************************
PROCEDURECLEARQ (l.,JAR<CJ : CJUEUETYPE);
(* Initialize Q to elnpty condition. *)
(************************************************ )
IN
EMPTYQ = Q.FRONT
END;
<************************************************
FUNCTION FULLCJ (CJ : CJUEUETYPE) : BOOLEAN;
(* Returns TRUE if Q is full; FALSE otherwise. *)
+ 1
(************************************************_
1
Application: Index \ 353
BEGIN (* enq*)
IFREAR =MA}{QUEUE
Q.
THEN Q.REAR := 1
ELSE Q.REAR := Q.REAR + 1;
Q.ELEMENTS[Q.REARJ := NEWPAGE
ENID; (:f: enq*)
(*********r**************************************
PROCEDURE DEQ (VAR Q : QUEUETYPE; VAR NEXTPAGE
PAGETYPE) ;
(*Ren10ves NEXTP.A.GE fnnn the Q.Asslllnes queue is. not enlpty. *)
BEG I N (* deC] *)
IF Q.FRONT = MAXQUEUE
THEN Q.FRONT := 1
ELSE Q.FRONT := Q.FRONT + 1;
NEXTPAGE :=iQ .ELEMENTS[Q. FRONT]
END; (* deq· .*)
(************************************************
FUNCTION QUEUEREAR (Q : QUEUETYPE) : PAGETYPE;
(* QUEUEREAR returns the last elen1ent inserted into the queue, *)
(* .leaving the queueuncbangecl. A.sSUUles that Q is·· notenlpty. *)
BEG I N (* queuerear *)
QUEUEREAR := Q.ELEMENTS[Q.REARJ
END; (* queuerear *)
(************************************************
PROCEDURE PRINTQUEUE (P : NODEPTR);
(:;: Prints an clitry in the dicti()lIar~', follo\\'cd h~' thl' Ptt~l'.C; U]l \vhich the l'ntr~' :;:)
(:;: occurs. Page's are 11ril1tcd ill Illlnieric order (lit rill' B()OI'l:\L)li~:\. :;:)
(:,: (2 ttl' II (" 0 r p a ,L; c sis
[) 1SF( )S I'~ J). Pl. C) U E l...
1 l'~ iss (' t t() :\ I L. :;: )
WRITELN(BOOKINDEX) ;
DISPOSE(Pt·QUEUE) ;
Pt.QUEUE := NIL
END; (:1: printquellc :1:)
(************************************************
PROCEDURE INSERT (VAR·. ROOT : ; WORD STRINGTY
(* Insertnoderepresentingnew.\vord into
llAR NEWNODEt be
CURRENT, node
BACK: NODEPTR; (* trail ing node
BEG IN (*iinsert *)
IF DEBUG«* • debugistr~leduring testing . • *)
THEN/WRITELN1DEBUGFILEt WORD,
'BEING INSERTED IN DICTIONARY');
<************************************************
<************************************************
3561 Chapter 8 Binary Search Trees
(************************************************ )
Application: Index 1357
(* is FALSE, and BACK points to the node in the index to \vhich \V()HI)'s node :;:)
(* should be attached. *)
(************************************************
PROCEDURE UPDATEPAGELIST (BACK: NODEPTR;
ENTRY: NODEPTR;
VAR INDEX: NODEPTR;
VAR DICTIONARY: NODEPTR);
(* If CUHRENTPAGE is on page list, nothing is done. ()ther'vvise page is put :1:)
(:1: on page list. If nU111ber of pages is at the linlit, the ENTHY is deleted :1:)
(* fron1 the INDEX and the vvord is printed along \vith its page list. :::)
(* The \vorel is then inserted into the DICTIONAHY of \v(nds not to he indexed. Of:)
I
3581 Chapter 8 Binary Search Trees
(************************************************ )
PROCEDURE INSERTWORD (BACK: NODEPTR;
WORD: STRINGTYPE;
VAR INDEX : NODEPTR);
(::: \~l()HD is inserted into the INDEX at the place pointed to by BACK. :::)
(************************************************
(************************************************
360 I Chapter 8 Binary Search Trees
(************************************************
PROCEDURE GETWORD (VAR DATA: TEXT;
VAR WORD: STRINGTYPE);
(::: Stores into \von1 characters beginning at the next character NOT IN :;:)
(::: PUI\CTUA'TION and continue..; until either a character IN PUNCTUATIOI\ OJ:)
WORD := BLANKS;
COUNTER := 0;
REPEAT
READ(DATAt CH)
UNTIL NOT (CH IN PUNCTUATION) OR EOF(DATA);
(::: Head nntil a <.Jlaraclcr II\' P'l)NCTUATION is found, in case VV()HIJ :::)
( \V(L~ longer than \IAXLE:\CTII. :j:)
(************************************************
Application: Index I 361
(************************************************
PROCEDURE GETAWORD (VAR WORD: STRINGTYPE;
VAR DATA: TEXT);
(* Invokes GET\VOHD and checks to see if vVORD returned is an *)
(* end of page 111arker, increlnents CUHRENTPAGE if it is. *)
(* Checks to be sure W()RD is at least InininHUll length and *)
(* returns to get another \VC)HD if it is not InininHul1 length. :1:)
(************************************************
PROCEDURE GETDICTIONARY (VAR DICTIONARY: NODEPTR;
VAR WORDS : TEXT);
(* vVords not to be included in the index are read frol11 file \VORDS *)
(:/: and stored into a binary search tree DICTIONAHY. *)
(************************************************
3621 Chapter B Binary Search Trees
REWRITE(600KINDEX) ;
RESET(BOOK) ;
RESET(WORDS) ;
PUNCTUAT ION : = [ I , I, I. I t '! I, '; I, I: I, I '];
IF DEBUG
(:!: debug is true during testing :t:)
THEN TESTPRINTTREE(DICTIONARY);
(:j: Head the text of the rnanuscript on file BOOK- processing :::)
(:1: each \\lord individually. If a word is not in the dictionary :1:)
(* of \\lords to be skipped, it is entered into the index and the *)
(* pages on \vhich it occurs are recorded. When the l1Ulllber *)
(:r: of pages on \vhich a \\lord occurs reaches a predetennined liinit, *)
(;:: the \\lord is relnoved froin the index, its page list is printed, *)
(* and the \'lord is entered into the dictionary of \vords to skip. ;j:)
WHILE NOT EOF(BOOK) DO
BEG~N
GETAWORD(WORD, BOOK);
IF NOT EOF(BOOK)
THEN
PROCESSWORD(WORD, INDEX, DICTIONARY)
(::: Print each \vord in the index along \\lith the page nU111bers :t:)
(r: on \vhich it occurs. :::)
INORDER(INDEX)
Testing and Debugging: There were two major logic errors in the first
version of this program that we wrote. The function to check the last ele-
Application: Index 1363
n1ent entered into tl1e queue was coded to check the front of the queue
instead of the rear. (This error has been corrected for the top-down design
shown here.) The second error ,vas that the index was passed as a value
parameter to the INSERT procedure. Therefore the external pointer to the
index was never changed; it remained nil.
We tracked down these errors in our testing by inserting intermediate
prints which were activated by a Boolean variable called DEBUG.
DEBUG was set to TRUE in the CONST section and the DEBUG output
was written to a file called DEBUGFILE. We have given the code for this
program exactly as it was on the final test run, including all the debugging
aids used. The text input and the two output files follow in the next section.
In a small program like this, it would be easy to go through and remove
the debugging print statements. However, in a large project they should be
left in the program. The DEBUG constant would be changed to FALSE so
that the testing output would not be generated during production runs. If
the program were altered at a later date, the debugging output would be
turned on again for further testing.
The testing of this program also uncovered an omission in the design.
The program treats lower and upper case letters as different characters.
Therefore, "Run" and "run" are considered to be different words. Unless
our input text consists only of upper case letters, ,ve must do something to
equate the upper and lower case versions of the alphabetical characters.
The easiest solution is to convert all letters, as they are read in, to either the
upper or the lower case representation, depending on which way you
would like to print out the text.
This conversion requires knowledge of the character set for the machine
on which the program will be run. Note that this modification to the pro-
gram has a major software implication. Since character sets differ, your
progralTI will no longer be portable; it will only run on n1achines that have
the same character set.
File: BOOK
This is a story of Dick and Jane. # Dick and Jane are friends. # Dick and
Jane like to run. # S~e Dick run. Dick can run very fast. # See Jane run.
# Run, Dick and Jane. # At the end of the day, Dick and Jane are tired. #
You would be tired if you ran like Dick and Jane.
3641 Chapler 8 Binary Search Trees
File: WORDS
This
and
are
I i ~\ e
the
thiswordistoolon9
1"lould
}' a u
File: BOOKINDEX
Die ~\ 1 2 3 a G
Jane 1 2 3 5 G
Run G
See a 5
You 8
can a
da}' 7
end 7
fast a
friends 2
ran 8
run 3 a 5
s tor}' 1
tired 7 8
\} e r}' a
File: DEBUGFILE
Di c ~\ IS NOT IN DICTIONARY
Di c ~\ IS BEING INSERTED IN THE I NDE}{
and IS FOUND IN DICTIONARY
Jane IS NOT IN DICTIONARY
Jane IS BEING INSERTED IN THE I NDE}{
Di c ~\ IS NOT IN DICTIONARY
Di c ~\ IS FOUND IN I NDE}-{
and IS FOUND IN DICTIONARY
Jane IS NOT IN DICTIONARY
Jane IS FOUND IN I NDE}{
are IS FOUND IN DICTIONARY
friends IS NOT IN DICTIONARY
friends IS BEING INSERTED IN THE I NDE}{
Di c ~\ IS NOT IN DICTIONARY
Di c ~\ IS FOUND IN INDE}-{
and IS FOUND IN DICTIONARY
Jane IS NOT IN DICTIONARY
Jane IS FOUND IN I NDE}-{
1i ~\ e IS FOUND IN DICTIONARY
run IS NOT IN DICTIONARY
run IS BEING INSERTED IN THE I NDE}{
See IS NOT IN DICTIONARY
See IS BEING INSERTED IN THE I NDE}{
Di c ~\ IS NOT IN DICTIONARY
Di c ~\ IS FOUND IN I NDE}-{
run IS NOT IN DICTIONARY
run IS FOUND IN INDE}-{
Di c ~\ IS NOT IN DICTIONARY
Di c ~\ IS FOUND IN I NDE}{
can IS NOT IN DICTIONARY
can IS BEING INSERTED IN THE I NDE}{
run IS NOT IN DICTIONARY
run IS FOUND IN I NDE}-{
\) e r}' IS NOT IN DICTIONARY
l.J e r}' IS BEING INSERTED IN THE I NDE}{
fast IS NOT IN DICTIONARY
fast IS BEING INSERTED IN THE INDE}-{
See IS NOT IN DICTIONARY
See IS FOUND IN INDEH
Jane IS NOT IN DICTIONARY
Jane IS FOUND IN INDE}-{
run IS NOT IN DICTIONARY
run IS FOUND IN I NDE}-{
Run IS NOT IN DICTIONARY
Run IS BEING INSERTED IN THE I NDE}{
Di c ~\ IS NOT IN DICTIONARY
Di c ~\ IS FOUND IN INDEH
Di c ~\ BEING INSERTED IN DICTIONARY
I
3661 Chapter 8 Binary Search Trees
1. Use the following tree to answer each question independently. (Answer each
question for the original tree.)
43
9 49
4 55
31
Exercises \ 367
(a) Show what the tree would look (i) Show what would be printed by a
like after adding node 3. postorder traversal of the tree.
(b) Show what the tree would look (j) Show \vhat would be printed by a
like after adding node 90. preorder traversal of the tree.
(c) Show what the tree would look (k) Show what would be printed by an
like after adding node 56. inorder traversal of the tree.
(d) Show what the tree would look (I) What is the maximum possible
like after deleting node 20. number of nodes in the tree at the level
(e) Show what the tree would look of node 55?
like after deleting node 43. (m) What is the maximum possible
(f) Show what the tree would look number of nodes in the tree at the level
like after deleting node 55. of node 31?
(g) What are the ancestors of node 33? (n) How ll1any nodes would be in the
tree if it were completely full down to
(h) What are the descendents of node
the level of node 31?
20?
2. The information field of the nodes of a binary tree contains three-letter words.
Show how the tree will look after the following words are read in (in this order).
Assume the tree is empty before you begin adding nodes.
fox dog leg hoe egg elf boy box zoo
3. How would a binary tree look if the information were already ordered when it
was read in?
5. The binary search tree is ordered according to student number. Print the names
of all the women, ordered from smallest to largest 10 nUlnber. (Write a recursive
procedure.)
6. A binary tree contains integer values in the INFO field of each node. Write a
function, SUMSQRS, that returns the sum of the squares of the values in the tree.
7. Write a nonrecursive procedure, ANCESTOR, that prints the ancestors of a given
node whose INFO field contains a value NUM. NUM only occurs once in the
tree. Do not print NUM. You may assume that ROOT is not empty. Use the
following procedure heading:
PROCEDURE ANCESTOR (ROOT: PTR; NUM : INTEGER);
8. (a) Write a recursive procedure, ANCESTOR. Use the follo\ving heading:
PROCEDURE ANCESTOR (P : PTR; NUM : INTEGER);
(b) Write a call to Procedure ANCESTOR in Exercise 8a to print the ancestors
of the node containing the value 14.
9. Develop the delete algorithm using the immediate successor of the value to be
deleted for the case of deleting a node with two children.
Pre-Test
Use the following tree for problems 1 and 2.
PRE-TEST
D s
A E w
D
G
BEG I N (* doubleorcler *)
IF P <:::- NIL
THEN BEGIN
WRITE(Pj.DATA) ;
DOUBLEORDER(Pj.LEFT) ;
DOUBLEORDER(Pj.RIGHT) ;
WRITE(Pj.DATA)
END (* if *)
END; (:1: dOll bleorcler *)
4. Write a Pascal procedure that takes the root of a binary tree as an input parame-
ter and traverses this tree level by level (starting with the root node and then
from left to right on each level). The data field of each node is printed when that
node is visited.
Example: This procedure should print the data fields of the tree given in prob-
lem 3 in the following (alphabetical) order: ABCDEFGJM.
370 I Chapter 8 Binary Search Trees
Hint: Use a queue of tree pointers. You may assume that all queue utility rou-
tines are given and that queuetype is appropriately declared as a TYPE of
queue of tree pointers.
5. Fill in the blank (less than, equal to, lTIOre than). The maximum number of
nodes that can be on level i of a binary tree is the sum of all the
nodes in the tree from the root through level i-I.
6. You are writing a program which uses a binary tree as the basic data structure.
For generality you decide to use an array of records instead of Pascal pointer
variables. Given the following declarations, show how the tree would look in
memory after it has been loaded in the order shown in the INFO field. Be sure
to fill in all spaces. Available space is linked in the LEFT field. If you do not
know the contents of a space, insert a "?".
7. Show the contents of NODES after B has been inserted and R has been deleted.
You may fill in all spaces or only those that have changed.
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
3721 Chapter 9 Trees Plus
Goals
To be able to show how an arithmetic expression may be
stored in a binary tree.
To be able to build and evaluate a binary expression tree.
To be able to show how a binary tree may be stored in a
nonlinked representation in an array.
To be able to define the following terms:
iii full binary tree
I!I complete binary tree
iii heap
To be able to explain the strategy used in the heapsort
algorithm.
To be able to define the following terms related to graphs:
l!I directed graph
iii undirected graph
III vertex
iii edge
I!I path.
III complete graph
iii weighted graph
Ii adjacency matrix
iii adjacency list
To be able to explain the difference between a depth-first and
a breadth-first search, and to describe how stacks and queues
can be used to implement these searching strategies.
+
*
5 2 8 5 4 3 9 3
=5 + 2 =8 - 5 = 9/3
Note that the values are in the leaf nodes, while the operator is in their
parent node.
*
+
12 3 4
Let's start at the root node and evaluate the expression. The root contains
the operator, *, so we look at its children to get the two operands. The
subtrees to the left and right of the root contain the two operands. Since the
node to the left of the root contains another operator, -, we know that the
left subtree itself consists of an expression. We must evaluate the subtrac-
tion of the value in that node's right child from the value in its left child
before we can do the multiplication. Similarly, in the node on the right of.
the root, we find another operator, +. Therefore, we know that we must
evaluate the addition of the operands in that node's left and right children
before we can do the multiplication. This example illustrates that the oper-
ations at higher levels of the tree are evaluated later than those below them.
37 41 Chapler 9 Trees Plus
The operation at the root of the tree will always be the last operation per-
formed.
See if you can determine the expressions represented by the following
binary trees:
(a) + (b)
D w +
B c x y
Expression Evaluation
Let's develop a function to evaluate a binary expression tree. We know that
the value of the complete tree is equal to
(operandI) (bin operator) (operand2)
where (bin operator) is one of the binary operators (+, -, *, or /) in the root
node, (operandI) is the value of its left subtree, and (operand2) is the value
of its right subtree. What is the value of the left subtree? If the left subtree
consists of a single node containing a value, operandI is that value itself. If
the left subtree consists of an expression, we must evaluate it. Of course,
we can use our same expression evaluation function to calculate its value.
That is, our function will be a recursive one. The right subtree is evaluated
similarly.
Let's summarize the recursive solution,
- - - - - - - - - FUNCTION EVAL - - - - - - - - -
Definition of problem.' Evaluate the expression represented by the
binary tree.
Size of problem.· The whole tree pointed to by TREE.
Base case: If the content of the node is an operand, THEN
EVAL ~ the value of the operand
General case: If the contents of the node is an operator, THEN
EVAL ~ EVAL(left subtree) (bin operator) EVAL(right subtree)
IF INFO(TREE) is an operand
THEN EVAL ~ INFO(TREE)
ELSE (* It is an operator. *)
CASE INFO(TREE) OF
'+' : EVAL ~ EVAL(LEFT(TREE)) + EVAL(RIGHT(TREE))
'-' : EVAL ~ EVAL(LEFT(TREE)) - EVAL(RIGHT(TREE))
'*' : EVAL ~ EVAL(LEFT(TREE)) * EVAL(RIGHT(TREE))
'1' : EVAL ~ EVAL(LEFT(TREE)) / EVAL(RIGHT(TREE))
END CASE
When you try to code this function, you notice that we are using the
INFO field of the tree node to contain two different types of data-
sometimes a character representing an operator, and other times a numeric
value. How can we represent two data types in the same field of the node?
Pascal and several other high-level languages provide a data type that is
ideal for this situation: the variant record. This data type allows us to use
record variables for multiple purposes. (See Appendix G for a review of the
syntax of the Pascal variant record type.) We can declare each tree node as
a record, with the INFO field varying according to the data type that will be
stored in it, either operator or operand.
The declaration of the variant record type TREENODE has two parts:
• A fixed part, consisting of LEFT and RIGHT, the pointers to left and
right children. These fields will be included in every variable of type
TREENODE.
• A variant part, which begins with the keyword CASE. The field CON-
TENTS is called the tag field, and its type, INFOTYPE, is the tag
type. The value in the tag field will determine which set of additional
fields will be part of the record variable. In this case, we see from the
declaration of INFOTYPE that the CONTENTS field may take on the
values OPERATOR and OPERAND. When CONTENTS has the
value of OPERATOR, the TREENODE-type record includes a charac-
ter field OPER, in addition to LEFT and RIGHT; this field will be
used to contain a character that represents an operator. When CON-
TENTS has the value of OPERAND, the TREENODE-type record
3761 Chapter 9 Trees Plus
includes the real number field VAL, in addition to LEFT and RIGHT;
this field will be used to contain the value of an operand.,
U sing these declarations, we can now write the function.
BEG I N (* eval *)
IF TREE1.CONTENTS = OPERAND
THEN EVAL := TREEj.VAL
END; (* eval *)
*
+ *
F G
8 C
Figure 9-1.
Binary Expression Trees 1377
corresponding prefix, infix, and postfix notations. Note that we cannot write
the infix notation directly from the tree because of the need to add paren-
theses.
* B + * AY B
A Y
The general approach that we will use to put nodes into the tree is as
follows: insert new nodes, each time moving to the left until we have put in
an operand. Then backtrack to the last operator, and put the next node to its
right. We continue in the same pattern: if we have just inserted an operator
node, we put the next node to its left; if we have just inserted an operand
node, we backtrack and put the next node to the right of the last operator.
In addition to the tree we are creating, we will need a temporary data
structure in which to store pointers to the operator nodes, to support the
backtracking we just described. Did you guess from the word backtrack
that we would use a stack? We will also use a flag, NEXTMOVE, to indicate
whether our next node should be attached to the left or the right, based on.
whether the current node contains an operator or an operand. (We are as-
suming the presence of some special character to denote the end of the
expression. In the figures, we are representing this character as a semi-
colon; however, its identity is not relevant to the processing.)
3781 Chapter 9 Trees Plus
NEXTMOVE ~ LEFT
CLEAR stack
get next SYMBOL
ROOT_~
NEWNODE ~
empty *+A-BCD;
t
PTRSTACK SYMBOL
NEXT~10VE = LEFT
Figure 9-2. Before first iteration of loop.
ROOT ---..
LASTN0 D E ~.J&-'-"---.l----'-""'-"--:......I
NEWNODE--~
*+A-BCD;
LASTNODE
t
PTRSTACK SYMBOL
NEXTMOVE = LEFT
Figure 9-3. End of first iteration of loop.
380 I Chapter 9 Trees Plus
ROOT ----.
(LASTNODE o) ~
LASTNODE 1 ---e>
lZI. A [Zl-NEWNODE
LASTNODE 1
*+A-BCD·
LASTNODE o t '
PTRSTACK SYMBOL
NEXTMOVE = RIGHT
Figure 9-4. End of second iteration of loop.
~ o=:::r=J- NEWNODE
LASTNODEo I *+A-BCD·
t '
PTRSTACK SYMBOL
NEXTMOVE = LEFT
Figure 9-5. End of third iteration of loop.
ROOT~
(LASTNODE o) ~
(LASTNODE 1) - - .
. . - LASTNODE 2
.--NEWNODE
LASTNODE 2
*+A-BCD·
LASTNODE o t '
PTRSTACK SYMBOL
NEXTMOVE = RIGHT
(Next node will be attached to the RIGHT of NODE(LASTNODE 2 ).)
Figure 9-6. End of fourth iteration of loop.
3821 Chapler 9 Trees Plus
ROOT --.
(LASTNODE o) ~ L+-----I------I_-.I
(LASTNODE 1) -.
.- LASTNODE 2
~-NEWNODE
LASTNODE o *+A-BCD·
PTRSTACK
t'
SYMBOL
NEXTMOVE = RIGHT
(Next node ·will be attached to the RIGHT of NODE(LASTNODE o).)
Figure 9..7. End of fifth iteration of loop.
ROOT---+
[Z[E]ZI-NEWNODE
*+A-BCD;
empty
PTRSTACK SYMBOL
t
NEXT~10VE = RIGHT
(Next symbol is last symbol, so we quit.)
Figure 9·8. End of sixth (last) iteration of loop.
Binary Expression Trees \ 383
Did you get the same result? Do a preorder traversal of the resulting tree to
see if it indeed matches the original prefix expression.
Before we write Procedure BUILDTREE, let's specify clearly what the
input will look like. It will be a valid prefix expression, made up of single-
letter operands and the binary operators +, *, -, and I. The expression will
be followed by some delineating character, which we will know as
LASTSYMBOL. We will have available the lower-level procedure GET-
SYM, which takes a prefix expression (TYPE = PREFIXTYPE) and returns
the next character in the expression. We will not worry about how GET-
SYM does this. We will also be able to call the CLEARSTACK, PUSH, and
POP procedures for using the stack of pointers.
Procedure BUILDTREE takes a prefix expression and returns the
pointer to the root of the equivalent binary expression tree.
BEG I N (* builcltree *)
Now that we know how to put a prefix expression into a binary tree, what
can we do with the tree? We have already seen that we can evaluate it or
print it in several different notations. We can also use the tree format to do
more complicated processing, like differentiating the expression with re-
spect to one of its variables. This problem is included in the programming
assignments for this chapter. Don't panic-this task is much easier than it
sounds, once the expression is in a binary expression tree. As a matter of
fact, you don't even need to know any calculus to write the program, since
all the processing is described in the program's specifications.
A NONL~NKIED
REPRESENTATION Of BINARY TREES
Our discussion of the implementation of binary trees has so far been lim-
ited to a scheme in which the pointers from parent to children are explicit
in the data structure. A field is declared in each node for the pointer to the
left child and the pointer to the right child.
A Nonlinked Representation of Binary Trees 1385
There is a way to store a binary tree in an array in which the relation-
ships in the tree are not physically represented by link fields, but are im-
plicit in the algorithms that manipulate the tree stored in the array. The
code is, of course, much less self-documenting, but we save memory space
because there are no pointers.
Let's take a binary tree and store it in an array in such a way that the
parent-children relationships are not lost. We will store the tree in the array
level by level, left to right. This mapping is as follows:
TREE
[1 ] 'Of
0
[2] 'J'
[3] 'S'
J S [4] '(3'
[5]
c M Q U [6]
[7]
The number of nodes in the tree is NUMNODES. The tree is stored with
the root in TREE[l] and the last node in TREE[NUMNODES].
If we can take the array representation and redraw the tree, the relation-
ships have been maintained. We can then write algorithms to manipulate
the tree in this form. An examination of where in the array a node's children
reside will give us the information we need to reconstruct the tree.
TREE[l]'s children are in TREE[2] and TREE[3]
TREE[2]'s children are in TREE[4] and TREE[5]
TREE[3]'s children are in TREE[6] and TREE[7]
Do you see the pattern? For any node TREE[I],
TREE[I]'s left child is in TREE[I * 2]
TREE[I]'s right child is in TREE[I * 2 + 1]
Another way of saying this is that the root of the tree is stored in
TREE[l], and for a node stored in TREE[I], the root of its left subtree will
be in TREE[I * 2] and the root of its right subtree will be in TREE[I * 2 +
1], provided I * 2 is less than NUMNODES. Notice that the nodes in the
array from
TREE[NUMNODES DIV 2 + 1] to TREE[NUMNODES]
are leaf nodes.
In fact, we can take any arbitrary array and create a binary tree. Whether
it means anything as a tree is another matter. We will use this fact in the
discussion of the heapsort algorithm later in this chapter.
Let's write a procedure to print the elements in the tree in preorder
using the array representation of the tree. Rather than reinvent the wheel,
3861 Chapter 9 Trees Plus
let's see if we can't simply adapt Procedure PREORDER written for the
dynamic storage representation.
Let's go through this procedure line by line and see if we can make the
switch to the new representation. The parameter will be an index rather
than a pointer. What does P <> NIL actually tell us? If this condition is
false (if P is equal to NIL), the last node we visited was a leaf node. In the
array representation, if the index into the array is outside of the array
bound, the last node visited was a leaf node. Now we simply replace the
pointer reference with the corresponding array reference and we have fin-
ished changing the procedure.
What happens if the tree is not completely filled out? For example, the
whole right subtree of the node whose value is 0 might not he there. To use
this representation, we must store a dummy value in those positions in the
array in order to maintain the proper parent-child relationship. What that
dummy value might be would depend on what the information represented
in the tree actually is. If the tree represented data that could not logically
Heapsort 1387
contain a nun1ber, we could use a nlunber as a dun1my value. If it is possi-
ble that your tree might not be completely filled out, the algorithms to
111anipulate the tree have to reflect this possibility. For exa111ple, to deter-
mine if the last node visited was a leaf, you would 11ave to compare the
value in DATA[I] to tl1at dummy value, after checking to see if I is within
the array bounds.
In the next section we will discuss a sorting technique that is based on a
binary tree represented in an array.
HlEAPSORT
The heapsort sorting algorithm sorts an array of values tl1at represent a tree
with a special property: the heap property. We will need several forInal
definitions in order to develop this algorithm.
A full binary tree is a binary b ee in which all the leaves are on the same
4
level and every nonleaf node has two children. The tree in Figure 9-8 (page
385) is a full binary tree.
A complete binary tree is a binary tree that is either full or full through
the next-to-Iast level, with the leaves on the last level as far left as possible.
Figure 9-9 shows S0111e exaInples.
A cOInplete binary tree is a heap if, for every node, tl1e value stored in
0
~
(a) (b)
c?
(c) (d)
complete
neither
(e) (f)
neither complete
Figure 9·9.
388\ Chapler 9 Trees Plus
that node is greater than or equal to the value in each of its children. If a
heap is stored in an array representation as described earlier, the heap
property means that, for every nonleaf node TREE[I],
(TREE[I] >== TREE[I * 2]) AND (TREE[I] >== TREE[I * 2 + 1])
The figure that follows is a tree with the heap property and its represen-
tation in an array. Take a good look and see ,vhat it is about tl1is representa-
tion that should 11elp us in sorting the array.
We know \vhere the maxin1um value in the array is: it will always be in
TREE[l] if the tree is a heap. We willlnake use of this fact in the following
strategy. The values we wish to sort are stored in the array, DATA, which
contains MAXDATA elements.
TREE
z [1] f$/:
[2]
s T [3]
[4]
o C G E [5]~O·
[6] (G~
[7] ~E'
B
[8] JB'
DATA[ 1 J •• DATA[MA}{DATA - 1]
DATA[lJ •• DATA[MAXDATA - 2]
and so on.
This process is pictured as follows:
Heapsort 1389
DATA
[1]
.........-----1
Now we can write the HEAPSORT procedure. Yes, we have not yet
shown how to build a heap. But in good top-down fashion, we will give this
task a name, BUILDHEAP, and COlne back to it later.
discussion at the point that BUILDHEAP is called the first tin1e through
the loop. We kno\\7 that the left subtree and the right subtree of the root are
still heaps, because only the root node, DATA[I], l1as been changed.
STILL A HEAP
When BUILDHEAP is called within the loop, there are two possibili-
ties. If the value in DATA[I] is greater than or equal to the values of its
children, the heap property is intact and we don't have to do anything. In
the other case, we know that the maximum value that we need to put into
DATA[I] is either in its left child, DATA[2], or in its right child, DATA[3].
We will determine which of the children has the larger value and swap
that value with the value in DATA[I]. Now ,ve know that either the subtree
whose root value was swapped with DATA[l] is a heap, in which case we
are through, or its root value Inust be swapped with the maximum of the
values of its children.
Yes, this is a recursive process. We are working with slnaller and smaller
subtrees until
1. the value in the root of the subtree being examined is greater than the
values of its two children, or
2. the root of the subtree being examined is a leaf node.
Since this is a recursive process, a recursive solution seems logical.
19 2
36
2 2
100
Figure 9-10.
" 100 19 7
AFTER BUILDHEAP
COUNT = 4 25 IH 3 100 2 7
AFTER COUNT = 3 25 17 100 19 :3 36 2 7
AFTER COUNT = 2 25 19 100 17 3 36 1 2 7
AFTER COUNT = 1 100 19 36 17 3 25 1 2 7
Tree is a heap.
Figure 9-11.
3941 Chapler 9 Trees Plus
The subtrees whose roots contain the values 19, 7, 3, 100, and 1 are
heaps because they are leaf nodes. Therefore, the shaded subtree in Figure
9-10(a) has left and right subtrees that are heaps. Procedure BUILDHEAP
will take the tree whose root has the value 2 and return the tree as illus-
trated in Figure 9-10(b).
The shaded subtree in Figure 9-10(b) has left and right subtrees that are
heaps. Procedure BUILDHEAP will take that tree and return the one in
Figure 9-10(c). Procedure BUILDHEAP will take the shaded tree in Fig-
ure 9-10(c) and return the tree in Figure 9-10(d).
Now we just have to apply BUILDHEAP to the whole tree and we have
changed the original tree shown in Figure 9-10(a) into the heap shown in
Figure 9-10(e). We have done this conversion by successively applying
Procedure BUILDHEAP.
We can now see that we will have to change our Procedure HEAPSORT
slightly. The first call to BUILDHEAP must be within a loop where the
variable for the lower bound starts with the first nonleaf node and ends
with 1. What is the first nonleaf node? Since half of the nodes of a complete
binary tree are leaves (prove this yourself), the first nonleaf node may be
found at DATA[MAXDATA DIV 2].
PROCEDURE HEAPSORT (VAR DATA: ARRAYTYPE;
MAXDATA : INDEXTYPE);
(* Sorts the first MAXDATA elenlents of array DATA in *)
(* ascending order, llsing the heapsort algorithnl. *)
BEG I N (* heapsort *)
(* Build the original heap. *)
FOR COUNT := (MAXDATA DIV 2) DOWNTO 1 DO
BUILDHEAP(DATA, COUNT, MAXDATA);
END; (* heapsort *)
Figure 9-12 takes the heap created in Figure 9-11 (the last line) and shows
the changes in the array that occur as a result of each iteration of the sorting
loop. The elements that are sorted are underlined.
Graphs 1395
[1] [2] [3] [4] [5] [6] [7] [8] [9]
HEAP
SWAP
BUILDHEAP
SWAP
BUILDHEAP
SWAP
BUILDHEAP
SWAP
BUILDHEAP
SWAP
BUILDHEAP
SWAP
BUILDHEAP
SWAP
BUILDHEAP
SWAP
BUILDHEAP
EXIT FROM
SORTING LOOP
Figure 9-12.
GRAPHS
Binary trees provide a very useful way of representing relationships in
which a hierarchy exists. That is, a node is pointed to by at most one other
node, and each node points to at most two other nodes. If we remove the
restriction that each node can point to at most two other nodes, we have a
nonbinary tree, as pictured on the next page.
3961 Chapter 9 Trees Plus
More Terminology
If (Vi, Vj) is an edge in E(G), Vi and Vj are said to be adjacent. If (Vi, Vj) is
a directed edge, you say Vi is adjacent to Vj, and Vj is adjacent from Vi.
A path from vertex Vi to vertex Vj is a sequence of vertices that connects
Graphs \ 397
V(G1) = {A, B, C, D}
E(G1) = {(A, B), (A, D), (B, C), (B, D)}
A picture of G1
A picture of G2
c M a U
V(G3) = {1, 3, 5, 7, 9, 11 }
E(G3) = {(1, 3), (3, 1), (5, 7), (5, 9), (9, 11), (9, 9), (11, 1)}
A picture of G3
Figure 9-13.
Vi to Vj. For a path to exist from Vi to Vj, there must be edges (Vi, Vk 1),
(Vk 1 , Vk2 ), . . . , (Vkn , Vj). That is, there must be an uninterrupted se-
quence of lines from Vi through any number of nodes to Vj. In the following
graph there is a path from vertex A to vertex G, but not from vertex A to
vertex c.
A
M
3981 Chapler 9 Trees Plus
Figure 9-14.
MYrtIEt:f~hs.·
~35
Carsont()wn
~.95 ..Lubti~~g~· .
i)6,~st!<lnla ~~~·i~~X •.•..•.
~.25 . M!D1lP$Y!Ue
S90ll"rlel1ClWn . ~
Figure 9-15.
Graphs 1399
Representing Graphs
A graph can be represented by either an adjacency matrix or adjacency lists.
If there are N nodes in a graph, an adjacency matrix is a table with N rows
and N columns. The value in the [i, j] position in the table is 1 if (Vi, Vj) is
an edge in E and 0 otherwise. If the graph is a weighted graph, the [i, j] cell
can contain the weight on that edge if the edge is in E and a 0 otherwise.
Adjacency lists are linked lists, one for each node, containing the names
of the nodes to which it is connected. The heads of each of these lists are
held in an array of pointers. Figure 9-16 shows graph G2 represented both
as an adjacency matrix and as adjacency lists.
With all data structures, we need a systematic way to reach or search for
each element. To access each element in a one-dimensional array, we use a
count-controlled loop going from the index of the first element to the index
of the last element. For a two-dimensional array, we use a pair of nested
count-controlled loops. For a stack, we pop each element until the stack
becomes empty. For a queue, \ve remove each element until the queue
becomes empty.
For a tree, three traversals are commonly used, each of \vhich goes to the
deepest level of the tree and works up. This strategy of going down a
branch to its deepest point and moving up is called a depth-first strategy.
To systematically visit each node in a tree, we could visit each node on
level 0 (the root), then each node on levell, then each node on level 2, etc.
Visiting each node by level in this way is called a breadth-first strategy.
With graphs, both a depth-first strategy and a breadth-first strategy are use-
ful. We will outline both algorithms within the context of an example.
o J s C M Q U
o 0 1 1 0 0 0 0 0 - J - s /
J 0 0 0 1 0 0 J c M /
s 0 0 0 0 0 1 1 s Q uV
C 0 0 0 0 0 0 0 c V
M 0 0 0 0 0 0 0 M
/
Q 0 0 0 0 0 0 Q V
u 0 0 0 0 0 0 0 u V
Using Graphs
Graphs are useful structures for representing physical relationships such as
roads between cities or airline routes. Let's assume that we are interested
in airline routes between seven cities: Austin, Dallas, Houston, Denver,
Atlanta, Chicago, and Washington, D.C.
Our favorite airline flies the following routes:
Atlanta to Houston
Atlanta to Washington
Austin to Dallas
Austin to Houston
Chicago to Denver
Dallas to Austin
Dallas to Chicago
Dallas to Denver
Denver to Atlanta
Denver to Chicago
Houston to Atlanta
Washington to Atlanta
Washington to Dallas
DALLAS
AUSTI~ I ~
WASHINGTON
DENVER II
ATLANTA
CHICAGO
~
HOUSTON
The question we can answer with this data structure is "Can I get from
city X to city Y on my favorite airline?" This is equivalent to asking whether
a path exists in the graph from city X to city Y.
The strategy we will use is first to take our starting city and see if we can
reach our destination directly. If we can't, we will take each of the places
we can reach directly and see if we can reach our destination directly from
anyone of them (Le., in one stop). If we can't, we repeat the process to see
if we can reach our destination from any of the next level of cities (Le., in
two stops). The process continues until we find our destination or deter-
mine that we can't get there from here.
We need a systematic way to keep track of the cities as we investigate
them. If we can't reach our destination in one stop, we need to remember
all the cities we can reach in one stop so we can fan out from them if we
don't find our destination at the one-stop level.
Does this problem of remembering alternativ~ branching points sound
familiar? Remember the maze problem in Chapter 3? We kept track of alter-
native routes by using a stack. We can do the same thing here. Rather than
put the row nunlber and column number of a square on the stack, we will
put the cities we need to fan out from on the stack.
We will begin by looking at each city we can reach directly. If we find
our destination, the search is over. Other\vise, each city is put on the stack.
When we have looked at all the cities we can reach directly and have not
found our destination, we will pop the stack and start looking for our desti-
nation from the city that we have taken from the stack.
Let's apply this algorithm to our sample airline route data. We want to go
from Austin to Washington. The places we can reach directly from Austin
are Dallas and Houston. Neither is our destination, so they are pushed onto
the stack. [See Figure 9-18(a).] When we determine that there are no other
flights from Austin, we pop the stack and start searching from Houston.
Atlanta can be reached from Houston, but that isn't our destination, so
Atlanta goes on the stack. [See Figure 9-18(b).] There are no more flights
out of Houston on this airline, so we pop the stack and begin our search
from Atlanta.
There is a flight from Atlanta to Houston, but since that isn't our destina-
tion, Houston goes on the stack. [See Figure 9-18(c).] There is also a flight
from Atlanta to Washington. Washington is indeed our destination, so our
question has been ans\vered. Yes, we can take our favorite airline from
Austin to Washington.
However, if there were only one flight from Atlanta and it were to Hous-
ton, our algorithm would be in trouble. In fact, it would be an infinite
Figure 9-19.
Graphs \ 403
DALLAS
START --.
HERE
AUSTIN~//
DENVER
CHICAGO~/ ATLANTA
~
HOUSTON
Figure 9-20.
4041 Chapter 9 Trees Plus
1. Sho\v the preorder, inorder, and postorder notation of the following binary ex-
pression trees. (Note that you must supply parentheses for the inorder nota~ion.)
(a) (b)
+
*
*
a
c d a b
12
4 2
3. Show the binary expression tree that represents the following preorder expres-
sion:
/- +abc+d-e*f+gh
[infix = (((a + b) - c)/(d + (e - (f *(g + h)))))]
4. Write a recursive procedure to print the preorder representation of a binary
expression tree. (Use the variant record inlplementation of the nodes in the
tree.)
Exercises \405
5. Tell whether the follo\ving trees are complete, full, or neither.
(a)
m
b s w
a h
(b) m
b s w
(c)
m
s w
a c
p s
8. (a) Show how the representation of a cOInplete binary tree as an array can be
modified to represent any binary tree.
(b) IInplement your representation on the following tree.
35
26 47
12 28 82
30 61
9. Given an array with the following values, show how the original heap would be
arranged.
[1] [2] [3] [4] [,5] [6] [7] [8] [9] [10]
10. Show how the heap in Problem 6 would look after three values were in place,
before reheaping.
11. Draw a picture of the undirected graphs specified below.
(a) Gl = (V, E)
V(Gl) = {X, Y, Z, W}
E(Gl) = {(X, Y), (X, Z), (Z, Z), (Z, '\T)}
(b) G2 = (V, E)
V(G2) = {RED, PURPLE, WHITE, PINK, BLUE}
E(G2) = {(RED, PURPLE), (RED, WHITE), (PINK, PINK),
(BLUE, PINK)}
12. Draw a picture of the directed graphs specified below.
(a) Gl = (V, E)
V(Gl) = {MARY, JOSH, SUSAN, GEORGE, BILL, SARAH}
E(Gl) = {(MARY, BILL), (SUSAN, JOSH), (SARAH, JOSH),
(SARAH, BILL), (JOSH, SUSAN), (GEORGE, SUSAN),
(MARY, GEORGE), (BILL, MARY), (GEORGE, SARAH)}
(b) G2 = (V, E)
V(G2) = {O, 1, 2, 3, 5, 8, 13, 21}
E(G2) = {(O, 1), (1, 2), (2, 3), (3,5), (5, 8), (8, 13), (13, 21)}
13. Draw the adjacency n1atrix for G1 and G2 in Problem 12.
14. Using the adjacency Inatrix for G1 developed in Probieln 13, describe the path
froln SARAH to SUSAN using
(a) a breadth-first strategy.
(b) a depth-first strategy.
Pre-Test \407
4 5 8
15
7 6
10 80
3. Given an array with the following values, show how the original heap would
look:
[1] [2] [3] [4] [5] [6] [7] [8] [9] [10]
4. Show how the heap produced by Problem 3 would look after three values were
in place, after reheaping.
5. Define the graphs pictured:
(a)
0~(----~)®
1
o
4081 Chapler 9 Trees Plus
G1 is a directed graph.
Gl = (V, E)
V(Gl) = { }
E(Gl) = { }
93 21
(b) a )------{
14 47 30
C t-----"""""'d
50
Goals
To be able to apply verification rules to assignment
statements.
To be able to apply verification rules to selection statements.
To be able to determine a loop invariant.
To be able to apply verification rules to loops.
How many times have you corrected "one last bug" only to find another?
How many "only one more run" runs have you made? If you are human, far
too many.
Debugging gets rid of known bugs. Good testing helps you to find more
bugs. Unless the test data checks every possible combination of branches
with every possible input, however, testing cannot really prove conclu-
sively that the program is correct. In a large program, the number of possi-
ble combinations of branches makes this approach unfeasible.
For this reason, one of the theoretical areas of computer science research
involves program verification. The goal of this research is to establish a
method of proving programs correct that is analogous to the method of
proving theorems in geometry. The techniques exist to do this, but the
proofs of the programs are long and often more complicated than the pro-
grams themselves. So the proofs must be proved correct, and the proofs of
the proofs, and so on.
Therefore, the major thrust of verification research is building automatic
program provers. That is, computer scientists would like to build a verifia-
ble program that verifies other programs.
Why is this subject being mentioned in a text to be used for a second or
third programming course? Program verification is discussed here for two
reasons. First, to stress the point that computer science is indeed a science.
There is a body of mathematical theory that can be used to help us con-
struct good programs. Second, program verification is discussed to demon-
strate how some of its techniques can be useful now to you, a beginning
computer science student.
Assignment Statement
The general rule for assignlnent statements is as follows:
{P~} v:= e {Q}
This lTIeanS that within {Q}, every occurrence of v has been replaced
with e.
The following table gives SaIne illustrations of this rule:
{P~} s {Q}
{J = 6} J:= J + 1 {J = 7}
{K = 3} L:= K + 3 {K = 3 AND L = 6}
{DATA[I] = J} J := J + 1 {DATA[I] = J - I}
If we look at the output assertion {Q}, we can see that every occurrence of v
(the left side of the assignlnent) in {Q} is really e (the right side of the
assignment) before the assignment is done. T11at is what this rule is saying.
4121 Chapter 10 Verification
{O}
Note that if there is no ELSE clause, 52 is the identity function; tl1at is,
everything keeps its previous identity.
The following example to set MAX to the maxilnUln of X and Y demon-
strates how the verification of the selection statelnent works.
{P} {true} (::: 110 preconditions exist :::)
IF X>Y
THEN {X > Y} {X = n1axiIullln(X, Y)}
{51} MAX := X {MAX = 111axiInun1(X, Y)} is TRUE
ELSE {X <= Y} {Y = n1axin1um(X, Y)}
{S2} MAX := Y {MAX = maximllln(X, Y)} is TRUE
{Q} {MAX = maximum(X, Y)}
In the THEN branch we know that X is greater than Y, and thus maxi-
n11uu(X, Y) equals X. In the ELSE branch \ve know that X is not greater
than Y, and thus InaxiInum(X, Y) equals Y.
The next exan1ple shows a case where the verification fails. The segment
of code is supposed to set MAX to the Inaximun1 of X and Y.
{P} {true} (:1: no preconditiolls exist :;:)
IF X>Y
{Sl} THEN {X > Y} {X == maximlun(X, Y)}
NIAX := X
{MAX = Inaxin1un1(X, Y)} is TRUE
(::: no else *) {X <= Y} {MAX = MAX}
{S2} (:j: identity:::) {MAX = Inaximum(X, Y)} is undefined
{Q} {MAX == Inaxin1U111(X, Y)} is FALSE.
The Verification Technique 1413
In tIle THEN branch we know that X is greater than Y, and tIlus 111axi-
111Unl(X, Y) equals X. In the ELSE branch, we know that X is not greater
than Y. However, because there is no action, nlaxiululll is unchanged and it
is probably left containing an incorrect or undefined value.
F
B
Just what is this loop invariant {I} that we are trying to prove? It is a
statenlent about the relationsllips of the variables involved in the loop. It
expresses the semantics of the loop, as opposed to the lnechanics. The
expression B, which controls the loop repetition, is used to tell when tIle
semantics of the loop \vould be violated if another repetition were per-
formed.
Let's tie all this together by looking at an exanlple: sUlllming the N inte-
ger values in an array, DATA.
4141 Chapter 10 Verification
SUM : = 0;
I : = 1;
WHILE I <= N DO
BEGIN
SUM : = SUM + DATA[I];
I := I +
END
Initialization The loop invariant says that at the end of any iteration, the
area of the array before DATA[I] is already sun1l11ed. Substituting 1 for I in
DATA
[1]
[2]
This part of the array
is SUnl111ed.
[1 - 1]
[1]
[N]
figure 10-1.
The Case Against GOTO I 415
the loop invariant, we get "SUM contains the sum of all the elements in the
array up to but not including DATA[I]," which is correct, since SUM = O.
Preservation We know from the loop invariant that SUM contains the
sum of the values fro111 DATA[I] up to and including DATA[I - 1]. The
first statement in the loop adds the value in DATA[I] to SUM. So now SUM
contains the values from DATA[I] up to and including DATA[I]. The next
statement increments I. Applying the rule for assignment statements, we
replace the I with I - 1 in the previous statement. We have that SUM con-
tains the sum of the values from DATA[I] up to and including DATA
[I - 1], which is our loop invariant.
This seems like an awful lot of work to show s0111ething that is obvious
anyway: the code is correct. It may be obvious in this small example, but
the technique can be used to verify much more complicated examples.
How can this technique be immediately useful? If you think about your
loop in terms of the loop invariant first (Le., in terms of the semantics of
your loop), you can choose the correct initialization and termination condi-
tions the first time. Substituting them into the invariant, you can show that
the initialization is correct and that the termination is correct. Second, if the
invariant is added to the code as a comment, a reader can immediately tell
the meaning of the loop, not just the code to accomplish it.
LABEL 10;
CONST MA}{ = 100;
VAR LIST: ARRAY[1 •• MAXJ OF INTEGER;
l.,JALUE 1 ,
PLACE : INTEGER;
PLACE := 1;
WHILE PLACE <= MAX DO
IF LIST[PLACEJ = VALUE1
THEN GOTO 10
ELSE PLACE := PLACE +
The invariant of this loop is "VALUE 1 is not contained in the array LIST
up to but not including LIST[PLACE]," and the terminating condition on
the loop control is PLACE< = MAX. But ifVALUEl is found in the array,
the ternlinating condition will still be true at the exit of the loop. Note tllat
the code is functionally correct; it works. However, it is very difficult to
verify programs that include GOTO statenlents.
The difficulty of verification is not the only probleln with the GOTO
statenlent. The use of GOTO leads to unstructured programs with lTIultiple
entries into and exits froln control structures. This is sOlnetiInes called
"spaghetti code" because trying to read it can be like trying to untangle a
bo\vl of spaghetti. Programs with GOTO statenlents are harder to read,
understand, lTIodify, and debug.
An examination of the way the values in the array DATA are changing
reveals the following pattern:
(a) DATA[I] .. DATA[I] is sorted.
(b) DATA[1] .. DATA[2] is sorted.
(c) DATA[1] .. DATA[3] is sorted.
(d) DATA[I] .. DATA[4] is sorted.
(e) DATA[I] .. DATA[5] is sorted.
Application: The Insertion Sort \ 419
[I - 1]
[I]
[I + 1]
DATA[I] .. DATA[N] is
unexanlined
[N]
(b) Put DATA[I] in its proper slot.
(c) Increnlent I by 1 and you are back at (a).
Figure 10-2.
Since there are five eleluents, the entire array has been sorted.
Earlier we discussed the concept of a loop invariant. Can you see what
the loop invariant is in this algorithnl? Figure 10-2 shows the data structure,
the loop invariant, and the accol1lpanying algorithnl.
"Put DATA[I] in its proper slot" nlust now be defIned. There are three
possibilities to consider. Renlenlber that the first I - 1 elelnents are al-
ready sorted witI1 respect to one another (but not necessarily to the rest of
the array).
1. DATA[I] > DATA[I - 1]. DATA[I] is greater than all tI1e sorted ele-
nlents. In tIlis case DATA[I] is in its proper place, and we don't need
to do anything.
2. DATA[I] == DATA[I - 1]. DATA[I] is equal to the largest sorted ele-
Iuent. Now there is a choice: We can interchange thelu or leave them
as they are. If we leave thenl as they are, we preserve the order in
\vhich duplicate values occur in the original data. If we are talking
about integer values, the order of duplicates isn't in1portant. How-
ever, if we are actually sorting an array of records using an integer
value field to order theIn, we nlight 'want to preserve the original
order. A sort that preserves this original order is a stable sort. We will
420 I Chapter 10 Verification
choose to preserve the order here by c0111bining this case with the one
above. That is, we will do nothing if DATA[I] is equal to DATA
[I - 1].
3. DATA[I] < DATA[I - 1]. DATA[I] is less than the largest sorted ele-
luent, so we swap thelll. Now DATA[I - 1] becomes the DATA[I] in
this analysis, and we are back to our original choices. Clearly this
implies an inner loop \vhich will run fron1 I down to the place \vhere
situation (1) or (2) occurs or the loop counter reaches 1. Figure 10-3
shows a picture of the data structure during this processing.
J~ I
WHILE (J > 1) AND (DATA[J] < DATA[J - 1]) DO
swap DATA[J] and DATA[J - 1]
decrelnent J
Let's look at each of the four parts of the loop invariant in detail:
J >== 1 will be used to deterluine the value of J at the end of the loop.
"DATA[l]..DATA[J - 1] is sorted" says that the first J - 1 elements of
the array are sorted with respect to each other.
"DATA[J] .. DATA[I] is sorted" says that the elements fro111 index J to
index I are sorted with respect to each other.
The fourth part of the invariant ties the t\VO halves of the array together.
It says that if J is greater than 1 but less than I, then DATl\[J - 1] is
less than or equal to DATA[J + 1]. This will be used to show that
when a swap is 111ade between DATA[J - 1] and DATA[J], the second
half is still sorted.
J >= 1 AND
DATA[l] .. DATA[J - 1] is sorted AND
[J - 1]
[J]
[J + 1]
DATA[J] .. DATA[I] is sorted AND
IF 1 < J < I, DATA[J - 1] <= DATA[J + 1]
[I]
[I + 1]
[N]
Termination There are two cases here: either J is not greater than 1 or
DATA[J] is not less than DATA[J - 1]. Let's look at each case separately.
1. Our loop invariant says that J > = 1, and the conditional expression
(J > 1) is false. Therefore, J must be equal to 1. Substituting 1 for J in
the relevant parts of our loop invariant, we have
DATA[I] ..DATA[O] is sorted.
DATA[I] .. DATA[I] is sorted.
The first statelnent is true because there are no elelnents in this set.
The second staten1ent is just what we wanted to prove.
2. The second ternlinating condition says that DATA[J] is not less than
DATA[J - 1]. Therefore we know the follo\ving f~lctS:
DATA[I] .. DATA[J - 1] is sorted.
DATA[J] .. DATA[I] is sorted.
DATA[J - 1] <= DATA[J].
4221 Chapter 10 Verification
The first two lines state that the two halves are each sorted. The tllird
states that the largest in the first half is less than the snlallest in the
second half. Hence, DATA[l] .. DATA[I] is sorted.
Notice that in this discussion we say that the set of values froIn DATA[l]
to DATA[O] is en1pty. This is not like a FOR loop, in wllicll you can go both
TO and DOWNTO. We are discussing values that are in an array fron1
position I + 1 up to and including position I, and this set of values is elnpty.
Therefore, anything general we want to say about all elements in tllat array
segn1ent is true. For example, they are sorted, summed, exanlined, and
unexalnined.
Preservation After the swap, \ve know that DATA[J - 1] is now less
than DATA[J]. If this is the first tin1e through the loop, J is equal to I, and
DATA[J - l] .. DATA[I] is sorted. If it is not the first tinle through the loop,
we know that DATA[J] is less than DATA[J + 1] from Ollr loop invar-
iant. Therefore DATA[J - l] .. DATA[I] is sorted. We also know that
DATA[J - 2] is less than DATA[JL because they were in a sorted segment
of the array. DATA[l] to DATA[J - 2] is still sorted because it hasn't
been changed.
Before we go on, let's sun1nlarize \\That it is we now know:
D.A.TA[l] .. Di\TA[J - 2] is sorted.
DATA[J - l] .. DATA[I] is sorted.
Di\TA[J - 2] <== DATA[J].
Applying the rule of assignlnent statenlents, we Sllbstitute J for J - 1:
D.ATA[l] .. DATA[J - 1] is sorted.
DATA[J] .. DATA[I] is sorted.
D.ATA[J - 1] <== DATA[J + 1].
Now let's code the whole algorithn1 as a procedure.
Application: The Insertion Sort 1423
l.,JAR I t J : INTEGER;
PLACEFOUND : BOOLEAN;
I := I + 1
END (* outer loop :::)
END; (:;: inssort :i:)
nate. We add 1 to I each tin1e through the loop without c11anging N; there-
fore, I will eventually be incremented beyond N, and the loop condition
will become false.
In the inner loop, J is set to I, which is initialized to 2 and only increases.
Since J is decreluented by 1 each tin1e through the loop, J will eventllally
reach 1, and the loop will terminate.
Preservation The body of the loop consists of two parts: the inner loop and
tIle statement that increments the loop counter I. We have already proved
the inner loop. Therefore we can use the terminating condition of the inner
loop as an assertion here. We know that DATA[l] to DATA[I] is sorted.
After I is incren1ented, we apply the assignment rule and get that DATA[l]
to DATA[I - 1] is sorted.
You will note that the code for the inner loop in the final procedure is
slightly different from the algorithlu we proved. T11is is because Pascal
evaluates both sides of a conditional expression regardless of the outcome
of the fIrst part. That is, even if the result of the first part gives the answer
directly, the second part is still evaluated. This nleans that even tllough J is
equal to 1, the second half of the condition, the expression DATA[J] <
DATA[J - 1] will be evaluated. Since h·ying to access DATA[O] will cause a
run-tilue error, this test must be made within the loop, where we know this
condition cannot occur, and the result returned t11rough a Boolean flag.
Pre-Test 1425
Exercises -, W_a_lili!I&llillM"M'_ww* _liiiiiri!·~hif4l+li!I!J&_ _W_I!i!iiiiilliMll_IIIIH5_~_&_'!_§i!m:J!ii!!E!!I'~"""+!
. .@_";
{P~} s {Q}
I := 1+2 {I = 6}
K:= K - 3 {K = 6}
J := N + 2 {N = 1 AND J= 3}
J := J - 1 {DATA[I] = J + I}
2. Write the code to find the 111initnu111 of two numbers, X and Y, and verify the
code.
3. Write the code that i111ple111ents the following loop invariant.
1 <= I <= N + 1 AND VALU = 7TDATA[l] .. DATA[1 - 1]
{P~} s {Q}
J := 2 * J {J = 18}
J :=
K DIV 2 {(J = 12) AND (K = 24)}
J:=J+K-L {(J = 0) AND (K = 5) AND (L = 10)}
2. Prove the following seg111ent of code by filling in the missing assertions on the
side.
{V
1\
....••.
0 AND y-- >.= O}
IF i\ :::. y \I
THEN { }
Z := Y - )( { }
ELSE { }
Y Z := V
{ 1\ - }
{V
1\ :::.= 0 AND y :::-= 0 AND Z <. = O}
426\ Chapter 10 Verification
3. A loop invariant:
(a) is false just before the loop is executed the first time. T F
(b) mayor may not be true after the execution of each statement within the
loop. T F
(c) is true at the top of the loop prior to each execution of the loop body.
T F
(d) is always true. T F
(e) is false after the loop execution is complete. T F
4. Prove the following segment of code by filling in the missing assertions on the
side.
{TRUE}
{ (A - B) * ( A + B)
BEGIN
v : = A + B;
/\
{ }
y : = A - B; { }
C : = )( * y; { }
END;
{C = A 2 - B2 }
4281 Chapter 11 Sorting Algorithms and Efficiency Considerations
At many points in tl1is book, we have gone to great trouble to keep lists of
elements in sorted order: student records sorted by ID number, integers
sorted from smallest to largest, words sOlted alphabetically. The goal of
keeping sorted lists, of course, is to facilitate searching them. Given an
appropriate data structure, a particular list element can be found faster if
the list is sorted.
Putting an unordered list of data items into order-sorting-is a very
common and useflll operation. Whole books 11ave been written about vari-
ous sorting algorithms, as well as algorithms for searching an ordered list to
find a particular element. The goal, of course, is to come up with better
sorts. Since the efficiency of a sort may in large part determine the effi-
ciency of the whole program, a good sorting routine is very desirable. This
is one area in which programmers are sometimes encouraged to sacrifice
clarity in favor of speed of execution. For this reason, we will use the sub-
ject of sorting algorithlns to illustrate the considerations involved in meas-
uring efficiency.
This chapter then \vill have a dual focus: (1) to introduce several sorting
algorithms and son1e terminology used to describe sorts, and (2) to talk
about efficiency considerations, using the comparison of sorting algorith111s
as an example.
First, ho\vever, we must decide what we mean by efficiency.
WHAT IS "GOOD"?
In Chapter 7, we said that quicksort is a good sorting algorithln. What do we
mean by good? How can we compare two algorithms that do the same task?
To Inake such a c0111parison, \ve must first define a set of objective mea-
What Is "Good"? 1429
sures that can be applied to each algorithm. The analysis of algorithms is an
inlportant area of theoretical computer science; in advanced courses, you
will undoubtedly see extensive work in this area. We will look at a small
part of this topic-only enough to let us determine which of two algorithms
requires less work to accomplish a particular task.
Ho\v do we measure the work that two algorithms perform? The first
solution that comes to nlind is sinlply to code the algorithllls and then com-
pare the execution times for running the two programs. The one with the
shortest execution time is clearly the better algorithm. Or is it? We can
really only say that Program A is nlore efficient than Program B on this
computer. Execution times are specific to a particlllar computer. Of course,
we could test the algorithms on all possible computers, but we want a more
general llleasure.
A second possibility is to COllnt the number of iustructions or statements
executed. This measure, however, varies with the progralnming language
used, as well as with the style of the individual programmer. To standardize
this measure somewIlat, we could count the nUlllber of passes through a
critical loop in the algorithm. If each iteration involves a constant alnount of
work, this measure will give us a meaningful yardstick of efficiency.
These musings lead to the idea of isolating a particular operation funda-
mental to the algorithln and counting the number of times that this opera-
tion is perfornled. Suppose, for exanlple, that we are searching an array for a
certain value. \Ve might count how many cOlnparisons between the target
value and the elements in the an·ay are necessary to locate tIle value. If we
were summing the elements in an integer array, \ve could count the integer
addition operations needed. (Note that this COllnt is a function of the num-
ber of elements in the array. For an array ofN elenlents, there will be N - 1
addition operations. So we can COlnpare the algorithnls for the general case,
not just for a specific array size.) If we want to compare algorithnls for
lnultiplying two real matrices together, we can cOlne up with a measure
that combines the real multiplication and addition operations required for
lnatrix multiplication.
This last exalnple brings up an interesting consideration: SOlnetilnes an
operation will so dominate the algorithln that the other operations fade into
the background "noise." If we want to buy elephants and goldfish, for ex-
ample, and we are considering two pet stores, we really only need to com-
pare the prices of elephants; the cost of the goldfish is trivial by compari-
son. Similarly, real multiplication is so nluch more expensive than addition
in terms of COIUpllter tiIne that the addition operation is a trivial factor in
the efficiency of the whole matrix multiplication algorithln; we might as
well count only the multiplication operations, ignoring the addition. In
analyzing algorithms, we can often find one operation that dominates the
algorithm, effectively relegating the others to the noise level.
From our exalnination of the insertion sort in Chapter 10, we can see that
tIle dominant operation in tIle loop is the cOlllparison operation-which
element is bigger? In our study of other sorting algorithms, we will use the
number of cOlnparisons as a lneasure of the efficiency of each algorithm.
430 I Chapter 11 Sorting Algorithms and Efficiency Considerations
Each sort takes the same input-an unordered array-and produces the
same output-an ordered array containing the same elements. We will con-
sider Sort A to be lnore "efficient" than Sort B if it can accolnplish its task
with a snlaller amount of work. That is, a more efficient sort, according to
our measure, will require fewer conlparisons. (There are other ways to
measure the efficiency and cost of an algoritllnl, which will be discussed
later.)
As in the case of summing the elements in an array, we do not actually
have to count the nunlber of critical operations on an array of a particular
size. The number of comparisons will be SOlne function of the nUlnber of
elements (N) in the array. Therefore, we can express the nunlber of com-
parisons in terlns of N (like N + 5 or N 2 ) rather thall as an integer value
(like 52).
NOTE: In all the exalnples that follow, DATA is an ARRAY[l .. N] OF IN-
TEGER, and all sorting is in order of increasing value.
swaps the value with itself, rather than checking for the special case.)
4. The smallest value in the unsorted part of the array (DATA[4]
.. DATA[5]) is in position [5]; we swap DATA[4] and DATA[5].
5. DATA[5] now must contain the largest value in the array. The values
in DATA are now sorted. The algorithm for the selection sort is shown
in Figure 11-2.
DATA[l] .. DATA[I - 1]
is sorted
[I - 1]
[I]
[N]
Let's break down step (b) of the algorithm further. Our task is to find the
index of the miniITIum value in the unsorted section of the array. We will
need a loop that starts with I and compares the values in the subarray
DATA[I] .. DATA[N], returning the index of the smallest value (MINDEX).
The loop control variable, J, representing the cursor moving through the
unordered part of the array, will begin at I + 1. In each iteration we will
ask: Is the value in DATA[J] less than the value in DATA[MINDEX]? If so,
we update MINDEX. We increment J and keep checking until we reach
the end of the array (when J > N). MINDEX is now the index of the small-
est value in the unordered part of the array. The algorithm is illustrated in
Figure 11-3.
J:::; N + 1
~ DATA[J] is being examined
[J]
(J is a counter going from
[J + 1] I + 1 to N)
DATA[J + l] .. DATA[N] is
unexan1ined
[N] J:::; N + 1
(b) IF DATA[J] < DATA[MINDEX] THEN MINDEX:= J
(c) Increlllent J by 1 and you are back at (a).
Figure 11·3. The algorithm for finding the location of the minimum element.
You will note that each loop is expressed in terIllS of the relationship of
the variables within the loop. What is such a description called? That's
right-a loop invariant. We \vill define these loop invariants in \vords, and
leave the proofs of them as an exercise. Try them yourself, using the tech-
nique froIll Cl1apter 10.
Straight Selection Sort 1433
The invariant for the algorithm in Figure 11-2 is "The elements from
DATA[l] up to and including DATA[I - 1] are sOlted, and all the values in
DATA[I] up to and including DATA[N] are greater than or equal to any of
tIle values in DATA[l] ..DATA[I - 1]."
The invariant for the inner loop (Figure 11-3) is "DATA[MINDEX] is
less than or equal to any values from DATA[I] up to and including
DATA[J - 1]."
The code for Procedure SELECTSORT is given below.
Note that the loops have been written as WHILE, rather than FOR,
loops. FOR loops would shorten the code, but they leave the loop control
variable undefined at the end of the loop. Because of this fact, we could not
prove that the loop invariants hold. Therefore, the algorithm is coded using
the more general WHILE loop. Of course, in practice either construct will
work.
Now let's get back to the business of measuring the amount of work
required by this algorithm in terms of the number of comparisons made.
4341 Chapter 11 Sorting Algorithms and Efficiency Considerations
There are two loops, one nested \vithin the other, and the comparison is in
the inner loop. The first time through the inner loop, there are N - 1 com-
parisons, the next time N - 2 conlparisons, and so on until there is 1 com-
parison in the last iteration. This totals up to
(N - 1) + (N - 2) + (N - 3) + ... + 1 = (N(N - 1)/2)
To accomplish our goal of sorting an array of N elements, the straight
selection sort requires N(N - 1)/2 comparisons. For an array of 10 ele-
ments, for instance, 45 comparisons are needed. Note that doubling the
array size more than quadruples the number of cOlnparisons. Also note that
the particular arrangement of values in tIle array does not affect the amount
of work done at all. Even if the array is in sorted order before the call to
SELECTSORT, the procedure will still make N(N - 1)/2 comparisons.
BUBBLE SORT
The identifying feature of a selection sort is that, on each pass through the
loop, one element is put in its proper place. In the straight selection sort,
each iteration found the smallest unsorted element and put it in its correct
place. If we had made the inner loop find the largest value, instead of the
smallest, the algorithm would have sorted in descending order. We could
also have made the loop go down from N to 1, putting the elements in the
bottom of the array first. All these are variations on the straight selection
sort. The variations do not change the basic way that the minimum (or
maximum) element is found.
The bubble sort is a selection sort that uses a different scheme for find-
ing the minimum (or maximum) value. Each iteration puts the smallest
unsorted element in its correct place, but it also makes changes in the
location of the other elements in the array. The first iteration will put the
smallest element in the array in the first position. We start with the element
in the Nth position, and compare successive pairs of elements, swapping
whenever the bottom element of the pair is smaller than the one above it.
In this way, the smallest element "bubbles" up to the top of the array. The
next iteration puts the smallest element in the unsorted part of the array
into the second position, using the same technique. As we walk through the
example in Figure 11-4, note that in addition to putting one element in its
proper place, each iteration causes some intermediate changes in the array.
The first iteration puts 6 in DATA[l]. Unlike the straigllt selection sort,
which would make only the single swap between 6 and 36, the bubble sort
causes two additional intermediate swaps. It first compares the values in
DATA[5] and DATA[4]. Since 12 >= 6, no swap occurs. Then DATA[4]
and DATA[3] are compared; 6 < 10, so the two are swapped. The new
value in DATA[3], which is 6, is compared to DATA[2]; 6 < 24, so they are
swapped. Finally, the new value in DATA[2] (6 again) is cOlnpared to
DATA[l]; 6 < 36, so a swap occurs. Now, tIle smallest value (6) is in the top
position in the array. The second iteration will put the next smallest value
(10) in the second position, causing one additional swap, and so on. Note
Bubble Sort 1435
DATA DATA DATA DATA DATA
[1] 36 [1] 6 [1] 6 [1] [1]
[2] 24 [2] 36 [2] [2] [2]
[3] 10 [3] 24 [3] 36 [3] [3]
[4] 6 [4] 10 [4] 24 [4] 36 [4]
[5] 12 [5] 12 [5] 12 [5] 24 [5] 36
(a) Initial (b) First (c) Second (d) Third (e) Fourth
iteration iteration iteration iteration
Figure 11-4. Example of the bubble sort.
that each iteration stops comparing values at the Ith position; the first I - 1
values are already sorted, and all the elelTIents in the unsorted part of the
array are greater tl1an or equal to the sorted elements. A snapshot picture of
this algorithn1 is shown in Figure 11-5.
DATA[I] .. DATA[J - 1] is
unexamined
[J - 1]
~DATA[J] is less than or equal
[J]
to all the elements in
[J + 1] DATA[J + 1] .. Di\TA[N]
I~J~N+1
[N]
(b) If DATA[J] < DATA[J - 1] THEN swap thelTI
(c) Decrement J by 1 and you are back at (a).
(d) When J = I, the unexamined portion is en1pty.
Figure 11-5. The algorithm for the bubble sort.
4361 Chapter 11 Sorting Algorithms and Efficiency Considerations
l.,JAR I t J : INTEGER;
BEG I N (;j: bubble 1 :::)
(::: Loop through the \vhole array. ;::)
I : = 1;
WHILE I -: : N DO
BEGIN
(::: Buhl>lc up the slllallest ullsorted value. :i:)
J : = N;
WHILE J : :- I DO
BEGIN
Ci: If the hottonl value is slnallcr than :::)
(:i; its predecessor, s\vap then\. :::)
IF DATA[J] < DATA[J - 1]
THEN SWAP (DATA[J] t DATA[J - 1]);
J := J - 1
END ; (::: \v hil e J I ::: )
I := I + 1
END (;j; \vhile I N ;j:)
I t J INTEGER;
SWAPPED BOOLEAN;
(;:: Loop through array; stop \vhen sorted. 'The array is sorted \vhen there are no *)
(:1: values s\vappecl in the inner loop. *)
WHILE (I < N) AND SWAPPED DO
BEGIN
(:1: Initialize. :::)
J : = N;
SWAPPED := FALSE;
J := J - 1
END ; (* \\1 hi Ie J > I;!:)
I := I + 1
END (:::
\vhile I < N :;:)
END; (::: bubble2 :!:)
What if the original array were actually sorted in descending order before
the call to BUBBLE2? This is the worst possible case. In this case, BUB-
BLE2 requires as lllany c0111parisons as BUBBLE1 and SELECTSORT.
Note that we are still defining work as the nlullber of con1parisons required.
This measure does not include such overhead as setting the Boolean flag in
each iteration of BUBBLE2, or all the extra swaps in both the bubble sorts.
So, is BUBBLE2 1110re efficient than BUBBLE1 or SELECTSORT?
It depends on the data. That is, the alllount of work needed will vary,
depending 011 the order of the original data. Can we calculate an average
case? Note that the nlunber of comparisons in iteration i is N - i. Let K
indicate the nlullber of iterations executed before BUBBLE2 finisl1es its
work. The total nU111ber of c0111parisons required is
(N - 1) + (N - 2) + (N - 3) + ... + (N - K)
A little algebra changes this to
(2KN - K - K2 )/2
Since K is not greater than N, (2KN - K - K2 )/2 is less than or equal to
N(N - 1)/2. Therefore, BUBBLE2 is better than either BUBBLE1 or
SELECTSORT, right? Well, maybe. Relneluber that the overhead in BUB-
BLE2 is greater. So even though we have a quantitative n1easure of how
n1uch work is done, we are still having trouble actually comparing the vari-
ous algorithms. We have hedged by saying that efficiency is data depend-
ent. Can't we do better tl1an that?
THE 18~G-O
N log2N N log2N N2 N3 2N
1 0 1 1 1 2
2 1 2 4 8 4
4 2 8 16 64 16
8 3 24 64 512 256
16 4 64 256 4096 65536
32 5 160 1024 32768 2147483648
64 6 384 4096 262144 About 5 years worth of
instructions on a
super computer
128 7 896 16384 2097152 About 600,000 times greater
than the age of the
universe in nanosecs (for
a 6-billion-year estimate)
256 8 2048 65536 16777216 don't ask
Figure 11·6.
As you can see in Figure 11-6, some of the computing times increase
very dralnatically in relation to the size of N. In particular, note that
Nlog2 N grows much more slowly than N2 . (It is also interesting to note
that the values in the last column grow so quickly that the computation
tilne required" for problems of this order Inay exceed the estimated life span
of the universe.)
that of BUBBLE2. In the best case, when the initial array is already sorted,
only one comparison is made on each pass, so the sort is O(N). In the worst
case (the array is sorted in reverse order), the nUlnber of comparisons is the
same as in the straight selection sort and BUBBLE1, namely, N(N - 1)/2, or
0(N 2). On the average, the insertion sort is 0(N 2). It is better than the
straight selection sort, however, if tl1e original file approaches sorted order.
Since the straight selection sort, both bubble sorts, and the insertion sort
are all on the order of N 2, for large values of N there is no difference be-
tween their performances. BUBBLE2 may require somewhat fewer com-
parisons, but the difference, on the average, is not significant.
ANALYZING QUICKSORT
Now that we have a measure (number of cOlnparisons) and a way of saying
"much more efficient," let's analyze Quicksort.
On the first call, every eleluent in the array is compared to the dividing
value, so the work done is O(N). The array is divided into two pieces,
which are then examined.
Each of these segments is then divided, making four pieces. This analy-
sis is illustrated in Figure 11-7. At each level, the number of pieces dou-
bles. At what level have we finished dividing the elements? If we split
each segment into approximately one half each time, it will take up log2N
splits. At each split, we make O(N) comparisons.
So quicksort is O(N log2N) on the average, which is quicker than 0(N 2).
When is quicksort not quick? Consider a quicksort routine whose splitting
algorithln uses the first element of the array (or the segment of the array
2 1 O(N)
4 2 O(N)
8 3 O(N)
N O(N)
Figure 11 ..7. Analysis of quicksort.
Other Efficiency Considerations 1441
under consideration) as a splitting value. What would happen if the array
were already sorted? The splits would be very lopsided, and the subse-
quent calls to quicksort would sort into a segment of one element and a
segn1ent containing all the rest of the array (or segment of the array).
Clearly, this situation would produce a sort that is not at all quick. In fact, in
this case quicksort would be O(N 2). The possibility of this situation occur-
ring by chance is very unlikely. By analogy, consider the odds of shuffling a
deck of cards and coming up with an ordered deck. On the other hand, in
some applications you Inay know that the original array is likely to be
sorted or nearly sorted. In such cases, you would want to use either a differ-
ent sort or a different splitting algorithn1 for quicksort.
ANAlYZ~NG HEAPSORl
It is hard to believe, fro111 our small-sized example in Chapter 9, that heap-
sort is really very efficient. It seems odd to move the largest value all the
way to the top before putting it in its place at tl1e bottom. And, in fact, for
sInal1 values of N, heapsort is not very efficient.
For large arrays, however, heapsort can be very efficient indeed. Con-
sider that a cOlnplete binary tree with N nodes has log (N + 1) levels. Even
if each element were a leaf and had to pass through the entire tree, the sort
would still be O(N log2N). So heapsort, unlike quicksort, is O(N log2N)
regardless of the initial order of its elements.
SWAP(}-{, Y)
TEMP := }-{;
}-{ : = Y;
Y := TEMP
Programmer lime
Why then would you ever decide to use a recursive version of the sort? The
decision involves a choice between types of efficiency desired. Up until
now, we have only been concerned with minimizing computer time. How-
ever, while computers are becoming faster and cheaper, it is not at all clear
that computer programmers are following that trend. Therefore, in some
situations programmer time may be an important consideration in choosing
a sort algorithm and its implementation. In this respect, the recursive ver-
sion of quicksort is more desirable than its nonrecursive counterpart, which
requires the programmer to simulate the recursion explicitly.
Space Considerations
Another efficiency consideration is the amount of memory space required.
In general, this is not a very important factor in choosing a sorting algo-
rithm, since the space needed is usually closer to O(N) than to O(N 2 ). The
usual time versus space tradeoff applies to sorts-more space means less
time, and vice-versa.
Since processing time is the factor that applies most often to sorting
algorithms, we have considered it in detail here. Of course, as in any appli-
cation, the programmer must determine goals and requirements before se-
lecting an algorithm and starting to code.
More About Sorting in General I 443
MORE ABOUT SOR1ING ~N GENERAL
Keys
In our descriptions of the various sorts, we showed examples of sorting
arrays of integers. In reality we are n10re likely to be sorting arrays of rec-
ords that contain several fIelds of inforlnation. A sort key is the field in such
a record whose value is used to order the records. Each record must contain
some unique identifying key, such as an IDNUMBER field. In addition, a
record Inay contain secondary keys, which mayor may not be unique. For
instance, a student record Inay contain the following fields:
If the data contain only single integers, it doesn't Inatter whether the
original order of dllplicate values is kept. As we will see, preserving the
original order of records with identical key values may be desirable. If a
sort preserves this order, it is said to be stable.
Consider the following declarations for personnel records:
ADDRESSTYPE = RECORD
NUMBER INTEGER;
STREETt
CITYt
STATE STRING10;
ZIP INTEGER
END; (* record *)
PERSONTYPE RECORD
NAME: STRINGZO;
ADDRESS: ADDRESSTYPE;
END; (* record *)
EMPLOYEES[IJ.NAME
EMPLOYEES[IJ.ADDRESS.CITY
EMPLOYEES[IJ.ADDRESS.ZIP
If the sort is stable, we can get a listing by zip code, with the names in
alphabetical order within each zip code, by sorting twice: the first time by
name and the second time by zip code. A stable sort preserves the order of
the records when there is a luatch on the key. The second sort, by zip code,
will produce n1any slIch lnatches, but the alphabetical order inlposed by
the first sort will be preserved.
To get a listing by city, with the zip codes in order within each city and
the names alphabetically ordered within each zip code, we would sort
three times, on the following keys:
EMPLOYEES[IJ.NAME
EMPLOYEES[IJ.ADDRESS.ZIP
EMPLOYEES[IJ.ADDRESS.CITY
The file would first be put into alphabetical order by nalne. The output
froln the first sort would be input to a sort on zip code. The output from this
sort would be input to a sort on city nalne. If the sorting algorithms used
were stable, the final sort would give us what we are looking for.
Sorrt~ng Pointers
Sorting large records using SOlne kind of exchange sort may require much
COlllputer tilne just to move sections of memory fronl one place to another
every time we make a swap. This nlove time can be reduced by setting up
an array of pointers to the records and then sorting the pointers instead of
the actual records. This scheme is illustrated in Figure 11-8.
Figure 11·9.
Note that, after the sort, the records are still in the same physical arrange-
ment, but they may be accessed in order through the sorted array of point-
ers.
This scheme may also be extended to allow us to keep a large array of
data sorted on more than one key. For instance, with the declarations
the data in Figure 11-9 are physically stored according to the primary key,
IDNUM. The arrays NAMEORD and SALORD contain pointers (indexes)
to the records in the large array, EMPLOYEEDATA. In this way, we can
keep the array ordered with respect to the secondary keys, NAME and
SALARY, as well.
We have not attempted in this chapter to give every known sorting algo-
rithm. We have presented a few of the popular sorts, for which many varia-
446 1 Chapter 11 Sorting Algorithms and Efficiency Considerations
tions exist. It should be clear fi'onl this discussion that no single sort is best
for all applications. The silllpler, generally O(N 2 ) sorts work as well, and
sonletiInes better, for fairly slllall files. Since they are silllple, they require
relatively little programnler tinle to write and nlaintain. As you add features
to inlprove these sorts, you also add to the cOlnplexity of the algorithms,
increasing both the work required by the routines and the programmer
tiIne needed to maintain thelll.
Another consideration in choosing a sort algorithnl is the order of the
original data. If the data are already ordered (or almost ordered), a simple
insertion sort or BUBBLE2 will only be O(N), while SOllle versions of
quicksort will be O(N 2 ).
As always, the fIrst step in choosing an algorithnl is to deternline the
goals of the particular application. This step will usually narrow down the
options considerably. After that, knowledge of the strong and weak points
of the various algorithnls will assist you in lllaking a choice.
Exercises h@_IlEJll1m!iiilI"liili1l!i['iW§§im!liii'f§iiio~&@N_.Mir_._i1M_§,_mq_V-~
11.!Z!l!al;-_ _ ~_.i
1. Write the proofs of the loop invariants given for the loops in SELECTSORT
(sho\vn in Figure 11-2 and 11-3), using the technique fronl Chapter 10.
2. Which sorts are O(N 2)? Which sorts are O(N log2N)?
3. What conditions can nlake an O(N 2) sort run filster than an O(N log2N) sort?
4. Given the array, DATA, containing the elen1ents
INSSORT
SELECTSORT
BUBBLEl
BUBBLE2
QUICKSORT*
HEAPSORT
8. Big State U. needs a listing of the overall SAT percentiles of all the students it
has accepted in the past year (13,438 students). The data are in a file that con-
tains the student ID number, SAT overall percentile, math score, English score,
and high school CPA, one student to a line. There is at least one blank between
each value. The required output is a listing of all the percentile scores, one per
line, ordered from highest to lowest. (Duplicates should be printed.) Write a
procedure to produce the listing. The procedure should be O(N).
9. Procedure A does a particular task in a "time" of N 3 + 100, where N is the
number of elements processed. Procedure B does the same task in a time of
3N 2 + N + 500. What are the Big-O requirements of each task? Which task is
more efficient according to its Big-O notation? When does the less efficient
procedure (by Big-O standards) execute faster than the more efficient one?
10. Give arguments for and against using procedures to encapsulate frequently
used code (like SWAP) within a sorting routine.
11. What is meant by programmer time as an efficiency consideration?
448 I Chapter 11 Sorting Algorithms and Efficiency Considerations
1. Show what the array would look like after the Ith iteration using the given sort.
The lth iteration means that the lth element is in its proper place.
(a)
(after iteration 2)
(b)
(after iteration 4)
(c)
(after iteration 1)
3. Fill in the following table by putting a B (best case) or a W (worst case) in the
appropriate column if that original order describes either the best case or the
worst case for that sort. For example, if the best case for a selection sort is when
the input data are in ascending order, put a B under the ascending column
beside selection. If the order makes no difference, put DM (doesn't matter).
Ascending Descending Randonl
SELECTION
INSERTION
SIMPLE BUBBLE
QUICK
Pre-Test \ 449
4. You are asked to print a list of the 100 best students in alphabetical order. Since
the file of students contains 49,000 records (ordered by Social Security num-
ber), it is not feasible to read them all in and sort thenl. You ITIUst devise another
strategy. The records are described as follows:
TYPE
STURECORD = RECORD
LAST t
FIRST: PACKED ARRAY[1 •• 10J OF CHAR;
SSNUM : INTEGER;
GPA : REAL
END
You ITIay use the following routines without writing the code for theITI. Just give
the formal and actual parameter lists.
Procedure READDATA, which reads in one student record.
Procedure SORT, which sorts a list, as specified by the parameter list. (If you
want this to be a specific sort algorithnl, just state which one in COlTIITIents.)
4521 Chapter 12 Searching
SEQUEN1~Al SIEARCH~NG
The simplest search technique is the sequential search. You begin at the
head of the list and search for the desired record by examining each subse-
quent record, until either the search is successful or the list is exhausted.
This technique is appropriate for both sequential lists and linked lists. The
list does not have to be ordered, although the efficiency of the search may
be improved if the list is ordered.
Sequential Searching 1453
The general algorithm for finding the record containing the key value,
KEYVAL, is
We will code this algorithlTI for searching an array of records that contain
a field called KEY. The records in the array are not necessarily ordered
with respect to the KEY field.
make N/2 cOluparisons; tIlat is, on the average we will have to search half
tIle list.
High-Probability Ordering
The assumption of equal probability for every record in the list is not al-
ways valid. Sometiules certain list elements are in much greater demand
than others. This observation suggests a way to improve the search: Put the
most often desired elements at the beginning of the list. For instance, given
a cOlumand-driven program, you can order the elements in the command
table according to the frequency of their use. U sing this scheme, you are
more likely to luake a hit in the first few tries, and rarely will you 11ave to
search the whole table.
If the elements in the list are not static as in a cOlumand table, or if you
cannot predict their relative denland, you need some scheme to keep the
most frequently used records at the front of the list. One way to accomplish
this goal is to move each record accessed to the front of the list. Of course,
there is no guarantee that this record will later be frequently used. How-
ever, if the record is not retrieved again, it will drift toward the end of the
list, as other records are moved to the front. This scheme is easy to imple-
luent for linked lists, requiring only a couple of pointer changes, but it is
less desirable for lists kept sequentially in arrays.
A second algorithlu, which causes records to luove toward the front of
the list gradually, is appropriate for either linked or sequential-array list
representations. As each element is accessed, it is swapped with the ele-
luent that precedes it. Over many list retrievals, the most frequently de-
sired elements will tend to be grouped at the front of the list.
Keeping the IUOSt active records at the front of the list will not aflect the
worst case; it will still take N conlparisons. However, the average perfor-
mance should be better. Both of these algorithms depend on the assump-
tion that SOUle elements in the list are used luuch more often than others. If
this assunlption is not applicable, a different ordering strategy is needed to
improve the efficiency of the search technique.
KEY ORDERING
Ordering a list according to the key value, as discussed in Chapter 11, can
greatly iIuprove the efficiency of a searching scheme. In the case of a se-
quential search, it is no longer necessary to search the whole list to discover
that a record does not exist. You only need to search until you have passed
its logical place in the list-that is, until you come across a record with a
larger key value. If the list is stored in a linked representation, you can add
a trailer node with an impossibly large key value (e.g., a NAME field of
'ZZZZZZZZZZ') to ensure that there will always be SOlne value larger than
the one for which you are searching.
Binary Searching 1455
Procedure SEQFIND2 in the following program searches a linked list
implemented with pointer variables for the value KEYVAL. The list con-
tains a trailer node.
BEG IN (* seqfind2 *)
FOUND g= FALSE;
B~NARY SIEARCH~NG
The advantage. of the sequential search is its simplicity. In the worst case,
you will have to make N comparisons, since you only examine one record at
a time and you might be searching for the last record in the list. If the list is
sorted and stored in an array, however, you can improve the search time to
a worst case of O(log2N). Of course, we improve efficiency at the expense of
simplicity.
The idea of a binary search is best described recursively.
456 \ Chapter 12 Searching
Consider how this algorithm could be used to find the page with the
name "Dale" in the phone book. We open the phone book to the middle
and see that tl1e names there begin with M. M is larger than D, so we binary
search from A to M. We turn to the midpoint and see that tl1e names there
begin with G. G is larger than D, so we binary search from A to G. We turn
to the middle page, and find that the nalnes there begin with C. C is smaller
than D, so we binary search the second half-that is, from C to G. And so
on, until we are down to the single page that contains the name "Dale."
Although the recursive description is conceptually simpler, we know
that a nonrecursive solution will be more efficient. The Procedure
BINARYFIND performs a binary search on array LIST, returning the index
of the desired element in LOCATION (or 0 if the element is not found).
BEG I N (* binaryfind *)
(* Initialize. *)
FOUND := FALSE,
FIRST : = 1,
LAST := NUMELEMENTS,
(* Search until elen1ent found or there are no 1110re ele111ents. *)
WHILE (FIRST <= LAST) AND NOT FOUND DO
BEGIN
(:1: C0111pare Inidclle elelnent in search area to *)
(:1: the key value. *)
MIDPOINT := (FIRST + LAST) DIV 2;
Binary Searching I 457
IF LIST[MIDPOINTJ = KEYVAL
THEN FOUND := TRUE
ELS E ( i\'! () ve nrs t an d 1as t i11 d l' xes to (' II t searc hare a in hal f.
:i: :i: )
KEYVAL = 25
Figure 12-1.
Note that the binary search discussed here is only appropriate for lists
stored in a sequential array representation. After all, how can you find the
midpoint of a linked list? But you already know of a structure that allows
you to perform a binary search on a linked data representation-the binary
search tree. The operations used to search a binary tree are discussed in
Chapter 8.
Hashing 1459
number of comparisons
N (number of elements)
HASHING
So far, ,ve have succeeded in paring down our O(N) search to O(log2N) by
keeping the list ordered with respect to the value of the key field-that is,
the key in the first record is less than (or equal to) the key in the second
record, which is less than the key in the third, and so on. Can we do better
than that? Is it possible to design a search of O(l)-that is, one that takes
the saIne search time to find any element in the list?
In theory, this is not an impossible dream. Consider a list of employees
of a fairly sn1all company. Each of the 100 elnployees has an ID number in
the range 1 to 100. In this case, we can store the records in an array that is
indexed from 1 to 100, allowing us direct access to the record of any elU-
ployee.
In practice, this perfect relationship between the key value and the ad-
dress of a record is not easy to establish or n1aintain. Consider a similar
company that uses its en1ployees' Social Security nun1bers (SSN) as the
primary key. Now the range of key values is frolu 000000000 to 999999999.
Obviously, it is impractical (impossible) to set up an array of a billion rec-
ords, of which only one hundred ,viII be needed, just to luake sure that each
employee's record will be in a perfectly unique and predictable location.
What if we keep the array size down to the size that we actually need
(ARRAY[0..99]) and just use the last two digits of the key field to identify
each employee? For instance, the record of eluployee 467353374 will be in
EMPLOYEELIST[74], while the record of employee 587421235 will be in
EMPLOYEELIST[35]. Note that the records will not be ordered according
to the values in the key fIeld, as they were in our earlier discussion; the
record of eluployee 587421235 precedes that of eluployee 467353374 in
this scheme. Instead, the records are ordered with respect to some function
of the key value.
The function 11as two uses. First, the result of the function is used as a
kind of key on which the list is sorted. That is, the function is used to
460 I Chapter 12 Searching
ENIPLOYEELIST
[00]
[01]
[02]
[03]
KEY---~Etr -----~[04]
[99]
Figure 12-3. Using a hash function to determine the location of the element in the
array.
determine where in the array to store the record. Second, the function is
used as a method of accessing the record.
This function is called a hash function. In the case of the employee list
above, the hash function is KEY MOD 100. The key (SSN) is divided by
100, and the remainder is used as an index into the array of employee
records. This scheme is illustrated in Figure 12-3.
Collisions
By now you are probably objecting to this scheme on the grounds that it
does not actually guarantee unique addresses. SSN 000001234 and SSN
999991234 both "hash" to the same address: EMPLOYEELIST[34]. The
problem of avoiding these collisions is the biggest challenge in designing a
good hash function. A good hash function minimizes collisions by spread-
ing the records uniformly throughout the table. Note that we say "mini-
mizes collisions," for it is extremely difficult to avoid them completely.
Assuming that there will be some collisions, where do you store the
records that produce them? We will briefly describe several popular colli-
sion-handling algorithms in the next sections. Note that the scheme that is
used to find the place to store a record determines the method of subse-
quently retrieving it.
I
try index 3
[99] liecord (KEY == 01023~30n9)
Figure 12-4. Hash and search: the new record will go in the first free space,
EMPLOYEELIST[5].
the employee record with key SSN 556677003. This key is hashed into the
address (index) 03. But there is already a record stored in this slot in
the array. We. increment the index and examine the next slot.
EMPLOYEELIST[4] is also in use, so we increment the index again. This
time we find a slot that is free, so we store the new record in
EMPLOYEELIST[5]. (Of course, all of the unused slots have been initial-
ized to a null or dummy key value, so it is easy to tell if the slot is free.)
What happens if the key hashes to the last index in the array and that space
is in use? We can consider the array as a circular structure and continue
looking for an empty slot at the beginning of the array.
Procedure HASHSTORE uses this algorithm to store a new element in a
list.
l.,JAR STARTPLACE, (::: starting location returned i'nnn I~IASll function :::)
TRY PLACE INDE}-{TYPE;
462\ Chapter 12 Searching
BEGIN (* hashstore *)
(:I~
I-lAS I-I is a function that returns the result of *)
(* a hash function applied to NEWVALUE, :1:)
STARTPLACE := HASH(NEWVALUE);
(* Initialize for search. *)
TRY PLACE := STARTPLACE;
PLACEFOUND := FALSE;
(* Search for place to insert NEWVALUE. *)
REPEAT
IF LIST[TRYPLACEJ = DUMMYVAL
THEN (:1: place found *)
BEGIN
LIST[TRYPLACEJ := NEWVALUE;
PLACEFOUND := TRUE;
END
ELSE (* try next place *)
TRYPLACE := (TRYPLACE + 1) MOD MAXLIST
UNTIL PLACEFOUND OR (TRYPLACE = STARTPLACE)
END; (* hashstore *)
Rehashing
Another common technique for resolving collisions is rehashing. If the first
computation of the hash function produces a collision, you use the hash
address as the input to the rehash function and compute a new address. For
instance, if our original hash function is KEY MOD 100, we may use the
rehash function (KEY + 1) MOD 100 to produce a new address. (This par-
ticular choice is, in effect, hash and search.) Our original address (Figure
12-4), 556677003, produces the address 03, which is already in use. So we
apply the rehash function, using the first hash address as input: (03 + 1)
MOD 100 = 04. This address, EMPLOYEELIST[4], is also in use, so we
reapply the rehash function until we get an available slot. Each time, we
use the address computed from the previous rehash as input (KEY) to the
rehash function.
In fact, you can use any function (KEY + <constant» MOD <number of
slots>, as long as <constant> and <number of slots> are relatively prime-
that is, not evenly divisible by any integer except 1. These functions will
Hashing 1463
produce successive rehashes that will eventually cover every index in the
array.
To access a record, use the hashing function to get an index into the
array. Compare the key of the record in this location to the desired key. If
it is not a match, rehash and compare again. Continue this cycle until you
find the desired record.
Chaining
A third technique for handling collisions uses the hash address not as the
actual location of the record, but as the index into an array of pointers called
buckets. Each bucket consists of a chain of records that share the same hash
address. Figure 12-5 illustrates this solution to the situation discussed in
the previous section. Rather than searching the array or rehashing, we sim-
ply allow both records to share hash address 03. The entry in the array at
this location contains a pointer to a chain that includes both records.
To search for a given record, first apply the hash function to the key, then
search the chain for the bucket indicated by the hash address. Note that you
have not eliminated searching, but you have limited the search to records
that actually share a hash address. Using the first hash and search tech-
nique we discussed, you may have to search through many additional rec-
ords if the slots following the hash address are filled with records from
collisions on other addresses.
buckets
[00]
add record with
KEY = 556677003 [01]
[02] add record
[03] ....'_ _.....'""' . . .,'V'...,.~_ •. ~......--... with KEY =
556677003
[04] here.
[05]
1
use
bucket
3
[99]
[98]
............--........-.--........-.j
[99] [99]
Figure 12-6.
That is, we will use the last two digits of the six-digit ID as our bucket
index. The planned hash scheme is shown in Figure 12-7(a).
Figure 12-7(b) shows the result after the hash scheme was implemented.
What happened? How could the distribution of the records have come out
so skewed? It turns out that the company's ID number is a concatenation of
three fields:
x X X X X X
'-.,-I ~
(a)
[00]
[01]
[02]
[03]
[99]
average 5 records/bucket
5 records X 100 buckets = 500 employees
expected search-O(5)
(b)
[00]
[01]
[02]
no records
[82] '------'
- -------- 97 records
[83] - -------- 27 records
no records
[99]
376 employees hired 1981
97 employees hired 1982
27 employees hired 1983
500 employees
real search-almost O(N)
Figure 12·7.
Hashing 1467
Division Method
The most common hash functions use the division method (MOD) to com-
pute hash addresses. This is the type of function used in the preceding
examples. The general function is
Searching, like sorting, is a topic that is closely tied to the goal of efficiency.
We speak of a sequential search as an O(N) search, since it may require up
to N comparisons to locate an element. (N refers to the number of records in
the list.) Binary searches are considered O(log2N) and are appropriate only
for arrays. A binary search'tree may be used to allow binary searches on a
linked structure. The goal of hashing is to produce a search that approaches
0(1). Because of collisions of hash addresses, some searching or rehashing
is usually necessary. A good hash function minimizes collisions and distrib-
utes the records randomly throughout the table.
For programmers, it is usually more interesting to create a new proce-
dure to solve some problem than to review someone else's solution. Why
Exercises \ 469
~ :~:::·~·:.::·I
DATA
Sequential Sequential
Ordered Unordered Binary
Search Search Search
26
2
96
98
103
243
244
5. Write the search routine, Procedure HASHFIND, that finds an element that
was stored by Procedure HASHSTORE described in this chapter.
DATA
944
Goals
To be able to trace some of the stages in the life cycle of a
software project.
To be able to describe two kinds of situations that lead to
program modification.
To be able to discuss four goals for a good program.
To be able to describe software-engineering principles that
relate to attaining these goals.
To be able to construct a schedule for completing a
programming assignment on time.
To be able to name and describe a technique used in this book
for designing a program through multiple levels of abstraction.
To be able to explain what is meant by data encapsulation and
why it is an important principle of software engineering.
To be able to show how Pascal cannot enforce data
encapsulation, and to name a language that can.
To be able to show how a programmer can effect data
encapsulation in a Pascal program.
To be able to describe several guidelines for choosing an
appropriate data structure.
We have said that, over the life cycle of a computer program, the largest
effort is expended not in its design or coding, but in its maintenance after
the coding is finished. To understand this assertion, you must look past
your immediate student programming experience into the real-world pro-
gramming environment. In the university, you typically write a program of
some several hundred lines of source code, debug it, run it on a few test
cases, and then leave it. In this situation, your greatest effort is spent on the
design and coding of the program. Understanding the code is clearly not a
problem, since it was written by one person over a short interval of time;
maintaining it is usually unnecessary, since programming assignments are
generally discrete activities.
In your professional computing career, you will probably engage in a
wide variety of activities in addition to coding. In fact, the act of coding is
one of the least time consuming (and least creative) aspects of program
development and maintenance. Unlike your student program, which usu-
ally ceases to have a function after it is delivered to your professor, the
professionally produced program really begins its active life at the time of
delivery to the customer. Thus program maintenance is a significant part of
the program's life cycle.
The Software Situation 1475
THE ACME AUTOMATION COMPANY (A f~Cl~ON)
The idea that a product needs to be maintained over the course of its life
cycle is nlore obviolls in the realm of hardware. Take, for instance, the case
of the Acme Automation Company and its automatic cookie-making ma-
chine. The New Business Department had the original idea: Bakeries
would be interested in buying a machine that automated the mixing, cut-
ting, baking, and bagging of their cookies. The idea was passed to the New
Product Development Department, which made up a detailed description
of the proposed Inaclline. The department determined, for instance, that
the machine should have a set of digital controls to determine how much of
each ingredient should be added, as well as a dial for controlling how long
the cookies should bake. The departnlent also decided that the machine
should have interchangeable cookie cutters. Finally, New Product Devel-
opment deternlined that the whole process should produce 10 dozen cook-
ies per minute.
The specifications were passed to a group of engineers in the Product
Design Department. The engineers designed a machine and built a proto-
type for testing. When they had a working design, the Factory Division
began building the automated cookie-making machines. Sales were good,
and soon several hllndred of tIle machines were in use in bakeries.
But the story does not end here. Over tiIne, some of the machines
needed servicing, and tIle company had to send out field engineers to re-
pair them. In addition, nlodifications to the original design were found to
be necessary. First, after a period of use in bakeries, it was found that, in
practice, the machines did not actually meet their advertised claim of pro-
ducing 10 dozen cookies a minute. The engineers were brought back in to
evaluate the problem and discovered a design flaw that made certain parts
of the machine wear down quickly, eroding the production rates. This
problem was solved with some clever redesign work; the machines were
Inodified accordingly, and the customers were happy. Second, after several
important custolners had expressed a desire for an enhanced cookie-mak-
ing machine, the Acme Automation Company had its engineers make addi-
tional modifications to the original design. The new machine made 25
dozen cookies a minute, allowed a greater range of packaging options
(boxes as well as bags), and required fewer operators.
Nate that modifications to the design were prompted by two situations:
first, when the design did not meet its specifications, and second, when the
specifications changed.
SOFTWARE ENG~NIE!ER~NG
The term programming emphasizes the act of writing a program for a com-
puter to execute-the generation of code in a computer language. As pro-
grammers, you know that this act also includes the design phase that pre-
cedes coding and the testing and maintenance phases that follow it. But the
stress of the term programming is on the act of coding.
478 \ Chapter 13 Welcome to the Real World
The term software engineering may more adequately reflect the set of
activities that control a program through its whole life cycle. As we said in
Chapter 1, the development of computer programs requires more than ar-
tistry, especially as the size and complexity of software products skyrocket.
your completion date as Friday, October 23. This leaves yOll room to over-
run your own schedule without missing the assignment's deadline. Given
this new completion date, you can make up the following schedule of mile-
stones:
Oct. 5 Assignment received. THINK.
Oct. 7 Specifications completely understood. Get answers to any ques-
tions about requirements, input, output, etc.
Oct. 9 Level 0 of top-down design completed.
Oct. 15 All levels of top-down design completed. (Save for external docu-
mentation.) All data structures determined.
Oct. 18 Coding completed.
Oct. 19 Code clean compiled. (All syntax errors out.) Begin testing.
Oct. 22 Testing completed. Make sure that program meets all standards
for delivery, and that external and internal documentation are
complete.
Oct. 23 Program ready for delivery.
If you schedule your progress from the start, you will avoid being surprised
on the night before the program is due when the hardware fails. Your
schedule will also give you a greater appreciation for the relative time
requirements of the various programming activities.
The Abstract Integer The idea of using a data type without knowing how
it is physically represented is not new to you. You have, for instance, been
using integers since your first program. The integer is a primitive type; that
is, it needs no further definition for program use. But what is an integer?
There is no single implementation of even this low-level data type. It may
be represented in the memory of one machine as a binary-coded decimal,
on a second as a sign and magnitude binary, and on a third computer in
one's-complement or two's-complement notation. All this time you've
probably been using integers without even knowing what any of these
terms mean.
Does the representation make any difference? Of course it does! The
implementation of all the operations on integers is directly dependent on
their machine representation. As Pascal programmers, you don't usually get
involved in matters at this level; you simply use integers. You know how to
define an integer variable in the syntax of the source language, and you
know what operations are allowed on integers: addition, subtraction, multi-
plication, division, and modulo. In fact, in a sense, the expression
X+Y
482\ Chapter 13 Welcome to the Real World
ADDINT(}{ f '()
where the operator, +, has been defined in the lowest level of data abstrac-
tion. All you need to know is how to use the operator: + is an infix binary
operator that takes two integer-type operands. You also know that there is
another + that is used in an identical way (binary infix) with operands of
type REAL. What is invisible to the users is the fact that the REAL + is
actually implemented in a completely different way.
In fact, both + operators are usually implemented in the machine's hard-
ware. How they work is transparent to the programmer. It would be intoler-
able if, every time you wanted to add two numbers, you had to get down
to the machine-representation level. Your code would be completely
weighted down with low-level detail and very prone to error. Furthermore,
if you wanted to use the program on a different machine, you would have to
change every use of the + operator to fit the new representation of num-
bers. Your program would not be portable.
Clearly, then, the bit level of data representation needs to be hidden
from the programmer. So we draw a "black box" around the type that has
been defined (e.g., INTEGER). On the outside of the black box is the name
of the type and a list of its operations. Inside the black box are the details of
the representation of the data element and the implementation of the oper-
ations on it. The user can only see the outside of the box.
Figure 13-1 shows how a black box for type INTEGER might look to the
Pascal user.
(inside)
REPRESENTATION OF INTEGER
(inside)
DEFINITION OF STACKTYPE
e.g., RECORD
operations TOP: INDEXTYPE;
ELEMENTS: ARRAY [1 .. MAX] OF ELTYPE
CLEARSTACK (~TACK) END; (* record *)
PUSH (STACK, EL)
POP (STACK, EL) IMPLEMENTATION OF OPERATIONS
FULLSTACK (STACK) : Boolean Complete procedures and Functions
EMPTYSTACK (STACK): Boolean
X := STACK.ELEMENTS[SJ
way to enforce the use of the interface. Accessing the data in the middle of
a stack completely denies its whole logical abstraction, but will not cause a
compile-time or a run-time error.
A truly abstract data type can be manipulated only through the opera-
tions defined for it. The user is unaware of its implementation. As such, the
abstract data type can only be simulated in Pascal, since there is no way to
enforce the prohibition against direct access of the implementation. The
ability to create enforceably abstract data types has been included in the
design of the programming language Ada.
Ada In the early 1970s, the U.S. Department of Defense (DoD) experi-
enced a significant increase in its computer systems expense, even though
hardware costs were greatly decreasing. It was determined that software
technology (or its lack) was responsible for much of this increasing cost,
and great effort was expended to develop solutions to the problem. One
solution was the decision to select a single language in which to implement
the numerous DoD projects. At that time, more than 500 different high-
level and assembly languages were in use in DoD systems. A number of
these languages were considered, including Pascal, and it was determined
that none of them met the criteria that DoD had established. Therefore, an
international design competition took place.
The details of this competition are, to say the least, very colorful; eventu-
ally, the designs from four contractors were picked as semifinalists, and
were nicknamed Blue, Yellow, Red, and Green. It is interesting that all four
of these competing groups based their designs on the Pascal language. In
the end, the Green language, designed by Honeywell/Honeywell Bull, was
selected as the winner of the design competition. The resulting language
was renamed Ada, in honor of Augusta Ada Byron, Countess of Lovelace.
Ada Lovelace, the daughter of the poet Lord Byron, was a mathematician
who worked with Charles Babbage on his computing machines in the mid-
nineteenth century. She had suggested how Babbage's machines might be
programmed to do different tasks and for this reason is considered by some
to be the world's first programmer. (She was not, however, the world's first
software engineer.)
One of the things the Ada programming language does that Pascal does
not do is enforce data encapsulation. The Ada package enforces data encap-
sulation by letting the programmer put a black box around the data type
definitions. The package is split into two parts: the package specification
that gives the interface (the outside of the black box), and the package body
that contains the hidden implementation (the contents of the black box).
You can declare a stack using an Ada package and then allow use of the
stack only through the interface in the package specification. An example of
an Ada package specification is shown below.
The Principles of Creating Good Programs I 485
pacKa~e STACKPACK is
pril.Jate
MAHSTACK : constant INTEGER := 100;
type STACKTYPE is
reco rd
TOP INTEGER range O•• MAHSTACK;
ELEMENTS array(l •• MAHSTACK) of INTEGER;
end reco rd;
end STACKPACK;
Note first of all that, because of its Pascal base, the Ada syntax seems
very familiar and is readable to a Pascal programmer who has no Ada train-
ing. The double dashes (--) precede a comment, and there is the unfamil-
iar use of the words is, in, and out, but on the whole, the package specifica-
tion is understandable.
The package specification, which is available to the stack user, gives the
interface to each of the stack utilities. However, the declaration of STACK-
TYPE itself is in a special section called the private part. This section of the
specification, unlike the procedure and function declarations that precede
4861 Chapter 13 Welcome to the Real World
it, is not visible to the stack user. That is, this private information about the
stack's implementation cannot be used in programs that use the stack pack-
age. Why was it necessary to Pllt this information in the specification at all?
The information is included in the specification, even though it is not visi-
ble to the user, because the type STACKTYPE is used in the interfaces to
the procedures and functions. If a type is completely internal to the pack-
age, it may be declared within the package body instead of the specifica-
tion, in which case it \vill be invisible to the package user.
The private declaration of STACKTYPE is visible, of course, to the pack-
age body associated with this package specification. The package body for
STACKPACK is shown below.
be~in
STACK.TOP := STACK.TOP + 1;
STACK.ELEMENTS(STACK.TOP) := X;
end PUSH,
end STACKPACK;
STACKPACK.PUSH(STACKl t NUM);
A simpler way to use the stack routines is to tell the calling program that
you want to use this package. This is accomplished through the use clause:
use STACKPACK;
be 9' i n
PUSH(STACKl t NUM);
end;
Either way, the user has access only to the visible part of the package
specification, which contains the interfaces to the stack routines.
Effecting Data Abstraction in Pascal How can you make Pascal pro-
grams enforce abstract data types? One way is to make programs look more
like pseudo-code. You can, for example, code a low-level procedure to in-
sert an element in a linked list that has been implemented with Pascal
pointer variables. This procedure may include references to PTR i .INFO
and PTR i .NEXT. The procedure that calls this insertion routine is a
higher-level subprogranl, and shouldn't include the details of the list im-
plementation. Instead of using such statements as
BEG I N (* info *)
INFO := Pj.INFO
END; (* info *)
488\ Chapter 13 Welcome to the Real World
The higher-level procedures that access list items still reference IN-
FO(PTR), since the implementation change is transparent at that level.
Similarly, NEXT and BACK pointer fields may be accessed through such
accessing functions. We discussed in Chapter 2 how the set of accessing
functions defines a data structure.
Such a scheme seems very inefficient because of the tremendous over-
head of procedtlre and function calls. In fact, a greater degree of data encap-
sulation will likely be somewhat inversely related to the (time) efficiency
of the program. As always, the identification of the goals of the program is
paramount. Does the program require fast execution above all, or is ease of
maintenance and modification a primary need? In most cases, ease of main-
tenance and modification will be more important, which means data encap-
sulation is the appropriate choice.
REGRESSION TESTING
Reexecuting tests of programs after modifications have been made to
ensure that the programs still work correctly.
Choosing a Data Structure 1489
Finally, reliability is increased by making the design conform to the
logical picture and by limiting error-causing details to lower levels of the
program.
If one of the built-in data types (e.g., the primitive types, array, or record
in Pascal) fits the problem solution, it should be used.
If not, the data structure designed should mirror the conceptual picture
of the data. Think about how you would solve the problem by hand.
The implementation should be deferred to lower-level subprograms,
transparent to the application part of the program. This makes the
code more easily readable, as well as modifiable.
If efficiency is a major consideration, the choice of data structure and
implementation should reflect the source of the limiting factor. Lim-
ited memory may require a less complicated data structure-for exam-
ple, fewer link fields in a linked list. Stringent time requirements, on
the other hand, may call for data structures that are more complicated
(take more space) to speed the algorithm. Usually, time and space
requirements cannot both be completely satisfied. In many cases,
however, programmer efficiency may be the limiting factor, requiring
the data structure to support the simplification of the program design.
Finally, resist the temptation to over-design. In student programs, it is
tempting to want to use everything at once. If the program really only
requires a simple linear linked list, don't use a doubly linked list, with
headers and trailers for good measure. That clearly wastes space, time,
and your efforts. Always design your data structures, as well as your
algorithms, to reflect the specifications of the program.
In this chapter, we have tried to bring together the many ideas presented
throughout this book. The tools we have discussed have varied greatly,
from style and formatting considerations to data structures to techniques for
using recursion and for sorting and searching. We hope that the presenta-
tion has stressed that these are all pieces of a larger, growing body of
knowledge that programmers-software engineers~share. The judicious
use of these tools leads to a goal that we all share as well: the production of
high-quality computer software.
Appendixes*
' _ _II
imM;w4&U i $i cl;_MM#A-@,&WiJ 4. 4 @EA;; ij-@ MMMMM'
* Some of these Appendixes are taken froin Introduction to Pascal and Structured Design, by
Nell Dale and David Orshalick. Copyright © 1983.
A1
A21 Appendixes
Standard Constants
FALSE TRUE MAXINT
Standard Types
INTEGER BOOLEAN REAL CHAR TEXT
Standard Files
INPUT OUTPUT
Standard functions
Para1neter type Result type Returns
ABS(X) INTEGER or REAL Same as Absolute value of X
parameter
ARCTAN(X) INTEGER or REAL REAL Arctangent of X in radians
CHR(X) INTEGER CHAR Character whose ordinal
number is X
COS(X) INTEGER or REAL REAL Cosine of X (X is in radians)
EOF(F) FILE BOOLEAN End-of-file test of F
EOLN(F) FILE BOOLEAN End-of-line test of F
EXP(X) REAL or INTEGER REAL e to the X power
LN(X) REAL or INTEGER REAL Natural logarithm of X
ODD(X) INTEGER BOOLEAN Odd test of X
ORD(X) Ordinal (scalar INTEGER Ordinal number of X
except REAL)
PRED(X) Ordinal (scalar Same as Unique predecessor of X
except REAL) parameter (except when X is the
first value)
ROUND(X) REAL INTEGER X rounded
SIN(X) REAL or INTEGER REAL Sine of X (X is in radians)
SQR(X) REAL or INTEGER Sanle as Square of X
paralueter
SQRT(X) REAL or INTEGER REAL Square root of X
SUCC(X) Ordinal (scalar Same as Unique successor of X
except REAL) parameter (except when X is the
last value)
TRUNC(X) REAL INTEGER X truncated
Appendix B \ A3
Standard Procedures
Description
DISPOSE(P) Destroys the dynamic variable referenced by
pointer P by returning it to the available space
list.
PROGRAM
IDENTIFIER
,
---. letter -_r_-------..--.
BLOCK
LABEL t .unSign~;
,
CONST...,--...... identifier---+ =--+ constant --+ ;
VAR----.-~-·identi::T :- type - ;
,
;~o~:::-:r; 4
BEGIN--
t-·· state~:r END
,
CONSTANT
TYPE
-----..---------b simple t y p e - - - - - - - - - - - - - - - - - - . - - - . . . .
- - -........ type identifier - - - - - - - - - - - - - -.........
PACKED
SIMPLE TYPE
( ~ntifier
type identifier
, E1
---------r
) ~)
FIELD LIST
identi~ : - type
,
CASE T identifier- : T type identifier - OF
co_nst~ : -
___ ( - field list - )
,
PARAMETER LIST
--c (-----------:..VAR--............
identifier --,--. : - . type identifier
,...-1
------~&> ) ] ~
FUNCTION
PROCEDURE -----rc:- identifier
, qr-'
)
' - - - - - - - - - - ; --
Appendix E I A7
STATEMENT
- _ _• unsigned integer~ II
variable - - . - - - - . : - - - . . expression------------t
function identifier..J
procedure identifier - - - - - - - - - - - - - - - - - - - - - -.......
( ---r-....-' expression -n---.. )
procedure identifier ~
,. -
BEGIN~t~m.::.=r-- END
,
IF --. expression~ THEN--... statement ELSE -----. statement
TO
DOWNTO
r expression- DO - statement
WITH - C-.
.... vari~-----Y--"I> D O - - ' statement--------............
,
GOTO~ unsigned i n t e g e r - - - - - - - - - - - - - - - - - -
VARIABLE
~ variable identifier
'--. field identifier---
[ -c::es:=J]
,
II - - . field identifier------t
Asl Appendixes
EXPRESSION
- - + imple expression ]
SIMPLE EXPRESSION
TERM
•
• factor- f"""----__ ~} otv M~O A~O
l il.-----II_----&.'i
factor ....._ ..... 1-....,)'
FACTOR
(--c..::re::.=r)
,
( ----.. expression ---.. ) ----------------t
NOT - - + f a c t o r - - - - - - - - - - - - - - - - - - . - . . t
[---------------.----.]
expression II II - . expression - ---'
Appendix F I A9
APPENDiX f PROGRAM fORMATIiNG*
ELSE
statement
* 'Reprintedwith permission of Hayden Book Company Froln Pascal With Style: Program-
ming Proverbs, by Henry F. Ledgard, Paul A. Nagin, and John F. Huares. Copyright © 1978.
A10 I
Appendixes
Variant Records
Within a record type, there may be some fields which are mutually exclu-
sive. That is, fields A and B may never be in use at the same time. Instead of
declaring a record variable large enough to contain all the possible fields,
you can use the variant record provided in Pascal.
The variant record has two parts: the fixed part where the fields are
always the same and the variant part where the fields will vary. Since only a
portion of the variant fields are in use at anyone time, they may share space
in memory. The compiler need allocate only enough space for the record
variable to include the largest variant.
If a record has both a fixed part and a variant part, the fixed part must be
defined first. The following is an example of a record definition that con-
tains a variant part.
PARTTYPE = RECORD
ID PACKED ARRAY[1 •• 10J OF CHAR;
QTY : INTEGER;
TAG: ITEM;
CASE ITEM OF
ASSEMBLY: (DRAWINGID PACKED ARRAY
[1 •• 5] OF CHAR;
Appendix G I A11
CODE 1 •• 12 ;
CLASS (A t B t Ct D) );
When using a variant record variable, the user is responsible for access-
ing fields that are consistent with the TAG field. For example, if the TAG
field is NUT, BOLT, or WASHER, only the fixed fields can be accessed. If
the TAG field is LOCK, PART.KEYNO is a legal field reference. If
the TAG field is ASSEMBLY, PART.DRAWINGID, PART.CODE, and
PART.CLASS are all legal references.
Several points should be Inade about defining and using variant records.
We will use the above definition to illustrate.
1. A record defInition may contain only one variant part, although field
lists in the variant part may contain a variant part (nested variant).
2. All field identifiers within a record definition mllst be unique.
3. The tag field (TAG) is a separate field of a record (if present).
4. The tag field is used to indicate the variant used in a record variable.
5. The case clause in the variant part is not the same as a CASE state-
ment.
(a) There is no 111atching END for the CASE; the END of the rec-
ord defInition is used.
(b) The case selector is a type (ITEM).
(c) Each variant is a field list labeled by a case label list. Each
label is a consta11t of the tag type (ITEM).
(d) The field lists are in parentheses.
(e) The field lists define the fields and field types of that variant.
6. The tag type can be any ordinal type, but it must be a type identifier.
7. Several labels can be used for the same variant (NUT, BOLT,
WASHER).
8. A field list can be empty, which is denoted by "()".
9. The variant to be used is assigned at run-tilne. The variant can be
changed by assignments to other variant fields and the tag field.
When a variant is used, data (if any) in a previous variant is lost.
10. The tag field does not appear in the field selectors for the variant
fields.
11. It is an error to access a field that is not part of the current variant.
A121 Appendixes
The case clause in the variant part of the record definition is often
matched by a CASE statement in the body of the program. For example, the
following program fragment could be used to print data about a record.
y y = F(x)
MAX --+------~~........-r-"T"""'T""'1I"""T'""T'-r-+-
MIN --+------:+-+-+-+-+-+-+-~~~-
x
FIRSTPT LASTPT
BEGIN
(:1: initialize tvlIN and ivlAX to first function value *)
MIN := F(FIRSTPT);
MAX := MIN;
EVALPT := FIRSTPT + INTERVAL;
The procedure calls the function specified for F in the call to MINMAX.
For example, the calls
MINMAX (SINt 0.5, 0.8, 0.01, MIN, MAX);
MINMAX (RESPONSE, A, B, T, MIN, MAX);
MINMA}{ (POLY, 01, D2, S, MIN, MAX)
are all valid calls to the procedure if RESPONSE and POLY' are declared
real functions, with one real formal parameter each (must be a value param-
eter). The other actual parameters must be real-all parameters must match
in type.
The call to function F within MINMAX would substitute the function
specified in the call to MINMAX. For example,
RESULT := F(EVALPT)
within MINMAX would be evaluated as
RESULT := RESPONSE(EVALPT)
if RESPONSE was specified in the call to MINMAX.
The syntax diagram and the following example adhere to the Jensen and
Wirth standard. However, several popular implementations require that
the entire procedure or function heading be repeated in the formal parame-
A14 I Appendixes
ter list. The implementation that was used to test the mailing label program
was one of these. On page 235, Function XCOMPARE is defined in the
formal parameter list of Procedure INSERT. On pages 237 and 238, Proce-
dure INSERT is called with Function COMNAME or Function COMZIP
as actual parameters.
Forward Statement
Identifiers in Pascal must be defined before being used (the type identifier
in the pointer type definition is an exception). Recursion was defined as a
procedure or function calling itself. There are recursive situations \vhere
one procedure or function calls another which in turn calls the first. This is
called mutual recursion.
<********************************************************)
PROCEDURE ONE (VAR A : ATYPE);
BEGIN
TWO(}'{) ;
END;
(********************************************************)
PROCEDURE TWO (VAR B : BTYPE);
BEGIN
ONE(Y) ;
END;
(********************************************************)
In the above example, the call to procedure TWO in the body of procedure
ONE is not allowed because procedure TWO has not yet been defined.
The solution to this problem is to make a forward reference to procedure
TWO by using the FORWARD statement.
(********************************************************)
PROCEDURE TWO <VAR B : BTYPE);
FORWARD;
(********************************************************)
PROCEDURE ONE (VAR A : ATYPE);
Appendix H IA15
BEGIN
TWO(}-{ > ;
END;
(********************************************************>
PROCEDURE TWO;
BEGIN
ONE( Y> ;
END;
(********************************************************>
Notice that the parameter list (and the result type for a ftInction) is written
in the forward reference; it is not repeated in the actual declaration of the
procedure (or function). The compiler "remembers" the parameter declara-
tions \vhen it encounters the actual procedure.
Right EBCDIC
Left Digit
Digit(s) 0 1 2 3 4 5 6 7 8 9
6 0
7 ¢ < +
8 &
9 ! $ * l
10 %
11 > ?
12 # @ a
13 b c d e f g h
14 j k m n
15 0 p q r
16 s t u v w x y z
17 {
18
19 A B C D
"E F G
20 H I J
21 K L M N 0 p Q R
22 S T U V
23 W X Y Z
24 0 1 2 3 4 5 6 7 8 9
Right CDC
Left Digit
Digit(s) 0 1 2 3 4 5 6 7 8 9
0 A B C D E F G H I
1 J K L M N 0 p Q R S
2 T U V W X Y Z 0 1 2
3 3 4 5 6 7 8 9 + *
4 / ( ) $ 0 , - [
5 ] % # ~ V /\ t t < >
6 ~ ~ -,
APPENDIX I SETS
Appendix I IA17
To declare a set type you use the following syntax:
SET OF base type
The type LETTERSET in the example above describes a set type in
which the base type is the letters of the alphabet. The statement
VAR VOWELS, CONSONANTS: LETTERSET;
actually creates two set variables of this type. VOWELS and CONSO-
NANTS are undefined (like all variables) until you initialize them in your
program. Be careful; each is a structure that can contain none, one, or a
combination of alphabetic characters. The set variables do not start out with
the letters in them.
To put elements into a set, you must use an assignment statement.
l.,JDWELS:= ['A', 'E', 'I', '0', 'U']
puts the elements 'A', 'E', 'I', '0', 'U' into the set variable VOWELS. Notice
that []s are used here, not ()s.
You cannot access the individual elements of a set, but you can ask if a
particular element is a member of a set variable. You can also do the stand-
ard set operations: union, intersection, and difference.
+ (Union): the union of two set variables is a set made up of those ele-
ments that are in either or both.
* (Intersection): the intersection of two set variables is a set made up of
those elements occurring in both set variables.
- (Difference): the difference between two set variables is a set made up
of the elements that are in the first set variable but not in the second.
The relational operators (=, <>, > = , < = , <, » can all be applied to
sets. In addition, there is a test for set nlembership.
Expression Returns TRUE if
SETI = SET2 SETI and SET2 are identical
SETI < SET2 all the elements in SETI are in SET2 and there
is at least one element in SET2 not in SETI
SETI > SET2 all the elements in SET2 are in SETI and there
is at least one element in SETI not in SET2
IF CH IN l"JOWELS
THEN +.+
if VOWELS had been initialized to ['A', 'E', 'I', '0', 'U']. Testing for set
membership is a much faster operation than evaluating a long expression in
an IF statement, and it certainly is easier to read.
Ordering has no meaning in sets. The assignment
is the same as
so it doesn't matter how you list values within the set brackets.
We've used subranges to assign values to sets. As usual, the second value
must be greater than the first. For example,
LET 1 ~ = [' p , •• 'T 't 'A' •• ' C ' t 'Z' t 'K' •• ' M ' J
although it makes better sense to list the values in a more readable order.
Just like variables of other types, you need to initialize the value of a set
variable before you manipulate it. If you want a set to be empty before
adding elements to it, the assignment to the empty set,
LET1 ~= [J
should be used.
Appendix J I A19
APrP[END~X J f~l!ES
The definition of a file does not limit the type of the file components. You
can have files of any type, simple or structured.* Only INPUT and OUT-
PUT are predeclared as file variables (of type TEXT). All other file varia-
bles must be declared in the program. The following declarations are all
valid.
Notice that INFILE is a text file, MAIL is a file of arrays, WORDS is a file
of strings, and INVTRY is a file of records. Also declared are variables of
the component types in order to manipulate a component of each file.
File Buffers
You have access to only one component of a file at anyone time. Whenever
you declare a file variable, you automatically create another variable
known as the file buffer variable. The buffer variable is denoted by the file
name followed by an up-arrow ( t ). For example, the buffer variable for file
INFILE is written as
INFILEj
*A file of files is not generally allowed. There may also be restrictions on the use of files as
elements of other structured types.
A20 I Appendixes
This buffer variable is the "window" through which you can either inspect
(read) or append (write) file components.
Whenever you do a READ or WRITE operation, you are actually manip-
ulating the file buffer variable. In Figure J-l(a) the character 'F' was read
last. The buffer variable, INFILE t , contains the next component to be
read-in this case the character 'G'. The statement
READ(INFILEt CH)
actually assigns the value in INFILE t to CH, and INFILE t gets the next
component in the file. [See Figure J-l(b).] INFILE t now contains a blank
because it is accessing the <eoln> marker and the <eoln> character is
stored as a blank. EOLN is currently TRUE. The statement
READLN(INFILE)
would move the window to the component past the <eoln> marker, and
EOLN would be FALSE. [See Figure J-l(c).]
File INFILE
1
(a)
~ INFILEj
l'
1
(b)
D INFILEj
l'
1
(c)
o INFILEj
Figure J-1.
Appendix J \ A21
READ(INFILEt CH)
is equivalent* to
CH : = INFILEj;
GET(INFILE)
If a
READ(INFILE, CH)
WRITE(OUTFILE, CH)
is equivalent to
OUTFILEj := CH;
PUT(OUTFILE)
OUTFILEj := CH
assigns a value to the buffer (assume CH contains the character 'C'). [See
Figure J-2(b).] Then the statement
PUT(OUTFILE)
*Note that this equivalence holds only if the variable CH is of the type CHAR.
A22 I Appendixes
OUTFILE
------.-------
------------
OUTFILE t
(a)
----...-------
'-----..-----'-------
OUTFILE t
(b)
------....----... - -- - - --
'---_---11..-----' _
OUTFILE t
(c)
Figure J..2.
would append the value in OUTFILE t to the file OUTFILE. [See Figure
J-2(c).] After the PUT operation, OUTFILE t is again undefined. Notice
that EOF is always TRUE for an output file. The foregoing example is
identical to what would have happened if the statement
WRITE(OUTFILEt CH)
(:i~ uses READ & VVElTE :!:) (* uses GEl' & PUT :1:)
We still need the WRITELN to generate the <eoln> marker wI1en dealing
with text files. However, the <eoln> marker does not exist for nontext files,
so, if FILEI and FILE2 are nontext files, the above code reduces to
The procedures READ and WRITE are not defined in standard Pascal
for nontext files, though some implementations do support this. So, for
nontext files, you must use GET and PUT and the file buffer variable. For
example, if INVTRY is a file of parts records, INVTRY i represents a rec-
ord in INVTRY, and
GET ( I Nl.,JTRY)
CHAPTER 1
1. This program will read in a series of student records, SOlt them by CPA,
and print out a listing of the student records by descending class rank.
Input:
The data for each student is free forInat, with each field separated by at
least one blank.
The fields for each record are in the following order:
Following the last record is a single 0 (in place of the next ID number),
to designate the end of the file.
Output:
The sorted listing should be output in the following format, ordered by
descending class rank, with appropriate headings:
A25
A26\ Exercise Answers
4. Outline.
GET UP
GET DRESSED
IF HUNGRY
GET BREAKFAST
GO TO CLASS
GET UP LEVEL 1
TAKE SHO\VER
GET CLOTHES
PUT ON CLOTHES
Exercise Answers IA27
GET BREAKFAST LEVEL 1
GO TO KITCHEN
GET CEREAL
PUT CEREAL IN A BOWL
GET MILK
PUT MILK ON CEREAL
WHILE ~10RE CEREAL IN BOWL
EAT
CLEAN UP MESS
GO TO CLASS LEVEL 1
LEAVE HOUSE
GO TO BUS STOP
WHILE NO BUS
WAIT
GET ON BUS
WHILE NOT THERE
STAY ON BUS
WALK TO CLASS
15. Logic errors are errors in the problem solution. These errors are found
during run tilne when your program either fails to run to completion or
gives the wrong answer.
Syntax errors are errors made in coding your solution in a programming
language. These errors are usually found during compile time.
16. F
17. Error 1: The parameter for Procedure INCREMENT should be a VAR
parameter.
Error 2: The BEGIN-END pair has been left off the body of the WHILE
loop. Only the WRITELN statement is actually within the WHILE
statement.
Error 3: The call to Procedure INCREMENT is within the comment.
18. Top-down testing makes use of stubs. Stubs are dummy procedures or
functions that stand in for procedures or functions that have not been
coded yet. You test the top levels of a program by running the program
with these stubs. At the next level of testing, you substitute the actual
procedures or functions for the stubs.
Drivers are used in bottom-up testing. Each procedure or function is
tested within a special program that invokes the procedure or function with
appropriate parameters and prints the results. These special programs are
called drivers.
19. This program will read in a set of records, each containing a name, a city,
and a zip code.
Sort the records in ascending order by last name, and print out the
information in table form.
Input:
The input data is in semifixed format, with one record per line. The fields
are:
col. 1-9 FIRSTNAME The name may begin anywhere in the
field. There may be embedded blanks,
hyphens, periods, etc.
col. 10 blank
col. 11-19 LASTNAME The name may begin anywhere in the
field. There may be embedded blanks,
hyphens, periods, etc.
col. 20 blank
col. 21-34 CITY Begins in col. 21. May include embedded
blanks or periods.
col. 35 blank
col. 36-40 ZIPCODE
You may assume that the input correctly conforms to this format.
The last line of the input has NOMORE beginning in col. 1 to designate
the end of the fIle.
Output:
The sorted records are to be printed in ascending order in the following
table format:
Exercise Answers 11\2.9
CHAPTER 2
1. CONST ARRAYLIMIT = 40;
TYPE ARRAYTYPE = ARRAY[1 •• ARRAYLIMITJ OF REAL;
l.,JAR ONED : ARRAYTYPE;
2. The accessing function of a one-dimensional array consists of two parts: the
name of the collection of elements and an index that determines which
element in the collection is to be accessed.
For example, ONED[I] accesses the Ith element in the array defined in
question 1.
3. (a) 5
(b) 26
(c) 11
4. NUM[I] 0
NUM[3] 2
NUM[5] 4
LET['A'] 5
LET['N'] 18
LET['Z'] 30
FP[ - 4] 31
FP[O] 35
FP[6] 41
5. TYPE TWOD1TYPE = ARRAY[1 •• 10t / A/ •• /Z/] OF CHAR;
VAR TWOD1 : TWOD1TYPE;
6. TYPE l.,JECTOR = ARRAY[ 'A I •• IZ J OF CHAR;
1
CHAPTER 3
1. (a) 4 (b) 5 2. (a) 2 (b) 36
2 X=l 4 25
3 Y=5 6 16
5 8 9
3 10 1
3. OVERFLOW? UNDERFLOW?
S TOP = 4
[1] [2] [3] [4] [5] c= 'F'
4. OVERFLOW? UNDERFLOW?
S TOP = 5
[1] [2] [3] [4] [5] c= 'A'
5. OVERFLOW? UNDERFLOW?
Exercise Answers I A33
s TOP = 0
[1] [2] [3] [4] C = 'B'
6. OVERFLOW? UNDERFLOW?
s TOP = 4
[1] [2] [3] [4] [5] C = 'B'
8. Two elements are PUSHed and three are POPed. POP (S, Y - 2) is illegal
because the second parameter is a VAR parameter.
9. l"JAR 8 1 t 82 : 8TACKTYPE;
CH1 t CH2 : CHAR;
COUNT: 0 •• 100;
MATCHING: BOOLEAN;
I t HALFCT : 0 •• 50;
CHAPTIER 4
1. 8 2. 5
5 7
7 2
6 5
5
Exercise Answers \ A35
5. CLEAR
BEGIN
UFLOW := EMPTYQ(QUEUE);
IF NOT UFLOW
THEN
BEGIN
I F QUEUE. FRONT = MA}{QUE"UE
THEN QUEUE.FRONT := 1
ELSE QUEUE.FRONT := QUEUE.FRONT + 1;
DEQVAL := QUEUE.ELEMENTS[QUEUE.FRONTJ
END
END;
7. The procedures and functions are given below without comments unless
the code is not self-documenting. QUEUE.FRONT points to the front
element.
BEGIN
QUEUE.FRONT := 1
QUEUE.REAR : = MA}'{QUEUE;
QUEUE.COUNT := o
END;
(*******************************************)
FUNCTION FULLQ (QUEUE: QTYPE);
BEGIN
FULLQ := QUEUE.COUNT = MAX QUEUE
END;
(*******************************************)
FUNCTION EMPTYQ (QUEUE: QTYPE);
BEGIN
EMPTYQ := QUEUE.COUNT 0
END;
(*******************************************)
PROCEDURE ENQ (VAR QUEUE QTYPE;
NEWt"JAL ELTYPE) ;
BEGIN
IF QUEUE.REAR = MAXQUEUE
THEN QUEUE.REAR := 1
ELSE QUEUE.REAR := QUEUE.REAR + 1;
QUEUE.ELEMENTS[QUEUE.REARJ := NEWVAL;
QUEUE.COUNT := QUEUE.COUNT + 1
END;
Exercise Answers I A37
(*******************************************)
PROCEDURE DEQ (VAR QUEUE QTYPE;
lyJAR DEQlyJAL ELTYPE) ;
BEGIN
DEQVAL := QUEUE.ELEMENTSCQUEUE.FRONTJ
QUEUE.COUNT := QUEUE.COUNT - 1;
IF QUEUE. FRONT = MAXQUEUE
THEN QUEUE.FRONT := 1
ELSE QUEUE.FRONT := QUEUE.FRONT + 1
END;
BEGIN
IF DEQUE.FRONT = 1
THEN DEQUE.FRONT := MAXDEQUE
ELSE DEQUE.FRONT := DEQUE.FRONT - 1;
DEQUE.ELEMENTSCDEQUE.FRONTJ := NEWVAL;
DEQUE.COUNT := DEQUE.COUNT + 1
END;
(c) PROCEDURE INDEQUEREAR is the same as the ENQ routine for
question 7.
(d) PROCEDURE OUTDEQUEFRONT is the same as the DEQ routine
for question 7.
(e) PROCEDURE OUTDEQUEREAR (l"JAR DEQUE DEQTYPE;
l"JAR DEQl"JAL ELTYPE) ;
BEGIN
DEQVAL := DEQUE.ELEMENTSCDEQUE.REARJ;
DEQUE.COUNT := DEQUE.COUNT - 1;
IF DEQUE.REAR = 1
THEN DEQUE.REAR := MAXDEQUE
ELSE DEQUE.REAR := DEQUE.REAR - 1
END;
(f) Deque applications are usually generalizations of queue operations. For
example, the data structure used to simulate a terminal input is a
modified deque. Characters are entered into the deque as they are
typed, but they can be ren10ved fr01l1 the front by the read operation or
from the rear by the rubout or delete operation.
A38 I Exercise Answers
BEGIN
FOUND := FALSE;
COUNTER := 1;
WHILE NOT FOUND AND COUNTER <= 8 DO
BEGIN
FOUND := NOT EMPTYQ(JOBS[COUNTERJ);
IF FOUND
THEN DEQ(JOBS[COUNTERJ, TOKEN)
ELSE COUNTER := COUNTER + 1
END;
ERROR := NOT FOUND
END;
(c) PROCEDURE CLEANUPJOBS (JOBS JOBTYPE) ;
BEGIN
FOR COUNTER := 1 TO 8 DO
WHILE NOT EMPTYQ(JOBS[COUNTERJ) DO
BEGIN
DEQ(JOBS[COUNTERJ, TOKEN);
NOTIFY(TOKEN, MESSAGE7)
END
END;
CHAPTER 5
1. (a) [3].INFO becomes 17 (b) [3].NEXT becomes 0
[3].NEXT becomes 6 [10].NEXT becomes 2
[10].NEXT becomes 3 AVAIL becomes 10
AVAIL becomes 2
Exercise Answers \ A39
FREENODE(Q)
END
ELSE WRITELN( 'NO ITH ELEMENT');
6.
PROCEDURE INSERT (VAR START: PTR; Q PTR) ;
BEGIN
P := START;
IF NODES[PJ.INFO > NODES[QJ.INFO
(* If Q goes before first node. *)
THEN
BEGIN
NODES[QJ.NEXT := START;
START := Q
END
ELSE
BEGIN
PLACEFOUND := FALSE;
WHILE (NODESEPJ.NEXT <> NULL) AND NOT PLACEFOUND DO
IF NODES[NODESEPJ.NEXTJ.INFO < NODES[QJ.INFO
THEN P := NODESEPJ.NEXT
ELSE PLACEFOUND := TRUE;
NODES[QJ.NEXT := NODES[PJ.NEXT;
NODES[PJ.NEXT := Q
END
END; (* insert *)
7. (* P, TEMP: PTR *)
TEMP := START;
START := NODES[STARTJ.NEXT;
(* Initialize TEMP with first node from START *)
NODES[TEMPJ.NEXT := NULL;
WHI LEST ART <: > NUL L DO (* while more elements in list *)
BEGIN
P : = START; (* Put them in sorted list TEMP. *)
START := NODES[STARTJ.NEXT;
INSERT(TEMPt P)
END;
STAR T : = TEMP ; (* TEMP pointed to sorted list. *)
8. Algorithm: Traverse both lists, comparing the first nodes in each list.
MOVE the node with the smaller INFO field to NE\VLIST. We must add
to the end of NEWLIST. (Let LASTNODE be a pointer to the last node in
NEWLIST.) When one list is empty, move the rest of the other list to the
end of NEWLIST. It would be convenient to have a short procedure that
would move the designated node from the beginning of its list to the end
of NEWLIST. Let us call it MOVE:
Exercise Answers IA41
PROCEDURE MOVE (VAR P, NEWLIST, LASTNODE : PTR);
BEGIN
IF NEWL I ST = NULL (* If it will be first node in NEWLIST. *)
THEN NEWLIST := P
ELSE NODES[LASTNODEJ.NEXT := P;
LASTNODE := P;
P := NODES[PJ.NEXT
END; (* move *)
(* Assumes at least one element in each list. Saves duplicate nodes. *)
9. Both linked lists (LIST and AVAIL) end in the same place. Is NODE[7] on
the available space list or on the ordered linked list pointed to by LIST?
10. We need one NULL value to denote the last node in each linked list. This
array contains two linked lists: LIST and AVAIL.
l.,JAR P : PTR;
POS : BOOLEAN;
BEGIN
P := LIST;
POS := FALSE;
IF P = NULL
THEN WRITELN( 'EMPTY LIST')
ELSE REPEAT
P := NODES[PJ.NEXT;
IF NODES[PJ.INFO > 0
THEN
BEGIN
WRITELN(NODES[PJ.INFO);
POS := TRUE
END
UNTIL P LIST;
IF NOT POS THEN WRITELN( 'NO POSITIVE ELEMENTS')
END;
A421 Exercise Answers
l"JAR P : PTR;
BEGIN
GETNODE ( P) ;
NODES[P].INFO := x;
IF Q = NULL (* If Q is empty. *)
THEN NODES[P].NEXT := P
ELSE
BEGIN
NODES[P].NEXT := NODES[Q].NEXT;
NODES[Q].NEXT := P;
END;
Q : = P
END;
13. PROCEDURE DEQ (l"JAR Q PTR; ~JAR }-{ ELTYPE);
l"JAR P : PTR;
BEGIN
P := NODES[Q].NEXT;
X := NODES[P].INFO;
IF P = Q (* Only one node in queue. *)
THEN Q := NULL
ELSE NODES[Q].NEXT := NODES[P].NEXT;
FREENODE(P)
END; (* deq *)
CHAPTlE~ 6 Q A
1. (a) P := Pi .NEXT
p A
(b) Q := P
p Q
(c) R := Pi .NEXT
p Q A
p Q A
(f) R i .NEXT := P p Q R
3. (a) 1 1
(b) 2 NULL
4. F
5. NULL NULL
6. (a) Employees are assigned EMPNOs from 1 to 1000. Use zero in header
node.
NEW (P) ;
Pi.EMPNO := 0;
Pi.NE}-{T := NIL;
EMPLOYEES := P;
(b) l"JAR I : INTEGER;
CH : CHAR;
DEPT: 1 •• 20;
EMPNO : 1 •• 1000;
NAME: ARRAY[1 •• 25] OF CHAR;
SALARY: INTEGER;
NEW (P) ;
FOR I : = 1 TO 25 DO
BEGIN
READ (CH) ;
NAME [ I] : = CH
END;
Pi.NAME := NAME;
READ(DEPT) ;
Pi.DEPTNO := DEPT;
READ(EMPNO) ;
Pi.EMPNO := EMPNO;
READ(SALARY);
Pi.SAL := SALARY;
(c) PROCEDURE INSERT (EMPLOYEES, EMP PTR) ;
BEGIN
PLACEFOUND := FALSE;
A44\ Exercise Answers
BACK : = NIL;
P : = EMPLOYEES; (* Initialize pointers. *)
WHILE (P <> NIL) AND NOT PLACEFOUND DO
Since we are not using a header node, it is possible that we will try to
delete the first-or only-node in the list. So we will need to pass the
external pointer to the list as a parameter.
END;
DISPOSE (P) ;
END; (* delete *)
BEGIN
IF Pi +BACK NIL (* first or only node *)
THEN
BEGIN
LIST := Pi+NE}{T;
IF LIST <> NIL THEN LISTi+BACK := NIL
END
ELSE (* middle or last node *)
BEGIN
Pi+BACKi+NEXT := Pi+NEXT;
IF Pi+NEXT <> NIL
THEN Pi+NEXTi+BACK := Pi+BACK
END;
DISPOSE(P)
END; (* delete2 *)
BEGIN
P := EMPLIST;
B := NIL;
A46 1 Exercise Answers
EMPLIST
I~~I"
( 10
I
\.0\ \(;\ P
®3\*_./~
Y EMP1.EMPNUM<Pl.EMPNUM
EMP
l"JAR P : PTR;
BEGIN
P : = EMPLIST;
WHILE EMPj.EMPNUM > Pj.EMPNUM DO
P := Pj.NE}{T; (* Finds insert place. *)
CHAPTER 7
1. FUNCT I ON LESS THAN (EL 1 t ELZ ETYPE) BOOLEAN;
BEGIN
IF SUCC(EL1) = ELZ
THEN LESSTHAN := TRUE
ELSE IF SUCC(EL1) = LASTEL
THEN LESS THAN := FALSE
ELSE LESSTHAN := LESSTHAN(SUCC(EL1} ELZ)'
END; (* lessthan *)
2. PROCEDURE ORDERPR I NT (L I ST : PTR);
BEGIN
IF LIST <> NIL
THEN
BEGIN
WRITELN(LISTj.INFO);
ORDERPRINT(LISTj.NEXT)
END
END;
3. FUNCT I ON SUMSQRS (L I ST PTR) INTEGER;
BEGIN
IF LIST = NIL
THEN SUMSQRS := 0
ELSE SUMSQRS := SQR(LISTj.INFO)
+ SUMSQRS(LISTj.NEXT)
END; (* sumsqrs *)
4. (a) Will not work. A "slnaller" case is called, but TEMP is reinitialized
each time to LIST. Therefore TEMP will never be nil.
(b) Calculates the product of the positive elements in the list.
(c) Works but does nothing. It always returns FALSE.
(d) Will not work. No trivial nonrecursive case exists.
5. (a) 4 (b) 0 (c) 1
6. (a) Trivial case: POWER:= 1
Recursive call: POWER(M, N - 1)
(b) Trivial case: P = NIL (* Nothing is done. *)
Recursive call: PRINT(P j .NEXT)
(c) Trivial case: FACT:= 1
Recursive call: FACT(N - 1)
(d) Trivial case: LL = UL (* Do nothing. *)
Recursive call: SORT(A, LL, UL - 1)
A48\ Exercise Answers
CHAPTER 8
1. (a) (b)
(c) (d) 43
SAME
55
56
(e) 33 (f)
47 57
2. FOX
BOX ELF
3. If the data were ordered in ascending order, the tree would have no left
branches. If the data were ordered in descending order, the tree would
have no right branches. For example: 1 2 3 4 5 6 ...
Exercise Answers \ A49
3
4
BEGIN
IF ROOT <> NIL THEN
BEGIN
STANDING(ROOTt+RIGHT) ;
WRITE(ROOTt.GPA : a : 2);
WRITE(FIRSTNAME) ;
WRITE(LASTNAME) ;
STANDING(ROOTt+LEFT)
END
END;
5. PROCEDURE FEMPRINT (ROOT TPTR) ;
BEGIN
IF ROOT <> NIL THEN
BEGIN
FEMPRINT(ROOTt·LEFT) ;
IF ROOTt.SEX = FEMALE
THEN WRITELN(ROOTt.FIRSTNAME, ROOTt.LASTNAME);
FEMPRINT(ROOTt·RIGHT)
END
END;
6. FUNCT I ON SUMSQRS (ROOT PTR) INTEGER;
BEGIN
IF ROOT = NIL
THEN SUMSQRS := 0
ELSE SUMSQRS := SQR(ROOTt.INFO.)
+ SUMSQRS(ROOTt·LEFT)
+ SUMSQRS(ROOTt·RIGHT)
END;
7. l.JAR P PTR;
BEGIN
P := ROOT;
WHILE Pt. INFO <> NUM DO
BEGIN
WRITELN(Pt·INFo);
IF NUM > Pt. INFO
A50 I Exercise Answers
THEN P := Pj.RIGHT
ELSE P := Pj.LEFT
END
END;
8. (a) BEG I N
IF Pj.INFO <> NUM THEN
BEGIN
WRITELN(Pj.INFO) ;
IF NUM > Pj.INFO
THEN ANCESTOR(Pj.RIGHTt NUM)
ELSE ANCESTOR(Pj.LEFT, NUM)
END
END;
(b) ANCESTOR (ROOT, 14);
9. BACK ~ P
TEMP ~ RIGHT(P) (* MOVE to the right. *)
(* Move as far left as possible. *)
WHILE LEFT(TEMP) <>NULL DO
BACK ~ TEMP
TEMP ~ LEFT(TEMP)
INFO(P) ~ INFO(TEMP)
IF P = BACK (* P's right child has no left child. *)
THEN RIGHT(BACK) ~ RIGHT(TEMP)
ELSE LEFT(BACK) ~ RIGHT(TEMP)
CHAP1ER 9
1. (a) Preorder: + *ab / +c d e
Inorder: (a *b ) +((c +d )/e )
Postorder: a b *c d +e / +
(b) Preorder: * / + abc d
Inorder: ( (a + b )/ c ) *d
Postorder: ab +c/d *
2. 5
3.
9
Exercise Answers I A51
4. PROCEDURE PREORDERPR I NT (TREE TREEPTR) ;
BEGIN
IF TREE <> NIL
THEN
BEGIN
IF TREEt.CONTENTS = OPERAND
THEN WRITE(TREEt.VAL)
ELSE WRITE(TREEt.OPER);
PREORDERPRINT(TREEt·LEFT) ;
PREORDERPRINT(TREEt·RIGHT)
END
END;
6.
[1] [2] [3] [4] [5] [6] [7] [8] [9] [10]
7. 95
52 13
35
35
26 47
9.
[1] [2] [3] [4] [5] [6] [7] [8] [9]
A521 Exercise Answers
10. [11 [2] [3] [4] [5] [6] °[7] [8] [9] [10]
Original W L T G B P S F D A
1 value
A L T G B P S F D W
in place
Reheap T L S G B P A F D W
2 yalues
D L S G B P A F T W
in place
Reheap S L P G B D A F T W
3 values
F L P G B D A S T W
in place
11. (a)
(b)
~
~
12. (a)
(b)
21
13. (a) MA BI SU GE JO SA
MARY 0 1 0 1 0 0
BILL 1 0 1 0 0 0
SUSAN 0 0 0 0 1 0
GEORGE 0 0 1 0 0 1
JOSH 1 0 0 0 0 0
SARAH 0 1 0 0 1 0
Exercise Answers I A53
(b) o 1 2 3 5 8 13 21
o 0 1 0 0 0 0 0 0
1 0 0 1 0 0 0 0 0
2 0 0 0 1 0 0 0 0
3 0 0 0 0 1 0 0 0
5 0 0 0 0 0 1 0 0
8 0 0 0 0 0 0 1 0
13 0 0 0 0 0 0 0 1
21 0 0 0 0 0 0 0 0
CHAPTER 10
1. {I = 4}
{K = 9}
{N = I}
{DATA[I] = J}
2. {TRUE}
IF }-{ -:: y
THEN {X -: : Y}{X = MIN(X, Y)}
MIN : = }.{ {MIN = MIN(X, Y)}
ELSE {X >= Y}{Y = MIN(X, V)}
MIN ~ = y {MIN = MIN(X, Y)}
{MIN = MIN(X, Y)}
3. l.,J AL U ~ = 1;
I ~ = 2;
WHILE I <= N DO
BEGIN
VALU := VALU * DATA[IJ;
I ~= I + 1
END;
4. At initialization, at termination, and at the top of the loop (Le., invariant
preserved)
5. {A = 5 AND B = 4}
BEGIN
TEMP : = A; {B 4 AND A = 5 AND TEMP 5}
A : = B; {B 4 AND TEMP 5 AND A ll}
B : = TEMP {B 5 AND TEMP = 5 AND A ll}
END;
A54\ Exercise Answers
CHAPTER 11
1. Inner loop of SELECTSORT:
DATA[MINDEX] < = DATA[I] ..DATA[J - 1] and J < = N + 1
Initialization: MINDEX is initialized to I, and J is initialized to I + 1.
This gives DATA[I] < = DATA[I] .. DATA[I], which is true. To show that
J < = N + 1, we again substitute for J and get I + 1 < = N + 1. To show
that this is true, we must know that I < = N. This is part of the outer loop
invariant, so we will assume it as a precondition.
Termination: The loop condition is no longer true; therefore we know
that J > N. Since our loop invariant says that J < = N + 1, J luust be equal
to N + 1. Substituting this in our loop invariant, we have
DATA[MINDEX] < = DATA[I]..DATA[N], which is just what we want.
Preservation: We know that
DATA[MINDEX] < = DATA[I] .. DATA[J - 1] because it is the loop
invariant. If DATA[J] is less than DATA[MINDEX], then MINDEX
becomes J and we now know that .
DATA[MINDEX] < = DATA[I] .. DATA[J]. If DATA[J] is not less than
DATA[MINDEX], we also can say that
DATA[MINDEX] < = DATA[I] .. DATA[J].
J is incremented by 1. Using the assignment rule, we have our loop
invariant back: DATA[MINDEX] < = DATA[I] .. DATA[J - 1].
Preservation: The body of the outer loop contains three parts: the inner
loop, a call to a swapping procedure, and the staten1ent that increments the
loop control counter. Since we have proved the inner loop, we can take its
tenninating condition as an assertion here. Since the swapping algorithm
was proven as an exercise in Chapter 10, we will assume that it does what
we expect. (In later courses when you go into nlore depth on program
verification, you will learn rules about verifying procedures.)
Let us sumlnarize what we know on exit from the inner loop:
DATA[I] .. DATA[I - 1] is sorted. (outer loop invariant)
DATA[I\1INDEXJ < = DATA[IJ .. DATA[N] (terminating condition of
inner loop)
DATA[I] .. DATA[I - IJ < = DATA[I] .. DATA[N] (outer loop invariant)
The next statement swaps the contents of DATA[MINDEX] and
DATA[IJ. SO now we can say that DATA[I] < = DATA[I] .. DATA[N]. Since
DATA[I] is less than or equal to any value in DATA[I] .. DATA[N], yet is
greater than or equal to any value in DATA[I] .. DATA[1 - 1], the following
statements are true:
DATA[I] .. DATA[I] < = DATA[I + 1] .. DATA[N]
DATA[I] .. DATA[I] is sorted.
The final statelTIent in the loop increnlents I by 1. Applying the rule of
assignment statements, we have just what we want:
DATA[I] .. DATA[I - 1] < = DATA[I + 1] .. DATA[N]
DATA[I] .. DATA[I - 1] is sorted.
2. O(N 2): insertion, selection, bubble
O(N log2N): quicksort, heapsort
3. (1) Size of N: If the nUlnber of elen1ents is small, an O(N 2) sort may run
faster than an O(N log2N) sort because of the overhead involved in a
lnore complex algorithlTI.
(2) Original order of the elements: Some sorts vary widely in time
required depending on the original order of the data. For exalnple, if
the elements are already in sorted order, the bubble sort, which
recognizes when elelnents are sorted, will be only O(N), but quicksort
will be O(N 2).
4. (a) Selection
(b) Bubble
(c) Insertion (Relnelnber that the outer loop is initialized to 2, so the first
three elements are sorted an10ng themselves after the second iteration.)
5. (a) 20 31 a1 58 58 15
(b) 15 20 31 a1 58 58
(c) 15 20 31 58 58 a1
6. The selection sort given in this chapter sorts from lowest to highest. You
can sort fron1 highest to lowest by looking for the maximum element left in
the unsorted portion of the array each time instead of looking for the
lninimum element.
A56 I Exercise Answers
In this problem, after the data have been read into an array of records,
use a selection sort ordering the data in decreasing order. When you have
gone through the outer loop 10 times, you will have the 10 winners.
7.
Data are in
completely
random
DM DM BEST DJ\!1
order.
Data are
ordered from
lowest to
BEST DM DM BEST WORST DM
highest
Data are
ordered from
highest to
WORST DM DM WORST WORST DM
lowest
8. The strategy here is to use the percentile as an index into an array where a
count is kept of the nUlnber of times that percentile has occurred. Since
only the percentiles are needed, the rest of the data record can be ignored.
PROCEDURE PERCENTILELIST;
BEGIN
(* Initialize counters to zero. *)
FOR CT := 0 TO 100 DO
COUNTS [CT] : = <);
(* Read and count SAT percentile scores. *)
WHILE NOT EOF(DATA) DO
BEGIN
READLN(DATA t 10 t SAT);
COUNTS[SATJ := COUNTS[SATJ + 1
END;
(* Prints the SAT percentile scores including duplicates. *)
FOR SAT := 100 DOWNTO 0 DO
FOR CT := 1 TO DATA[SATJ DO
WRITELN(SAT)
END;
Exercise Answers I A57
9. N 3 + 100 = O(N 3 )
3N 2 + N + 500 = O(N 2 )
O(N 2 ) is more efficient than O(N 3 ).
See answer to question 4.
10. Encapsulating frequently used code within a routine hides details and
makes the code easier to read and understand. However, execution time is
shorter when the code is expanded inline.
11. When determining efficiency, there are three resources to consider:
memory space, execution time, and programmer time.
It will take a programmer longer to write and debug a complex
algorithm. If the code is to be executed many times, the investment in a
more efficient algorithm may be worthwhile. If the code is to be executed
only once or twice, a silnple algorithm that takes more time to run but less
time to write and debug would be n10re efficient.
CHAPTER 12
1. KEYVAL NUMBER OF COMPARISONS
Sequential Sequential
Ordered Unordered Binary
Search Search Search
26 1 1 3
2 1 9 ~
96 3 3 3
98 4 9 4
103 6 9 3
243 9 9 4
244 9 9 4
2.
[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]
3. Remember that the KEY in the rehash function is the hash address from
the previous hash function.
[0] [ 1] [2] [3] [4] [5] [6] [7] [8] [9] [ 10]
A58\ Exercise Answers
4. PROCEDURE MOl"JE1 (l"JAR DATA : ARRAYTYPE
NUMELt
KEYVAL : INTEGER;
VAR FOUND: BOOLEAN);
BEGIN
FOUND := FALSE;
CT : = 1;
WHILE (CT <= NUMEL) AND NOT FOUND DO
IF DATA [CT] = KEYVAL
THEN FOUND := TRUE
ELSE CT := CT + 1;
IF FOUND
THEN
BEGIN
FOR CT := CT DOWN TO 2 DO
DATA[CTJ := DATA[CT - 1J;
DATA [ 1] : = KEYl"JAL
END
END;
5. PROCEDURE HASHF I ND (l"JAR LIST : ARRAYTYPE;
KEYVAL : INFOTYPE;
VAR FOUND : BOOLEAN);
BEG IN.
START PLACE := HASH(KEYVAL);
TRYPLACE := STARTPLACE;
FOUND := FALSE;
REPEAT
IF LIST[TRYPLACE] = KEYVAL
THEN FOUND := TRUE
ELSE TRY PLACE := (TRYPLACE + 1) MOD MAXLIST
UNTIL FOUND OR (TRYPLACE = STARTPLACE)
END;
Glossary
iNial H% F @wtl
A59
A60 I Glossary
explicit link field in each element rather than sequential order in mem-
ory.
listing See source listing.
literal A symbol that defines itself; a constant value such as a literal string
or number.
local identifier An identifier declared in the block \vhere it is used. See
name precedence.
logical operator A symbol used in a Boolean expression whose operation
results in a Boolean value of TRUE or FALSE.
loop A control structure that allows a statement(s) to be executed more
than once (until a termination condition is reached).
loop control variable A variable (usually ordinal) used to control the num-
ber of times the body of a loop is executed.
machine code See maclline language.
machine language The language used directly by the computer and com-
posed of binary coded instructions.
main storage Also main memory. See memory.
memory The ordered sequence of storage cells (locations, words, places)
in a computer that are accessed by address and used to temporarily hold
the instructions and variables of an executing program. See secondary
storage.
memory unit The internal data storage of a computer. See memory.
module An independent unit that is part of a whole; a logical part of a
design or program, such as a procedure.
multi-dimensional array An array of one or more arrays.
name precedence The priority of a local identifier over a more global iden-
tifier, where the identifiers have the same name. See scope.
nested logic A control structure contained within another control structure.
NIL A constant in Pascal that can be assigned to a pointer variable, indicat-
ing that the pointer points to nothing.
node Element in a list or tree.
object code The machine code producd by a compiler or assembler from a
source program. Also called object program.
operating system The set of programs that manage computer resources.
operator A symbol that indicates an operation to be performed.
operator precedence See precedence rules.
order of magnitude Way of expressing relationships between large num-
bers by using formal approximation. Used in computing to express
amount of work done.
ordinal type A set of distinct values that are ordered such that each value
(except the first) has a ullique predecessor and each value (except the
last) has a unique successor; any scalar type except REAL.
output Data produced by a program and sent to an external file or device.
overflow A condition wllere the results of a calculation are too large to
represent on a given machine. See precision.
packed array An array which occupies as little memory space as possible
Glossary I A67
by having as many array components as possible packed into each Inem-
ory word.
packed option A Pascal feature allowing more efficient storage of records
and arrays.
palindrome String that reads the same backward or forward (e.g.,
RADAR).
parameter An expression passed in a procedure or function call. See actual
parameter, formal parameter.
parameter list See actual parameter list, formal parameter list.
path Sequence of vertices that connects two nodes in a graph.
peripheral device An input, output, or auxiliary storage device of a com-
puter.
pointer A simple data type, consisting of an unbounded set of values,
which addresses or otherwise indicates the location of a variable of a
given type.
portability The ability of software written for one computer to run success-
fully on different machines.
postconditions Output specifications of a routine which describe the trans-
formed data; tell what is true on exit from a subprogram.
postfix notation Notation for expressions in which the binary operator fol-
lows its operands.
postorder traversal Traversal of a binary tree in which each node is visited
after its left and right subtrees.
powerset See universal set.
precedence rules The order in which operations are performed in an ex-
pression.
precision The maximum number of significant digits.
preconditions Input specifications and allowable assumptions for a rou-
tine; tell what is true on entry to a subprogram.
prefix notation Notation for expressions in which the binary operator pre-
cedes its operands.
preorder traversal Traversal of a binary tree in which each node is visited
before its right and left subtrees.
prettyprinting Program formatting to make a program more readable.
procedure A subroutine that is executed when called. See subroutine, pa-
rameter.
procedure call See call.
program verification Demonstration of a program's correctness by formal
proof or logical argument.
programming The planning, scheduling, or performing of a task or an
event. See computer programming.
programming language A set of rules, symbols, and special words used to
construct a program.
pseudo-code A mixture of English and Pascal-like control structures used to
specify a design.
queue A data structure in which elements are entered at one end and re-
A68 I Glossary
moved from the other; a "first in, first out" (FIFO) structure.
queuing system System made up of servers and queue(s) of objects to be
served.
random access The process of retrieving or storing elements in a data struc-
ture where the time required for such access is independent of the order
of the elements.
range The smallest and largest allowable values.
range-checking The automatic detection of an out-of-range value being as-
signed to a variable.
real number One of the numbers that has a whole and a fractional part and
no imaginary part.
record A structured data type with a fixed number of components (not
necessarily of the same type) that are accessed by name (not subscript).
recursion The ability of a procedure or function to call itself.
referenced variable A variable accessed not by name but through a pointer
variable; a dynamic variable; a variable created by the procedure NEW
in Pascal.
regression testing Re-execution of program tests after modifications have
been made in order to ensure that the program still works correctly.
relational operator A symbol that forms an expression with two values of
compatible types, and whose operation of comparing these values results
in a Boolean value of TRUE or FALSE.
repeat statement A looping control structure similar to a WHILE loop, ex-
cept that there will always be at least one execution of the loop since the
loop condition is tested after the body of the loop.
representational error An arithmetic error that occurs when the precision of
the result of an arithmetic operation is greater than the precision of a
given machine.
reserved word An identifier that has a specific meaning in a programming
language and may not be used for any other purpose in a program.
right child Node to the right of a given node in a binary tree; sometimes
called right son.
right subtree All the nodes to the right of a given node in a binary tree.
robustness The ability of a program to recover to a known state following
an error.
root node The external pointer to a tree data structure; the top or base node
of a tree.
round off To truncate (or make zero) one or more least significant digits of
a number, and to increase the remaining least significant digit by one if
the truncated value is more than half of the number base. Pascal provides
a function to round off a real value to the nearest integer.
run-time The phase of program execution dllring which program instruc-
tions are performed.
scalar data type A set of distinct values (constants) that are ordered.
scientific notation A method of representing a number as an expression
consisting of a number betweell 1 and 10 multiplied by the appropriate
power of 10. Also called floating point notation.
Glossary I A69
scope The range or area within a program in which an identifier is known.
searching The locating of a particular element in a data structure.
secondary storage Backup storage for the main storage (memory) of a com-
puter, usually permanent in nature (such as tape or disk).
seed Global variable that initializes a random number generator.
selection A control structure that selects one of possibly several options
or paths in the flow of control, based upon the value of some expres-
sion.
self-documenting code A program containing meaningful identifier names,
as well as the judicious use of clarifying comments.
semantics The set of rules which give the meaning of a statement.
sentinel A special data value used to mark the end of a data file.
sequential access The process of retrieving or storing elements in a fixed
order in a data structure where the time required for such access is de-
pendent on the order of the elements.
set A structured data type composed of a collection of distinct elements
(members) chosen from the values of the base type.
siblings Nodes in a tree that have the same parent node; sometimes called
brothers.
side effects A change, within a procedure or function, to a variable that is
external to, but not passed to, the procedure or function.
significant digits Those digits that begin with the first non-zero digit on the
left and end with the last non-zero digit on the right (or a zero digit that is
exact).
simple type A scalar type; a type that is not structured; any of the Pascal
types INTEGER, REAL, BOOLEAN, CHAR or any user-defined (ordi-
nal) type.
software Computer programs; the set of all programs available to a com-
puter.
software engineering Disciplined approach to the design and maintenance
of computer programs using tools that help manage the size and complex-
ity of the resulting software products.
sorting Arrangement of elements in a list according to the increasing (or
decreasing) values of some key field of each element.
source code Also called source program; a program in its original form, in
the language in which it was written, prior to any compilation or transla-
tion.
source listing A printout of a source program processed by a compiler and
showing compiler messages, including any syntax errors in the program.
stack A data structure in which elements are entered and removed from
only one end; a "last in, first out" (LIFO) structure.
stack overflow The condition resulting from trying to push an element onto
a full stack.
stack underflow The condition resulting from trying to pop from an empty
stack.
statement An instruction in a programming language.
statement'separator A symbol used to tell the compiler where one instruc-
A7a I Glossary
1. Your assignment is to write a program for a computer dating service. Clients will
give you their names, phone numbers, and a list of interests. It will be your job to
maintain lists of men and women using the service and to match up the compati-
ble couples.
Data structures: The problem requires you to maintain two lists, men and
women. The lists must include the following information:
name (20 characters), phone number (8 characters), number of
interests (maximum number is 10), interests (10 characters;
must be in alphabetical order), and a variable that gives the
position of the client's current match (will be 0 if not matched).
When a new client is added to the list, he or she is added to
the bottom of the appropriate list. (You do not keep the names
of the clients in alphabetical order.)
Input: The first part of the input file contains the data base of current clients.
1. Number of current clients
2. For each current client, a record containing the following: sex (7
characters), name (20 characters), phone number (8 characters), num-
ber of interests, list of interests (10 characters for each one; must be
sorted; interests are separated by commas with a period after the final
interest.)
The rest of the file will include commands to the dating service. Each com-
mand will begin with a 10-character command word, on a new line of input.
B1
821 Programming Assignments
Commands:
NEWCl I ENT .{) (sex) (nan1e) (number of interests) (interests)
If the key word NEWCLIENT occurs, you should add the client to the appropri-
ate list by storing the appropriate information. Match him or her with a member
of the opposite sex. (A match occurs when three or more of the interests are the
same. Use the fact that interests are sorted to make the match process easier.)
Make sure you then designate both persons as matched, as described above.
Print the name of the new client, his or her match, and both phone numbers. If
no match is found, print an appropriate message.
DlDCl I ENT 15 (name)
Unmatch this name with his (or her) current match by setting the MATCH varia-
bles for this name and his (or her) match to O.
PRINTMATCH
Print a list of all matched pairs.
PR I NTDDT 1).{)
Print the names and phone numbers of clients who are not currently luatched.
STD PPRDG 1515
This will be the last line in the input.
Output: Print information as described above with appropriate labels.
Original I)\, Ibll Bittl!t,I';
I111H)il'iec! by J;)~:('(' BJ't:'lllHlll
*3. A small trucking firlu acts as a broker between people who want things shipped
and independent drivers who own their own trucks. The COlnpany receives re-
Programming Assignments \ B3
quests from cllstolners who want goods shipped from city A to city B. When
truckers are free, they call the COInpany and give their location and the Inile
radius they are willing to travel to pick up goods. The company looks at its
requests and arranges for as full a load as possible.
Input: 1. Mileages between the cities served by this trucking firm.
These entries are repeated, one per line. An asterisk (*) ends these entries.
2. A n1ixture of shipper and driver requests.
where (naIne) and (city) are character strings of up to ten characters with no
embedded blanks. There Inay be n1ultiple blanks between con1ponents of each
individual request.
(pounds), (Inile radius), and (truck capacity) are integer numbers.
(cost) is a real number (i.e., dollars and cents).
There will be at least one blank following the key words SHIPPER and
DRIVER.
Output: 1. Echoprint the Inileage table.
2. Echoprint each shipper request.
3. For each driver request,
(a) Echoprint the driver request.
(b) Print a list of the shippers who have goods to be shipped from
the city where the driver is and from cities within the radius
designated as acceptable by the driver. This list should be
printed in order by cost. That is, the shipper whose load will
cost the most should be printed first.
(c) The names of those shippers whom the driver should service,
along with their locations and the destinations of their goods.
This list will be made up of as many of those shippers as the
driver can handle with his or her particular truck. That is, the
combined pounds must be within the capacity of the driver's
truck.
4. After all the shipper and driver requests have been processed, print
a list of those shipper requests still not picked up.
5. If a driver request comes in and there are no shipper requests for a
pick-up within the designated radius, or the driver's truck does not
have the capacity to service the requests that do exist, print a mes-
sage to the driver to take a day off.
Data structures: 1. A two-dimensional array representing mileages. A city's
place in the array of city nalnes is used as an index into this
mileage table.
2. An array of records that represent shipper requests.
Assumptions: 1. Intra-city shipper requests are legal.
B41 Programming Assignments
5. Add to Assignment 3 above the requirement that summary statistics be kept for
each driver. You will need one more item in the output specification, one more
data structure, and one more assumption.
Output: After all shipper and driver requests have been processed, print sum-
mary statistics for all drivers, showing the driver's name, the number of
pounds hauled, and the total cost.
Data structure: A one-dimensional array of records, in which each record rep-
resents a driver.
Assumption: A driver's name is not entered into the list of drivers until he or
she has actually picked up a shipment. Ga('1 13\1ck]t l )'
*6. Your cousin-the noisy one-has been sitting in the corner for hours, quietly
absorbed with a new game. Being a little bored (and curious), you ask if you can
play. The game consists of a 2-dimensional black box with numbers bet\veen 0
and 39 on all sides.
There are five obstructions, called baffles, which you cannot see, placed in
the box~ The object of the game is to find the baffles. You select a number be-
tween 0 and 39, which activates a laser beam originating at that location. You are
then told where the beam leaves the box. If the beam does not encounter a baffle,
it will exit directly opposite where it entered. If the beam encounters a baffle, it
will be deflected at right angles, either right or left, depending on the direction
of the baffle. You can locate the baffles by shooting beams into the box, using the
deflections of the beams as hints to the placement and direction of the baffles.
10 11 12 13 14 15 16 1718 19 10 11@13 14 15 16 1 18 19
9 20 9 ~ 20
8 21 8 21
7 22 06 -- -- ~- - -- - -
I-- ~- - ~
6 23 23
5 24 ®4 24
4 25 25
3 26 3 7 26
2 27
n
@
1 28 CD0 " ~ 28
o 29 29
39 38 37 36 35 34 33 32 31 30 39 38 37 36 35 34 33 32 31 30
Programming Assignments I 85
Given the box on the preceding page, a beam shot from 7 comes out at 22. A
beam shot from 1 is deflected once and exits at 37. A beam shot from 27 exits at
2 without being deflected. A beanl shot from 5 comes out at 17 after three deflec-
tions. A beam shot from 12 is deflected once, exiting at 24.
The game is scored by giving one point for each laser shot and two points for
each guess. A lower score, obviously, is more desirable.
When your cousin demands his game back, you decide to write a computer
program to simulate the game. (Note: Your program is not supposed to solve the
baffles problem; it is supposed to present the game to be played.) This clearly
ought to be an interactive program, where the baffles are set by a random number
generator, and a player either fires a laser beam or makes a guess as to the posi-
tion of a baffle. However, you decide to write the program and test it first. You
can add the prompts for interactive play later.
5 12 L sets
8 18 R sets
Error checking: You are not assured that each line of input
will be unique (just as the random number
generator may coincidentally come up with
the same coordinates for two baffles). You can
only get a baffle in a "free" position, i. e., one
that has not been previously set. You need to
set a total of five unique baffles.
When you have set five unique baffles, skip all input until
you encounter a *.
Continue processing untfl all five baffles are found. Output message of con-
gratulations and calculated score. Print the box showing the location and direc-
tion of all the baffles.
Error checking: In addition to the specific error checking mentioned above,
you must check all input. If an error is found in any line, that
line is not used, and an appropriate warning should be
printed. (Use your imagination.) You may assume that the
number of items on a line and their respective types are cor-
rect.
Example:
5 12 L
8 18 R
Programming Assignments I B7
2 17 J
3 10 R
8 18 L
14 3 R
1 12 L
1 17 R
3 13 R
8 11 L
*L8
L12
G 5 12 L
P
L 30
G 30 20 R
G 8 18 R
R 17
L 17
G 5 12 L
S
G 1 10 R
L
G 12 L
Output:
L 12
LASER SHOT #2 EXITED THE BOX AT 2a.
G 5 12 L
THIS IS GUESS #1.
CONGRATULATIONS, YOU HAVE NOW FOUND 1 BAFFLE(S).
B81 Programming Assignments
10 11 12 13 14 15 16 17 18 18
8 -+--+--+--+--+--+--+--+--+--+- 20
8 -+--+--+--+--+--+--+--+--+--+- 21
7 -+--+--+--+--+--+--+--+--+--+- 22
6 -+--+--+--+--+--+--+--+--+--+- 23
5 -+--+--L--+--+--+--+--+--+--+- 24
4 -+--+--+--+--+--+--+--+--+--+- 25
3 -+--+--+--+--+--+--+--+--+--+- 26
2 -+--+--+--+--+--+--+--+--+--+- 27
1 -+--+--+--+--+--+--+--+--+--+- 28
o -+--+--+--+--+--+--+--+--+--+- 28
38 38 37 36 35 34 33 32 31 30
L 30
LASER SHOT #3 EXITED THE BOX AT 20.
G 30 20 R
K 17
*** ILLEGAL COMMAND -- TRY AGAIN ***
L 17
LASER SHOT #4 EXITED THE BOX AT 5.
G 5 12 L
THIS IS GUESS #3.
YOU HAVE ALREADY FOUND THIS BAFFLE.
S
NUMBER OF SHOTS: 4
NUMBER OF GUESSES: 3
CURRENT SCORE: 10
G 10 R
THIS IS GUESS #15.
SORRYt BETTER LUCK NEXT TIME.
L 1
LASER SHOT #26 EXITED THE BOX AT 37.
Programming Assignments I 89
G 12 L
THIS IS GUESS #16.
CONGRATULATIONS, YOU HAVE NOW FOUND 5 BAFFLE(S).
10 11 12 13 14 15 16 17 18 18
8 -+--+--+--+--+--+--+--+--+--R- 20
8 -+--+--+--+--+--+--+--+--+--+- 21
7 -+--+--+--+--+--+--+--+--+--+- 22
6 -+--+--+--+--+--+--+--+--+--+- 23
5 -+--+--L--+--+--+--+--+--+--+- 24
4 -+--+--+--+--+--+--+--+--+--+- 25
3 -R--+--+--+--+--+--+--+--+--+- 26
2 -+--+--+--+--+--+--+--+--+--+- 27
1 -+--+--L--+--+--+--+--R--+--+- 28
o -+--+--+--+--+--+--+--+--+--+- 28
38 38 37 36 35 34 33 32 31 30
Your output may differ in minor details from this example. (Different error
messages, slight format differences, etc.)
7. You are the manager of a team of ten programmers who have just cOlnpleted a
seminar in structured programming and top-down design. To prove to your boss
that these techniques payoff, you decide to run the following contest: You nUlTI-
bel' the programmers 1 .. 10 based on their performance in the seminar (1 is
poorest, 10 is best) and monitor their work. As each does his or her part of your
project, you keep track of the number of lines of debugged code turned in by
each programmer. You record this number as a progralTImer turns in a debugged
module. The winner of the contest is the first person to reach 1000 lines of
debugged code. (You hope this \vill be programmer #9 or # 10.) As further proof
of the the value of these new techniques, you want to determine how many poor
programmers it takes to surpass the winner's figure; that is, find the smallest k
such that programmers 1 .. k have turned in n10re lines than the winner.
Input: The input consists of a sequence of pairs of integers. The first integer in
each pair is the programmer's number (an integer from 1 to 10), and the
second is the number of lines of code turned in. The pairs occur in the
same order as that in which the modules were turned in. (Incidentally,
there will be two integers per line, but this should make no difference.)
Output: Read in pairs of integers until someone's total goes over 1000. Print out
(echoprint) each pair as you read it. Ignore any input after someone's
total exceeds 1000. Then print out a table listing the ten programmers
and their totals, with the winner flagged as shown in the example
below. Finally, find the smallest k such that the sum of programmers
B10 I Programming Assignments
10 230
8 400
P ra 9 raflHlle r # Lines
1 51
2 105
3 308
4 101
5 215
8 o
7 333
8 4S2
8 1077 *** THE WINNER
10 800
Chapter 2
Since no new structures are introduced in Chapter 2, the programming assign-
ments for Chapter 1 are also appropriate for Chapter 2.
1. Complete the command-driven tester described in this chapter and test the
string routines. (Let us know if you find an error!)
2. You have just bought a personal computer to use for writing your term papers. It
has a text editor, but it does not have a text formatter. You decide to write one that
will recognize a simple set of formatting commands.
Input: There are two kinds of inputs to your formatter. One consists of com-
mands to the formatter and the other is the text to be formatted. Format-
ting commands all begin with a period in the first character position of a
line. Formatting commands and the text to be formatted are interspersed
on the same file.
Commands:
+ L (integer) Set line length to (integer) characters
+ 6 (integer) Write (integer) blank lines.
+J The text to be forlnatted begins on the next line and continues
until another formatting comn1and is encountered. The text
should be both left and right justified within the line length
specified.
+Q Quit processing.
ary. Each inquiry specifies a pattern of characters along with a restriction regard-
ing the pattern's relative position \vithin the required word. The pattern may be
found at the beginning, end, or middle of the \\lord, according to the request. For
each inquiry, either no such word exists in the given dictionary, exactly one such
word exists, or more than one is found to satisfy the requirement specified by the
inquiry. Corresponding to each of these three cases should be an appropriate
response from your program (see output section below).
Input: The first part is in fixed format, and its purpose is to define the diction-
ary; the second part is in free format, and it contains the sequence of
inquiries.
Part 1 is composed of a series of lines, each containing a word terminated by a
colon, followed by its meaning. There may be blanks following the colon and
preceding the meaning. The last line in this part is recognized by an asterisk in
column 1.
The words begin in column 1, and their corresponding meanings may be
stored and printed out, whenever required, in the same format as they appear in
the input file, with leading blanks removed.
Part 2 is composed of a sequence of four possible inquiries appearing in any
order. In each inquiry, (pattern) is a string of alphanumeric characters terminated
by a period.
START I NG (pattern) Find the meaning of the word that starts with the
given pattern.
END I NG (pattern) Find the meaning of the word that ends with the
given pattern.
CONTA I N I NG (pattern) Find the meaning of the word that contains the given
pattern (including the beginning or the end of the
word).
S TOP (pattern) Stop all processing immediately.
Data structure: The dictionary of words and their meanings must be stored in
an appropriate data structure.
Output: Echoprint all the input. For each inquiry there are three possible out-
puts:
1. If no word in the dictionary satisfies the inquiry, print an appropriate mes-
sage.
2. If exactly one such word exists, the meaning of this word should be printed
out.
3. If more than one word satisfies the inquiry, list these words without their
meanings.
Assumptions: You may assume that there are no more than 30 words in the
dictionary for testing this program.
*4. Groucho Marx used to have a galne show called "You Bet Your Life" in which he
would have a conversation with the contestant and would try to get him or her to
say a specified "luagic word." If the contestant said the magic word, a duck
would drop down, signaling that the contestant had won a prize. This probleln
deals with a similar galne (we could call it "You Bet Your Grade"), which will
have a specified set of n1agic words. The input will consist of several passages of
text. You will read in each passage, see if it includes all the magic words, and
then print whether or not the person wins the prize.
Programming Assignments \ B13
Input: The input is divided into sections. The last section is terminated by 'a $.
All other sections are terminated by a *. You may assume the $ and all
the *s are preceded by a blank.
A section is divided into words. A tvord is a sequence of letters and is
terminated by any nonalphabetic character. Words are at most ten char-
acters long. All nonalphabetic characters are completely ignored; their
only purpose is to terminate words and (for * and $) mark the end of a
section.
The first section gives all the magic words. The remaining sections
give the text for each contestant. The first two words in each section give
the contestant's name.
Output: Initially, print out the list of lnagic words, then echo each section as it
is read. Test for end-of-line so that the output will have its end-of-lines
in the same place as the input. For each section, keep track of which
magic words were said, and at the end of the section, print out a mes-
sage indicating whether the contestant won the prize. Suitable mes-
sages are
CONGRATULATIONS (name), YOU HAl"JE WON THE PRIZE
Note that you must print out the nalne and the list of unsaid magic
words if the contestant loses.
Jim Bitner
Chapter 3
1. You are to write a set of utility routines to luanipulate a group of three stacks. The
specifications for these routines are given in terms of preconditions and postcon-
ditions. The preconditions to each routine tell you what the routine may assume
to be true on entry to the routine. The postconditions state what the routine
guarantees to be true on exit from the routine. The notation is as follows:
S the specified stack
S' the stack before the last operation on it
first(S) the most recent element put in S
length(S) the number of elelnents in S
II concatenated with
Stack operations:
preconditions: TRUE
postconditions: S = X II S' AND OVERFLOW = NOT FULL(S')
OR S = S' AND OVERFLOW = FULLS(S)
B141 Programming Assignments
preconditions: TRUE
postconditions: S' = X II S AND UNDERFLOW = EMPTYS(S)
precondition: TRUE
postconditions: S = emptystack
preconditions: TRUE
postconditions: S = S' and TOP = first(S) AND UNDERFLOW =
NOT EMPTYS(S)
OR S = S' AND UNDERFLOW = EMPTYS(S) AND TOP
IS UNDEFINED
FUNCTION EMPTYS (S : STACKTYPE) : BOOLEAN;
precondition: TRUE
postconditions: FULLS = (length(S) = MAXSTACK);
Testing: The input data consist of a series of commands to test your routines.
There are an arbitrary number of blanks between commands. Be sure
to echoprint each command before you execute it.
Stacks will be designated by the integers 1 through 3. Elements will be
strings of 1 to 20 characters, including embedded blanks, delin1ited by a period.
(The period is not part of the element.)
Commands:
PUSH (stacknumber) (element)
Execute Procedure PUSH, using (element) as the value to be put on the desig-
nated stack. If an error condition occurs, print an appropriate message.
PO P (stacknumber)
Execute Procedure POP, using the designated stack, and print the value re-
turned. If an error condition occurs, print an appropriate message.
STACKTOP (stacknumber)
Execute Procedure STACKTOP, using the designated stack, and print the value
returned. If an error occurs, print an appropriate error message.
CLEARS (stacknuluber)
Execute Procedure CLEARS, using the designated stack.
EMPTYS (stacknumber)
Execute Function EMPTYS, using the designated stack, and print the result.
FULLS (stacknulnber)
Execute Function FULLS, using the designated stack, and print the result.
Programming Assignments \ B15
PR I NT (stacknumber)
Print the elen1ents in the designated stack. The stack must be returned to its
original state. (Hint: Use a temporary stack.)
DUMP
Print the element~ in all the stacks.
STOP
Stop executing.
Note that all the preconditions are TRUE. This means that all the testing for
full and/or empty is being done in the utility routines themselves. If the testing
were done in the calling routine, this would be stated in the preconditions.
Which routines would this change? What would the preconditions look like?
Sample data:
CLEARS 2
PUSH 2 SALLY.
CLEARS
CLEARS 5
PUSH 1 GEORGE. PUSH 2
Jo ANN.
PUSH
5 SUSAN. CLEARS 3
CLEARS a POP 5 TOP 2 PUSH 5 JOE. PUSH 5
TOM. PUSH 5 MARY Jo. PUSH 3 HENRY. PUSH 3 HARRY.
PUSH 5 DICK.
TOP 5 EMPTYS 2 EMPTYS a PUSH
a ELLEN. PUSH 3 JANE. POP 2 POP 2 POP 2 DUMP
PUSH 1 HARRIET.
PUSH 1 STANLEY.
PUSH 1 LIVINGSTON.
FULLS 1
TOP a
POP 5
CLEARS 2
PUSH 2 LIZZIE. PUSH 1 JOSEPH. PUSH 1 ELLEN. TOP a
PUSH 1 SALLY ANN. PUSH 2 JOHN. PUSH 5 HARRY. POP
a
PUSH GEORGE II.PUSH ELIZABETH. PUSH SALLY.
PUSH DICK. POP 5
PUSH 1 WILLIAM. EMPTYS 5 TOP 1 EMPTYS a FULLS
CLEARS 5
DUMP
PUSH 1 TOM. PUSH 1 Jo ANN. POP 3 POP 3
PUSH 1 JAMIE. PUSH 1 DANIEL. TOP 1 PUSH
ELLEN. TOP a DUMP
CLEARS 2 CLEARS 3 PUSH JOHN. PUSH LIZZIE.
PUSH 1 RICKIE.
FULLS 1
FULLS 2
EMPTYS 1
8161 Programming Assignments
CLEARS 3
EMPTYS 2 PUSH 1 JULIE. DUMP
STOP
2. Alter the maze program given in this chapter so that it prints out the path used to
exit the maze if a path is found.
*3. This problem requires you to write a program to convert an infix to a postfix
expression. The evaluation of an infix expression like A + B * C requires knowl-
edge of which of the two operations, + or *, should be performed first. In gen-
eral, A + B * C is to be interpreted as A + (B * C) unless otherwise specified. We
say that multiplication takes precedence over addition. Suppose that we would
now like to convert A + B * C to postfix. Applying the rules of precedence, we
first convert the portion of the expression that is evaluated first, namely the
multiplication. Doing this conversion in stages, we obtain:
A + B * C Given infix form
A + BC * Convert the multiplication
ABC + * Convert the addition
The major rules to remember during the conversion process are that the oper-
ations with highest precedence are converted first and that after a portion of an
expression has been converted to postfix it is to be treated as a single operand.
Let us now consider the same example with the precedence of operators re-
versed by the deliberate insertion of parentheses.
(A + B) * C Given infix form
AB + *
C Convert the addition
AB + C * Convert the multiplication
Note that in the conversion from AB + * C to AB + C *, AB + was treated as a
single operand. The rules for converting frolll infix to postfix are simple, pro-
vided that you know the order of precedence.
We consider four binary operations: addition, subtraction, multiplication, and
division. These operations are denoted by the usual operators, +, -, *, and I,
respectively. There are two levels of operator precedence. Both * and I have
higher precedence than + and -. Furthermore, when unparenthesized operators
of the same precedence are scanned, the order is assumed to be left to right.
Parentheses may be used in infix expressions to override the default precedence.
As we discussed in this chapter, the postfix form requires no parentheses. The
order of the operators in the postfix expressions determines the actual order of
operations in evaluating the expression, making the use of parentheses unneces-
sary.
Input: A collection of error-free simple arithmetic expressions. Expressions are
separated by semicolons, and the final expression is followed by a pe-
riod.
The input is free format; an arbitrary number of blanks and end-of-lines may
occur between any two symbols. A symbol may be a letter (A .. Z), an operator
(+, -, *, or I), a left parenthesis, or a right parenthesis.
Each operand is composed of a single letter. The input expressions are in infix
notation.
Programming Assignments I B17
Example.' A +B - C ~
A+B*C;
(A + B)/(C - D) ;
((A + B) * (C - D) + E)/(F + G) .
Output: Your output should consist of each input expression, followed by its
corresponding postfix expression. All output (including the original
infix expressions) lnust be clearly formatted (or refornlatted) and also
clearly labeled.
Example.' (Only the four postfix expressions corresponding to the above sample
input are shown here.)
AB + C-
ABC * +
AB + CD - /
AB + CD - * E + FG + /
Rule 4.' When an opening (left) parenthesis is seen, it must be pushed onto the
stack.
Rule 5: When a closing (right) parenthesis is seen, all operators down to the
most recently scanned left parenthesis must be popped and appended
to the postfix string. Furthermore, this pair of parentheses must be dis-
carded.
Rule 6.' When the infix string is completely scanned, the stack may still contain
son1e operators. (No parentheses at this point. Why?) All these remain-
ing operators should be popped and appended to the postfix string.
Examples: Here are two examples to help you understand how the algorithm
works. Each line below demonstrates the state of the postfix string
and the stack when the corresponding next infix symbol is scanned.
The rightmost symbol of the stack is the top symbol. The rule num-
ber corresponding to each line delTIonstrates which of the six rules
was used to reach the current state from that of the previous line.
Exa1nple 1: Input expression is A + B * C/O - E.
Next symbol Postfix string Stack Rule
A A 2
+ A + 3
5 A B + 2
A B + 3
* *
C A B C + * 2
I A B C * + / 3
0 A B C * D + / 2
A B C * D / + 3
E A B C * D I + E 2
A 5 C * D I + E - 6
Example 2.' Input expression is (A + B * (C - D))/E.
Next symbol Postfix string Stack Rule
( ( 4
A A ( 2
+ A ( + 3
B A B ( + 2
* A B (
+ * 3
( A B (
+ * ( 4
C A B C (
+ * ( 2
A B C (
+ * ( - 3
D A B C 0 (
+ * ( - 2
) A B C D - (
+ * 5
) A B C D - * + 5
/ A B C D- * + / 3
E A B C D - * + E / 2
A B C D - * + E / 6
Programming Assignments I B19
4. This problem requires you to write a program to convert a prefix to a postfix
expression.
Example: + * A 6 / CD;
* A + 6 / C D;
- * A + B / C D E;
* + A 6 - C D.
Output: Your output should consist of the list of input expressions along with
the corresponding converted postfix forms. All output should be
clearly formatted and labeled.
Prefix Postfix
+ A 6 / C D A B * C D / +
* D A 6 C D / +
* A +
- A
6 /
+ 6
C
/ C D E A B C D / +
*
* E -
* + A
* 6 C D A 6 + C D - *
Note: Although your program does not process or print infix, here are the four
infix expressions that correspond to the above forms:
A * 6 + C / D
A * ( B + C / D
A * ( B + C / D - E
( A + B ) * ( C - D )
Discussion: The key idea here is that after a portion of an expression has been
converted, it is to be treated as a single operand.
Assume that every operator is associated with a flag. The flag is
initially off to signal that neither of the two operands correspond-
ing to the given operator has been processed yet. When the first
operand is processed, the flag is switched on. When the second
operand is processed, the given operator may be iInmediately ap-
pended to the output string.
Below is a description of a simple prefix-to-postfix conversion
algorithm that works on one input expression:
1. Initialize the input string, the output string, and the operators stack.
2. Repeat steps 3-5 until there are no more input symbols.
3. Get the next input symbol.
B20 I Programming Assignments
* - *
A -(*) A
+ -(*)+ A
B -(* + ) A B
/ -(* +)/ A B
C -(* + / ) A B C
D (- ) A B C D / +
E A B C D / +
* E -
*
\l\llljid \l\1sallal\l
1. You are to ,,,rite a set of utility routines to manipulate a queue. The specifications
you are given are in terms of preconditions and postconditions, just as in assign-
luent 1, Chapter 3. The following notation is used in the description of the rou-
tines:
Q The current queue
Q' The queue before the last operation on it
first(Q) The element that has been in the queue for the longest time
length(Q) The number of elements in the queue
II concatenation
Programming Assignments I B21
Queue operations:
Note that all the preconditions are simply the one word TRUE, which means that
there are no preconditions and that all error checking must be done in the proce-
dures and functions themselves. The actual tests for full and empty queues must be
coded only once. If you need to make that test in another procedure, call the appro-
priate functions.
When a patient checks out, the doctor he or she was assigned to is available to
see the next patient if there is anyone in the waiting list.
Input: Since this will be an interactive system, your program should prompt
the users to input the correct information.
The initial prompt is
Output: The output for each request is in the form of messages to the user
according to the request.
Doctor check-in: confirmation that room is available or error message if room
is in use
Doctor check-out: goodbye message
Patient check-in: Message telling patient which room to go to and which doc-
tor has been assigned. If no doctor available, apologetic
message.
Patient check-out: goodbye message. At a later time we may add billing infor-
mation at this point.
Details and assumptions: 1. There are 100 examination rooms at the clinic,
each with a waiting room attached.
2. Specialty codes are:
PED Pediatrics
GP General practice
INT Internal medicine
CARD Cardiology
SUR Surgeon
OBS Obstetrics
PSY Psychiatry
NEUR Neurology
ORTH Orthopedics
DERM Dermatology
OPTH Opthomology
ENT Ear, Nose, and Throat
B241 Programming Assignments
1. This program illustrates the benefits of pushing iInplementation details into low-
level subprogralns. You are to revise Programming Assignment 1 of Chapter 3 to
implement the stacks as linked lists. Use the array of records implementation for
the linked lists. All five LISTS (and AVAIL) should be kept in the same array.
You may aSSUlue in testing the routines that the nluuber of elen1ents in all the
stacks together \vill not exceed 30, although they may all be in one stack. You
should use the following declarations:
Since you do not know whether singly linked lists or doubly linked lists will
be better for this progran1, you decide to \vrite these routines for both. That is,
each task will have two routines, one for a list with a forward link only and one
for a list with both a forward and a backward link.
B26 I Programming Assignments
Testing: You should test your routines in two runs: one for the singly linked list
and one for the doubly linked version. To create your list, insert the
characters 'A' .. 'Z'.
Part of this assignment is to determine what test cases must be run
to completely test the routines you have written. For instance, to test
the exchange left routine, you should call it with a letter from the
middle of the list, like 'D', and with the letter from the beginning of
the list. Run enough test cases to completely demonstrate the function
of your routines and any error conditions.
Use your print routine to print the contents of the list after each of
the tests has been done.
*5. Your assignment is to track the corporate careers of some up-and-coming execu-
tives who are busily changing jobs, being prollloted and demoted, and, of course,
getting paid.
In this version of the corporate world, people either belong to a company or
are unemployed. The list of people the program must deal with is not fixed;
initially there are none, and new people luay be introduced by the JOIN com-
mand (see below).
Executives within a company are ordered according to a seniority system and
are numbered from 1 to N (the number of people in the company) to indicate
their rank: 1 is the lowest rank and N is the highest. A new employee always
enters at the bottom of the ladder and hence will always start with a rank of 1.
When a new person joins a company, the rank of everyone in the company is
increased by one, and when an employee quits, the rank of employees above hin1
or her in that COlupany is decreased by one. Promotions can also occur and affect
the ranks in the obvious way.
Naturally, salaries are based on rank. An employee's salary is RANK * $1000.
Unemployed people draw a $50 "salary" in unemployment compensation.
Input: The input consists of a list of all possible companies, followed by the
word END, followed by a list of commands (see below), followed by the
word END.
The input is free format; an arbib"ary number of blanks and end-of-
lines may occur between any two input words. There will be at n10st 20
companies. The name of every company and person \vill be at most ten
characters long. Names consist of letters only and do not have embed-
ded blanks.
Commands and output: Execution consists of two phases-processing the com-
mands and then printing out the total aluount each person has earned during
execution of the program. The seven commands are listed below, along with the
required output. Note that if a person is currently employed, the command does
not tell you his or her current company, so you will have to search your data
structure to find the person.
JO I N (person) (colnpany)
The given person joins the specified COlupany. This may be the first reference to
this person, or he or she n1ay be unemployed. The person will not currently
belong to another company. Remember that when a person joins a company he or
she always starts at the bottolu.
Programming Assignments IB27
QUI T (person)
The given person quits his or her job and becomes unemployed. You may as-
sume that the person is currently employed.
CHANGE (person) (newcompany)
The given person quits his or her job and joins the specified company. You may
assume that the person is currently employed.
PROMOTE (person)
The given person is moved up one step, ahead of his or her immediate superior.
If the person has highest rank within the company, no change occurs.
DEMOTE (person)
The given person is moved one step down, below his or her immediate subordi-
nate. If the person has lowest rank within the company, no change occurs.
PAYDAY
Each person is paid his or her salary as specified above. (You will have to keep
track of the amount each person has earned from the start of the program.)
DUMP
The current list of employees should be printed for each company. The employ-
ees must be printed in order of rank; either top to bottom or bottom to top is
appropriate. The list of unemployed people should also be printed. (Note that
DUMP may be very useful in debugging.)
After all the commands have been processed, print out one list consisting of
all the people who have been mentioned in any command and the total amount
of money they have accumulated. The list should be sorted by decreasing order
of total salary accumulated.
Testing: You may want to test your program on the following sample data:
*6. The problem is to keep track of the lists of members in a set of clubs as members
join, quit, and change clubs, as given by the input. Periodically, the input will
Programming Assignments I B29
ask you to print all the members in a given club or to find all the clubs to which a
given person belongs.
Input: A list of all possible clubs, followed by the word END followed by a list
of commands (see next section), followed by the word END. The input
is free format; an arbitrary number of blanks and end-of-lines may occur
between any two input words.
Exanlple:
There will be at most 20 clubs. The name of every club and person will be at
most 10 characters.
Commands and output: The six possible commands are listed below, together
with their semantics and required output.
JO I N (person) (club)
Add the given person to the given club. (If the person is already in it, do noth-
ing.)
Output: Either "(person) has been added to (club)" or "error- (person) is al-
ready in (club)", whichever is appropriate.
QU I T (person) (club)
Delete person from club.
Output: Either" (person) has been deleted from (club)" or "error- (person) is
not in (club)".
CHANGE (person) (club 1) (club2)
Delete person from club1, then add person to club2.
Output: First print the appropriate deletion message (see QUIT), then the ap-
propriate insertion message (see JOIN).
PR I NT (club)
Print all the members in the club.
Output: "The member(s) in club are:", followed by the list of members in
sorted order, one per line.
FIN 0 (person)
Find and print all the clubs to which the person belongs.
Output: "(person) belongs to the following club(s):" followed by the list of
clubs, in any order, one per line.
MERGE (clubl) (club2)
830 I Programming Assignments
Merge the list of members for the two clubs. If a person is in both clubs, the
name should occur only once in the merged list. From now on, the merged club
can be referred to by either club name (I)
Output: "clubl and club2 have been merged."
Suggested data structure: The only requirement placed on your data structure
is that you must store the members of each club in a
linked list. The list can be doubly linked or circular
if you think that will make things easier.
One suggested data structure consists of an array, TABLE, of records that have
one field, NAME, to store the club name, and another field, NEXT, to point to a
list of members of this club. A suitable declaration is
Each club can be stored as'· a linked list (whose components are of type
BLOCK) that has a header (whose name is ignored) and a trailer (whose name is
ZZZZZZZZZZ, which you can assume is not the name of any person). The
linked list should be sorted so that the PRINT operation is easily done.
... ~
~
"'-L...JI -~
The header will make the insert, delete, and merge operations easier, and the
trailer will simplify insertion into the sorted list.
A note on the merge operation: After the two lists are merged together, the
pointers for both clubs must be set to the
header of the merged list. In this way, we ac-
cess the merged list of members when the
name of either club is mentioned. If this club is
merged with another club, both these pointers
will have to be changed again.
Programming Assignments I B31
LIONS LIONS
KIWANIS KIWANIS
Jim Bitner
*7. A friend of yours is opening a small grocery warehouse. He has offered you 100/0
of the first year's profits if you'will write an accounting system for the warehouse.
What can you lose?
Input:
Inventory master file (maximum of 20 items):
SOLD This (quantity) will be removed from the list of the (item name) and
the transaction will be recorded. If the item is a LIFO item, the (quan-
tity) will be recorded at the last purchase price. If the item is a FIFO
item, the (quantity) will be recorded at the earliest purchase price.
If the (quantity) sold is more than is currently in inventory, fill as
much of the order as you can and backorder the rest. This means put-
ting the item name and the quantity left to be filled into the list of
backordered items. This list should be kept in alphabetical order by
item name.
LOS S Remove from inventory as if it were sold. The quantity lost, the cost of
the los's, and the reason for the loss should be recorded on the daily
report.
ENDDAY Write the date and summary statistics for the day as specified under
OUTPUT.
STOP Write out the updated INVENTORY MASTER file to be used for the
next update run. The STOP will always follow the last ENDDAY.
Output: A daily report on OUTPUT showing the following information:
1. Value of the inventory at the beginning of the day.
2. Each transaction recorded as specified.
3. The following summary statistics:
(a) Dollar volume of items BOUGHT that day
(b) Dollar volume of items SOLD that day
(c) Dollar volume of LOSS that day
(d) Value of inventory at the end of that day
Data structures: 1. An array of records. Each record contains the item name,
whether it is LIFO or FIFO, and a pointer to a linked list of
nodes, each of which contains an amount and a price per
amount.
2. A linked list of backordered items and quantity kept, or-
dered by item name.
LETIUCE L
MILK L '
POTATOES L
TUNA F
NAPKINS F
Note: You might find it useful to carry a pointer to the end of the list as well.
Assumptions: 1. You may assume that the master file is correct.
2. You may not assume that the transaction file is correct. It may
have invalid commands or misspelled item names. If you en-
counter an unknown (item name) or command, print an error
message on the daily report and skip to the next command.
You may assume that the (quantity price) is correct (Le., you
may use a numeric read).
3. An item name will appear only once on the INVENTORY
MASTER file.
4. There will be no more than 20 item names.
Programming Assignments \ B33
Special cases: 1. Two nodes with the same price side by side can be com-
bined.
2. If the amount sold is more than the quantity in the node to be
removed first, you may have part of the order filled at one
price and part at another.
*8. Your neighbor is an agent for a group of magicians. People call her to book magi-
cians for holidays. She would like to use her new computer to keep track of the
jobs she schedules for the magicians she manages, so she hires you to write the
program.
Input: The input comes in three parts. The first part is a list of magicians. The
names are in free format, separated by blanks and terminated by a *. You
may assume that none of the magicians has a name longer than ten char-
acters.
The second part of the input is a list of holidays. Again, the names are
in free format, separated by blanks and terminated by a *. Assume none
of the holidays has a name longer than ten characters.
The third part of the input is a series of commands (see below). Com-
mands are in free format. Names in a command are at most ten characters
long.
SCHEDULE (person) (holiday)
The given person wants to book a magician for the given holiday. Check to see if
there is a magician free for this holiday. You should sequence through the magi-
cians in the order in which you read them in. If a magician is available, book the
magician and print out the name of the magician, the holiday, and the name of
the person. If a magician is not available, put the person on a waiting list, and
print out a message indicating that the person and holiday have been put on a
waiting list.
CANCEL (person) (holiday)
The given person has canceled his or her booking of a magician for the listed
holiday. Delete the reservation for that holiday. Update the schedule of the ma-
gician who was going to perform for the occasion. This may allow someone on
the waiting list to be serviced. Sequence through the waiting list to see if some-
one wanted a booking on that holiday. If someone did want a magician on that
holiday, schedule the booking and print a message. If this person is on the wait-
ing list, delete the name from the waiting list.
QUI T (magician)
Occasionally a magician quits. When that happens, you have to try to redistribute
that magician's bookings to other magicians. If you reschedule a booking, print a
message. If you can't, print out a message and add the request to the front of the
waiting list.
STATUS (magician or holiday)
Print out either the schedule of the named magician or the schedule for a given
holiday.
END
End of input; quit processing.
B341 Programming Assignments
Data structures: You will need data structures for storing each of the following:
1. a list of bookings for each holiday. There are at most ten
holidays you need to worry about, so use an array of ten
lists. Each list is the schedule on some holiday. Each rec-
ord in a list gives the name of the person who made the
booking and the name of the magician. Each list should be
stored in alphabetical order according to the person who
made the booking.
JULY 4th
XMAS ... ~
BILL
HARRY
4. If a magician quits and you put someone on the waiting list, print
I 'l"JE PUT (person) (holiday) AT THE FRONT OF THE
WAITING LIST.
Hick Alterman
Chapter 6
Any of the programming assignments for Chapter 5 may be written using Pascal
pointer variables instead of an array implementation.
*1. This is a program for the local libraries. It contains the fiction books available at
each of several branches. Each branch keeps its books in a linked list arranged in
alphabetical order by the author's name. Books by the same author are arranged
alphabetically by title. There may be several copies of the same book in a branch.
When a book is checked out from a branch, it is deleted from the linked list for
that branch. When it is returned to a branch, it is inserted in the appropriate
position in the list by its author and title. Patrons can ask the library to find the
number of copies of a book available at a particular branch at the present time.
Occasionally remodeling is done, and the library moves all the fiction books
from branch A to branch B, where branch B never has been and never will be
remodeled. When this happens, the list of books from the branch to be remod-
eled should be merged with the list of books at the branch that will no\v contain
those books. If a customer should try to return a book or check out a book from
the branch being remodeled, he or she will want to go to the branch now housing
the remodeled branch's books. Hence, you should keep a pointer from the
branch being remodeled to the branch that its books were moved to. (We will not
ask you to reopen a remodeled branch.)
The list structure can be one of three major designs:
1. BRANCH
BRANCH
836 I Programming Assignments
2. BRANCH
BRANCH
The list structure under option 2 performs the operations requested more effi-
ciently than the others. Ho\vever, you may use anyone of the three.
The program is horribly inefficient if it starts scanning at Adams when looking
for a book by Stevenson. To avoid this problen1, you can create an array (LETAR-
RAY) of 26 pointers, each of which points' to the first author in the list whose
name starts with that letter. Create the array so the indices are A through Z.
Hence, when looking for Stevenson, start with the pointer at LETARRAY['S'].
You can keep one LETARRAY for each branch, or you can have one element in
LETARRAY store a pointer for each branch, in which the first points to the posi-
tion of the letter for the first branch, the next for the second branch, etc.
You must use Pascal pointers for this program!
Input: Every input string (the strings for (title), (author), and (branch)) is a max-
imlUU of 10 alphabetic characters ended with a blank. There may be
multiple blanks between strings. Every command appears on a single
line of input.
To set up each branch of the library:
BRANCH (branch)
BOO K (author) (title)
END6RANCHES
Several books will be read in after each branch command. There will be at most
10 branches and 200 books total. ENDBRANCHES separates the initialization
from the requests.
A customer can make one of the following requests:
These last four commands can COlne in any order and are terminated by an end-
of-file.
Programming Assignments I B37
Output: 1. Echoprint the input.
2. For CHECKOUT: If the branch is being remodeled, print the mes-
sage, "This branch is temporarily closed and please go to branch
(branch2)." If there is no copy of the book at that branch, print the
message "Sorry, (title) by (author) is not in stock at branch
(branch)." If there is a copy available, print the message that "Book
(title) by (author) is available at branch (branch)."
3. For RETURN: If the branch is being remodeled, print the message
"Please return book (title) by (author) to branch (branch2)." Other-
wise, accept the book and print "Book (title) by (author) was re-
turned to branch (branch)."
4. For NUMBER: If the branch is being remodeled, print "Branch
(branch 1) is being remodeled; please check with branch
(branch2)." Otherwise, find the number of books at the branch and
print "Branch (branch) has (number) copies of (title) by (author)."
5. For REMODEL: Print "Branch (branchI)'s books have been moved
to (branch2). Branch (branch2) now contains: (and print the list of
books at (branch2) in alphabetical order by author's name)."
6. At the end of the input, print out the books in each branch in the
order in which the branches were read in, the books arranged in
alphabetical order by author, with the number of books at each
branch. If the branch is being remodeled, state where the books are
now kept.
Sample:
Input.'
BRANCH BURNET
BOOK ADAMS LIFE
BOOK ADAMS LIFE
BOOK ADAMS WORKS
BRANCH MANCHACA
BOOK BURNS POEMS
BOOK WIRTH PROGRAMS
BOOK FLON STRUCTURES
BRANCH MLK
BOOK AHO ALGORITHMS
BOOK ROUSSEAU LIFE
BRANCH RESEARCH
BOOK YEH CHEERS
ENDBRANCHES
RETURN El"JANS DANCES MANCHACA
CHECKOUT AHO ALGORITHMS MLK
CHECKOUT AHO ALGORITHMS MLK
NUMBER AHO ALGORITHMS MLK
REMODEL BURNET TO RESEARCH
Output: Without echoprinting, output is as follows:
BooK DANCES by EVANS was returned to branch MANCHACA.
BooK ALGORITHMS by AHO is available at branch MLK.
B381 Programming Assignments
*2. During the holiday season last winter, you spent hours standing in line at the
cosmetics counter of your favorite department store. You marched right up to the
president and complained. She apologized and asked if you had any suggestions
of a better way to handle the crowds at peak periods, such as Christmas, Valen-
tine's Day, Father's Day, and Mother's Day.
Being a good computer science major, you decided to take the challenge.
Here is the scheme you suggested:
When customers come into the cosmetics department of the store at peak
times, have them sign in, stating which cosmetic line they are interested in. If
that counter is free, they go right up to it. If it is not free, they put their name on
a waiting list for that cosmetic line. They then can do other shopping or browse
around until they are paged. Customers who say that any cosmetics line will do
are put on the waiting list for all the men's or all the women's lines, whichever
they stipulate. They will be paged for the first opening. (They will of course have
to be removed from all other lines when they are serviced at one.)
You convince the president that this plan not only will work but will increase
sales, because people will buy while they are waiting, rather than fume. How-
ever, before instituting this schen1e, the president wants you to do a computer
simulation of this process.
This will be a simulation of a queuing system. Such a system has servers and
queues. The servers are the cosmetic counters; the queues are the people on the
waiting lists.
What we want to find out from the simulation is the average waiting time and
the ave~age queue length for each cosmetic line.
The parameters of the system are as follows:
1. The number of servers: This store carries no more than 20 lines of cosmetics.
Some are for women and some are for men (but not both).
2. The events of the simulation and their effect on the system:
Programming Assignments I B39
~MF,h-0-#
~~J
2. A sequence of names and cosmetic lines followed by a time marker.
These. are repeated until an end-of-simulation marker (a period) is
encountered.
WOMEN,
MEN,
cosmetic line
2. If the desired counter is empty, assign the customer to it. If the counter is
busy, enter the custolner's name in the queue.
3. If the custonler's preference is WOMEN or MEN, put the customer at any
open WOMEN's or MEN's counter. If there are no open counters, put the
customer in every appropriate queue.
4. When a time marker is encountered, do the following:
(a) Decrement the service time of the customer being served at each
counter.
(b) When the service tinle of a customer becomes 0, that customer is finished
being served. The customer's name is inserted into the alphabetical list
of customers, and the person at the front of the queue becomes the cus-
tomer being serviced.
(c) If the person leaving the queue (starting to be serviced) had no counter
preference, that customer must also be removed from -the remaining
queues.
(d) Print out the name of the person being served at each counter and the
number of people in the queue for each counter.
5. When the end-of-simulation marker is encountered, print the average wait
time for each counter, then the average queue length. This time will be
measured in minutes; Le., each time mark is one miJ!ute.
COMPANY NAME
Q-Length wait, etc. next Q- ::.
info info M/F
Tips: 1. Average Queue Length: There is one such value for each cosmetic
line. To obtain this value, sum up the queue lengths at each hash mark
(#) and then divide this sum by the number of hash marks.
2. Average Wait Time: There is one such value for each cosmetic line.
This is only for those customers who have left the service counter. To
obtain this value, sum up the wait times of the customers who left the
service counter and then divide this sum by the number of customers
who left the counter.
3. Service Time Wait Time: Service time for each customer is obtained
using the random number generator. Wait time is the time each cus-
tomer spent in a queue before entering a service counter. At each hash
Programming Assignments \ 841
mark, decrement the "time left" at each service counter by 1 and in-
crement the \vait time of each customer in the queue by 1.
4. Time Interval Between Two Consecutive Hash Marks Is 1 Minute: If a
customer (service time = 5 minutes) has just reached a counter, the
customer in the front ofthe queue can reach this counter only after five
hash marks.
Sample data:
*3. You have been hired to write a progran1 to manage the ticket office of a theater.
Your program will sell tickets and assign people seats, based on input containing
ticket requests and cancellations of ticket orders. Periodically, you are asked to
display the current status of all tickets and customers.
Input/output: The input consists of two different palts. The first part consists of
two numbers-the number of rows in the theater and the num-
ber of seats per row. The rows are numbered starting at 1. You
may assume there are at most 100 rows. Seats are numbered from
left to right. You may not assume any bound on the number of
seats per row.
The second part of the input contains ticket requests and can-
cellations. There are four different commands (see below). Com-
mands are free format; an arbitrary number of blanks and end-of-
lines may occur between fields in a command. Names in a com-
mand are at most 10 characters long.
Commands:
REQUEST (person) (number)
B42 IProgramming Assignments
The given person requests the indicated number of tickets. Further, our theater-
goers are a little finicky; they require that all tickets in each particular request be
for consecutive seats in the same row. In addition, they must be given the ro\v
with the smallest possible number. If there are several blocks in the row that are
large enough, the theater-goers must be given the seats that are as far to the left
as possible (see an example later). You must follow this seat allocation policy. If
there is no block of available seats large enough, the person's request goes onto a
waiting list. It is possible that later cancellations will free up a suitable block of
seats. After processing the person's request, print a message indicating which
seats (if any) he or she was allocated or whether the request was put on the
waiting list.
CANCEL (person)
The given person cancels a ticket request. If this person has made several re-
quests, all of them are canceled. If this person is on the waiting list, his or her
name is crossed off the list. If this person has been allocated a block of tickets,
the tickets are freed. This may allow people on the waiting list to be serviced.
Before reading in any more requests, we go down the waiting list, allocating
tickets until someone cannot be serviced (or until we reach the end of the list). If
someone cannot be serviced, we do not skip over him or her and look at people
lower on the list; we read in the next request instead. After processing the can-
cellation, print out a message that the person's order was canceled and which (if
any) people were allocated seats.
STATUS
Print out in a readable format: (1) the names of the people who have been allo-
cated seats and which seats they were allocated. (This list does not have to be
sorted; the people can occur in any order. If a person has several blocks of tick-
ets, they do not have to be printed out consecutively.) (2) the names of people on
the waiting list and the sizes of their requests.
END
End of input; quit processing.
(Note that Pascal pointers must be used.) This section suggests possible data
structures and outlines some restrictions on the data structures you may use.
1. Storing what people have been given which tickets. You may not use an array
for this. Use a linked list in which each block in the list contains the person's
name, the row number of his or her tickets, the number of the first and last seat
in the block, and a pointer to the next block in the list. If a person has made
several requests, just have separate blocks for each. A lTIOre complicated struc-
ture in which the person only occurs once is not necessary (but may be used).
2. Storing what seats are available. For this you may not use a big two-dimen-
sional array or any other structure requiring an excessive amount of space. A
Programming Assignments I B43
possible structure is an array of 100 lists, in which the ith list describes the
seats available in row i. However, a list lnay not have one entry per seat (this
would be as bad as using a two-dimensional array). Seats must be grouped
together into blocks of adjacent seats as shown below.
x X X X X X X 3
1
X X X X 2 row #
X X 1
2
1234567
seat # 3
X = not available
1 ignore
Each record in the ith list describes a block of available adjacent seats in that
ro\v. For example, the first block in the list for row 1 says that seats 2 through 6
are available. Each list is sorted in order of increasing seat number. You may
find it useful to use a doubly linked list and/or headers and trailers.
3. Storing the waiting list. You may use an array to store the waiting list (which
is actually a queue) in sequential allocation. If you do, you must check for
overflow. You may use another linked list instead if you wish.
Sample:
Input:
3 7
REQUEST WAGNER a
REQUEST BRAHMS 5
REQUEST WAGNER a
REQUEST BEETHOVEN
REQUEST HANDEL 3
REQUEST LISZT 7
REQUEST WAGNER a
REQUEST BERLIOZ 3
STATUS
CANCEL WAGNER
STATUS
REQUEST VIVALDI 2
CANCEL BEETHOVEN
REQUEST BACH a
STATUS
CANCEL BRAHMS
STATUS
END
B44/ Programming Assignments
Selected output: The output of the STATUS commands is shown below. (The
diagrams of the theater are for purposes of illustration; you
are not required to print them.)
3 x X X X X X X
2 X X X X X
1 X X X X X
1234567
WAITING LIST:
LISZT 7
WAGNER a
BERLIOZ 3
3 X X X
2 X X X X X
1 X
WAITING LIST:
LISZT 7
BERLIOZ 3
3 x X X
2 X X X X X
1 X X X X X X
1 2 3 4 5 6 7
WAITING LIST:
LISZT 7
BERLIOZ 3
3 X X X X X X
2 X X X X X X X
1 X X X X X X
1 234 567
WAITING LIST:
Jim Bitner
*4. You are to write a genealogy program that will accept data about the various
relations among a set of people and, when requested, print out information about
the relations.
Sample output: The output below reflects the input statements and commands
by echoprinting the input. The information output as a result of
PRINT is indented below the command.
*5. Your assignment is to write a prograIn for a computer dating service. (Assignment
1, Chapter 1, is a simplified version of this problem.) Each customer will give
you his or her name and a list of phrases (which we will call attributes) that
describe the person and his or her interests. It will be your job to maintain lists of
men and women using the service and to match up compatible people (those
with similar qualities and interests).
Data structures: The problem requires you to maintain three lists: one of un-
matched men, one of unmatched women, and a third of pairs
of people who have been matched. (Actually, it may be sim-
plest to store the pairs as two parallel lists.) When a person
starts using your service, he or she is put on one of the un-
matched lists. Later, he or she may be put on the list of
matched pairs (if a match is found). For simplicity, you will
assume that once a person is put on the list of matched pairs,
he or she remains there forever. (The list of matched pairs is
maintained so that you can give prospective clients a list of
satisfied customers.)
Basic Components of the commands: The basic components of the input file
are \-vords, strings and lists of attributes.
These are defined as follows:
1. A word is a sequence of letters. (Words are like reserved words in Pascal and
tell us what kind of data is coming.) A word is at most 20 characters long.
2. A string is a sequence of letters, digits, and blanks. It must start with a letter,
and therefore any blanks before the string are not part of the string. It can,
however, have elnbedded blanks, which are considered significant (i.e.,
"JOE GREEN" is different from "JOE GREEN"). Each name and attri-
bute will be a string. The maxilnum length for a string is 20 characters.
3. A list of attributes is a sequence of strings separated by commas and termi-
nated by a period. A list is at most ten items long and will contain no dupli-
cate attributes.
Programming ASsignm~nts I 849
Important: The list of attributes will be sorted to lnake it easier to tell if two
people are compatible.
Commands: The following describes the seven possible commands to the sys-
tem. Commands are separated by an arbitrary number of blanks
and end-of-lines. The syntax diagrams at the end of the assignment
indicate the parameters of each cOlumand.
In addition to providing the output described below, you must echoprint each
command as it is read in. A blank line should be output after each command to
improve readability.
ADD
The specified person should be added to either the list of men or the list of
women, whichever is appropriate. You may assume no two people ,vill have
the same name.
DELETE
The specified person has become dissatisfied with our service, and his or
her name should be deleted from the appropriate list. You are guaranteed
that the person is actually in the list.
MATCHUP
This command requires you to find all compatible pairs of unmatched peo-
ple in the database and put them on the pairs list. (People must be of oppo-
site sex.) A pair is cOlupatible if they have at least three attributes in com-
mon. If there are several ways of pairing up compatible people, you may
choose any way you like as long as two compatible people are not left on the
unpaired lists after the matching has been done. (Remember that the attri-
bute lists are sorted. Therefore, testing to see if two people are compatible
can be done much more quickly.)
PRINT
Print the names and attributes of all the unmatched people of the given sex
who match the given list of attributes. (This service is provided to people for
an extra charge.) Of course, match means having at least three attributes that
are on the list. The list of attributes can be printed in any format (vertically or
horizontally), as long as it is understandable.
PRINTALL
Print the names and attributes of all men, women, or matched people (de-
pending on the word after PRINTALL). The list of pairs may be printed with
the men first, followed by all the women (or vice versa), or with the men and
women interleaved.
DUMP
Print the names and attributes of all people in the database. Label each list
you output so that it is clear to which class people belong. (Note: This is
useful for debugging.)
END
Stop processing.
Sample output: The upper-case letters show the original input file. They have
been echoprinted.
B50 \ Programming Assignments
<string)
(aMisl)
<command)
Sample input:
Chapter 7
*1. A toy that many children play with is a base with three pegs and five disks of
different diameters. The disks begin on one peg, with the largest disk on the
bottom and the other four disks added on by order of size. The idea is to move the
disks from the peg they are on to another peg by moving only one disk at a time
and without ever putting a larger disk on top of a smaller one.
This child's toy is actually an example of a classic mathematical puzzle called
the Towers of Hanoi problem.
Write a recursive solution to this problem. Yes, one exists. It may take you a
while to see the solution, but the program itself is quite sholto
*2. Another classic problem that lends itself to a recursive solution is the Eight
Queens problem. The problem is to place eight queens on a chess board in such
a way that no queen is attacking" any other queen.
Programming Assignments I 853
Represent a chess board as an 8 x 8 array of Boolean. If a square is occupied
by a queen, the position is TRUE. Otherwise the square is FALSE. The status of
the chess board when all eight queens have been placed is the solution.
3. The maze problem in Chapter 3 illustrated the use of the stack data structure.
Rewrite the same problem using a recursive algorithm.
*4. In this program you will write a function that computes an approximation to a
definite integral that is within a given tolerance of the exact answer. You will also
write a sorting procedure that will help monitor the behavior of the integration
function. You will then test your function on several prescribed integrals.
Test data: Test your integration function by computing the following integrals:
f A B EPS MAXFUN
1 0.0 1.0 10- 3 5
x -1.0 3.0 10- 5 3
x -1.0 3.0 10- 5 2
e-x*x 0.0 5.0 10- 3 1000
(i-x)l/3 0.0 1.0 10- 2 10
(i-x)l/3 0.0 1.0 10- 3 1000
4
0.0 1.0 10- 3 1000
Note: Standard Pascal does not require that an exponential function be pro-
vided. If your version of Pascal does not include an exponential function,
you should enter the code for an exponential function as a procedure in
the main program or as an external function.
Also, most exponential functions will not allow you to raise a negative
number to a real number power. This problem occurs with the fifth and
sixth functions above. However, you can take the absolute value of the
base, raise it to a power, and then append the correct sign.
Alan eli ne and
David Scott
Programming Assignments \ B55
asterisk is encountered, you should make a copy of the tree as it is at that time
and print the copy.
8. You are to write a program that creates and maintains a pair of binary trees, OAK
and ELM.
Data structures: You will need two binary trees, OAK and ELM. Each node on
the trees contains the following information: name, phone
number, and pointer fields. The nodes on each tree are to be
ordered by name.
Input: The input to the program is a series of commands. All input is free for-
mat, with an arbitrary number of blanks and end-of-lines occurring any-
where.
Basic com1nand components:
1. A word is a sequence of at most 10 nonblank characters, delimited by a
blank. Words include phone numbers (e.g., 438-5555) and tree names (OAK
or ELM).
2. A string is a sequence of at most 20 characters, delimited by a period or
comma, including blanks, beginning with a letter. Leading blanks should be
skipped, but embedded blanks are considered to be significant. Names are
strings.
Commands:
These are the commands to the systenl. In addition to generating the specific
output required, you must echoprint each command as it is read in. A blank line
should be output after each command to improve readability.
ADD (treename) (name), (phonenumber)
The specified person should be added to the tree designated by (treename). You
nlay assume that no two people have the same name.
DELETE (treename) (name)
Delete the specified (name) from the designated tree. You may assume that the
person is actually on the tree.
CHECK (treename) (name)
Check whether the specified (name) is on the designated tree, and print an ap-
propriate message.
MATCHCHECK (namel), (name2)
Check to see if (namel) (in the ELM tree) and (name2) (in the OAK tree) have the
same phone number. If so, delete them both from their respective trees. You
cannot assume that the names are in the tree. If one or the other is not found,
print an appropriate message.
PR I NT (treename)
Print the names and phone numbers of all the nodes in the designated tree. The
names should be ordered alphabetically.
DUMP
Programming Assignments I857
Print all the names and phone numbers in both trees. Each tree should have an
appropriate heading and have its names in alphabetical order.
END
Stop processing.
Input syntax diagram:
~
STRING
~ (maximum 20)
blank
Sample data:
*9. Your assignment is to write a program for a police department that has collected a
database of information on various suspects for a given crime. (Luckily for you,
the department is only investigating one crime at a time.) Each suspect has a set
of atb·ibutes, such as shifty eyes, a limp, a parrot on shoulder, etc. The maximum
number of such attributes for any suspect is not known. Your program will accept
commands to manipulate the database in order to narrow down the list of sus-
pects, in the hope of pinpointing the villain.
Input: The input consists of the initial database followed by *, then a set of
inquiries, each separated by *, and finally terminated by *. This is
shown in the following diagram:
initial database
inquiry 1
inquiry 2
inquiry N
*
Programming Assignments I 859
An inquiry consists of a set of commands about a single crime. At the end of
each inquiry, the crime is assumed to be solved. We begin working on a new
crime with the next inquiry. Therefore, we start over with the entire original list
of suspects (as given in the initial database).
Assume that each * is preceded by at least one blank space. The syntax of the
initial database and of the inquiries is given in the subsequent sections.
Initial database.' The initial database consists of the attributes for a list of sus-
pects.
(attribute) sequence of 20 or less nonblank characters.
(name) sequence of 20 or less nonblank characters. (You may assume that
no name is the same as any attribute.)
initial
database -C SUSPECT.~.
name ~ allnbute
The set of attributes for each suspect starts with SUSPECT, followed by the
suspect's name and attributes.
Inquiry:
inquiry
Examples of a tip-info:
Sample data:
SUSPECT
QUICKDRAW-MCGRAW TALKS-WITH-DRAWL WALKS-WITH-LIMP
HAS-LONG-HAIR
SUSPECT
TWINGUN-MORGAN TALKS-WITH-LISP IS-BEARDED SMOKES-CIGARS
SUSPECT
JACKDA-RIPPER WALKS-WITH-LIMP BITES-FINGERNAILS
CARRIES-KNIFE HAS-LONG-HAIR
SUSPECT
SON-OF-SAM TALKS-WITH-LISP IS-BEARDED EATS-FRITOS
SMOKES-CIGARS
SUSPECT
SLOWDRAWL-RAUL TALKS-WITH-DRAWL CARRIES-KNIFE
HAS-LONG-HAIR EATS-FRITOS
SUSPECT
SLOAN-DE-UPTAKE WALKS-WITH-LIMP IS-BEARDED
BITES-FINGERNAILS HAS-LONG-HAIR
*
TIP THE CRIMINAL TALKS-WITH-LISP
TIP THE CRIMINAL HAS-LONG-HAIR
CHECK QUICKDRAW-MCGRAW
CHECK SON-OF-SAM
PRINT
*
TIP THE CRIMINAL SMOKES-CIGARS
PRINT
TIP THE CRIMINAL IS-BEARDED
Programming Assignments I B61
CHECK TWINGUN-MORGAN
TIP THE CRIMINAL TALKS-WITH-LISP
PRINT
*TIP THE CRIMINAL TALKS-WITH-DRAWL
CHECK SLOWDRAWL-RAUL
TIP THE CRIMINAL HAS-LONG-HAIR
CHECK SLOAN-DE-UPTAKE
PRINT
*TIP THE CRIMINAL BITES-FINGERNAILS
CHECK TWINGUN-MORGAN
TIP THE CRIMINAL WALKS-WITH-LIMP
CHECK SLOWDRAWL-RAUL
TIP THE CRIMINAL HAS-LONG-HAIR
PRINT
*
PRINT
TIP THE CRIMINAL SMOKES-CIGARS
PRINT
TIP THE CRIMINAL IS-BEARDED
PRINT
TIP THE CRIMINAL BITES-FINGERNAILS
PRINT
* Jim Bitner
10. There is a real program developed by a computer company that reads in written
reports, issues warnings on bad style, and partially corrects the style. This as-
signment is to create a simplified version of this program. It will contain a tree
of words that can be annoying if used in a report. The program will caution the
writer on these tendencies, and then correct the report. The output will be the
original text, the corrected text, a list of slight tendencies the writer has (1 to 4
occurrences of an annoying word), and a list of extreme tendencies that the
author should avoid (5 or more occurrences of an annoying word). These last
two lists will be printed in alphabetical order.
After an author sees which words the program has changed, the author may
decide to tell the program not to change certain words using DELETEWORD
(badword). This command will remove (badword) from the search tree so that
all future input paragraphs will not have this word replaced in the paragraph.
The author may also find other words annoying, and will tell the program to add
them to the list by ADDWORD (word), followed on the same line by several
synonyms to insert in place of this word. ADDWORD (badword) (syn1) (syn2)
... adds the new (badword) to the search tree, and the computer will replace
any occurrence of that word in a future input paragraph with some synonym
given for that word.
The program has a main data structure of annoying bad words and is ar-
ranged as a search tree (we term this BADTREE). BADTREE is complicated by
the fact that when the program finds an annoying word in the text, it must
replace it by a more acceptable synonYlTI. It is nearly as annoying to have the
same word repeated every tiIne for the annoying word, and so the program has a
862 I Programming Assignments
list of several synonYITIS to use for every node in BADTREE (to a maximum of
5 synonyms). These are arranged in a circular list, so that the progranl cycles the
words around for every occurrence of a particular annoying word (see exaITIple).
There are at most 20 annoying words in BADTREE at anyone time.
Input:
1. A list of annoying words, in which the first word on each line is the annoying
word, followed by a maximum of 5 synonyms. Each word is a maxiInum of 15
characters and is terminated by a blank. There may be multiple blanks be-
tween words. The entry for an annoying word is ended by an EOLN.
2. A paragraph of text over several lines, in which each word is separated froln
each other by blanks, commas, periods, or quotes. There need not be a blank
before the word at the beginning of the line. ENDPARAGRAPH appears on
a separate line to end the paragraph.
3. Any number of the following two con1mands in any order, one per line (a
blank ends the cOIllmand):
Output: 1. After BADTREE has been initialized, print out the badwords and
their synonyms in alphabetical order by badword.
2. For a paragraph of text:
(a) Echoprint the input text.
(b) Print the corrected output text.
(c) Print
SLIGHT TENDENCIES TO USE ANNOYING WORDS:
(a list in alphabetical order of the words used 1 to 5 times)
EXTREME OVERUSE OF ANNOYING WORDS:
(a list in alphabetical order of the words used more than 5
times)
3. For the list of commands altering BADTREE, echoprint the input.
Sample:
Input:
grungy dirty soiled griMY encrusted
aweSOMe aMazing incredible
teen}' Sitlall tin}'
ENDWORDS
The a par t itl e n t IAI ass 0 9 run 9}' i t IAI a S tot a I I}' alAI e S 0 itl e. The s 0 f a
IAI a S 9 run 9}' t the floor IAI a s 9 run g}' t the f rid 9 e IAI a s 9 run g}' tan d
e\)en the grass outside IAlas grung}'. When I thin~\ that could
h a u e bee n IAI her e I IAI 0 U1 d I i \) e t his yea r t a I I I c 0 UIds a}' IAI a S t
'Total I}' alAleSOitle + '
Programming Assignments I B63
ENDPARAGRAPH
ADDWORD fridge refrigerator
DELETEWORD aweSOMe
ENDALTER
The a par t (,1 e n t ',,' ass 0 9 r Ij n g}1 i t 1,. 1 a s tot a I l}' a 1,,1 e s 0 (,1 e. The s 0 f a
1"las 9 run g}1 t the floor 1"las 9 run g}1 t the fridge 1"las 9 run g}' t and
even the grass outside was grungy. When I think that could
hal.Je been 1"lhe re I 1"lould I il.Je this }/ear t all I could sa}' 1"las I
'Totall}1 al"leSO('le. '
ENDPARAGRAPH
Output:
Annoying words and synonyms:
aweSOMe aMazing incredible
grungy dirty soiled griMY encrusted
teen}' s('lall tin}1
(echoprint of input paragraph)
The a par t (,1 e n t 1,,1 ass 0 d i r t}1 i t 1,,1 a s tot a I 1}1 a (,1 a z i n g. The s 0 f a
1,,1 ass 0 i led t the floor 1,,1 a s 9 r i (,l}' t the f rid 9 e 1,,1 a sen c r u s ted t
and el.Jen the grass outside 1"las di rt}/. When I thin~\ that could
h a l.J e bee n 1,,1 her e I 1,,1 0 u I d I i l.J e t his }' ear t a I I I c 0 u Ids a}' 1,,1 a s
'Totall}1 incredible.'
SLIGHT TENDENCIES TO USE ANNOYING WORDS:
a 1,,1 e s 0 rtl e
EXTREME OVERUSE OF ANNOYING WORDS:
9 run g}1
ADDWORD fridge refrigerator
DELETEWORD aweSOMe
(echoprint of input paragraph)
The a par t (,1 e n t 1,,1 ass 0 d i r t}1 i t 1,,1 a s tot a I l}' a 1,,1 e s 0 (,1 e. The s 0 f a·
I,,' ass 0 i led t the floor 1,,1 a s 9 r i (,l}' t the ref rig era tor I,,' a s
encrusted t and even the grass outside 1"las di rt}/. When I thin~,
that could hal.Je been 1"lhere I 1"lould lil.Je this }/ear t all I could
sa}' 1,,1 as t 'Totall}1 al"leSO(,le. '
SLIGHT TENDENCIES TO USE ANNOYING WORDS:
fridge
EXTREME OVERUSE OF ANNOYING WORDS:
9 run g}1
Cael Btll'klL'~'
You can see that this is a recursive process, since DIFF is defined in terms of other
uses of DIFF. The base cases are DIFF(C) = 0 and DIFF(X) = 1 (when X is the
variable with respect to which you are differentiating).
8661 Programming Assignments
The recursive DIFF function itself is similar to the function that evaluates an ex-
pression stored in a binary expression tree. The basic processing, given a pointer to
a node in the tree, PTR, and a variable with respect to which we are differentiating,
V, is:
IF INFO (PTR) V
THEN
DIFF~l (;1: 1\llle 1 ;;:)
ELSE
IF INFO (PTR) a constant OR
INFO (PTR) another variable
THEN
DIFF~ (* Hule 2 *)
ELSE
S / 0 = 'DIVISION BY ZERO'
0/0 'UNDEFINED'
Basically your program has three main tasks per expression/variable pair.
1. Build a binary expression tree representing the expression.
2. Differentiate the expression in the tree with respect to the variable.
3. Simplify the expression representing the derivative and print result.
1. The object of this programming assignment is twofold. First, you are to compare
the relative performance of different sorting algorithms on the same data set.
Second, you are to compare the relative performance of the same algorithm on
two different data sets.
Five sorting algorithms are to be tested. You are to code and run the following
sorts:
1. Insertion sort. You are to run two different versions of the standard insertion
sort. One will use a singly linked list and the other will use an array imple-
mentation with elements being shifted down as necessary.
2. Binary tree sort.
3. Quicksort. You may use either a recursive or a nonrecursive version of quick-
sort. Be sure to indicate which it is on the output.
4. Any sort of your choice. This can be any sort you choose. It can even be the
other version of quicksort if you wish.
You must include a counter in the inner loop of each sort which counts comparisons.
Input: Two files of integers to be sorted. A maximum of 100 integers in the first
data set and a maximum of 1000 integers in the second data set.
Output: The following output should be repeated for each sort:
1. The name of the sort
2. Echoprint of the input
B68 I Programming Assignments
Your final output should be a summary table that lists the type of sort and number of
comparisons, by data set.
1. The object of this assignment is twofold. First, you are to compare the relative
performance of different searching algorithms on the same data set. Second, you
are to compare the performance of the same algorithm on data sets of different
sizes.
Code the following three search strategies:
1. linear search in an unordered list
2. linear search in an ordered list
3. binary search
INPUT: Create a data set of 100 integers. Do ten searches with each algorithm.
Be sure the searches include values not in the list as well as those in
the list.
Create a second data set made up of three different sets of data to be searched.
The first set of data should have 6 values, the second 50, and the third 150. Run
each routine with five searches within each of the three data sets.
2. Take the data from the previous assignment and use the division method of hash-
ing to store the data values. Use table sizes of 7, 51, and 151. Use the linear
method of collision resolution. Print out the tables after the data have been
stored. Run the same five searches within each of the three data sets, counting
the number of comparisons necessary. Print out the number of comparisons nec-
essary in each case.
OUTPUT: The following should be printed for each data set.
1. Echoprint the input data.
2. For each search request,
(a) print the value being searched for.
(b) In each algorithm print
(i) the algorithm name.
(ii) 'YES' if the search is successful; 'NO' otherwise.
(iii) the nUlnber of comparisons made.
Note: The input will have to be sorted before algorithms 2 and 3 can be run.
Index
h¥
234567890