2. Software Engineering vs structured programmingSoftware Engineering vs structured programming
• Not really a course about software engineering…
1. Software engineering
– Mostly about concepts,
2. Structured programming
– Revision, coding in C and Matlab, functions
3. Data structures
– structures, classes
4. Object oriented programming
– objects, object-oriented concepts like inheritance, polymorphism,
patterns and the standard template library
3. Learning OutcomesLearning Outcomes
• The course will aim to give a good understanding of basic design
methods, and emphasize the need to produce well-structured
maintainable computer software. The course will concentrate on
principles, but these will be reinforced with examples in Matlab and
C/C++ programming languages. Specifically, by the end of the
course students should:
– understand concepts of basic program design techniques that can
be applied to a variety of programming languages, in particular
Matlab and C/C++
– understand the need for structured programming in software
projects
– be able to recognise and to produce and/or maintain well
structured programs
– have a basic understanding of the role of and advantages of
object oriented design
4. TextsTexts
• Sommerville, Software Engineering, Addison-Wesley (8th
edition), 2007.
• Wirth, Algorithms + Data Structures = Programs,
Prentice-Hall, 1975
• Leveson, Safeware: System Safety and Computers,
Addison-Wesley, 1995.
• Lipmann and Lajoie, C++ Primer, Addison-Wesley, 2005.
• Goodrich et al., Data structures and algorithms in C++,
Wiley, 2004
5. The Role of Computing in EngineeringThe Role of Computing in Engineering
• Computing is ubiquitous in engineering. Why?
• Awesome speed of modern, everyday computers a makes complicated
analysis and simulation possible across all domains.
• Applications in design and modelling. Far beyond the reach of the
mortal human engineer. Indeed many modelling problems are utterly
infeasible without modern computers and software.
• In embedded systems, computers can provide a level of power, speed,
flexibility and control not otherwise possible (eg mobile phone)
• Computing is “cheap” (but exercise this argument with care)
• Software is the key…
some examples…
6. Example: mobile phoneExample: mobile phone
• Even simple mobile phones
rely on software
• Typical phone has a
microcontroller (SIM card) with
a small program
– Drive GUI
– Control devices (keypad,
microphone, a/d, dsp,
decoder)
7. Example: Sizewell BExample: Sizewell B
• Nuclear power station (PWR), onstream in 1995
• Software used extensively in the design
• Software for control!
– first UK reactor to use software in its Primary Protection
System)
8. Example: A380Example: A380
• A380
• 1400 separate
programs
• There is a
software project
just to manage all
the software!
• Clearly safety-
critical features of
the software
9. Example: NPfITExample: NPfIT
• NHS National Plan for IT
• Plan to provide electronic care records for patients
• Connect 30000 GPs and 300 hospitals
• Provide secure access to records for healthcare
professionals
• Provide access for patients to their own records via
“Healthspace”
10. Software engineering versus programmingSoftware engineering versus programming
• Software engineering is about more than just
programming/coding
• It is about design principles and methodologies that yield
programs that are
– Robust
– Manageable
– Reusable
11. Software vs “other” engineeringSoftware vs “other” engineering
• How is software engineering similar to other
engineering?
• Abstraction and Modularity
– Consider free-body diagram
– Thevenin/Norton
– Low output impedance / High input impedance
– Digital computer
• We return to these concepts later…
13. Modularity: Op-amp bufferModularity: Op-amp buffer
• Unity gain buffer
• Vout = Vin
• Very high input impedance, very low output impedance
+
-
In
Out
14. Software vs “other” engineeringSoftware vs “other” engineering
• How is software different to other engineering?
• Pure, weightless, flexible
• Capacity to incorporate massive complexity
• No manufacturing defects, corrosion, aging
15. Intrinsic difficulties with softwareIntrinsic difficulties with software
• Analogue versus discrete state systems
• The “curse” of flexibility
– Can encourage unnecessary complexity
– Redefinition of tasks late in development – shifting goal-post
• Complexity and invisible interfaces
– Standard way of dealing with complexity is via modularity
– But this alone is not enough because interfaces can be subtle
and invisible, and here too there is a need to control complexity
• Historical usage information
– Unlike physical systems, there is a limited amount of experience
about standard designs
16. When software projects go wrongWhen software projects go wrong
• A320, Habsheim and Strasbourg
17. When software projects go wrongWhen software projects go wrong
• London Ambulance Service
– 1992, computerised ambulance despatch system fails
• Therac-25
– 2 people died and several others exposed to dangerous levels of
radiation because of software flaws in radiotherapy device
• OSIRIS
– £5M University financial package
– Expenditure to date more like £20-25M
• NPfIT?
– NHS £12 billion IT project
• comp.risks is a great source of others...
18. NHS National programme for IT: NPfITNHS National programme for IT: NPfIT
• Plan to provide electronic care records for patients
• Connect 30000 GPs and 300 hospitals
• Provide secure access to records for healthcare
professionals
• Provide access for patients to their own records via
“Healthspace”
• Laudable?
• Realistic?
– Software Engineering specialists have their doubts
– Ross Anderson (Prof of Security Engineering, Cambridge
Computing Laboratory) wrtes in his blog “I fear the whole project
will just continue on its slow slide towards becoming the biggest IT
disaster ever”.
19. Software life-cycleSoftware life-cycle
• Software development stages
– Specification
– Design
– Implementation
– Integration
– Validation
– Operation/Maintenance/Evolution
• Different types of system organise these generic
activities in different ways
• Waterfall approach treats them as distinct stages to be
signed off chronologically
• In practice usually an iteration of various steps
20. RequirementsRequirements
• Vague initial goals
• Iterative refinement
• Leading to more precise specification
• Example
– Calculate the n-bounce trajectory of a lossy bouncing ball.
– Refine this to consider
• What does the statement actually mean?
• Physics
• Initial conditions
• Air-resistance?
• Stopping criterion (criteria)?
– Now, think about how to design/implement
22. Extreme programming (XP)Extreme programming (XP)
• Proposed in the late 90s as a reaction to problems with
“traditional” development processes
• Takes extreme position compared with waterfall
approach
• Appropriate for small-medium sized projects
– Teams of pairs of programmer, programming together
– Incremental development, frequent system releases
– Code constantly refined, improved, made as simple as possible
– Do not design for change; instead change reactively
23. Top down designTop down design
• Here want to keep in mind the general principles
– Abstraction
– Modularity
• Architectural design: identifying the building blocks
• Abstract specification: describe the data/functions and
their constraints
• Interfaces: define how the modules fit together
• Component design: recursively design each block
24. Modular designModular design
• Procedural programming: focus on algorithms
• Object-oriented programming: focus on data structures
Algorithms Data structures
Programs
25. Structured programmingStructured programming
• Top-down vs bottom-up
• Both are useful as a means to understand the relations
between high-level and low-level views of a program
• Top-down
– Code high level parts using “stubs” with assumed functionality
for low-level dependencies
– Iteratively descend to lower-level modules
• Bottom-up
– Code and test each low-level component
– Need “test harness” so that low-level can be tested in its correct
context
– Integrate components
• Not hard-fast rules; combination often best
26. Simple design toolsSimple design tools
• Flow chart • Pseudo-code
– Wait for alarm
– Count = 1
– While (not ready to get up
and count <= 3)
• Hit snooze button
• Increment count
– Climb out of bed
27. Data flowsData flows
• Data flow diagram
• Simple example, VTOL simulator
Controller
Simulator
Display
state
state thrust
29. Basic coding techniquesBasic coding techniques
• Pretty much any program can be specified using:
– Sequences of instructions
• { Do A; Do B; Do C }
– Conditional instructions
• If (condition) Do A
– Repetitions (loops)
• While (condition) Do A
• These semantic concepts are implemented in different
high-level programming languages using different syntax
30. Implementation in Matlab and CImplementation in Matlab and C
N= 10;
tot = 0;
totsq = 0;
for i=1:N
tot = tot+i;
totsq =
totsq+i^2;
end
tot
totsq
int i;
int tot = 0;
int totsq = 0;
for (i=1; i<N; i++) {
tot += i;
totsq += i*i;
}
cout << tot << endl;
cout << totsq << endl;
31. Notes on coding styleNotes on coding style
• Use meaningful variable names
• Use comments to supplement the meaning
• Indent code for each block/loop
• Encapsulate groups of statements sensibly in functions
• Encapsulate related data sensibly in data structures
• Design top down
• Code bottom-up or top-down, or a combination
32. Matlab vs CMatlab vs C
• Matlab and C are both procedural languages
• Matlab is an interpreted language
– each statement decoded and executed in turn
• C is a compiled language
– each module (.c file) is converted into assembly language
– The interfaces between the modules are
• Shared global data
• Function calls from one module to another
– This is resolved at link time when the modules are linked
together into an executable
33. Procedural programmingProcedural programming
• Aim is to break program down into functional units
– procedures or functions
– Set of inputs, set of outputs
• In Matlab and C this procedural building block is the
function
• Understanding functions…
34. Organisation of Matlab programsOrganisation of Matlab programs
• A Matlab “program” may be a script or function
– i.e. a sequence of instructions
• This script or function will typically call a bunch of other
functions
• Functions are stored in .m files
• Multiple functions can be stored in one .m file, but only
first is visible outside
– The others are local functions
– Part of the recursive subdivision of the problem
36. Organisation of C programsOrganisation of C programs
Source code
.c .cc
Object file
.o
compilation
Source code
.c .cc
Object file
.o
compilation
…..
…..
linking
executable
37. FunctionsFunctions
• Function definition
• Function call
• Function prototype
• Scope (local versus global data)
• Parameters and return value(s)
• Function call
• Low-level implementation of function calls
• Recursion
38. Function definitionFunction definition
% compute factorial
function z = fact(n)
% function body
z = 1;
for i=1:n
z = z*i;
end
// compute factorial
int fact(int n)
{
int i, val = 1;
for (i=1; i<=n; i++) {
val *= i;
}
return val;
}
39. Function callFunction call
• Distinguish between
– The function definition
• Defines the set of operations that will be executed when the
function is called
• The inputs
• The outputs
– And the function call
• i.e. actually using the function
• Formal vs Actual parameters
• Return value(s)
– The value of a function evaluation is the return value
fact(10)
a = 6;
z = fact(a);
[V,D] = eig(A);
40. Function prototypeFunction prototype
• The function prototype provides enough information to
the compiler so that it can check that it is being called
correctly
• Defines the interface
– Input (parameter), output (return value)
myexp.c file
float myexp(float x)
{
const float precision = 1.0e-6;
float term=1.0, res=0.0;
int i=0;
while (fabs(term)>precision) {
res += term;
i++;
term = pow(x,i)/fact(i);
}
return res;
}
myexp.h file
float myexp(float x);
41. Scope: local variablesScope: local variables
• Variables which are declared inside a function are local
variables
• They cannot be “seen” outside the function (block) in
which they are declared
• A local variable exists only for the duration of the current
function execution
• It is declared as a new variable every time the function is
called
• It ceases to exist when the function returns
• It does not “remember” its value between calls
42. Scope: global variablesScope: global variables
• Global variables exist outside all functions
• A global variable is visible inside functions
• If there exist two variables, one local, one global, with
the same name, then the local one takes precedence
within its local scope
• C and Matlab behave differently
– C will use a global if no local exists
– Matlab only uses a global if the programmer explicitly requests it
• Globals should be used with caution because their use
inside a function compromises its encapsulation
43. EncapsulationEncapsulation
• Want the function to behave in the same way for the
same inputs
– encapsulate particular functional relationship
• But if the function depends on a global it could behave
differently for the same inputs
• Live example using myexp
45. Side-effectsSide-effects
• Could set value of a global variable in a function
• Again this compromises the function’s encapsulation
– Causes a side-effect
– An implicit output, not captured by the interface
• Makes it difficult to re-use code with confidence
• c.f. C and Matlab function libraries
– Set of re-usable routines with well defined interfaces
• In small projects maybe not a big problem
• Hugely problematic in bigger projects, especially when
multiple programmers working as a team
• Complicates interfaces between components, possibly in
unintended ways
46. Low-level implementation of function callLow-level implementation of function call
Memory
CODE
DATA
machine code
global variables
STACK
local variable m
local variable 1
return location
return value n
return value 1
parameter x
parameter 1
………
Activation
record
47. Pass by value/referencePass by value/reference
int i=5, j=10;
swap(i,j);
cout << i << “ “ << j << endl;
Pass by value Pass by reference
void swap(int a, int b)
{
int temp = a;
a = b;
b = temp;
return;
}
void swap(int& a, int& b)
{
int temp = a;
a = b;
b = temp;
return;
}
48. RecursionRecursion
• Recursion is the programming analogue of induction:
– If p(0) and p(n) implies p(n+1)
– Then p(n) for all n
• Define a function in terms of
– Itself
– Boundary conditions
• For example
– Factorial: n! = n * (n-1)!, 0! = 1
50. Data types and data structuresData types and data structures
• C/C++ predefine a set of atomic types
– bool, char, int, float, double
• C/C++ provides machanism for building compound data
structures
– struct (class)
– Array
• Matlab supports arrays/matrices (of course)
• Matlab also supports structures
51. • A class (struct in C) is a compound data type which
encapsulates related data into a single entity
class Complex {
public:
double re, im;
};
• Defines how a variable of this type will look
int i;
Complex z;
C/C++: struct and classC/C++: struct and class
Class definition
Create a variable (an instance) of this type
52. Example: VTOL stateExample: VTOL state
• Represent current state as, say, a
triple of numbers and a bool,
(position, velocity, mass, landed)
• Single variable represents all
numbers
– Better abstraction!
class State {
double pos, vel, mass;
bool landed;
};
State s;
Controller
Simulator
Display
state
state thrust
53. Accessing class membersAccessing class members
State s;
s.pos = 1.0;
s.vel = -20.0;
s.mass = 1000.0;
s.landed = false;
s.pos = s.pos +
s.vel*deltat;
Thrust =
ComputeThrust(s);
• In Matlab introduce
structure fields
without declaration
s.pos = 1.0;
s.vel = -20.0;
…
Thrust =
ComputeThrust(s);
54. Output parametersOutput parameters
Image ReadImage(const string filename, bool& flag);
bool ReadImage(const string filename, Image& im);
• Input: filename (type string)
• Output:
– im (type Image)
– boolean flag indicating success/failure
function [Image, errflag] = ReadImage(filename)
• Basically the same, but cleaner in Matlab!
55. ArraysArrays
• An array is a data structure containing a numbered
(indexed) collection of items of a single data type
int a[10];
res = a[0] + a[1] + a[2];
Complex z[20];
State s[100];
for (t=1; t<100; t++) {
s[t].pos = s[t-1].pos + s[t-1].vel + 0.5*g;
s[t].vel = s[t-1].vel + g – GetThrust(s[t-1],
burnrate)/s[t-1].mass;
s[t].mass = s[t-1].mass – burnrate*escapevel;
}
57. MethodsMethods
• In C++ a class encapsulates related data and functions
• A class has both data fields and functions that operate
on the data
• A class member function is called a method in the
object-oriented programming literature
59. ConstructorConstructor
• Whenever a variable is created (declared), memory
space is allocated for it
• It might be initialised
– int i;
– int i=10;
– int i(10);
• In general this is the work of a constructor
• The constructor is a special function with the same name
as the class and no return type
– Complex(double x, double y) {
{ re = x; im = y; }
60. Information hiding / encapsulationInformation hiding / encapsulation
• Principle of encapsulation is that software components
hide the internal details of their implementation
• In procedural programming, treat a function as black
boxes with a well-defined interface
– Need to avoid side-effects
– Use these functions as building blocks to create programs
• In object-oriented programming, a class defines a black
box data structure, which has
– Public interface
– Private data
• Other software components in the program can only
access class through well-defined interface, minimising
side-effects
63. C++ program organisationC++ program organisation
• Complex.h
class Complex {
public:
Complex(double x, double y);
double Re();
double Im();
double Mag();
double Phase();
private:
double re, im;
};
64. C++ program organisationC++ program organisation
• Complex.cpp
#include “Complex.h”
Complex::Complex(double x, double y) {
re = x; im = y;
}
double Complex::Re() { return re; }
double Complex::Im() { return im; }
double Complex::Mag() {
return sqrt(re*re+im*im);
}
double Complex::Phase() { return atan2(im,re); }
65. Object-oriented programmingObject-oriented programming
• An object in a programming context is an instance of a
class
• Object-oriented programming concerns itself primarily
with the design of classes and the interfaces between
these classes
• The design stage breaks the problem down into classes
and their interfaces
• OOP also includes two important ideas concerned with
hierarchies of objects
– Inheritance
– polymorphism
66. InheritanceInheritance
• Hierarchical relationships often arise between classes
• Object-oriented design supports this through inheritance
• An derived class is one that has the functionality of its
“parent” class but with some extra data or methods
• In C++
class A : public B {
…
};
67. ExampleExample
class Window
Data: width, height
posx, posy
Methods: raise(), hide()
select(), iconify()
class TextWindow
Data: cursor_x, cursor_y
Methods: redraw(), clear()
backspace(), delete()
class GraphicsWindow
Data: background_colour
Methods: redraw(), clear()
fill()
class InteractiveGraphicsWindow
Data:
Methods: MouseClick(), MouseDrag()
68. PolymorphismPolymorphism
• Polymorphism, Greek for “many forms”
• One of the most powerful object-oriented concepts
• Ability to hide alternative implementations behind a
common interface
• Ability of objects of different types to respond in different
ways to a similar event
• Example
– TextWindow and GraphicsWindow, redraw()
70. ExampleExample
• Class A is base class, B and C both inherit from A
• If the object is of type A then call A’s func()
• If the object is of type B then call B’s func()
• If the object is of type C then call C’s func()
• If class A defines func() as
virtual void func() = 0;
then A has no implementation of func()
• class A is then an abstract base class
– It is not possible to create an instance of class A, only instances derived
classes, B and C
– class A defines an interface to which all derived classes must conform
• Use this idea in designing program components
– Specify interface, then have a guarantee of compatibility of all derived
objects
71. Another exampleAnother example
• Consider a vector graphics drawing package
• Consider base class “Drawable”
– A graphics object that knows how to draw itself on the screen
– Class hierarchy may comprise lines, curves, points, images, etc
• Program keeps a list of objects that have been created
and on redraw, displays them one by one
• This is implemented easily by a loop
for (int i=0; i<N; i++) {
obj[i]->Draw();
}
72. TemplatesTemplates
• Templating is a mechanism in C++ to create classes in which one or
more types are parameterised
• Example of compile-time polymnorphism
class BoundedArray {
public:
float GetElement(int i) {
if (i<0 || i>=10) {
cerr << “Access out of boundsn”;
return 0.0;
} else {
return a[i];
}
}
private:
float a[10];
};
73. TemplatesTemplates
template <class Type>
class BoundedArray {
public:
Type GetElement(int i) {
if (i<0 || i>=10) {
cerr << “Access out of boundsn”;
return Type(0);
} else {
return a[i];
}
}
private:
Type a[10];
};
BoundedArray<int> x;
BoundedArray<Complex> z;
74. Design patternsDesign patterns
• Programs regularly employ similar design solutions
• Idea is to standardise the way these are implemented
– Code re-use
– Increased reliability
– Fewer errors, shorter development time
• An array is special case of a container type
– Way of storing a collection of possibly ordered elements.
– List, stack, queue, double-ended list, etc
• Templates in C++ offer a way of providing libraries to
implement these standard containers
75. Standard Template LibraryStandard Template Library
• C++ provides a set of container classes
– Standard way of representing and manipulating container types
– eg, methods insert(), append(), size(), etc
• STL supports
– Stack (FILO structure)
– List (efficient insertion and deletion, ordered but not indexed)
– Vector (extendible array)
– others
76. STL exampleSTL example
• std::vector<Type> is an extendible array
• It can increase its size as the program needs it to
• It can be accessed like an ordinary array (eg v[2])
• It can report its current size
– v.size()
• You can add an item to the end without needing to know
how big it is
– v.push_back(x) #include<vector>
int main() {
std::vector<int> v;
for (int i=0; i<20; i++) v.push_back(i);
for (int i=0; i<v.size(); i++)
std::cout << v[i] << std::endl;
}
77. STL, continuedSTL, continued
• To create a new STL vector of a size specified at run-
time
int size;
std::vector<Complex> z;
std::cin >> size;
z.resize(size);
z[5] = Complex(2.0,3.0);
78. STL, continuedSTL, continued
• To create a two dimensional array at run-time
int width, height;
std::vector< std::vector<int> > x;
x.resisze(height);
for (int i=0; i<height; i++)
x[i].resize(width);
x[2][3] = 10;
…
79. IteratorsIterators
• A standard thing to want to do with a collection of data
elements is to iterate over each
– for (int i=0; i<v.size(); i++)
• Not all container types support indexing
– A linked list has order, but only relative order
• An iterator is a class that supports the standard
programming pattern of iterating over a container type
std::vector<int> v;
std::vector<int>::iterator i;
for (it=v.begin(); it!=v.end(); it++) …
• An iterator encapsulates the internal structure of how the
iteration occurs
80. Complete exampleComplete example
• Design a program to compute a maze
– User-specified size
– Print it out at the end
• Algorithm
– Mark all cells unvisited
– Choose a start cell
– While current cell has unvisited neighbours
• Choose one at random
• Break wall between it and current cell
• Recursively enter the chosen cell
81. Design data structuresDesign data structures
• Maze class
– Compute method
– Print method
– Two dimensional array of Cells
• Cell class
– Accessor methods
– Break wall methods
– Wall flags
– Visited flag
83. Maze class interfaceMaze class interface
class Maze {
public:
Maze(int width, int height);
void Compute(int x, int y);
void Print();
private:
int Rand(int n);
int H, W;
std::vector< std::vector<Cell> > cells;
};
84. Main programMain program
int main(int argc, char* argv[])
{
int width, height;
cerr << "Enter maze width: ";
cin >> width;
cerr << "Enter maze height: ";
cin >> height;
Maze m(width, height);
m.Compute(height-1,0);
m.Print();
return 0;
}
Davd Parnas (“Structured programming: a minor part of software engineering”, Information Processing Letters, 88(1-2), 2003), one of the leading researchers on software engineering in the world has argued that “Software Engineering is … a branch of Engineering specialising in software intensive products. ……..it is argued that writing programs, though obviously an essential step in the process, is only a small part of Software Engineering.” That said, it is this essential step that this course is primarily looking at.
I assume you have some skill in writing computer programs in Matlab, less in C, and maybe some knowledge of C++/Java, which will have been derived mostly from the 1st and 2nd year Computing practicals. There will be some brief revision of these during the course of the 4 lectures. The overall aim of this course is to take you beyond the narrow syntactical knowledge you acquired in these labs. To do this we will look in quite close detail at some things you have already seen, like functions, because a crystal clear understanding of these will greatly help in understanding how and why various programming practices are “good” and some (many!) are “bad”. I will try to use a mixture of both Matlab and C/C++ to illustrate the general principles.
The first is a comprehensive “classic” text book on software engineering, now in its 8th edition.
Wirth is a classic. Some aspects are outdated, but it remains very sound.
Leveson is very interesting and contains valuable lessons, but is really for peripheral reading.
Lipmann and Lajoie, and Goodrich et al, are C++ text books. The C++ bible is by Stroustrup.
Computing is cheap: well, we have to be careful here. Leveson in her seminal book on computers and safety (see course texts) debunks this “myth”, along with several others. Computer hardware is cheap relative to other elctromechanical devices; but the cost of writing and certifying reliable and safe software may be enormous.
Sizewell B was Britain’s first (only?) Pressurised Water Reactor. It is also notable because the Primary Protection System uses software extensively. This sofwtare is thus an (extreme!) example of so=called safety-critical software; i.e. unexpected operation could lead to disaster. Over 100,000 lines of code were validated in a process that was contraversial, hugely time-consuming and at greatly expensive.
The A380 incorporates 1400 separate computer programs. These govern everything from XXXXX to XXXX &lt;&gt;fixme&gt;. Indeed there are so many components to the software that there is a separate software project just to manage all the software! (cite)&lt;fixme&gt;
CAD software, of course, has played a maor role in the design, but has in fact been at the root of major delays of as much as two years I production. Two of the partners (French and German) used different versions of the CAD software which were unable to interface to one-another. Sigh! (cite)&lt;fixme&gt;
Like its predecessors (the first was the A320), the A380 is a fly-by-wire aircraft. This means that some of the software is safety-critical…
Like its predecessors (the first f-b-w commercial airliner was the A320) it uses fbw for weight savings and also implemnents a “Flight Envelope Protection System” (i.e. the computer will overrule attempts to take the aircraft outside its safe flying limits).
The A320 used 7 computers running sofwtare from two separate vendors (the aim being to minimise the possibility that the same bug would appear simultaneously.
See comp.risks archives for extensive discussion of the A380, and if you go back to the archives from late 1980s you will see a lot of discussion about several A320 incidents; more on this in a bit.
NPfIT is one of the biggest, if not the biggest ever, civil information technology programmes. Sounds like a great plan. Or does it? See later for a little discussion about this project.
Programmers of course want to produce code that is correct; i.e. it produces the right output for all expected inputs. But beyond this software should ideally be robust, in that it is capable of gracefully handling inputs not explicitly defined for its application.
Code that is well designed and implemented will facilitate current and future maintenance, make debugging more easy, and potentially expedite future changes.
Well designed software, which has clearly defined interfaces to other code, can be reused to improve correctness, save time, etc. For example, you would not reimplement Matlab&apos;s trig functions since these are already implemented in an effecient manner in the Matlab function library, and have a well defined interface (eg put in an angle in radians and get out sine of the angle). Re-inventing the wheel here introduces unnecessary complexity and the possibility of errors. A well-encapsulated module can be re-used.
In many respects software engineering is no different from other forms of engineering. In designing a bridge, or a building, or a machine, or an electronic circuit, or a control system, the engineer uses a variety of techniques to make the analysis/design of an apparently complex system more tractable. Two key ideas that transcends all engineering disciplines are those of abstraction and modularity. In statics and dynamics the engineer will consider the behaviour of a component by by use of a free-body diagram in which the role of neighbouring/connected components is abstracted as a set of forces. The component can then be considered as a module. In control theory (for that matter any dynamic modelling) we use transfer functions to model dynamic behaviour and a block diagram is an abstract representation of a set of modules dynamic modules. In linear, passive component circuit analysis Norton’s and Thevenin’s theorems enable a complex circuit to be abstracted as a voltage and/or current source and a resistance. And when dealing with transistors and op-amps, one tried to ensure that parts of a circuit have high input impedance and low output impedance so that they can be considered as separate modules with simple interactions (eg output voltage of circuit 1 equals input voltage of circuit two). Of course this reaches an extreme in the design of a digital computer (see A2) which lives in (and is built with) an analogue world, but which is built using logic gates, an abstraction of underlying analogue electronics. In P2/A2 you saw how the transistors/op-amps could be combined to yield flip-flops and and/or (etc) gates. Then how these could be combined into registers. And how registers (a higher level abstraction) could be connected to form a control unit and a data unit, and ultimately to yield a micro-processor.
Use of software is attractive for these reasons.
Consider Airbus A320, huge weight savings in fly-by-wire design. Wires carrying control signals will not fatigue like cables in traditional design.
The most complex engineering systems in existence today are almost either software, or heavily reliant on software.
A computer&apos;s behaviour can be easily changed by modifying its software. In principle, flexibility is a good thing, since major changes can be effected quickly and at apparently low cost.
Software does not age in the sense that it will always do the same thing, so it will not fatigue/fail like other engineering systems (distinguish this from the hardware on which the software runs which is of course built from digital circuits).
Software cannot be made more reliable by having multiple copies of it. Each copy will behave identically and have exactly the same failure modes.
Furthermore, it has been observed that software teams writing code independently to perform the same tasks will often make the same mistakes!
Discrete behaviour means we cannot interpolate. A bridge can be tested at extreme loads and we can assume it will work for lesser loads, but with software, every input is a separate, distinct situation. Further, the failure behaviour need not be related in any way to the normal behaviour.
A computer&apos;s behaviour can be easily changed by modifying its software. Though potentially advantageous (see previous slide), ease of change can encourage unnecessary complexity and introduce errors, because it blurs the distinction between what can be achieved versus what should be achieved. It can encourage a shifting of goalposts in specifications.
As mentioned earlier, a standard way in all engineering disciplines is to attach complexity via modularity. In physical systems, the physical separation of functions can provide a useful guide for decomposition into modules: the spatial separation of these modules limits their interactions,. Makes their interactions easy to trace and makes introducing new interactions difficult. With software there are no such barriers: complex interfaces are as “easy” to produce as simple ones, and the interfaces/interactions between software components can be invisible or not obvious.
Historical data: Consider aeroplane, or bridge design. The starting point for such designs is usually based on sound historical designs, and data collected about such designs over a long period [c.f. Millenium bridge, a new design of bridge that met with unexpected problems]. With much software it is specially constructed and therefore there is very little in the way of historical data to go on.
Air France Flight 296 was a chartered flight of a newly-delivered fly-by-wire Airbus A320 operated by Air France. On June 26, 1988, as part of an air show it was scheduled to fly over Mulhouse-Habsheim Airport at a low speed with landing gear down at an altitude of 100 feet, but instead slowly descended to 30 feet before crashing into the tops of trees beyond the runway. Three passengers were killed. The cause of the accident is disputed, as many irregularities were later revealed by the accident investigation. This was the first ever crash involving an Airbus A320.
On 20th Jan 1992 another incident involving an A320 in which the aircraft descended much too rapidly without the flight crew noticing until it was too late. The official report states that from the moment of descent until impact, the aircraft&apos;s high descent rate was adopted and maintained. The official conclusion on the cause of the accident was &quot;pilot error&quot;. The “pilot error” in this case was the confusion of the &quot;flight-path angle&quot; (FPA) and &quot;vertical speed&quot; (V/S) modes of descent. These were selectable on the Flight Management and Guidance System (FMGS) console. The pilots were inadvertently in V/S when they should have been in FPA mode. The error was not noticed on the console itself, due to the similarity of the number format display in the two modes. The other cues on the Primary Flight Display (PFD) screen and elsewhere (e.g., altitude and vertical speed indicator) were not noticed since the pilots were overloaded following a last-minute change of flight plan, and presumably were concentrating on the Navigational Display.
The report criticised the ergonomic design of the descent-mode selector for being too easy to misread or mis-set.
Officially neither of these accidents was caused by software. However the additional complexity introduced – which also of course has advantages – in these cases contributed to the accidents.
LAS: project to computerise ambulance despatch. To be completed in 6 months. A consortium of Apricot, Systems Options and Datatrak made the cheapest bid and was awarded the contract. Because of time pressures, and the lack of experience of the development team in dealing with safety-critical systems, fundamental flaws in system design, and inadequate consultation with users, the system went “live” even though there were 81 known errors. It ran for a day and a half before being shut down. A further 10-day trial was abandoned and the LAS reverted to manual operation. From the Independent, 30 Oct 1992, “Computer specialists yesterday said that the system blamed for this week&apos;s crisis at the London Ambulance Service appeared to ignore basic tenets for software where breakdown would put lives at risk. The failure of the computer system over 36 hours on Monday and Tuesday, which was said to have cost between 10 and 20 lives, raised serious questions about the way it was designed and tested, experts said. Yesterday, the software company involved, Systems Options, refused to comment.”
Therac-25: See Leveson, especially Appendix A
OSIRIS: Classic failures of trying to build a systemn to serve two different needs and neglecting one. Commissioned in 01/02. In Nov 2003 departmental administrators highlight 10 points critical to operation not adequately addressed, in partic the ability to generate reports central ro management of finances. System goes live in April 2004 with major omissions/flaws and University lucky to escape without serious financial meltdown.
NPfIT is one of the biggest, if not the biggest ever, civil information technology programmes. Original budget of £2.3billion over 3 years, by June 2006 this was estimated to be £12billion over 10 years. In April and June 2006 in an Open Letters, 23 leading professors of Computer Science wrote to the Health Select Committee calling for a public enquiry:
“As computer scientists, engineers and informaticians, we question the wisdom of continuing NPfIT without an independent assessment of its basic technical viability. We suggest an assessment should ask challenging questions and issue concrete recommendations where appropriate, e.g.:
- Does NPfIT have a comprehensive, robust:
- Technical architecture? - Project plan? - Detailed design?
Have these documents been reviewed by experts of calibre appropriate to the scope of NPfIT?
- Are the architecture and components of NPfIT likely to:
- Meet the current and future needs of stakeholders? - Support the need for continuous (i.e., 24/7) healthcare IT support and fully address patient safety and organisational continuity issues? - Conform to guidance from the Information Commissioner in respect to patient confidentiality and the Data Protection Act?
- Have realistic assessments been carried out about the:
- Volumes of data and traffic that a fully functioning NPfIT will have to support across the 1000s of healthcare organisations in England?
- Need for responsiveness, reliability, resilience and recovery under routine and full system load?
In April 2007, the Public Accounts Committee of the House of Commons issued a 175-page damning report on the programme. The Committee chairman, Edward Leigh, claimed &quot;This is the biggest IT project in the world and it is turning into the biggest disaster.&quot; The report concluded that, despite a probable expenditure of 20 billion pounds &quot;at the present rate of progress it is unlikely that significant clinical benefits will be delivered by the end of the contract period.&quot;
Specification: engineers and customers collaborate to specify objectives (what should it do?) and constraints. This process would typically be iterative, refining vague ambiguous objectives into more formal, tightly specified ones that clarify assumptions
System/Software Design: the requirements specification is then mapped onto an overall architecture of the system (both hardware and software, probably including choices of programming language). Software design involves identifying and describing the fundamental sofwtare abstractions asnd their relationships (i.e. identifying individual software components and designing their interfaces).
Implementation: software design is realised as a set of units or programs or modules. Unit/modules are tested to ensure each meets its specification.
Integration and Validation: components brought together and tested rigorously to ensure conformity to user-requirements and correctness
Operation/Maintenance/Evolution: system installed, modifications to adapt to changing user-requirements; maintenancce involves correcting errors that were not discovered earlier, imoroving the implementation and enahncing the services as new requirements are discovered.
Because software is invisible, signoff will often involve production of copious documentation .
What does the statement n-bounce trajectory mean? We could either assume that this means “given a set of initial conditions, work out n”, or it could mean, “given n, work out what the initial conditions would need to be”. This may inform how we set about the task, or how we represent the trajectory (or if we even represent it at all).
Verification checks the system against its specification, while validation checks that it actually meets the real needs; the two may be different if there were problems moving from the vague mission statement to the formal requirements in the design process.
Testing at the module/unit level should happen concurrently with implementation so that errors are found as quickly as possible.
Top down testing involves coding top-level components first and substituting trivial “stubs” for the lower-level functionality that has not yet been implemented (eg writing a simple function that returns a number corrupted by a random element to simulate a sensor module). Bottom up testing involves building the lowest level components first and testing them using a “test harness”.
Black-box testing checks a component by examining its outputs for a set of controlled inputs to check they match the specification. White-box, on the other hand, examines the internal structure of a module exercising every line of code and every decision point. It can be useful for checking limiting conditions which are a common source of error and may be missed in black-box testing.
As mentioned previously interpolation is not possible in the discrete world of computer hardware and software, but exhaustive testing is impossible for all but the simplest (trivial) systems.
Incremental development supported through small, frequent releases of code; requirements specified via a set of customer scenarios forming the basis for planning;
Customer involvement in the development cycle, defining acceptance tasks, etc
Pairs of programmers work very closely together, collective ownership of code
Change supported through regular releases, and continuous integration
Simplicity maintained by use of simple designs that deliberately do not anticipate future requirements. Instead constant refactoring to improve code quality.
Extreme programming takes the position that most design for change is wasted effort. Instead, code is re-factored: the programming team is constantly looking for improvements and implements them immediately; the software is then ideally easy to understand, facilitating change as/when necessary.
Beck, 1999, “Embracing change with extreme programming”, IEEE Computer 32((10).
Top down design means breaking the problem down into components (modules) recursively.
Each module should comprise related data and functions, and the designer needs to specify how these components interact – what their dependencies are, and what the interfaces between them are. Minimising dependencies, and making interfaces as simple as possible are both desirable to facilitate modularity.
By minimising the ways in which modules can interact, we greatly limit the overall complexity, and hence limit unexpected behaviour, increasing robustness.
Because a particular module interacts with other modules in a carefully defined manner, it becomes easier to test/validate, and can become a reusable component.
Note the reference to the general engineering principles of abstraction and modularity. A word about each ion its software context:
Abstraction: the idea here is to distil the software down to its fundamental parts, and describe these parts precisely, but without cluttering the description with unnecessary details such as exactly how it is implemented. The abstraction specifies what operations a module is for, without specifying how the operations are performed. A function header with a sensible name is an example of such an abstraction. The Matlab function header:
function y = sin(x)
tells us almost enough about this function to use it without ever needing to know how it is implemented (of course it doesn’t quite, since this specification does not expose the units of the angle x).
Modularity: the aim is to define a set of modules each of which transcribes, or encapsulates particular functionality, and which interacts with other modules in well defined ways. We will explore this in the context of functions when we look at side-effects and global variables. The ore complicated the set of possible interactions between modules, the harder it will be to understand. The bottom line is that humans are only capable of understanding and managing a certain degree of complexity, and it is quite easy (but bad practice) to write software that exceeds this capability.
In 1975 Niklaus Wirth, famous Swiss computer scientist (investor of Pascal, Modula 2), wrote a seminal text called Algorithms + Data Structures = Programs.
While the top-down design methodology is a general tool, how we approach it will potentially be language dependent.
In procedural programming the design concentrates on the functional requirements and recursively subdivides the functionality into procedures/functions/subroutines until each is a simple, easily understood entity.
Examples of procedural languages are C, Pascal, Fortran, Matlab
In object-oriented programming the design emphasis shifts to the data structures and to the interfaces between objects.
Examples of object-oriented languages are C++ and Java.
Donald Knuth interview, 7/12/1993:
“You can create the parts in whatever order is psychologically best for you. Sometimes you can create them from the bottom up. Bottom-up means that you know somehow that you probably need a subroutine that will do something, so you write it now while you&apos;re ready, while you&apos;re psyched for it. With this bottom- up programming, your pencil gets more powerful every page, because on page nine you&apos;ve developed more tools that you can use on page ten... your pencil is stronger.
With top-down programming you start at the beginning and say &quot;I&apos;m going to do this first and then this, and then this&quot;... but then you have to spell out what those are--- you can wind up gasping for breath a hundred pages later when you finally figure out how you&apos;re actually going to do those things!
Top-down programming tends to look very nice for the first few pages and then it becomes a little hard to keep the threads going. Bottom-up programming also tends to look nice for a while, your pencil is more powerful, but that means you can also do more tricky stuff. If you mix the two in a good psychological way, then it works, even at the end.
I did this with TeX, a very large program: 500+ pages of code in the book . Throughout that entire program, all those lines of code, there was always one thing that had to be the next thing I did. I didn&apos;t really have much choice; each step was based on what I&apos;d done so far. No methodology would teach me how to write a piece of software like that, if I followed it rigorously. But if I imagined myself explaining the program to a good competent programmer, all that this long program was, then there was just this one natural way to do it. The order in which the code appears in the book is the order in which I wrote it.”
Traditional flow charts are a rather cumbersome and outmoded way of designing code. There is a strong emphasis on the sequences of operations and decisions that is probably better represented using pseudo code [picture from EDraw website]
Pseudo-code is representation of program structure that is an intermediate between natural language and proper programming language. The structure of the program is expressed by the formatting (note indentation, etc).
Data flows,on the other hand, can be very useful in helping to abstracting the system architecture, and begin to define interfaces. One notation uses rounded rectangles to represent functional processing, rectangles to represent data stored, and arrows between boxes to represent data movement.
Sommerville p175
State machines represents how a system responds to internal and external stimuli, by showing how the internal state of the system changes in response to events. This representation should be familiar from P2/A2 Digital Logic and Computer Architecture courses.
Unlike the data-flow model, the state machine representation does not show the flow of data within the system. State machines are often useful abstractions when dealing with real-time systems, because these are often driven by external events.
In the example the diagram (taken from Sommerville p176) shows how the states of a simple microwave oven vary.
Of course a much more efficient way of doing this in Matlab would be to make use of the fact that the in-built data type is a matrix and that this could be achieved as:
x = [1:N];
tot = sum(x);
totsq = sum(x.*x);
Things to note are sequences of instructions. Note the different role played by the semicolon in the two languages. In C/C++ the ; denotes sequential statements. In contrast in Matlab the sequence is implicit from the different lines and the semicolon suppresses the output which would (implicitly) print the value of the variable to the console.
In Matlab a group of sequential statements is bracketed by an initial statement (in this case “for” and and “end” statement. In C/C++ a group of sequential statements is denoted using “{ }”.
C/C++ are strongly typed languages. This means variables must be decakred before use, and given an explicit type. In Matlab when a new variable name is encountered it is dynamically created. All variables in matlab are matrices of real numbers. In this case “tot” and “totsq” (and for that matter “i”) are simply 1x1 matrices (i.e. Scalars).
Note the unfamilar “&lt;&lt;” notation used in the C code. This is actually C++ code; C++ implements i/o using “streams” and the streaming operators “&lt;&lt;” and “&gt;&gt;”. “cout” is the standard output stream (the console) and so the last two lines should be interpreted as writing the value of tot followed by a new line to the console.
All of these points are aimed at producing code which is clear and as simple as it needs to be. This in turn will aid with debugging, modification and maintenance of the code.
Encapsulation is an important concept and we will discuss this in more detail with regard to functions and to data structures later in the course.
Note that Matlab does provide object-oriented support, but I do not discuss the details in this course.
Matlab is an interpreted language (or at any rate, it is best thought of as being interpreted). Each statement is decoded and executed in turn.
C is a compiled language
Confusingly in Matlab, (and in my view irritatingly) the name of the function in the function call is the filename, not the name that appears in the function header. Work that out!
Each .c file contains the implementation details of one or more functions and global variables
Each is compiled independently of other code modules isolation to produce an object file (.o)
In Matlab this would live in a .m file. In C/C++ this would live in a .c or .cpp file
The formal parameters are the parameters that appear in the function definition. They are placeholders for the values of the inputs; in this way they are analogous to the use of a variable in an algebraic expression. In the case of fact there is only one formal parameter, n.
The actual parameters are the values that get passed in at the time the function is called. In the examples above the actual parameters are 10, and 6 (since this is the value of “a” at the time fact(a) is called.
Carrying the analogy of algebraic expression forward, the function call is an evaluation of the expression at a particular value, given by the current value of the actual parameter(s).
C permits only one return value (can be a significant limitation!). This value is indicated in the definition using the “return …” statement. This return value becomes the “value of the function”.
Matlab, in contrast, allows multiple return values. Matlab does not use the “return …” notation. It is rather more elegant: the placeholders to be used for the outputs are indicated in the function header as “function [a,b,c] = myfunc(d,e,f)”. Here d,e,f are input placeholders and a,b,c are output placeholders. In the eig example, the first output is assigned to the matrix V, and the second to the matrix D.
The implementation detail of myexp is in the .c file.
In order to use the function, we do not need to know the internal detail from the .c file, just the function prototype in the header (.h) file. If you design a module that uses the function myexp, you need to include the prototype at the top of the code; more usually you put the prototype into a file with a .h extension and use the compiler directive #include “myexp.h”
The whole C standard library is built like this, and to use functions from it you issue #include &lt;math.h&gt;. During the linking phase of Modules are then linked together to resolve undefined references.
The prototype encapsulates the function interface and hides the implementation detail
Though Matlab does not enshrine the prototype in quite the same explicit way, if it helps, think of the descriptions returned by “help xxxx” in matlab as being the prototypes: you find out the interface without being exposed to unnecessary implementation detail.
Just for completeness, here is a version of myexp in Matlab:
function y = myexp(x)
precision = 1e-6;
term=1;
y=0;
i=0;
while abs(term)&gt;precision
y = y + term;
i = i+1;
term = x^i/factorial(i);
end
end
Note that in a Matlab context, variables are not declared explicitly. When a variable is used for the first time it is declared. If this use involves looking up the variable’s value it will cause an error: it must be an assignment, which implicitly declares the variable and sets its initial value. A common source of error in C programs is to forget to initialise the value of a variable before using it – the declaration does not set the value, it just allocates some memory.
Global variables are usually the resort of the lazy programmer. In general good practice dictates that you declare variables as close to where they are used as possible, and restrict their scope as much as possible.
Matlab guards against “indiscriminate” use of globals within functions by assuming that all variables are local and giving an error if there is a reference to an unknown local variable – unless the variable has been explicitly declared global inside the function, viz:
global varname;
In contrast, if there is no local variable that matches a variable in an expression, then C checks to see if there is a global one and uses that if there is.
In the example, the inputs are the same to function “myexp”, but the output is different.
We have used a global variable “precision” to govern the roundoff error in our exp function (we might think this is a good idea because we could use it for all functions involving a calculation of this sort, like sin and cos, etc).
Later, we quite understandably used the term precision for the reciprocal of variance (as is common terminology), forgetting to declare it as a local variable in function “_tmain”. Resetting this to 1/variance drastically affects the output of myexp!
It is easy to say “you made a silly mistake not declaring precision in _tmain”, but this misses the point.
One way around this would be to make the global precision a const, viz:
const precision = 1.0e-6;
Then the assignment in _tmain would generate an error at compile time.
However again the main point is that by making exp depend on something global, we effectively make its interface less transparent.
There is a temptation to use “easy” techniques like global variables rather than rigorously defining function interfaces, but this is the resort of the lazy programmer and can lead to very obscure errors. Do not be fooled by the simple examples in the lecture. The complexity of a program even for relatively simple tasks grows rapidly to the point where it is very easy to make unknowing errors in coding.
In large projects, with multiple programmers working in teams this type of error can create enormous problems – one team may have no conception of the consequences of its implementation choices on another team. Errors relating to poor interface design may be obscure and hard to trace and eliminate. At best this creates expense, at worst serious risk of operational failure.
Because of the way so-called object-oriented programming encapsulates related data and functions together, emphasizing the interfaces between objects, it has become the paradigm of choice for large projects. Implement an object correctly, with the right interface, and it can be used with confidence as a building block.
Whenever a function is called, the stack is use to store:
return location (JSR, RTS machine code)
parameter values
local variables
return values
When a function returns the stack shrinks as the local variables are removed. Execution returns to wherever the function was called from (as given by the return location), and the parameters are removed from the stack. Finally the return values are removed from the stack.
Usually best to think of parameters as being values. All parameters in Matlab are passed by value.
If you change the values of the formal parameters in the function, this has no effect (see example of “swap” when a and b are passed by value).
However in C (well, actually, C++) you can specify that parameters are passed by reference. This is done with the & operator. Instead of placing the value of an actual parameter on the stack, the location of the actual parameter is placed on the stack. The compiled machine code uses indirect addressing to access the parameters (recall from A2?)
One reason for this is to overcome the limitation that C only permits one return value.
If not exercised carefully and in a completely transparent way, changing pass-by-reference parameters in a function can destroy its encapsulation. One sensible way of managing this would be to have all input parameters first in the list, followed by any “output parameters”, and make this clear in any comments and.or documentation.
Note, also, that arrays in C/C++ are always passed by reference. This means that changes to array elements in a function “stick” – they remain after the function returns. This is not the case in Matlab.
When either designing recursive functions, or considering if a particular problem can (or should!) be solved recursively, it is useful to think about ways in which the problem can be subdivided into one or more problems with the same structure as the original. In some cases there are no gains to be had over a simple iterative version, and in other cases a naïve recursive version can be very inefficient (consider a simple implementation to compute the nth fibonacci number, fib(n) = fib(n-1) + fib(n-2) -- a naïve implementation would end up computing fib(2) many times. Exercise: can you think of a smart way of doing this in Matlab?).
However the power of recursion lies in (i) the fact that recursive implementation are often beautifully elegant and map cleanly onto a problem specificiation; (ii) significant gains can be had using a divide-and-conquer approach. Typically one starts with a problem and then divides it into two wit the same structure, recursing, then combines the two in a clean way.
Examples of this are the sorting algorithm due to C A R Hoare (QuickSort), with a run-time of O(NlogN), and the Fast Fourier Transform due to Cooley and Tukey.
The data elements that make up the class/struct are known as fields
We could represent the state as a set of four “atomic” variables. However this would not capture the conceptual relationship between the variables, and would result in code has a more complicated interface. The controller would need to take 4 input variables instead of one! Likewise the simulator output would be four quantities. Harder to understand, less close to our data flow diagram.
Data fields are accessed using the “.” (dot) operator.
“s” is a variable of type “State”. In object-oriented terminology, “s” is an instance of “State”.
As shown on the previous slide, the use of classes is another way of returning multiple values from a function in C. Recall that an alternative is to use reference parameters.
How do you decide which to use? If the data really are conceptually related, then put the data together in a class/struct. If they are not then consider using reference parameters. An example of when this might happen would be if you want to return some data unless there is an error, in which case you want to flag the error. Two possibilities are shown in the slide. I would recommend the second form:
bool ReadImage(const string filename, Image& im);
The informed reader would see this as one input parameter (indicated by the const keyword which means the function is not allowed to change its value), and one output parameter (Image& im is a reference to an image).
This function will return an image, read from the file specified, but also return a boolean value to indicate if the read was successful.
In Matlab, the cleaner looking function header (prototype) would be that makes it absolutely clear what the inputs and outputs are:
function [Image, errflag] = ReadImage(filename)
Of course arrays are supported pretty much as atomic types in Matlab.
C/C++ provides no inbuilt runtime checking of array bounds, so beware – it is possible to access, eg d[1][7]. Because of the way arrays are stored, this will actually return the value stored in d[2][2] (why??). If you are lucky, your program will crash with a “Segmentation Fault” or something nasty like this. If you are unlucky, it will apparenrly run but give nonsensical results and be very difficult to trace.
Methods, like data fields, are accessed using the dot operator.
The third form above int i(10) is the C++ style of initialisation. This form makes it clearer that the constructor is being called.
The constructor is defined in the class definition like any other method. It is called when an instance of the class is created (ie a variable of that class is declared)
Live example.
The private data fields of an instance of the Complex class, cannot be accessed by other software components. Instead, these components must use the so-called accessor methods Re() and Im().
Here, the internal representation of Complex has been changed. This has necessitated a change to the internal workings of each of the methods, but the interface, the way other software components use this class, has remained unaltered.
This is the essence of encapsulation and it is the reason why it is such an important programming concept. The interface captures all the functionality that is required for other program components to use the class.
In C++ programs the header file (.h) plays a much more important role than in C. Every program element that uses the Complex class needs to know how it is defined. Well, actually, more specifically, each program element needs to know the interface. This part, which is, captured in the header file is then incuded in every module that uses Complex:
#include “Complex.h”
The implementation detail is in the .cc or .cpp file
The implementation of each method in the Complex class appears in the .cpp file. The class scoping operator “::” is used to inform the compiler that each function is a method belonging to class Complex.
Strictly speaking, the first three points apply to so-called object-based programming. They are general design principles.
Object-oriented programming includes further powerful mechanism that help organise hierarchies of objects (inheritance), and enable objects in the same hierarchy to respond in tailored ways to the same events (polymorphism).
Object-oriented programming has become the programming paradigm of choice for large projects. The reason is that OOP actively encourages encapsulation. By specifying the object interfaces clearly in the design phase of a project, teams of programmers can then implement the components with some assurance that the components will all work together.
A class in C++ inherits from another class using the : operator in the class definition. The code above reads “class A inherits from class B”.
The class at the top is known as the base class. A GraphicsWindow, for example, has the data fields
width, height, posx, posy, background_colour
And the methods
raise(), hide(), select(), iconify(), clear() and fill()
An InteractiveGraphicsWindow does not add any data fields, but does add methods that deal with mouse interactions.
Notice that in the previous example, both TextWindow and GraphicsWindow have a method redraw(). Indeed it will probably be the case that we want a redraw() method for any type of window. We could get this, of course, by putting the method redraw() into the base class, but what if the way to redraw a TextWIndow is different from the way to redraw a GraphicsWindow? For example a TextWidnow redraw() method will probably render all the characters that are currently in the window in the right font, etc, while the GraphicsWindow redraw will display an image, or render a pipeline of graphics drawing commands.
The idea behind polymorphism is to permit the two different window types to share the interface redraw(), but implement it in different ways.
Thus different objects (TextWindow and GraphicsWindow) respond to the same event (eg a raise() event which calls a redraw() event) in different ways.
#include &lt;iostream&gt;
using namespace std;
class A {
public:
virtual void func() { cout &lt;&lt; &quot;A\n&quot;; }
};
class B : public A {
public:
void func() { cout &lt;&lt; &quot;B\n&quot;; }
};
class C: public A {
public:
void func() { cout &lt;&lt; &quot;C\n&quot;; }
};
void callfunc(A param)
{
param.func();
}
int main(int argc, char* argv[])
{
A x;
B y;
C z;
x.func();
y.func();
z.func();
callfunc(x);
callfunc(y);
callfunc(z);
return 0;
}
Slightly technical point to do with C++ (rather than being a general principle): an array can only store objects of the same type. In this case it stores pointers to (ie the memory address of) each object. The -&gt; operator does the same thing as the dot operator, but dereferences the pointer (looks up the object via the memory address) first.
Programmers regularly use similar, or even identical design solutions for problems.
The Standard Template Library is a suite of code that implements (among other things) classes of container types with standard ways of accessing elements, inserting, adding, deleting elements, etc, as well as a set of standard algorithms that operate on these classes (eg searching, iterating over all elemnets, etc).
While these could all be coded from scratch by a competent programmer, code re-use in this way speeds development and reduces the possibility of introducing errors in the implementation. This addresses one of the issue with Software Engineering discussed in lecture 1, that of the lack of historical design data to rely on.
The point here is that though different applications will have to represent different data, and will therefore have custom classes, it is common for the organisation of multiple instances of these classes to be done in standard ways (such as in an array). The use of custom classes is at first thought problematic if we want to implement these standard containers until we realise that templates enable us to write the generic, templated code, and specialise it to our custom classes at compile-time.
An iterator is a class that supports the standard programming pattern of iterating over a container type without exposing the underlying representation of the container type itself (cf iterating using an integer index – this relies specifically on the ability of the representation to index).
Refine specification.
Not much to do here: How will the size be specified? How to print it out? What algorithm to use?
Now design data structures…
Now design the data structures. In particular, think about the classes and the class interfaces.