The production of software is a labour intensive activity. Given the scale of software projects
related to current and future HEP experiments, it is worth trying to improve the knowledge of
the PEOPLE involved, the organization of the software development PROCESS and the
TECHNOLOGY used in the various aspects of this activity. The goal is better systems at lower
cost and happier users of the software.
Published in the proceedings of the 1994 CERN School of Computing, Sopron, Hungary. 1/25
1 Introduction
Software is expensive. Building it and maintaining it are labour intensive activities, but delays in delivery
can be very costly and any undetected problems may cause loss of performance and frustrate users. This
also applies to High Energy Physics, even though physicists have been at the forefront of computing in the
early days and, still are now in several areas.
Today the use of computers for physics is much more pervasive, and the size, complexity and lifetime of
the software are much larger. For most of the physicists and engineers working in experiments, the
business is physics or engineering, not computing, and many technical managers are not computer literate
enough. The use of software is growing, and often it seems that the process of producing it is out of control.
There is the feeling that not enough people around have the adequate software engineering background to
improve the situation.
The software development teams around experiments are traditionally very independent. Aspects of the
software development culture are local to various centers, although the same application packages are
widely used in the community for some aspects of the work.
Three key components drive all improvements in software development productivity: the PEOPLE
involved, the organization of the development PROCESS, and the TECHNOLOGY used. Talented people
are a very important element in any software organization, but even the best professionals need an
organized environment in which to do cooperative work. The same applies to advanced technology:
without an organizational framework, it cannot be fully effective, and there is evidence that going for new
technology instead of improving the process can make things worst.
2 People
It is obvious that any software system is dependent on the quality of the people who build it. People
working in Particle Physics have been selected for their intellectual abilities, so nobody could argue that
our software problems are due to the lack of talented people. Our people have produced many excellent
programs and packages, and superior programs derive from superior design.
Superior design, on the other hand, is always the work of people who understand the application domain.
This is a necessary condition of success, anybody who does not understand statistics cannot build a good
statistical system. Of course he may profit from help by those who are familiar with technical aspects of
software production. We can confidently assume that in our field we have bright people, who know their
application domain. Let us see what they do in the scientific discovery process, and what is their
involvement with software.
and similar systems, CAVIAR, the histogramming package HBOOK and its more recent interactive version
PAW, and so on.
3 Process
Many large organizations are actively trying to find ways of producing better software at lower cost, within
predictable resource allocations and time estimates.
In recent years, it has been realized that the way to progress is to study and improve the way software is
produced, while better technology only helps once the organizational framework is set. This discipline
goes under the name of “Software Process Improvement” (SPI), and the reference for it is the work of the
Software Engineering Institute [1] at Carnegie Mellon University.
The book by W. S. Humphrey, “Managing the Software Process” [2] is a good starting point to learn about
the basic ideas behind this approach. SPI is based on statistical process control, defined in the ‘30s and used
by Japanese industry after WW2, and by now in all industrial countries of the world. It is Total Quality
Management applied to software.
important to distinguish them logically, and identify documents that are the outcome of the various
phases. For software that comes in successive version a, the life cycles of the versions usually overlap: one
may be designing version 3 while delivering version 2 to users who run version 1.
3.3.1 Assessment
Before a successful attempt can be made to improve the software process, it is essential to carry out an
assessment in order to rate the current status, formulate goals and an improvement program to reach those
goals. The Bootstrap Project [7] by ESPRIT in Europe has been in operation for a few years. Starting from an
extended SEI model, an assessment procedure has been defined, and tried out on a number of
organizations in different countries. The assessment is based on two questionnaires, one addresses
software quality attributes while the other deals with management issues that reflect the organization’s
maturity. The replies are then analyzed and the results are presented showing the key attributes grouped
under three main areas: Organization, Methodology and Technology.
The weakest aspects will point in the direction of improvement actions that must be identified by the
assessment team and given a priority that takes into account also external conditions such as cost, resource
availability etc. Actions must be followed up, and the assessment/actions cycle repeated regularly, if the
organization intends to improve its maturity. A statistical survey of results from different types of
organizations and countries has shown that most of the european software producers are between levels 1
and 2. Telecom companies are almost at 2, banks just above 1. National differences are also visible.
4 Technology
Various activities and tasks are performed during the software development process to go from user
requirements to the delivered product. The machine can often help the human, sometimes even
automatically, to avoid errors, work faster and concentrate the attention at the conceptual level.
The programs that help building software are called software development tools and are very
specialized. Some are designed for one activity during the whole life cycle, while others for a specific action
limited to one phase of the life cycle. In these notes we cannot cover all aspects of the technology concerned
with software development. We illustrate our presentation with a few tools that can help to introduce the
concepts and that we have experience with.
and output of each task. Once the process is modelled, ProcessWeaver helps to follow its execution by
automating the start of each task within the process when all required input available. It can also start the
appropriate tools needed for the execution of the task. On the other hand, MS-Project (figure 4) is a project
management tool to handle resources allocated to the project in terms of manpower, time and other
icon represents a task he/she must perform. Clicking on an icon opens a work context (3) with a description of the task
(3a), icons for all input and output documents to be produced (3b 3c) and completion/decision buttons (3d). Clicking on
an input/output document will start the appropriate tool (e.g. FrameMaker, C/C++ programming environment ...).
Similarly, clicking on a completion/decision button will end the task and send it to the right project member.
The manager uses PetriNets (1) to model the process describing the tasks to be performed, by whom, in which order
and the input and output to each task. Each place (1a) models a task under execution by one or more project members.
Each transition (1b) models the submission of a task to one or more project members after the completion of the
previous task(s). The member of the team can open a read-only dynamic view of this process model to consult its state
and see which are the tasks in progress and who is handling them.
figure 4 The MS-Project window is divided into three sections. Section (1) is a list of tasks and sub-tasks with the start time, end
time and the resources allocated to them. Section (2) shows a time flow of all tasks with their dependencies and
milestones. Section (3) shows for each resource when it is in use, free or over allocated. Contrary to ProcessWeaver
(figure 3), MS-Project only displays the information statically: it must be fed with information on the state of each task,
and it reschedules all the dependent tasks automatically.
ProcessWeaver and MS-Project are complementary and can be coupled: ProcessWeaver can read the MS-
Project description file to generate the process skeleton, and MS-Project can read runtime information on
the process status generated by ProcessWeaver.
Another domain where tools help is the software process assessment, to measure the maturity of the
process and set improvement goals. The ESPRIT project, AMI (Application of Metrics in Industry),
produced a 12-step method for the measurement of the software development process and starts with an
SEI self assessment. It ensures that quantitative approaches are used to achieve company objectives and
improvement in the software development process. AMItool (figure 5) implements the AMI method and
takes the user from the SEI questionnaire to the measurement plan.
figure 5 AMItool implements the AMI method. To begin with one answers an SEI questionnaire (1). AMItool compiles and
presents the results of the questionnaire in a Kiviat graph (2). On this Kiviat graph each axis represents one domain of
the process (e.g. Training, Organization) and is divided into the 5 SEI CMM levels. A dot is placed at the corresponding
level of maturity on each axis. The process manager, using the results of the questionnaire, identifies primary goals
(e.g. ‘Gain Better Understanding of Project Costing’) and decomposes them, in the goal tree (3), down to basic sub-
goals. He then defines metrics and interpretation rules and associates them to goals and basic sub-goals (4). AMItool
then generates a plan (5) for the measurement of the process. After a while, another AMI iteration can start.
Communication and information flow in a project can be improved by suitable tools. For example, a
discussion involving more than two people over the network, can be difficult to follow because more than
one idea is discussed at the same time. This is where WIT (World Wide Web Interactive Talk) [11] can help.
WIT (figure 6) allows discussions through the WWW and displays discussion items in a structured fashion.
Lotus Notes is a commercially available tool addressing the same problem, that works across UNIX
machines PCs and Macs.
figure 6 A WIT system is composed of Discussion Areas. A Discussion Area is a general subject, which has specific aspects,
called Topics (1), and each Topic is composed of Proposals (2) which generate an exchange of messages (3). The big
advantage of WIT over the news groups is the structuring. When looking at a Proposal one can see what has been
discussed, who answered to whom, who agrees/disagrees with whom, etc. WIT is not only useful to browse through
discussions but it is also possible to agree ( ) or disagree ( ) with any statement in the Proposal using the tool itself.
At each stage of the life cycle many documents are produced by different people: requirements
specification, analysis, source code, user manuals, etc. These documents are heavily structured and cross-
referenced. LIGHT (LIfecycle Global HyperText) [12] integrates these documents and allows to navigate
(figure 7) between and inside requirements, diagrams, code, manuals, etc.
figure 7 The LIGHT concept is illustrated by this ADAMO [13] example. The navigation map (1) shows all documents that are
linked here: pictures, requirements, analysis diagrams (2), DDL (3), FORTRAN source code (4) and ADAMO reference
manuals (5). Clicking on any of the document symbols will go to the corresponding document, which is itself linked to
the others. For example clicking on the “track” class in the diagram will open the “track” declaration in the DDL. Similarly
clicking on an ADAMO routine call within the code will open the ADAMO manual entry for that routine.
The Verification software must itself be specified and written. Many commercial tools are available to help
organize verification and validation activities. For example, Purify and TestCenter (figure 8) detect memory
access errors and memory leaks in C and C++ programs. Similarly, TestCenter and Logiscope (figure 9) can
visualize which part of the code has been executed during tests (test coverage). Programming in a higher
level language is another way of preventing this class of programming errors.
figure 8 TestCenter is an environment for C/C++ testing and debugging. It shows memory leaks (memory with no pointer to it)
and possible memory leaks (memory with a pointer to its middle). It can also detect runtime errors such as: pointer and
array boundary errors, use of uninitialized memory, attempted write to read-only memory, errors using free and
unanticipated termination signals. For both the memory leaks and runtime errors, TestCenter gives respectively in the
leak browser (1) and the error browser (2) information on the problem. It also shows the source code line where the
error was introduced, and can help with test coverage analysis to some extent.
figure 9 Logiscope performs test coverage analysis for programs. It displays the call graph tree (1) and highlights the function
calls not executed; similarly for each function, it shows the control graph (2) and highlights the instruction blocks not
The verification and validation will introduce changes and new releases of the software. It is important to
be able to perform defect logging and tracking with a flexible source code manager with a database and a
query system.
figure 10 For each function Logiscope calculates a set of metrics (size, comment rate, cyclomatic number, etc.) and displays the
results in a Kiviat graph (1). Each axis is one metric. The plot is the value of that metric for this function. The two circles
represent the minimum and maximum values acceptable. The metric distribution graph (2) is the detail of one metric
(here the cyclomatic number, and indicator of local code complexity, high for “spaghetti” code). Clicking on a bar of the
histogram will display the list of the functions for which the metric has the corresponding value. The configuration
manager can consult the control graph of each function. Logiscope also measures and displays metrics on the call
graph (reachability of a function, testability of a function, etc.). It can also check programming rules (banned keywords,
consistent usage of case, misusage of brackets, etc.). These rules and reference values for the metrics are defined by
the quality manager, and problems pointed out by Logiscope can be analyzed by the quality manager or by the
programmer himself.
4.1.4 Maintenance
Once the software is developed, tested and delivered, users will start to work with it, and suggestions and
bug reports will come in. The maintenance team will sort this input and set priorities. After improving the
software the development team carries out regression tests (i.e. old tests still work and no new bugs are
introduced). They may also have to introduce new test programs for the newly detected bug.
A CASE tool can help in the application of the method. For example, Artifex (figure 11) supports the
PROTOB method based on High Level Colored Concurrent PetriNets with simulation and C/ADA code
generation. Similarly, Objecteering (figure 12) implements the Class Relation method with C++ code
generation. For a given method one finds usually many tools to supports it and needs to choose the
appropriate one. IEEE has produced recommendations for the evaluation and selection of CASE Tools [30]
and suggests criteria such as: how well the methodology is supported, are all the concepts covered, how
much consistency checking is performed, how much the tool can be customized and the quality of its
graphical user interface.
figure 11 Artifex checks that the PetriNets are consistent and complete. It also allows debugging and prototyping with its
interactive simulation tool. Artifex generates distributed code along with the Makefiles. Developers can even compile,
start the remote code and monitor the execution using the interface. The logic of the PetriNet and the inter-process
communications are all handled by the tool. Developers can concentrate on the model and specification of the actions
in a transition.
figure 12 With Objecteering, the developer performs the analysis using the CR notation (1). He can enter the method
specification in C++ through the graphical user interface (2). Objecteering can then generate the C++ code and
automatically integrate the user specified part in the right place (3). All the syntax of the C++ files, the test statements
(IFDEFs) and the relationship operators are automatically generated and taken care of by the tool. Developers can
concentrate on the method and the model specification.
figure 13 ObjectCenter is a complete C/C++ programming environment. Its main window (1) has a source code browser area
(1a) and an input area (1b). The input area can be used as a C/C++ interpreter where all the data types and classes
defined in the source code are accessible. The error browser (2) can detect loading/compilation and run time errors.
Clicking on the error will highlight the error-prone line in the source code area. The inheritance browser (3) can display
a class tree. The programmer can browse through all the classes/structures defined in the loaded code and defined in
the external libraries. Selecting a class from the inheritance tree will display the class browser (4) where all the data and
functions defined in the class or inherited are listed. The programmer can filter on attributes (e.g. private, public, static,
virtual ...). Selecting a function from the class browser will display the cross-reference browser (5) where the
programmer can navigate in the call tree and identify the callers/called functions of any function.
The programmer can use the main window to run the program, set breakpoints and step through it. At run time the
programmer can display an object in the data browser (6) and see the values of its fields. If the object points to another
object then clicking on the pointer record next to the address field will display the pointed object, building dynamically
the data network. During execution the displayed object are dynamically updated. The browsers presented here are
callable in any order and from one another through handy pop-up menus.
There exist other development environments such as VisualWorks for SmallTalk and LispWorks for LISP
with objects. Some of these can be combined with testing tools. For example, TestCenter and ObjectCenter
can be coupled to form a complete environment for editing, browsing, testing and debugging C/C++
such as STOP [32]. They usually include a table of contents, a map of the document, a glossary, an index
and a bibliography. Graphics is a must to make the document easier to understand and more readable.
Often the same information can have many incarnations that must be kept coherent. For example a
Reference Guide that exists on paper can also be published electronically on the World-Wide Web [20] and
be integrated in the on-line help.The paper “An Automated System for the Maintenance of Multiform
Documents” [33] (figure 14) shows how this was achieved for one specific product.
figure 14 As an example, the paper “An Automated System for the Maintenance of Multiform Documents,” shows how the
ADAMO documentation is automatically maintained on paper, World-Wide Web and KUIP. The system uses
FrameMaker (1) as the master format and to produce the printed version. Automatic converters generates HTML for
WWW (2) and CDF for KUIP (3). Examples are dynamically compiled, run and included, to ensure that they are inline
with new features.
4.2.6 Integration
The integration phase is where all the pieces of the system are put together. These may consist of different
executables, libraries, include files, documentation (see 4.2.5) and the installation scripts. The integration is
a detailed process where deliverables must be put in the right place and tested. The failure of any test leads
to further development followed by another integration phase. This means that one may need to go
through the integration process many times and follow the same steps each time. This is a typical example
where a tool such as ProcessWeaver can help in automating the repeatable part of the process. As
mentioned in 4.1.1 figure 3, ProcessWeaver is a management tool that helps to automatize the process
(figure 15) in terms of tasks, dependency between tasks and input and outputs of each task.
figure 15 Like Artifex (see 4.2.3) ProcessWeaver uses PetriNets for modelling. The above model is the “WebMaker TAR file
verification process model.” The “TE_TAR_Collect_TS_TAR_Creation” is activated when a previous process (not
shown) leads to the collection of all deliverables. Then the TAR file is created. With the transition
“TE_TAR_Creation_TS_TAR_Review” all testers concerned are asked to test and accept/refuse it. Depending on the
number of refusals “CollectAnswer” will either go back to produce a new tar file with “TAR_Review_Incomplete” and
restart the process or continue with “TAR_Review_Complete_TS_TAR_MAIL” to mail the announcement of the new
version to the user. The actions performed in each transition (creation of the TAR, test of the TAR, delivery of the mail)
are not modelled as part of the process, but ProcessWeaver can call external tools. For example it can call a shell script
for the creation of the TAR or to run tests and analyze the output.
4.2.7 Delivery
Once the TAR file of deliverables is ready the delivery can start. It may consist of several steps (figure 16):
inform the user of the availability of the software, distribute it, provide installation instructions and finally
run the installation scripts. Nothing magic, but a non negligible amount of work.
figure 16 The files above are from WebMaker, a current development of the CERN ECP Programming Techniques Group. The
availability of the software is announced to the users registered on a mailing list via a Bulletin (1). The user may ask to
retrieve the software by ftp and is given instructions by E-mail (2). In the delivery kit, README files (3) explain the
contents of each file/directory and lead the user through the installation process. A large part of the installation is done
with shell scripts (4).
always the latest versions. Some of the WWW information can be derived from the printed version (4.2.5
figure 14).
figure 17 This is the WebMaker information Web. From the entry page (1) WebMaker users can consult the User’s Manual (2)
(generated as explained in 4.2.5), the Frequently Asked Questions (3), interesting examples, list of bugs and
suggestions to name but a few.
Documentation and public information will never be perfect and usually many user’s queries will need to
be handled preferably via electronic mail. A system such as MH-mail (figure 18), a highly configurable mail
handler, can be of great help by filing messages automatically.
figure 18 MH-mail and its X interface EXMH can file the mail in a separate folder (2), reply to the mail, register the sender in a
mailing list etc. All this according to rules specified in a configuration file and depending on the subject, sender or any
header information in the mail. It also acts as a standard mail tool with a GUI (1) to read messages, and reply manually
to them.
We would like to thank Ian Hannell for contribution and advice concerning the process part and help with
this document, Pierrick Pinasseau for setting up the cluster we used in Sopron and for his daily assistance
during the school, Joel Closier for the installations of all the software tools used, and the suppliers of the
tools who kindly provided us with free demonstration licences and information leaflets for those how
attended the lectures.
