Design & Build An Access Database PDF
Design & Build An Access Database PDF
Database Systems
Mark Gregory
___________________________________________________________________________
Page 1 of 144
Designing and Building Access
Database Systems
Mark Gregory
École Supérieure de Commerce de Rennes
Previously,
Page 2 of 144
1. INTRODUCTION: WHO IS THIS DOCUMENT FOR?..................................... 13
1.1. Preface...................................................................................................................................................... 13
2.1. Databases (bases de données) and how they are designed ................................................................... 19
2.5. Background.............................................................................................................................................. 20
Page 3 of 144
2.10. Basic concepts ...................................................................................................................................... 23
2.13. Attribute............................................................................................................................................... 25
2.18. Queries.................................................................................................................................................. 26
Page 4 of 144
3.0.8. Business Process Modelling .................................................................................................................. 38
3.0.9. SSADM ............................................................................................................................................... 38
3.0.10. MERISE .............................................................................................................................................. 39
3.4. Decide the purpose and basic contents of the database – Data Modelling ......................................... 41
3.4.1. Basic Constructs of ER Modelling .................................................................................................... 41
3.4.2. Deciding entity types ......................................................................................................................... 41
3.4.3. Entities............................................................................................................................................... 41
3.4.4. Relationships ..................................................................................................................................... 42
3.4.5. Fields: What are the attributes of each entity?................................................................................... 43
3.4.6. Data type: Domain............................................................................................................................. 44
3.4.7. Identify Domains................................................................................................................................... 44
3.4.8. Classifying Relationships .................................................................................................................. 45
3.4.9. Keys: primary and secondary (“foreign”) .......................................................................................... 46
3.4.10. Normalisation....................................................................................................................................... 48
3.4.11. ER Notation....................................................................................................................................... 48
3.4.12. Online tutorial.................................................................................................................................... 50
3.4.13. DFDs and ERDs – why both? How are they linked?......................................................................... 50
3.4.14. Why BOTH Data and Process models?............................................................................................. 51
3.8. How will Input / Update be carried out (Forms etc.)? ......................................................................... 52
Page 5 of 144
4.2. An Exercise .............................................................................................................................................. 54
Page 6 of 144
7.4. Attribute types in MS Access.................................................................................................................. 63
7.7. Relationships............................................................................................................................................ 66
7.7.2. Relationships and linking: Enforcing referential integrity where appropriate ................................... 68
8.1. Sample databases and applications included with Microsoft Access .................................................. 72
8.1.1. NorthWind Traders sample database (English edition) / Les Comptoirs (édition française)............ 72
8.1.2. Database Wizards (Assistants) .......................................................................................................... 72
Page 7 of 144
12. THE PROCESS OF DECIDING WHAT HAPPENS TO STUDENTS............... 74
16. PROCESSES................................................................................................... 75
Page 8 of 144
25. ANYTOWN ER DIAGRAM............................................................................... 94
27. TERMINOLOGY ASSOCIATED WITH DATA MODELLING AND DATABASE DESIGN ....... 95
0. REFERENCES ................................................................................................ 96
1.2. What to do if a use case diagram won’t fit on a single page? ............................................................ 101
1.6. Using Use Cases to identify System Inputs and Outputs.................................................................... 103
Page 9 of 144
2.8. The elements of a DFD .......................................................................................................................... 108
2.10. First List the Elements of the Data Flow Diagram ......................................................................... 110
5.3. Some difficulties associated with forms and subforms and how to overcome them.................................. 125
Page 10 of 144
5.4. Subform not updated................................................................................................................................ 125
5.5. Detail subform does not show the subset of records based on the value of the current master form record
126
Page 11 of 144
8. APPENDIX 8 STRUCTURED WALKTHROUGHS, A WAY TO IMPROVE THE
QUALITY OF ANALYSIS........................................................................................ 143
8.1. How to seek for perfection! Improving the quality of our work ....................................................... 143
Page 12 of 144
1. Introduction: Who is this document for?
1.1. Preface
The booklet aims to help you learn how to design and build applications using Microsoft
Access. This document is written to be read and understood as you are working on your own
design and build experiments.
This Access database design and implementation document is a higher-level self-instruction
booklet; it is assumed that you are already a fairly competent Access user.
If you need to learn how to use Microsoft Access, please see section 1.6 for further advice.
Page 13 of 144
and not just a theoretical one. (Corresponds to Level 1 above)
♦ If you are aspiring to general competence in business studies:
to give you reasonable skills in the analysis and
construction of effective, albeit small-scale, computerised
business information systems. (Corresponds to Level 2
above)
♦ If you want really to exploit the power of databases and
systems and / or you aspire to the challenge of managing
information systems professionals: To help you reach the
point where you can analyse a user's requirements, design
them a solution, and refine the solution by means of
building a working prototype in MS Access. (Corresponds to
Level 3 above)
♦ If you are a budding IS professional, or wish to become a
systems analyst or consultant: This document is a starting point
only – you will need specific additional training and
experience. (Corresponds to Level 4 above)
Note that material which only applies to Level 3 or above is shown in grey-background Arial Narrow, like
this paragraph.
Page 15 of 144
databases are a very powerful way to structure data and to be able to get the
information you need as a future manager.
Page 16 of 144
∗ Creating a report based on a query
♦ Relationships
∗ Creating a relationship between tables
∗ Creating a query which uses linked tables
∗ Forms, sub-forms and their use with one-to-
many (1:M) relationships
∗ Many-to-many relationships and multi-part
primary keys
♦ Forms – more advanced use
∗ Using list and combo boxes
∗ Combo boxes (zones de liste déroulantes)
and subforms (sous-formulaires)
∗ Creating a subform (sous-formulaire)
∗ Inserting a subform into a main form
∗ Subforms of subforms
∗ Adding record navigation buttons
1.9. Acknowledgements
I should like to thank:
♦ Former Huddersfield colleagues Dr. Steve Wade and Dr. Ken
Lunn
♦ ESC Rennes colleagues, notably Dr. Renaud Macgilchrist
♦ Previous ESC Rennes students
The following students gave me permission to reuse parts of their excellent
work on the Anytown Business School group case. I have incorporated this
case as a worked example in this document, and made significant use of these
students’ work:
Marine CORRE; Marie GALATAUD; Emmanuelle HAMEURY;
Naïla MALTI
Page 18 of 144
SECTION 1 - THE PRINCIPLES OF DATABASE
Information
Store Retrieve
Data Data
Database
This diagram, which shows the structure of a data processing system (a synonym for business
information system), highlights the central importance of the database as the place where data is
stored and from which it is retrieved.
Page 19 of 144
2.4. Why study Databases?
2.5. Background
Some understanding of what a database is, how it is used, and (to a greater or lesser extent) how
databases are designed is essential to understanding electronic business.
Businesses are systems; they use Information Systems, which are based on Information and
Communications Technology.
Example: any e-commerce company provides a Web window onto its internal catalogue: which
is a web page connected to a database.
Every stakeholder needs information from the business. They generally obtain this as
information presented on forms (screens), reports and dynamic web-pages (webpages which
show the current contents of a database and permit stockholders to update that database).
Relative strengths and weaknesses of Word, Excel and Access for storing data
Method Advantages Disadvantages
Word Simple, well understood by people with weak computing skills No formulae (or only very
Processing: e.g. rudimentary ones)
Word
Excellent formatting options Tables are not related in any
way Can only be updated by one
person at a time. The data in a
table has no “structure” known to
the computer.
Spreadsheet: e.g. Some degree of structure – cells organised into rows and columns, Persistent data is not safe.
Excel with links possible between the cells
Very powerful data manipulation using formulae Size limits – 65535 rows (until
Office 2007).
Page 20 of 144
Separate tables can be held in different worksheets No design methodology or
coherence – it is possible and
easy to mix data up in a way
which makes it impossible to
find, update and relate.
Items of data can be related together using lookup formulae such as Poor support for queries –
VLOOKUP (RECHERCHEV) and HLOOKUP (RECHERCHEH) searching is slow, and the lookup
formulae are far from being
intuitive.
Can only be updated by one
person at a time
Database: e.g. Each kind of data is stored by the database management system More difficult to use and to learn
Access (DBMS) in its own separate table. The tables are related together (at first)
in accordance with the Relational data model – this gives
coherence to the collection of tables, which is the whole database
Very powerful data structuring and querying. In fact a query is just Requires thoughtful use and
a results table which combines together selected data from more advance planning
than one stored table. The database program enables the user to say
what data they need and they construct a query which precisely
specifies what data is to be retrieved into the results table
Safer persistent data (though less safe than bigger, more powerful Access databases are not directly
DBMS programs like Microsoft SQL Server, ORACLE etc) web-accessible
Is multi-user: that is, more than one person at a time can change
(update) the database
Since every record in a table has the same basic structure, it is But the programming language
much easier and / or more cost-effective to process complete sets within Access, VBA, is too
of records under program control difficult and / or inappropriate for
most business users to learn.
Figure 1 Comparative strengths and weaknesses of data storage in two dimensional tables: Microsoft
Office tools
Page 21 of 144
Order Customer name Customer Product Product Unit Price Quantity Amount
number address code description of per unit
sale
O001 GREGORY Mark 1 La Rue P001 Apples kg 0,80 € 2,5 2,00 €
O002 GREGORY Mark 1 La Rue P876 Oranges kg 0,90 € 1 0,90 €
O003 MACGILCHRIST 1 La Croix P001 Apples kg 0,80 € 3 2,40 €
Renaud Mistyped
O004 GREGORY Mark 1 La Rue P001 Apples kg 0,80 €
address 2 1,60 €
O005 MACGILCHRIST 1 La Croix P876 Oranges kg 1,05 € 1,5 1,58 €
Renaud
O006 GREGORY Mark 11 La Rue P001 Apples kg 0,90 € 2 1,80 €
O007 GOT Guillaume 1 L’Avenue P001 Apples kg 0,90 € 3 2,70 €
O008 GREGORY Mark 99 Le Chemin P876 Oranges kg 0,90 € 1,5 1,35 €
Changed
address
Page 22 of 144
∗ PRODUCT table Price per unit
is on both
Product Product Unit of sale Standard
tables! One is
code description price per
standard, the
unit
other order-
P001 Apples kg 0,80 €
specific.
P876 Oranges kg 0,90 €
∗ ORDER table
Order Customer Product Actual Quantity Amount
number number code price per
unit
O001 C001 P001 0,80 € 2,5 2,00 €
O002 C001 P876 0,90 € 1 0,90 €
O003 C002 P001 0,80 € 3 2,40 €
O004 C001 P001 0,80 € 2 1,60 €
O005 C002 P876 1,05 € 1,5 1,58 €
O006 C001 P001 0,90 € 2 1,80 €
O007 C003 P001 0,90 € 3 2,70 €
O008 C001 P876 0,90 € 1,5 1,35 €
♦ This still isn’t perfect, since Orders and their Details continue to
be mixed together in one table. 1
1
The solution here includes the introduction of a link or intersection entity, called Order
Detail. See section 2.20 for a general description of what must be done.
Page 23 of 144
an entity
Example: student number uniquely identifies a Student
♦ Relationship: a logical connection or dependency between
two entities
Example: any one programme has many students; any one student is on
precisely one programme: we say that a one to many relationship exists
between programme and student
Page 24 of 144
2.12. An example: Students by Programme
In the diagram, the two rectangular boxes represent entity types. Here, they are programme and
student. They are represented as different entity types because they represent different things in
the real world. At least in theory, a programme could exist without any students. Almost by
definition, a student is on a programme of some kind, but it is clear that programme and student
are not the same things. It is equally clear that they are related. The diagram represents this
relationship by using a line with a crow's foot at one end of it. The end of the crow's foot
represents the many end of a one to many relationship, often represented simply as 1: M
It is necessary to have an additional attribute on the student which links the student to its
owning programme. On the sample data provided with the diagram, we have shown Annabelle
Leuchars as being a student on the Executive MBA, by including the Programme code in the
Student table. Programme code is a foreign key, which links the Student back to her
Programme.
2.13. Attribute
An attribute is a Property of an entity, a single fact about the entity. An entity type will
normally have several different attributes, one (or occasionally more) of which uniquely
identifies every instance of the entity type. The identifying attribute or group of attributes is
called for the primary key for the entity type.
The Attributes of Programme are Programme Code (primary key), Programme Name, and
Programme leader
The Attributes of Student are Number (primary key), First name, Last name, Programme Code
(foreign key)
Programme code has to be present as a foreign key in the student entity in order to represent the
relationship which exists between programme and student.
2.18. Queries
The purpose of a database is to enable users to get the specific information they need. This can
be done using queries. Queries are both useful in themselves, and also are used as the basis for
reports and for forms.
♦ To answer a question like: who is programme leader for a
given student? We can get all the necessary information by a
query on both tables - programme and student
Page 26 of 144
♦ Note that the name of the programme leader should be an
attribute of programme, and definitely NOT of student!
To answer a question like: who is programme leader for a given student? we can get all the
necessary information by a query on both tables - programme and student. This is the work of
the relational database management system software (RDBMS). A user of the database
formulates a query, and the RDBMS goes away to look up details of occurrences in both entity
types, joining the answers together as a result presented to the user.
Module Student
In this diagram, the two rectangular boxes represent entity types. Here, they are Module and
Student. The relationship is Many-to-Many. The diagram represents this relationship by using a
line with a crow's foot at both ends of it. The end of the crow's foot represents the many end of
the many to many relationship, often represented simply as M:M or M:N
This model reflects the empirical observations that:
1. Any one student studies many modules
2. Any one module has many students
Many-to-Many relationships are very common. They are also problematical – this is because
actual database management systems like Access (and almost all others) cannot support Many-
to-Many relationships directly.
However, by following simple rules, it is possible to eliminate many-to-many relationships.
Page 27 of 144
Resolving Many-to-Many
Relationships
Module name
Module leader
Module code Student no Module result
Module Registration
IS402E 20099234 A
PK,FK1 Module code
IS402E 20099235 B
PK,FK2 Student no
IS402E 20099897 C
Module result
OB401E 20099234 Fx
Student
PK Student no
Surname
Forenames
Page 28 of 144
♦ Note that there is only one primary key, made up of two
attributes
∗ Neither Module code nor Student no are unique in the
Module registration table – but the combination is
unique
Unless a student is allowed to do a module a second time, in which
case it is necessary to add a further attribute, usually a date, to the
compound primary key in order to make it unique again:
Page 29 of 144
2.22. Towards a more complete entity relationship attribute model
As analysis proceeds, the model is gradually refined and improved. Still incomplete, it might
look like this:
Programme Qualification
Programme name
LMD level
Programme leader surname
Programme leader forenames
Student
PK Student no Award
PK,FK1 Qualification
Student surname PK,FK2 Student no
Student forenames
FK1 Programme code Award result
Student gender
Student birthdate
Module Registration
Module grade
Module mark
Module Operation
Module leader
Module
PK Module code
Module name
Note that this model has introduced a number of changes:
♦ There is greater precision in the attribute names chosen
♦ We wish to record a student’s qualifications, so we have
introduced Qualification
♦ Because a many-to many relationship exists between student
Page 30 of 144
and qualification, an intermediate (link) entity has been
introduced; we observe that in the real world a specific award is
give to each student who qualifies in something, so we’ve
called the link entity Award
♦ We observe that many modules are offered and “run” (that is,
they occur and are taught) for several years in succession (and
sometimes in more than one semester in a year), and further
that in some cases students take a module one year, fail it, and
do it again in a subsequent year; therefore we introduce a
Module Operation, the run of a module in a given year and
semester
♦ The model remains incomplete but it’s now good enough to be
worth prototyping (building and testing) in Access – so that we
can check that it meets our needs for storing data and (above
all) retrieving information in a very flexible way
I created it using the query design wizard (assistant) in Access. In the simple query wizard, I
specified fields from the student table and from the programme table which I wished to appear
in the result. Here, I wanted a list of students with details of the programme they are following.
The screen shot shows the resulting query: it indicates that there are two tables which are joined
together in the preparation of the result, and it also indicates which fields take part in the result.
Page 31 of 144
How has Access created this result? Probably something like this: it reads each record in the
student table. One of the attributes of student is the programme code. Programme code is the
foreign key in the student table; it is also the primary key in the programme table. Access looks
up the details from the programme corresponding to the programme code for each student
record. In effect, it joins together the two tables on the basis of the linking foreign key.
Page 32 of 144
2.28. Another example database: NorthWind / Les Comptoirs
♦ Provided with Microsoft Access – usually to be found under
help menu, Sample databases. It includes table like:
∗ Product
∗ Order
∗ Customer
∗ Supplier
∗ Purchase order
The screen-shot shows a slightly-improved version of NorthWind.
Page 34 of 144
telephone numbers, for example, must be stored as text, as indeed should code numbers like
student numbers.
Note the usefulness of a simple data dictionary here. If you decide that a customer number is to
be five letters (as it is in NorthWind), then it needs to be five letters everywhere it is used. You
record that decision in the data dictionary.
It is absolutely critical to identify the primary key for each entity type, and to ensure that
there is a foreign key at the many end of any one to many relationship which is discovered as
you think about how the entity types are related.
2
UML is the Unified Modelling Language, a set of notations largely used by information systems professionals
and particularly associated with a style of programming called Object Oriented or OO. The only UML notation
we employ in this module is the Use Case diagram, UCD.
3
However, it is a serious error to use a spreadsheet when a database is necessary. Please see appendices 3 and 4
for a discussion of reasons why a database is often superior to a spreadsheet.
Page 35 of 144
A methodology is a set of methods for tackling a particular class of problem; the
methods should be linked by a coherent philosophy and be consistent with one
another. Formal (mathematical) and semi-formal (strictly defined) methodologies have
been defined for the analysis, design and construction of information systems.
However, they are often too rigid, too prescriptive or quite simply too long-winded to
be useful for people who are still learning the basics of the craft and who are tackling
relatively small problems. The approach adopted in the rest of this document is
methodical, but does NOT follow any one specific methodology; instead, it follows a
simplified methodology of my own. 4 So, if you think a full-blown database is
appropriate, you need to consider the steps outlined in the main parts of this section,
many of them associated with a particular method.5
3.0.3. Assumptions
The approach described in this document is applicable only to relatively small
applications: such as proof-of-concept prototype systems or perhaps end user
computing systems. So:
∗ The requirement is relatively small scale
E.g. the specific needs of the department in which you work; or
(part or all) of a small business
∗ A prototype (and perhaps target system) can be
implemented using Microsoft Access or a similar end-
user orientated database
Even if it’s too large for Access, people often create an initial (or
“prototype”) system in Microsoft Access. This is then used to
establish the complete requirements for an eventual full, or “target”
system. Or the target system may be sufficiently small to be
realisable using Microsoft Access.
∗ You are acting as the Analyst or System Designer
This document exists to help people design an effective database
application. In business, it is normal to distinguish between those
who use a system, the so-called users, and those who analyse,
design and implement a system – the developers.
This document treats you throughout as though you were acting in
the developer role.
What if you are the user as well as the developer?
Then you are in the situation sometimes described as end user
development, where a business person or student develops a system
for their own use and perhaps also for the use of other members of
their team or department.
Wherever possible, get someone else – e.g. a member of your team
– to act in the role of a true system user. Their perspective may be
different but also complementary.
Then you need to know how to analyse and build what you
need
∗ You’re an entrepreneur and you want to build a
business
Page 37 of 144
3.0.7. Why have we chosen the techniques we have?
♦ Entity relationship attribute model
The ER model maps directly to tables and fields in commonly-used database
management systems. It is much less complex than classes, the parallel in the
more-recent but (very technical) object oriented (OO) approach.
♦ Use case model
This technique is intended specifically for use with business users, and it is
reasonably visual. It is therefore a very good basis for a dialogue between
you as system users and IS professionals.
♦ Dataflow diagrams
This technique is intended specifically for use with business users, and it is
reasonably visual. It also breaks large problems down into smaller, more-
manageable ones. It is therefore a very good basis for a dialogue between
you as system users and IS professionals.
3.0.9. SSADM
SSADM itself is less widely used than once it was but remains important, not least
because it is relatively easy for business people to understand when compared with
more modern techniques.
For a good worked example of all SSADM techniques, please see
http://www.systemsanalysis.org.uk/ accessed 24/11/2008.
Wikipedia (accessed 26/02/2008) has a useful summary of SSADM (Structured
Systems Analysis and Design Methodology):
http://en.wikipedia.org/wiki/Structured%20Systems%20Analysis%20and%20Design%20Method
The following material was found at http://www.edrawsoft.com/SSADM.php accessed
03/01/2009.
♦ Introduction - Structured Systems Analysis and Design
Methodology (SSADM)
SSADM (Structured Systems Analysis and Design Method) is another method dealing with
information systems design. It was developed in the UK by CCT (Central Computer and
Telecommunications Agency) in the early 1980's. It is the UK government's standard method
for carrying out the systems analysis and design stages of an information technology project.
SSADM has been traditionally used for the development of medium or large systems. However,
one variant of SSADM is 'Micro SSADM' which is for small systems. SSADM starts from
Page 38 of 144
defining the information system strategy and then develops a feasibility study module. These
are followed by requirements analysis, requirements specification, logical system specification
and a final physical system design.
♦ Structured Systems Analysis and Design Methodology
(SSADM) Stages
SSADM consists of 5 main stages (which are broken-down in several sub-stages). The 5 main
stages are:
♦ Feasibility Study
The Feasibility Study involves a high level analysis of a business area to determine
whether it’s feasible to develop a particular system. Data Flow Modelling and (high-
level) Logical Data Modelling can be used as technique during this stage.
♦ Requirements Analysis
In the Requirements Analysis stage requirements are identified and the current
business environment is modelled, business system options are produced and
presented. One of these options will be chosen then refined. Data Flow Modelling
and Logical Data Modelling can be used as technique during this stage.
♦ Requirements Specification
In the Requirements Specification the functional and non-functional requirements are
specified as a result of the previous stage. Data Flow Modelling, Logical Data
Modelling and Entity Event Modelling can be used as technique during this stage.
♦ Logical System Specification
In the Logical System Specification the development and implementation
environment are specified, and the logical design of update and enquiry processing
and system dialogues are carried out.
♦ Physical Design
During the Physical Design the logical system specification and technical
specification are used to create a physical design and a set program specifications.
♦ Applicability of SSADM
Unlike rapid application development, which conducts steps in parallel, SSADM builds each
step on the work that was prescribed in the previous step with no deviation from the model.
Because of the rigid structure of the methodology, SSADM is praised for its control over
projects and its ability to develop better quality systems. Most current developers find it too
onerous in its application, however.
3.0.10. MERISE
This is a French equivalent to SSADM.
See, for example, http://www.commentcamarche.net/merise/concintro.php3 accessed 24/11/2008.
3.4.3. Entities6
Entities are the principal data object about which information is to be collected.
Entities are usually recognizable concepts, either concrete or abstract, such as person,
places, things, or events which have relevance to the database. Some specific examples
of entities are Employees and Projects. An entity is analogous to a table in the
relational model.
Entities can be classified as independent or dependent (in some methodologies, the terms
used are strong and weak, respectively). An independent entity is one that does not rely on
another for identification. A dependent entity is one that relies on another for identification.
6
This material was in part found at http://www.edrawsoft.com/datamodel.php checked 18/10/2009.
Page 41 of 144
An entity occurrence (also called an instance) is an individual occurrence of an entity.
An occurrence is analogous to a row in the relational model.
♦ Special Entity Types
∗ Associative entities (also known as link or intersection
entities) are entities used to associate two or more
entities in order to reconcile a many-to-many
relationship.
∗ Subtypes entities are used in generalisation hierarchies
to represent a subset of instances of their parent entity,
called the supertype, but which have attributes or
relationships that apply only to the subset.
An example is B2B Customer, a specialisation of Customer. The
Customer entity has the main attributes. A B2B entity then has
additional attributes specific to B2B, for example, credit
arrangements or contact details. Customer and B2B customer have a
one to one relationship.
Associative entities and generalisation hierarchies are discussed in more
detail below.
♦ What are the main entities / tables?
We now go on to decide which tables are necessary and how they link
together. There should be a table for each class of real-world thing, or 'entity'.
3.4.4. Relationships
A Relationship represents an association between two or more entities. Example of
such a relationship might be:
1. Employees are assigned to projects
2. Projects have subtasks
3. Departments manage one or more projects
Relationships are classified in terms of degree, connectivity, cardinality, and existence.
These concepts are discussed below.
♦ Relationships and linking
How are the entity types inter-related? There are three basic possibilities,
sometimes referred to as the cardinality of the relationship. Cardinality
specifies how many instances of an entity relate to one instance of another
entity.
Ordinality is also closely linked to cardinality. While cardinality specifies the
number of occurrences of a relationship, ordinality describes the relationship
as either mandatory or optional. In other words, cardinality specifies the
maximum number of related records and ordinality specifies the absolute
minimum number of related records. When the minimum number is zero, the
relationship is usually called optional and when the minimum number is one
or more, the relationship is usually called mandatory.
∗ 1:1 (one to one)
In a one-to-one relationship, each record in Table A can have only
one matching record in Table B, and each record in Table B can
have only one matching record in Table A. This type of relationship
is not common, because most information related in this way would
be in one table. For example, it may not be necessary to have a
Page 42 of 144
separate credit reference entity; instead, its attributes could appear
on the customer entity.
You might use a one-to-one relationship to divide a table with many
fields, to isolate part of a table for security reasons, or to store
information that applies only to a subset of the main table. For
example, you might want to create a table to track employees
participating in a fundraising soccer game. The additional attributes
for employees who are also football players would be stored in a
football player table, linked one-to-one to employee. This is done
because the vast majority of employees will not be football players.
Similarly, you might have a general customer table, and then link it
to a B2B table (for B2B-specific elements) and a B2C one. See also
generalisation hierarchies below.
∗ 1:M (one to many)
A one-to-many relationship is the most common type of
relationship. In a one-to-many relationship, a record in Table A can
have many matching records in Table B, but a record in Table B has
only one matching record in Table A.
∗ M:N (many to many) and their resolution into two 1:M,
1:N relationships to a new link entity
In a many-to-many relationship, a record in Table A can have many
matching records in Table B, and a record in Table B can have
many matching records in Table A. This type of relationship can
only be stored in a database by defining a third table (called a
junction table, or a link or intersection entity) whose primary key
consists of or includes two fields - the foreign keys from both
Tables A and B. A many-to-many relationship is really two one-to-
many relationships with a third table. For example, an Orders table
and a Products table have a many-to-many relationship that's
defined by creating two one-to-many relationships to an Order
Details table.
It is occasionally necessary to add another attribute to the key to
ensure uniqueness – often this is a date/time field.
Page 43 of 144
AVOID “repeating fields”- fields whose name is in
the plural, or which imply a plural, which almost
certainly requires a list of values - this is almost
invariably a sign that an extra table is needed.
As an example in a students database, do not make qualifications into fields of Student
– a student already has many qualifications and will gain more. The very fact that the
word qualifications is in the plural is an indication that the relationship between
student and qualification is in fact many-to-many. So a good database design is:
Note that, in accordance with the rule that the primary key of the one end of a many-
to-may relationship becomes an attribute of the many end – where it is known as a
foreign key – the entity type Award has attributes Qualification and Student no. Very
frequently, the combination of the foreign keys is the best primary key for the new
entity type. However, it is sometimes necessary to add a date or time attribute to make
the key unique – this is arguably necessary here because it is possible to envisage a
student achieving a qualification on more than one date. However, for simplicity, we
have ignored this rare possibility here.
Page 45 of 144
two or more binary relationships. They are sufficiently rare to be ignored in
the remainder of this document.
♦ Direction
The direction of a relationship indicates the originating entity of a binary
relationship. The entity from which a relationship originates is the parent
entity; the entity where the relationship terminates is the child entity.
The direction of a relationship is determined by its connectivity. In a one-to-
one relationship the direction is from the independent entity to a dependent
entity. If both entities are independent, the direction is arbitrary. With one-to-
many relationships, the entity occurring once is the parent. The direction of
many-to-many relationships is arbitrary.
♦ Type of relationship
An identifying relationship is one in which one of the child entities is also a dependent
entity. A non-identifying relationship is one in which both entities are independent.
♦ Existence
Existence denotes whether the existence of an entity instance is dependent upon the
existence of another, related, entity instance. The existence of an entity in a
relationship is defined as either mandatory or optional. If an instance of an entity
must always occur for an entity to be included in a relationship, then it is mandatory.
An example of mandatory existence is the statement "every project must be
managed by a single department". If the instance of the entity is not required, it is
optional. An example of optional existence is the statement, "employees may be
assigned to work on projects".
♦ Generalisation Hierarchies
A generalisation hierarchy is a form of abstraction that specifies that two or more
entities that share common attributes can be generalized into a higher level entity
type called a supertype or generic entity. The lower-level of entities become the
subtype, or categories, to the super type. Subtypes are dependent entities.
Page 46 of 144
treatment-name attribute in Treatment. The
combination of patient-code and treatment-name is
not, however, sufficient in this case to act as the
primary key – it is also necessary to include the date in
order to create a unique key. As a rule of thumb if
there are two or more columns within a given table
which together are the logical way to identify that row
(and the way you would always join to the table), then
use those as a compound key, otherwise assign a
separate auto increment column as a primary key.
∗ Candidate keys
There may be more than one possible candidate for use as the
primary key of a table. For example, in an employee table, you
could use either the company generated employee number, or the
Social Security number. In this situation, we say that there are two
candidate primary keys.
∗ Choosing a primary key
One primary key must be selected for the table. A primary key can
sometimes be a compound key, that is it may consist of two or more
elements, which in combination uniquely identify the entity
occurrence.
There may be several candidates, but each entity has one and only
one primary key.
∗ Entity integrity rule
The entity integrity says that no field participating in the primary
key of an entity may be null. Null means empty, or spaces, or zero,
etc.
∗ Multi-part primary keys
Where a link or intersection entity is used to resolve a many to many
relationship into two one to many relationships, it is common for
each foreign key in the child entity to form a part of a compound
primary key. Sometimes it may be necessary to add an additional
part to ensure that the primary key is unique for each instance; most
commonly, it is necessary to add a Date.
∗ Foreign keys
In order to create a one to many relationship between two entity
types, the primary key of the parent entity (or, much more rarely,
another candidate key) is replicated in the child entity as the so-
called foreign key.
Foreign keys implement one to many (1:M) relationships in the
following way. If two entity types are related 1:M, then the primary
key attribute(s) (or, rarely, the alternate key attribute(s)) of the one
entity MUST appear as attribute(s) of the many entity. This is
because this is the only way in which the database software can
“join” the many records to the one. Consider a situation in which
students are on a programme. The entity types are Programme and
Student, related 1:M. If the primary key of Programme is
Programme_Code, then Student must also have a Programme_Code
attribute.
♦ Ensuring referential integrity
Page 47 of 144
The terms referential integrity, linking, Primary and Foreign keys and
relationship can be described in this way: Two tables can be linked by a
relationship. This link can be one-to-one (e.g. husband to wife), or one-to-
many (e.g. one brand of car gives rise to many models of car - but each
model has one and only one brand). Coordination is accomplished with
relationships between tables. A relationship works by matching data in key
fields - usually a field with the same name in both tables. In most cases, these
matching fields are the primary key from one table, which provides a unique
identifier for each record, and a foreign key in the other table. For example,
employees can be associated with orders they're responsible for by creating a
relationship between the Employee table and the Order table using the
EmployeeID fields. You can ask Microsoft Access to enforce referential
integrity: if a table such as patient is related to another table animal type, and
referential integrity is enforced, then Access will only allow a new patient to
be introduced if the animal type already exists in the animal type table.
When you create the properties of a new relationship, you can specify the
behaviour to be followed:
∗ Insert
∗ Update
If a primary key is changed in an owning table, should the system
automatically change the related foreign keys? The answer will
usually be yes, and the option should be set.
To be certain, it is necessary to model the ordinality of a relationship, as
mentioned in section 3.4.4 and again in section 3.4.11.
∗ Delete
If a parent record is deleted, should the system automatically delete
all the associated child records? Setting this option should only be
done after careful thought!
3.4.10. Normalisation
We should now go back and check each attribute list is:
∗ Complete
∗ Has the right attributes on the right entities
We may choose to use the formal relational data analysis technique called normalisation. This
technique is described in appendix 6. It is a useful cross-check, and is not essential.
3.4.11.ER Notation
There is no standard for representing data objects in ER diagrams. Each modelling
methodology uses its own notation.
The original notation used by Chen is widely used in academic texts and journals but rarely
seen in either CASE (Computer Aided Software Engineering) tools or publications by non-
academics. Today, there are a number of notations used, among the more common being
Bachman, crow's foot, IDEFIX and SSADM.
Page 48 of 144
Source: http://en.wikipedia.org/wiki/File:ERD_Representation.svg accessed
18/10/2009.
All notational styles represent entities as rectangular boxes and relationships as lines
connecting boxes. Each style uses a special set of symbols to represent the cardinality
of a connection.
♦ Showing relationships diagrammatically using the crow’s foot
notation
The symbols used in this document for the basic ER constructs are taken
from the American Information Engineering tradition and are also called the
crow’s foot notation (in French, patte d’oie).
∗ Entities are represented by labelled rectangles. The
label is the name of the entity. Entity names should be
singular nouns.
∗ Relationships are represented by a solid line connecting
two entities. The name of the relationship is written
above the line. Relationship names should be verbs.
Page 49 of 144
∗ Attributes, when included, are listed inside the entity
rectangle. Attributes which are identifiers are underlined.
Attribute names should be singular nouns.
∗ Cardinality of many is represented by a line ending in a
crow's foot. If the crow's foot is omitted, the cardinality is
one.
∗ Existence is represented by placing a circle or a
perpendicular bar on the line. Mandatory existence is
shown by the bar (which looks like a 1) next to the entity
of which an instance is required. Optional existence is
shown by placing a circle next to the entity that is
optional.
There are many different ways of drawing entity-relationship diagrams. In
most of this document, we show one-to-many relationships using the crow’s
foot notation without particular concern for the ordinality.
Where it is desirable or necessary to consider ordinality (whether or not a relationship
is mandatory) we can use an extended set of symbols:
3.4.12.Online tutorial
For an additional online tutorial about entity relationship modelling, see
http://www.cems.uwe.ac.uk/~tdrewry/lds.htm checked 24/11/2008. Note that this
tutorial sticks rigidly to the SSADM modelling conventions and names and makes
reference to Logical Data Structures, LDS. As it makes clear, “Logical data structures
are data models, and are sometimes called entity-relationship (ER) models or even
entity-attribute-relationship models.” In other words, LDS is a synonym for Entity
Relationship Model.
Page 50 of 144
one or more entity types
♦ ERM is a data model
∗ Used to analyse data requirements, and to design
database tables, attributes and relationships
Page 51 of 144
indicate the basic interactions between the system and its users - some of
these will be data input / update actions, others will involve information
output. They can also be used to derive a list of system inputs and outputs –
e.g. forms, reports and/or webpages. The technique is described in Appendix
1.
Page 52 of 144
such as pseudocode. The language supported within Access is Visual Basic for Applications.
The use of pseudocode and of programming languages is beyond the scope of this document.
Page 53 of 144
♦ The user, inspired by seeing the implemented system, decides
that they would like the system to do more
Great! A business opportunity! You enter the required additional
functionality on a document which you may grandly term the Enhancement
Register, work out the implications in terms of additional design and
implementation effort, and tell the user what the enhancements will cost - in
terms of later delivery and / or an increased bill. You should never allow
yourself to get dragged into a cycle of continuously responding to such
changes as you go along, without explicit renegotiation of the terms of
reference agreed at the outset of the project.
4.2. An Exercise
Assume that you are the people who originally designed the University of Anytown database used as an
example later in this booklet. Now that you know about the various stages required to analyse user needs
and design a database solution, carry out those steps for yourself for a business school. Go through the
various stages and carefully document what you do at each stage. Or, if you are responsible for database
design in an assignment I have set, do the same thing for that database.
This is a significant piece of work - it will probably take you at least a few hours of effort, and may well
take you a week of on-and-off effort.
Page 54 of 144
When you have finished the University of Anytown database design, compare the results of your work
with those of the original analyst / designers. You should find that you have reached similar or better
conclusions.
Page 55 of 144
Be aware that this is a fairly difficult example to tackle - for example, how will your
database design cope with an article by many authors (NOT just two or three, maybe
five?).
Page 57 of 144
5.4. What is a database management system?
♦ Software which manages a database
♦ Implements entities as tables, maintaining and enforcing
relationships
♦ Deals with all the component disc files
♦ Provides functions such as
∗ Table creation and structural updating
∗ Insert, update and delete operations, on individual
records and on complete sets of records
∗ Queries, reports and forms
Page 59 of 144
SECTION 2 – USING MICROSOFT ACCESS TO BUILD GOOD
DATABASES
Page 61 of 144
♦ Integrate with the CASE (Computer Aided Software Engineering) tool which
created and maintains the data dictionary
♦ Implement resilience and recovery mechanisms
These things include roll-forward and / or roll-back mechanisms so that complete transactions
(only) are carried out. Such mechanisms are essential to prevent situations where, for example,
money leaves one company’s bank account, but never reaches another company’s.
♦ Enforce security
Only privileged users should be able to see things like payroll data.
Page 62 of 144
6.4.2. MS Access is easily obtained
It forms a part of the Microsoft Office Professional and Premium office suites
(although it is included neither in the Small Business Edition nor the Student edition).
MS Access is available in French on ESC Rennes student workstations for students
who do not have a copy on their own personal machine. Alternatively a free copy can
be obtained by means of the MSDN Academic Alliance membership of the School.
15 decimal places.
Date/Time Dates and times. 8 bytes
(Date/Heure)
Currency Currency values. Use the Currency data type to prevent 8 bytes
(Monétaire) rounding off during calculations. Accurate to 15 digits to the
Page 64 of 144
left of the decimal point and 4 digits to the right.
AutoNumber Unique sequential (incrementing by 1) or random numbers 4 bytes
(Numérotation automatically inserted when a record is added.
automatique)
NB: if you use an automatically numbered field as part of the
primary key of a table, and you also have to use it as the
foreign key in a linked table, the data type required in the
many end is long integer, which is how in fact an
AutoNumber field is stored.
Yes/No Fields that will contain only one of two values, such as 1 bit
(Oui/Non) Yes/No, True/False, On/Off.
OLE Object Objects (such as Microsoft Word documents, Microsoft Excel Up to one gigabyte
(Liaison OLE) spreadsheets, pictures, sounds, or other binary data), (subject to disc space!)
created in other programs using the OLE protocol, that can
be linked to or embedded in a Microsoft Access table. You
must use a bound object frame in a form or report to display
the OLE object.
Hyperlink Field that will store hyperlinks. A hyperlink can be a UNC Up to 64,000 characters
(Hyperlien) (Universal Naming Convention) path to a file, or a URL.
Assistant for choosing Creates a field which permits you to choose, from a scrolling The same size as the
from a list list, a value which comes either from another table or from a primary key of the
(Assistant Liste de specified list of permitted values. If you choose this option, a corresponding table. In
choix) wizard appears to help you to define the field. the (common) case
where this is an
AutoNumber field, it will
be 4 bytes in length.
Page 65 of 144
French format. You can set a specific format for the input mask, and select another
format so that the same data is displayed differently.
For full details of masks and how to use them, please refer to Microsoft Access
documentation available online: http://office.microsoft.com/en-
us/access/HA100964521033.aspx#2
7.6. Keys
7.7. Relationships
Defining relationships in Access involves you in adding the tables you want to relate to the
Relationships window, and then dragging the primary key field from one table and dropping it
on the foreign key field in the other table.
Page 66 of 144
The kind of relationship that Microsoft Access creates depends on how the related fields are
defined:
♦ One-to-many relationship
A one-to-many relationship is created if only one of the related fields is a
primary key or has a unique index. This is usually the case.
♦ One-to-one relationship
A one-to-one relationship is created if both of the related fields are primary
keys and / or have unique indexes.
Sometimes Access recognises this automatically, as here, when a B2B
customer table is being created to hold fields specific to B2B customers:
Page 67 of 144
Sometimes Access will not automatically recognise a one to one relationship
and you may need to force a one-to-one relationship; you do this by setting
the index property of the foreign key attribute to duplicates not allowed.
So, if we have this B2B table which we want to link back to Customer:
♦ Many-to-many relationship
A many-to-many relationship is really two one-to-many relationships with a
third table whose primary key consists of7 two fields - the foreign keys from
the two other tables. This has already been discussed in section 2.21
7
Or, includes them, along with another attribute which ensures uniqueness, usually a date.
Page 68 of 144
7.8.1. Queries
A query is a temporary results table resulting from joining together fields taken from
one or more database tables. A query can also include calculated fields.
7.8.2. Reports
Reports are comprehensive summaries of a situation, and normally involve data from
several tables. As such, it is based rather on a single query than on a single table. A
report is frequently intended to be printed, rather than viewed on-screen.
7.8.3. Forms
Forms are used to get data into a system, and may also be used to get information out -
- see the next section.
Page 69 of 144
7.9.3. Using relational integrity to carry out inter-table validation checks
Where two tables are linked in a one to many relationship, it is usually good practice
to enforce referential integrity. See section 7.7.2. This makes it impossible to introduce
a child record for a non-existent parent; this is often of considerable value in
improving the design of a database.
A variant of this technique involves the specific identification of so-called lookup
tables. A lookup table contains the valid values of an attribute. By making the lookup
table a parent entity to the table whose values are to be verified, it becomes impossible
to enter “bad” data, that is, data not authorised by the lookup table. In this example,
the grade attribute of a student’s result in a module has been made into a lookup based
on the valid values stored in the parent Grade table. It is therefore impossible to record
an invalid grade.
7.11.2. Macros
Macros are stored sequences of user commands.
8
However, please note that this Contact Management system will NOT meet the requirement set out in section 4.3.1!
Page 72 of 144
SECTION 3 – THE ANYTOWN DISTANCE LEARNING BUSINESS
SCHOOL EXAMPLE
9. Example scenario: Anytown Distance Learning Business
School
The Anytown Distance Learning Business School offers general business courses at undergraduate and
postgraduate levels. The undergraduate course is a Bachelor of Arts (BA) course called Business
Studies. The postgraduate course is a Master of Business Administration (MBA). Each course is
administered by a Course Coordinator.
Students apply for a course, BA or MBA.9 They send in an application form containing their personal
details, and their desired course. On behalf of the School, the appropriate Course Coordinator checks
whether the course is available and that the student has already obtained the necessary academic
qualifications. If the course is available (not yet full) and the student is qualified, he or she is enrolled in
the course, and the School confirms the enrolment by sending a confirmation letter to the student. If the
course is unavailable or the student is not sufficiently qualified, the student is sent a rejection letter.
9
Note that course in this case study is neither programme nor module – but, as we will see, it is closer to
programme than module.
Page 73 of 144
11. A Closer Look into "Managing Students"
This section focuses on the part of the system that supports the administration of information about
students on courses and modules.
• Students study modules drawn from two lists of modules held for the School, one of
undergraduate modules, the other of postgraduate ones.
• Modules have titles and a unique identifying code. Each module has a pre-defined value
expressed as a number of credits.
• Modules are of two kinds - some modules are core, some are electives, that is they are optional.
• Core modules must be taken by all students on the course. The course regulations will specify
how many optional modules a student can take and what these options might be.
• Students who pass a module are awarded the number of credits specified as that module’s value.
If they fail, they get zero credits.
• Students construct a programme of study by doing core modules, to which are added the optional
(elective) modules they select from those available.
• Every course defines a maximum period of enrolment within which time the course must be
completed. This is normally five years for an undergraduate course and three years for a
postgraduate course. If a student does not complete within this time, the decision of the next
exam board will be that they have failed the course.
• Students may suspend studies or withdraw from the course. The date on which this happens must
be recorded.
• Each component (coursework or exam) of a module has a certain percentage weighting and a
student’s overall mark for a module is calculated by combining the marks for each component.
• An exam board (jury) meets after each semester to consider the marks obtained by students and
to determine whether they have passed or failed the modules they were registered for, and what
their status on the course now is. This process is described in more detail in section 12.
1. Passed all the necessary credits, including all the core modules for the course and at least the
necessary number of options from the course’s collection of optional modules; in this case, the
decision is that they have succeeded in the course and they are awarded a BA or MBA.
2. Not yet passed all the necessary credits, but are making satisfactory progress: the decision is
that they may proceed, taking further credits as necessary.
3. Are not making satisfactory progress, that is, they are failing to complete too many modules or
have exceeded the maximum time they may stay on the course: the decision is that they have
failed in the course as a whole.
After the exam board, a revised version of the Student Record is printed and sent to the students.
Each module has only one teacher, and that teacher is the module leader. One teacher may however be
the module leader for a number of modules.
10
For a more thoughtful approach to how to manage qualifications, please see section 2.22
Page 75 of 144
16.6. Review Course
This is described in section 13.
17. Documents
The Course Coordinators currently produce and maintain the following documents:
17.1.1.List of Modules
For each module, the following data has to be kept:
♦ Module Code
♦ Module Title
♦ Course Code – The Course on which the module is used – a
module is used either on the BA or the MBA
♦ The Lecturer who is the Module Leader
♦ Elective or Core?
♦ Examination weighting %
Course Course
Module Core or work work Exam Exam Overall
Code Credits elective Module Name Teacher proportion mark proportion mark mark Result
M001 15 C Statistics BENNETT 40% 55 60% 33 42 Pass
Gordon
M567 15 C Electronic GREGORY 50% 63 50% 25 44 Pass
Business Mark
Systems
M999 15 E Ethics ELLUL Jacques 100% 33 0% 33 Fail
(etc.)
Year 1 Dissertation
Course Course
Core or work work Overall
Credits elective Module Name Teacher proportion mark mark Result
M234 60 C Dissertation GREEN David 100% 56 0% 56 Pass
Page 77 of 144
20. Anytown high-level Use Case diagram
Please note that the label <<include>> can also be written « include ». Note also that Microsoft Visio
employs <<uses>> or « uses » instead of << include >> - they mean the same thing.
Page 78 of 144
21. Anytown: Context diagram
Page 79 of 144
22. Level 1 DFD
Page 80 of 144
23. Example Level 2 DFD
Page 81 of 144
24. Data dictionary
We now need to move towards a good ERA model by means of top-down entity attribute modelling.
The approach I have adopted here is to work on the basis of the list of "obvious" entities which I identified in section 18, put them into a spreadsheet, and gradually add the
appropriate attributes. The spreadsheet is an extended example of what is sometimes called a Data Dictionary.
Description
External entities
Applicant
Student
Course Coordinator
Module Leader
Dean
Page 82 of 144
Admit students Admit students to
to course P 2 course 2
Register
students on
core and Register students on
elective core and elective
modules P 3 modules 3
Prepare for and
hold exam Prepare for and hold
board P 4 exam board 4
Collect module
results and Collect module
produce student results and produce
profile S 4 1 student profile 4.1
The scaling factor (if any)
is applied to the recorded
student results before the
Review module Review module Student Results Summary
results S 4 2 results 4.2 is reprinted
Page 83 of 144
Review
programmes Review programmes
and modules P 6 and modules 6
Produce
management Produce management
reports P 7 reports 7
1 Applicants
2 Students
3 Module registrations
4 Module results
5 Module specifications
6 Student profiles
7 Course specifications
Process
Data Flows External entity Process No Direction Name of flow name
Process
Applicant 1 Inward Application applications
Process
Applicant 1 Outward Acceptance or rejection applications
Process
Course Coordinator 1 Outward Application applications
Page 84 of 144
Process
Course Coordinator 1 Inward Decision applications
Review
programmes
Course Coordinator 6 Inward Course description and modules
Register
students on
core and
elective
Student 3 Inward Module choices modules
Review
Coursework and exams for module
Student 4.2 Inward assessment results
Teach and
assess
Student 5 Outward Module results letters module
Print results
Student 4.4 Outward Student results letters letters
Review
Proposed changes to module programmes
Module Leader 6 Inward specification and modules
Review
Module specification as programmes
Module Leader 6 Outward revised and agreed and modules
Review
programmes
Module Leader 6 Inward Module results and modules
Analysis of the results of
Produce each module; course
management description; list of
Dean 7 Outward Management reports reports modules
Page 85 of 144
Review
Proposed changes to programmes
Dean 6 Inward programme and modules
Page 86 of 144
Application
date Date/time 20 Date 00/00/0000
Enrolment
date Date/time 20 Date 00/00/0000
Finishing
date Date/time 20 Date 00/00/0000
Applicant /
Enrolled /
Passed /
Failed /
Withdrawn /
Suspended /
Status Text 12 Progressing
Term
address line
1 Text 20
Term
address line
2 Text 20
Term city Text 50 > Defaults to Anytown
Term
postcode Text 8
Contact
details Text 60
Previous
qualifications Memo
Employee
Employee
number Y Text 7 > LLL0000 e.g. EMP1234
Employee
forenames Text 30
Page 87 of 144
Employee
last name Text 20
Course
coordinator /
Employee module
role Text 20 leader / Dean
Social
security
number Text 16
Employee
address line
1 Text 20
Employee
address line
2 Text 20
Employee
city Text 50
Employee
postcode Text 8
Employee
country Text 32
Employee
contact E.g. telephone numbers,
details Memo etc.
Programme Level Y Text 1 > L P/U
Credits per 10 if undergrad; 15 if
module Integer postgrad
Project
credits Integer 60 if postgrad
Credits 360 if undergrad; 180 if
required Integer postgrad
Course
Page 88 of 144
Course MBA /
code Y Text 3 BA
BA Business Studies or
Course Master of Business
name Text 40 Administration
Course Long Employee who manages
coordinator Y Integer the course
Level Y Text 1
Required
qualifications Memo
Max number
of students Integer
Normal
number of
years Integer
Max number
of years Integer
Module
value Integer
Modules per
semester Integer
Taught
semesters
per year Integer
Description Memo
Module
Module
code Y Text 4
Page 89 of 144
Module title Text 1
Registration Result
Module
code C Y Text 4 L000
Student
number C Y Text 11 > LLL00000000
Date course
work
received Date/time
Page 90 of 144
Course work
mark Integer %
Page 91 of 144
Registration Result
resolves the many-to-
Is many relationship
Applicant / Registered Registration between Module and
Student On Result 1:M Student
Module
Module Runs as operation 1:M
System
Outputs
Reports Description
Page 92 of 144
Average mark, standard
deviation, percentage of
students who have not
Analysis of the results of each module passed
See section 10 of
Student Results Summary scenario
Sub-
Forms Forms Description
Queries Description
System
Inputs
Sub-
Forms Forms Description
Applicant details
Student details
Programme and module
Record student module choices details
Module and Module Operation
Update course structure details
Update module
Update student Registration results
Update member of staff
Coursework receipt
Page 93 of 144
25. Anytown ER diagram
Page 94 of 144
26. Anytown system implementation
In order to use the analysis and design work we have already undertaken, you would begin to
translate the ERA model (data model) into equivalent Access objects. Therefore, entities become
tables, attributes become fields, and relationships are defined as relationships! Similarly, the Use
Case diagram has already been used to identify inputs and outputs indicated in the dictionary
above. Implementation in Access involves converting these into equivalent forms and subforms.
You might like to try this for yourself (because we have not uploaded an Anytown database).
Over to you to try…
Page 95 of 144
0. References
This book is frankly difficult at first encounter, but it remains the classic reference on relational
database.
Page 97 of 144
1. Appendix 1 Business Process Analysis using Use Case
Analysis
With thanks to Dr. Ken Lunn, former colleague at the University of Huddersfield, whose material
has formed the main basis for this section.
A Use Case is a definition of a meaningful interaction with a computer system. If you have used
the internet to buy things, an example of a Use Case would be choosing something from an online
catalogue, and another might be paying for the goods.
Use Case modelling is part of requirements definition and systems analysis. At the high level, a
set of Use Case diagrams define the presentation of the system, and these are excellent tools for
discussion with stakeholders of a system, such as users and sponsors. At a more detailed level,
Use Cases are used to fully specify the external functionality of a system.
Use Cases are part of the information required by developers to design and implement a system.
Use Case diagrams say "what" a system does. The detailed analysis of Use Cases begins to say
something of "how" the system behaves in an environment. However, it does not say "how" a
system is structured internally to provide that behaviour. In computer system development you
will frequently see this separation emphasised. Before you decide how a system works, you need
to determine what it does first - a simple and obvious rule, but one so often forgotten to many
people's ultimate regret. That’s why Use Case diagrams (UCDs) and Use Case models (UCDs
with supporting text documents) can be so useful.
Page 98 of 144
We draw a Use Case as an ellipse with the name of the Use Case underneath:
Sometimes the name is put inside:
A Use Case
A Use Case
The Use Case name is a concise, active description of the behaviour carried out by the
Use Case, such as "print invoice". Do not write mini-essays to describe the behaviour of
the Use Case - we shall use a more elaborate means for describing the behaviour in full.
An Actor
This is rather an unusual choice of notation when it is an external computer system, but
you will get used to it. An Actor is really a role, not a person. One person may use the
system under many different roles. When finding actors, you are looking for the roles
that people adopt, not the people or even the job titles.
Relationships are drawn as lines, usually with an arrow:
A Use Case
An Actor
This means that an Actor uses the Use Case. In any relationship there will be two way
communications. The direction of the arrow indicates who initiates the interaction. Often
in an interactive system, it is the Actor that initiates the dialogue, but it can be the Use
Case. Sometimes the arrow is left out.
A Use Case can use another Use Case. If you have a piece of well-defined functionality,
it makes sense to re-use this wherever possible. Also, sometimes a Use Case gets too big
to manage sensibly and it makes sense to break this down into smaller Use Cases.
There are two ways Use Cases can relate. The first is where a Use Case "includes"
another Use Case. In this case the second Use Case is always invoked as part of the
execution of the first. This is drawn with an arrow pointing to the Use Case that is
included, with the label <<include>> tagged to the line:
Page 99 of 144
<<include>>
Please note that the label <<include>> can also be written « include ». Note also that
Microsoft Visio employs <<uses>> or « uses » instead of << include >>.
Sometimes a Use Case is only called occasionally from another Use Case. From the
scenario analysis of the business, this will often be to support an alternative path or an
exception. We draw this with an arrow pointing the other way (yes it is confusing at
first) where the arrow points to the calling Use Case. So below, Chase Payment
sometimes calls Issue Warning Letter.
<<extend>>
Telephone Reminder
Process Payment
<<include>>
Correct Invoice
Credit Control
Clerk Receive Payment
<<extend>>
<<extend>>
Correct Delivery
With a Use Case Diagram like the one above, you are getting a clear picture of who uses
a system, and what they can do with it. You also have forced some decisions, and
provided some external structure to the system.
The use of these techniques is not taught on this module, nor is it described in
this document. That is because the scope of activity for the kind of systems
Page 101 of 144
described in the rest of the document is assumed to be relatively small scale, in
relatively clear-cut situations. So you should already know what the business
process is, and you are simply trying to improve it. Once you have a business
process fully defined, you go around all the activities asking the simple
question "is there a potential use of a computer system here?" Sometimes you
may need no system support in an activity in a business process, sometimes the
need for one use case, sometimes many use cases are needed for a particular
business activity.
We are now beginning to see the rudiments of a methodology emerging. It starts in the
business arena, describing the business in some detail. Then it starts to think about
where computer systems are used. The first thing to worry about is what the system
does, and how it fits in to the business, not how it does it in detailed technical terms.
NB: this is NOT a DFD; this diagram shows the STRUCTURE of a DFD!
1 A Process box
The Number and Description is the
Process same as in the Elements List (data
Description dictionary)
A Data Store
D1 Name of Data The Number and Description is the
same as in the Elements List
Source or
Destination Source/Destination
Arrows show DATA FLOWS
3.1. Introduction
We normally store data on computers when we have many occurrences of a specific kind
of record, and we want to process specific records, or complete set of records. For
example, we may want to maintain a list of companies. For purposes of comparison, we
will normally choose to store the same items of data about each occurrence. For
example, we will store the name of each company, its principal sector of activity, and
the address of its global headquarters. A widely accepted way of storing data, indeed, we
may even refer to it as the “natural” way to store such data, is by means of two-
dimensional tables.
Many widely used office productivity programs provide good facilities for storing two-
dimensional tables. We can use a word processing program, such as Microsoft Word; a
spreadsheet, such as Microsoft Excel; or a database, such as Microsoft Access. However, each
program has specific strengths and weaknesses when it stores data in this way. Refer back to
section 2.6 for more on this.
3.2.4. Summary
When you are manipulating data for yourself alone, or as part of a small team,
or in a very small business, spreadsheets are likely to be more intuitive, initially
more productive and easier to get started with. However, as the volumes of
data, or the number of users, increase, databases become much the preferable
option. It is often a sensible and viable option to prototype the requirements for
Since the days of Lotus 1-2-3, people have used spreadsheet programs for everything
from word processing to data management. Doing the former is silly. Doing the latter,
however, is viable, especially in the latest version of Microsoft Excel. But though you
may be more comfortable with Excel, a real relational database program like Microsoft
Access is a better choice for managing data—for a number of reasons.
♦ Databases are safer. Excel, for example, does everything
in memory, so that any unsaved data may be lost if your
system crashes. Databases write data to the hard drive
immediately.
♦ Databases can handle more data. Sure, Excel can
technically handle more than 65,000 rows of data, but
doing so will likely bog down even the fastest PC.
♦ Databases can easily link tables of related data together,
such as customers and orders or musical groups and
albums (as well as the songs on each album). This is
where the words relational and database come together.
Storing related data together in a single table or
spreadsheet can be unwieldy and invite errors.
We'll look at a situation for which Access is a better tool than Excel and show you how an
Access solution works. If you've never used Access before, that's okay; we'll walk you
through how to create everything from scratch. We used Access 2002 for the instructions,
but you'll find the process is similar in all versions of Access. We chose Access because
so many users have it already, but you can do the same things in other relational databases
such as FileMaker or Microsoft SQL Server. For more on picking the right database, see
"Databases for All Reasons" in our issue of January 2003 at
http://www.pcmag.com/article2/0,1759,760886,00.asp (checked 24/11/2008).
If you want, you can add a description for each field to explain its contents as well as a
caption. The caption is a name that is used in place of the field name in reports and
forms. If you use shortened or cryptic field names, captions are a good idea.
To set a primary key, right-click on the area to the left of the Owner Nr field and choose
Primary Key. A key icon will appear, indicating that the field is the primary key. Save
the file with the name Owner, and click on the table's Close button.
Repeat this process to create a second table for pets with these fields:
Now you can enter data into the tables. Click on Tables in the Objects bar and double-
click on Clients to open it in datasheet view. Type the following data into the table (the
number in the Owner Nr field will be entered automatically):
Close the table and then repeat the process to add the following data to the Pet table (the
patient no will be added automatically):
patient owner animal patient condition treatment leave date date of birth
no code type name date
1 2 Cat Peaches fever 30/04/2004 01/05/2004 01/04/2003
2 1 Dog Sam 01/04/2003
3 3 Horse Dobbin 03/03/1999
4 3 Cat Ginger 01/04/2003
I found material helpful to my writing in the first three sites which Google displayed! They are:
♦ http://www.microsoft.com/communities/newsgroups/en-
us/default.aspx?dg=microsoft.public.access.queries&tid=462cbebf-
5bef-437b-88f6-fbf70e774da0&cat=&lang=&cr=&sloc=&p=1
♦ http://articles.techrepublic.com.com/5100-10878_11-5285168.html
♦ http://www.access-
programmers.co.uk/forums/showthread.php?t=170227
11
You don’t need to create this SQL statement yourself. Instead, separately create a query that combines
the fields that you need in the usual way, using Design mode (mode création). Test that it works, then
display it in SQL mode. Copy the SQL SELECT statement that Access has generated and use it to replace
the SELECT statement in the Row Source mentioned above.
Page 124 of 144
5.3. Some difficulties associated with forms and subforms and how to
overcome them
The examples here are based on the following database structure:
5.5. Detail subform does not show the subset of records based on the
value of the current master form record
Unless you take specific action, a detail subform does not show the subset of records
based on the value of the current master form record when that master form record
changes.
SOLUTION
Solving this problem requires both SQL and some simple VBA, which deals with certain
events.
Event programming is a very powerful tool that you can use within your VBA code to
monitor user actions, take appropriate action when a user does something, or monitor
the state of the application as it changes.
An Event is an action initiated either by user action or by other VBA code. An Event
Procedure is a Sub procedure that you write, according to the specification of the event,
which is called automatically by Access when an event of that particular type occurs.
This is a test form which shows how the ProductID combo box displays values based on
the CategoryID selected.
The content, the RowSource for the ProductID, is obtained using the SQL statement:
SELECT distinct Products.ProductID, Products.ProductName FROM
Products WHERE
(((Products.CategoryID)=[forms]![frmComboTest]![CategoryID])) UNION
select distinct null, null FROM Products ORDER BY Products.ProductName;
The ProductID combo box is requeried on both the OnCurrent event for the Form as
well as the Change event for the CategoryID combo box. The code required for the
OnCurrent event procedure is:
Private Sub Form_Current()
ProductID.Requery
End Sub
The code required for the Change event on the parent (master) is:
Private Sub CategoryID_Change()
ProductID.Value = Null
ProductID.Requery
End Sub
Pay close attention to the RowSource for the ProductID combo box. The RowSource is
an SQL statement. It is based on a UNION query with the appropriate Product table
records as well as a row that contains null values. When the CategoryID combo box is
changed, the ProductID combo box receives the null value. This is how the contents of
the ProductID combo box are cleared.
Unfortunately, the SQL used as the RowSource has to be written by you, the user – this
particular kind of SQL statement cannot automatically be generated on the basis of a
user-defined query.
Note The syntax for referring to objects, such as forms and controls, is not completely
straightforward.
Use either of the following syntax statements to reference a control on a main form:
6.2. Introduction
♦ The relational database has a mathematical basis in Set
Theory
♦ It is possible to exploit the mathematical basis for
relational database design to improve the quality of the
actual design. Normalisation is a formal technique for
ensuring that the right attributes appear on the right
entities
♦ Also called relational data analysis, the technique of
normalisation is based on a property of data called
dependency or functional dependency.
♦ Normalisation aims to yield a set of entities designed to
∗ Minimise data redundancy
∗ Avoid consistency problems
♦ Normalisation is a “Bottom up” technique
Instead of starting with a top-down analysis of user requirements, this
technique starts with the existing situation: the technique examines business
documents as they are currently used in existing business processes. From
this, it induces the necessary database entities. For example, the starting
point might be an existing purchase order form. As we saw above in section
5, normalisation enables us to deduce the need for several entity types,
including purchase order, supplier, product and order detail line.
♦ It is applied to attributes discovered on paper and
computer forms, viewed as a table (cf. Spreadsheet view)
6.4. Terminology
6.4.1. Records
Data tends to be held in groups of items - each individual item of data is a field, and the
group of fields constitutes a record.
6.4.3. Keys
♦ Introduction
Before we can store details of (facts about) things in a database, we need
unique labels, that is, identifiers or names, for the entities about which
attributes are to be stored. These identifiers, or keys, need to be chosen with
precision and consistency.
Candidate keys are possible labels / names / identifiers.
Where there is more than one candidate key, we need to choose one as the
primary key.
Usually we choose numeric / short keys.
Often, we deliberately create a unique key (perhaps intended to be computer-
generated), such as a student enrolment number
∗ Candidate keys
There may be more than one possible key in a given situation. For
example, we might identify an Employee by her payroll number, or
by her National Insurance (NI) number in the UK, or Social Security
(US) number.
Page 130 of 144
∗ Choose numeric / short keys
This may imply encoding, e.g. BAIB for BA (Honours) International
Business.
∗ Often need to create a unique key (perhaps
computer-generated)
Microsoft Access offers the AutoNumber facility to assist in
generating unique keys.
♦ Key types
Keys may be:
∗ Simple: single attribute
∗ Secondary - identifies a group of linked
occurrences
This document does not discuss secondary keys.
∗ Compound
Answer: one
This involves representing the data in the Purchase Order in the following format:
Remove repeating groups, i.e. groups of data fields (or a single data field) that may have multiple values for
a single value of the key.
Set such groups up as a separate entity.
The key to this new entity will be a compound key comprising the original key plus additional information to
identify individual occurrences.
Applying this to the above example gives us the following:
We now have two entities, purchase order and purchase order detail.
There must be only one value per cell (row/column intersection) in the entity. Put in another way, an entity is
in first normal form (1NF) if there are no repeating groups of attributes.
A 1NF entity is also in 2NF if every non-key attribute depends on the whole of the key.
Any attributes that are dependent on only a part of the key should be removed and stored in their own
entity along with the part-key on which they depend.
Applying this rule to our example leads us to produce the following representation:
Part Number
Product Name
Packet size
What was wrong with 1NF, and what have we gained by moving to 2NF?
The answer is that the 1NF representation contains unnecessary repetition of "Part Description" and
"Packet size" information for every part ordered.
The same part may be ordered many hundreds of times so that storing the data in 1NF could represent a
waste of disk space. More importantly, this amount of redundancy in the way data is stored could lead to
significant update problems.
Another problem with our 1NF representation is that there is nowhere in the database to store information
about Parts which are not currently on order.
So to summarise, by normalising, we have discovered a third entity type, which is going to be called
something like Product, or Stock item.
To avoid the possibility of the database becoming inconsistent (with some copies of the same data being
updated whilst other copies are overlooked) we would ideally like to store each piece of data only once.
This is really what normalisation is all about.
Applying this rule to our example would give the following representation:
Note: The items in CAPITALS above are suggested names for the entities now identified
Again we should ask the question: "What's wrong with 2NF"?
Our 2NF representation included unnecessary repetition of "Supplier Name" and "Supplier Address" for
every purchase order associated with the same supplier.
This corresponds to the first problem that we discussed with regard to 1NF.
The second problem corresponds to the fact that in the 2NF representation there is nowhere to store
information about suppliers from whom nothing is currently on order.
So normalisation here has identified the existence of a supplier entity.
There is another possible reason for interdependency of attributes. This is that one attribute is calculated
from others. It is very wise not to make such calculated attributes part of the database structure. Instead, it
is better simply to remove them, and to create them as calculated fields on queries, reports or forms as
they are needed. In this example, it would be unwise to have a subtotal attribute calculated as quantity
required times purchase price. Instead, as the subtotal is needed – e.g. on a report or form – it should
normally be recalculated by a formula. The exception to this advice is where it genuinely is necessary to
store the value calculated at one point in time, typically for accounting reasons.
Emp. Emp. Name Job Job Title Start Date Finish Date
No. Code
2. Given the following table are the following statements true or false?
3. Consider a retail company that stores sales information. The information is currently stored in a single file. The
company has several stores (shops) and the file has a record for every product line on sale at each store. The
file also contains details of future price changes and the effective date for which these have been scheduled.
The file therefore has the following structure:
The entries marked * have values that have already been entered, i.e. * represents 'ditto'.
(I) What problems might result from storing this data in a single table?
(II) Take the data in the file through to third normal form.
(III) Does the new file structure address all the problems identified in (I)?
(IV) If the sales manager wanted to add the following data to the files:
- Supplier Name for each item
- Name of store manager for each store
- Maximum quantity of each item to be stored in each store
Where would the data be stored in your 3NF model? Would any new tables be
required?
4. The following table shows the breakdown of student marks on different courses by assignment number. In this
example we have a repeating group inside a repeating group. For each course there is repeating student data
and for each student there is repeating assignment data.
i) Take the data in this report through to 3NF. What are the benefits of storing this data in third normal
form?
5. The Natural Yoghurt Company sells many products. Each product is composed of several raw ingredients that
are supplied by various vendors. A particular ingredient is always supplied by the same vendor; however a
vendor may supply more than one ingredient. The product line (product offering) is divided up so that only one
department is responsible for a particular product. However, each department is responsible for more than one
product. Each manager manages exactly one department. The following data items must be stored in the
Natural Yoghurt Company’s database:
Derive an entity relationship model and a set of 3NF tables from the above description.
7.1. Introduction
MS Visio is now available to ESC students via the Microsoft Developers’ Network
Academic Alliance MSDNAA Electronic Licence Management System ELMS. You
should by now have received an email from e-academy telling you how you can profit
from this scheme.
In order to create a drawing of a particular kind, you use both a template file and a
stencil file. These together tell Visio what kind of symbols can be used. The equivalent
terms in French are un modèle and un gabarit.
Microsoft Office Visio 2007 makes it easy for business and ICT professionals to
visualise, explore, and communicate complex information. Rather than complicated text
and tables that are hard to understand, you can use Visio diagrams that communicate
information at a glance. Instead of static pictures, you can create data-connected Visio
diagrams that display data, are easy to refresh, and dramatically increase your
productivity. You can use the wide variety of diagrams in Office Visio 2007 to
understand, act on, and share information about organizational systems, resources, and
processes throughout an enterprise.
Office Visio 2007 is available in two stand-alone editions: Office Visio Professional and
Office Visio Standard. Office Visio Standard 2007 has the same basic functionality as
Visio Professional 2007 and includes a subset of its features and templates. Office Visio
Professional 2007 offers advanced functionality, such as data connectivity and
visualization features, that Office Visio Standard 2007 does not.
12
Please note : the zero-defect ideal is emphatically not expected in the work that you do for assessment.
Instead, we are aiming for “good enough”! This appendix is included only because of the extremely useful
technique it illustrates.
Page 143 of 144
that the meeting concern itself only with the identification of problems or serious errors
of style. The designer of the artefact should correct the problems subsequently
Keys to success in the use of structured walk-through include:
Correctly assembling the right group of colleagues.
Distributing the artefact to participants before the meeting.
Total concentration on the artefact itself, rather than the person -- individual
criticism should be avoided.
The meeting should be scheduled in advance and of fixed duration.
The benefits of structured walk-throughs can be summarised as:
The quality of the artefact is improved because more faults are found, and
because errors of style -- which can lead subsequently to errors of interpretation
by others -- are eliminated.
Misunderstandings of the original requirements are more likely to be detected.
The earlier a problem is found with an artefact, the cheaper it will be to fix it.
But there are obvious problems in using this technique in an organisational culture that
is not collaborative and supportive.