Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

AIS - Chapter 3 Relational Database

Download as pdf or txt
Download as pdf or txt
You are on page 1of 24

Chapter Three

RELATIONAL DATABASES
Files versus Databases
To fully appreciate the power of databases, it is important to understand some basic principles
about how data are stored in computer systems. Information about the attributes of an entity, such
as a customer`s name and address, are stored in fields. All the fields containing data about one
entity (e.g., one customer) form a record. A set of related records, such as all customer records,
forms a file (e.g., the customer file). A set of interrelated, centrally coordinated files forms a
database.
Figure 3.1: Basic Elements of Data Hierarchy
Database

Customer File Sales File Inventory File

Record 1: Record 3: Record 1000:


Record 2:
Customer 1 Customer 3 Customer 1000:
Customer 2

Field 1: Field 2: Field 3: Field 4: Field 5: Field 6: Field 7:


Customer Customer Street City State Zip Code Customer
Number Name Address
Age

Database systems were developed to address the problems associated with the proliferation of
master files. For many years, companies created new files and programs each time an information
need arose. The result was a significant increase in the number of master files. For example, Bank
of America at one time had 36 million customer accounts in 23 separate systems. One
governmental agency had data stored in 22 separate systems.

This proliferation of master files created problems. Often, the same data were stored in two or
more separate (or multiple) master files. This made it more difficult to effectively integrate data
stored in different files and to obtain an organization-wide view of the data. It also created
problems because the specific data values stored in the different files may not have been

1
consistent. For example, a customer`s address may have been correctly updated in the master file
used to ship merchandise, but the old address may still be stored in the master file used for billing.
Figure 3.2: File-Oriented Versus Database Systems
File Approach Database Approach

Master File 1 Sales Database


Fact A Fact B Program
Fact A Fact B

Fact C Fact D Fact C Fact D

Fact E Fact F

Fact G

Master File 2
Shipping
Fact A Fact C Program
Fact E Fact F Database
Management
System

Master File 3
Billing
Fact A Fact D Program
Sales Shipping Billing
Fact E Fact G Program Program Program

A database is a set of inter-related, centrally coordinated files. The database approach treats data
as an organizational resource that should be used by and managed for the entire organization, not
just the originating department or function. A database management system (DBMS) acts as an
interface between the database and the various application programs. The purpose of the DBMS
is to provide controlled access to the database. The DBMS is a special software system that is
programmed to know which data elements each user is authorized to access. The user’s program
sends requests for data to the DBMS, which validates and authorizes access to the database in
accordance with the user’s level of authority. The DBMS will deny requests for data that the user
is unauthorized to access. As one might imagine, the organization’s criteria, rules, and procedures
for assigning user authority are important control issues for accountants to consider.
The combination of the database, the DBMS, and the application programs that access the database
through the DBMS is referred to as the database system. The person responsible for the database
is the database administrator (DBA). The DBA is responsible for managing the database resource.

2
Multiple users sharing a common database require organization, coordination, rules, and guidelines
to protect the integrity of the database.
In large organizations the DBA function may consist of an entire department of technical personnel
under the database administrator. In smaller organizations someone within the computer services
group may assume DBA responsibility. The duties of the DBA fall into the following areas:
database planning, database design, database implementation, database operation and
maintenance, and database change and growth.
Functions of the Database Administrator

1. Database Planning
✓ Develop organization’s database strategy
✓ Define database environment
✓ Define data requirements
✓ Develop data dictionary
2. Design
✓ Logical database (schema)
✓ External users views (subschemas)
✓ Internal view of database
✓ Database controls
3. Implementation
✓ Determine access policy
✓ Implement security controls
✓ Specify test procedures
✓ Establish programming standards
4. Operation and Maintenance
✓ Evaluate database performance
✓ Reorganize database as user needs demand
✓ Review standards and procedures
5. Change and Growth
✓ Plan for change and growth
✓ Evaluate new technology

3
As technology improves, many large companies are developing very large databases called
data warehouses. For example, Bank of America created a customer information database to
provide customer service, marketing analysis, and managerial information. At the time it was
created, it was the largest in the banking industry, with over 600 billion characters of data. It
contained copies of all bank data on checking and saving accounts; real estate, consumer, and
commercial loans; ATMs; and bank cards. Although it cost the bank $14 million a year, it was
well worth the cost.

THE ADVANTAGES OF DATABASE SYSTEMS


Database technology is everywhere, and everyone is or will be affected by it. Most new AISs
are implemented using a database approach. Virtually all mainframe computer sites use
database technology, and database use in personal computers (PCs) is growing rapidly.
Most accounting students will audit or work for a company that uses database technology to
store, process, and report accounting transactions. Many accountants work directly with
databases since they are directly involved in entering, processing, and querying the data in
databases. They are also responsible for developing and evaluating the internal controls
necessary to ensure database integrity. Some will become involved as designers and managers
of databases.

Database technology provides the following benefits to organizations:


Data integration: Integration is achieved by combining master files into larger pools
of data that many application programs can access.
Data sharing: Integrating data makes it easier to share data with all authorized users.
Reporting flexibility: Reports can be revised easily and generated as needed, and the
database can be easily browsed to research a problem or obtain detailed information
underlying a summary report.
Minimal data redundancy and data inconsistencies: Because data items are usually
stored only once, data redundancy and data inconsistencies are minimized.
Data independence: Because data and the programs that use them are independent of
each other, each can be changed without having to change the other. This makes
programming easier and simplifies data management.
Central management of data: Data management is more efficient because a database
administrator is responsible for coordinating, controlling, and managing data.
Cross-functional analysis: In a database system, relationships, such as the association
between selling costs and promotional campaigns, can be explicitly defined and used
in the preparation of management reports.
The Importance of Good Data
While database systems provide many advantages, bad or incorrect data stored in the database
can lead to:
• Bad decisions
• Embarrassment
• Angry users of the data

4
Database Systems
Logical and physical views of data
In file-oriented systems, programmers must know the physical location and layout of records
used in an application program. Suppose a programmer wants a credit report showing the
customer number, credit limit, and current balance. To write the program, the programmer must
understand the location and length of the fields needed (i.e., record positions 1 through 10 for
customer number) and the format of each field (alphanumeric or numeric). The process
becomes more complex if data from several files are used.
Database systems overcome this problem by separating the storage and use of data elements.
The database approach provides two separate views of the data: the physical view and the
logical view.
The logical view is how the user or programmer conceptually organizes and understands the
data. For example, a sales manager may conceptualize all information about customers as being
stored in the form of a table.
The physical view refers to how and where the data are physically arranged and stored in the
computer system. Separating the logical and physical views of data facilitates developing new
applications because programmers can concentrate on coding the application logic (what the
program will do) and do not need to focus on how and where the various data items are stored
and accessed.
Discussion Question 3.1:
A database allows two distinct views of data: a logical view and a physical view. Contrast the
two views, and discuss why separate views are necessary in database applications. Describe
which perspective is most useful for each of the following employees: a programmer, a
manager, and an internal auditor. How will understanding logical data structures assist
accountants in designing and using database systems?
Answer:
Databases are possible because of their database management system (DBMS). The DBMS is
a software program that sits between the actual data stored in the system and the application
programs that use the data. This allows users to separate the way they view the data (called
the logical view) from the way the data is actually stored (the physical view). The DBMS
interprets the users' requests and retrieves, manipulates, or stores the data as needed. The two
distinct views separate the applications from the physical information, providing increased
flexibility in applications, improved data security, and ease of use.
In a database system, the manager will rarely need to understand or be familiar with the
physical view of the data. Nor, in most instances, will the internal auditor and the programmer
as most everything they do involves the logical view of the data.
If accountants understand logical data structures and the logical view of the data, they are better
able to manage, use, and audit a database and its data.
Schemas
A schema describes the logical structure of a database. There are three levels of schema: the
conceptual level, the external, and the internal.
The conceptual-level schema is the organization-wide view of the entire database—i.e., the
big picture. It lists all data elements and the relationships among them.

5
The external level schema consists of a set of individual user views of portions of the database,
each of which is referred to as a subschema. i.e., how each user sees the portion of the system
with which he/she interacts. These individual views are referred to as subschema.
The internal level schema provides a low-level view of the database. It describes how the data
are actually stored and accessed, including information about record layouts, definitions,
addresses, and indexes.
Accountants are frequently involved in developing the conceptual- and external-level schema,
so it is important to understand the difference between the two.

The Data Dictionary


A key component of a DBMS is the data dictionary, which contains information about the
structure of the database. For each data element stored in the database, such as the customer
number, there is a corresponding record in the data dictionary describing that element. The data
dictionary describes every data element in the database. This enables all users (and
programmers) to share a common view of the data resource and greatly facilitates the analysis
of user needs.
Information provided for each element includes:
✓ A description or explanation of the element.
✓ The records in which it is contained.
✓ Its source.
✓ The length and type of the field in which it is stored.
✓ The programs in which it is used.
✓ The outputs in which it is contained.
✓ The authorized users of the element.
✓ Other names for the element.
Table 3.1: Example of Data Dictionary
Data Descriptio Records in Sourc Filed Field Programs Outputs Autho Other
Element n Which e Leng Type in Which in Which rized Data
Name Contained th Used Containe Users Eleme
d nts
Custome Unique A/R record, Custo 10 Numer A/R update, A/R aging No None
r number identifier of customer mer ic customer report, restrict
each record, sales No file update, customer ions
customer analysis record listing sales status
analysis report,
update, sales
credit analysis
analysis report,
credit
report
Custome Complete Customer Initial 20 Alpha Customer Customer No None
r name name of record custo numeri file update, status restrict
customer c report, ions

6
mer statement monthly
order processing statement
Address Street, city, Customer Credit 30 Alpha Customer Customer No None
state, and record applic numeri file update,status restrict
zip code ation c statement report, ions
processing monthly
statement
Credit Maximum Customer Credit 8 Numer Customer Customer D. Credit
limit credit that record, A/R applic ic file update, status Dean limit
can be record ation A/R update, report, R.
extended to credit A/R aging Dalebo
customer analysis report, ut & H.
credit Heaton
report
Balance Balance A/R record, Variou 8 Numer A/R update, A/R aging D. Custo
due from sales analysis s sales ic sales report, Burton mer
customer record and analysis sales B. balanc
on credit payme update, analysis Hening e
purchases nt statement report, er
transac processing, monthly S.
tions credit report, Summ
analysis credit aer
report

Accountants often participate in the development of the data dictionary because they
understand the data elements that exist in a business organization, where they originate, and
where they are used.
The DBMS usually maintains the data dictionary. In fact, this is often one of the first
applications of a newly implemented database system.
Inputs to the data dictionary include:
• Records of any new or deleted data elements.
• Changes in names, descriptions, or uses of existing elements.
Outputs include a variety of reports useful to programmers, database designers, and users of
the information system. Sample reports include (1) a list of all programs in which a data item
is used, (2) a list of all synonyms for the data elements in a particular file, (3) a list of all data
elements used by a particular user, and (4) a list of all output reports in which a data element
is used. These reports are useful in:
✓ Design and implementation of a database system.
✓ Providing documentation of the system and
✓ Creating an audit trail.
DBMS Languages

7
Every DBMS must provide a means of performing the three basic functions of creating,
changing, and querying a database. The set of commands used to perform these functions are
referred to as the data definition, data manipulation, and data query languages, respectively.
The data definition language (DDL) is used to:
▪ Build the data dictionary
▪ Initialize or create the database
▪ Describe the logical views for each individual user or programmer, and
▪ Specify any limitations or constraints on security imposed on database records or
fields
The data manipulation language (DML) is used for data maintenance, which includes such
operations as
• Updating data
• Inserting data
• Deleting portions of the database.
The data query language (DQL) is used to interrogate (request) the database. Whereas the
DML is used to change the contents of the database, the DQL retrieves, sorts, orders, and
presents subsets of the database in response to user queries. Most DQLs contain fairly
powerful, but easy-to-use, set of commands that enable users to satisfy many of their
information needs without a programmer`s assistance.
Many DBMS packages also include a report writer, which is a language that simplifies report
creation. Typically, users need only specify which data elements they want printed and how
the report should be formatted. The report writer then searches the database, extracts the
specified data items, and prints them out according to user specified format
All users generally have access to both the DQL and report writer. Access to DDL and DML,
however, should be restricted to those employees with administrative and programming
responsibilities. This helps to limit the number of people who can make changes to the
database.

RELATIONAL DATABASES
Relational database is a database management system that uses the relational data model
developed by Dr. E. F. Codd in 1970. A DBMS is characterized by the type of logical data
model on which it is based. A data model is an abstract representation of the contents of a
database. Most new DBMSs are called relational databases because they use the relational
model developed by Dr. E. F. Codd in 1970.
The relational data model represents everything in the database as being stored in the forms
of tables (aka, relations). Technically, these tables are called relations (hence the name
relational data model), but will use the two words interchangeably. Moreover, keep in mind
that the relational data model only describes how the data appear in the conceptual- and
external-level schemas. The data are not actually stored in tables but rather in the manner
described in the internal-level schema.
Each row in a relation, called a tuple (which rhymes with couple), contains data about a
specific occurrence of the type of entity represented by that table.
Types of Attributes

8
Tables in a relational database have several types of attributes. A primary key is the attribute,
or combination of attributes, that uniquely identifies a specific row in a table. Usually, the
primary key is a single attribute. In some tables, however, two or more attributes jointly form
the primary key. For example, the primary key of the Sales-Inventory table is the combination
of Sales Invoice # and Item #.
A foreign key is an attribute in a table that is a primary key in another table. Foreign keys are
used to link tables. An example is the attribute Customer #. It is the primary key in the
Customer table and, as a foreign key in the Sales table, it is used to link the data about a
particular customer in the Sales table to the Customer table.
Other non-key attributes in each table store important information about that entity. For
example, the inventory table also contains information about the description, color, vendor
number, quantity on hand, and price of each item.
Basic Requirements of a Relational Database
The relational data model imposes several requirements on the structure of tables. The set of
tables follows these constraints, representing a well-structure (normalized) database.
1. Every column in a row must be single valued.
In relational database there is one, and only one, value in any given cell.
• In the student table, you couldn’t have an attribute named “ Phone Number” if a
student could have multiple phone numbers.
• There might be an attribute named “ local phone number” and an attribute named
“ permanent phone number.”
• You could not have an attribute named “ Class” in the student table, because a
student could take multiple classes.
2. Primary key cannot be null
The primary key is the attribute, or combination of attributes, that uniquely identifies a specific
row in a table. For this to be true, the primary key of any row in a relation cannot be null
(blank), for then there would be no way to uniquely identify that row and retrieve the data
stored there. A non-null value for the primary key indicates that a specific object exists and can
be identified by reference to its primary key value. This is referred to as the entity integrity
rule, because it ensures that every row in every relation must represent data about some specific
object in the real world. For example, in sales-inventory table there is no single field that
uniquely identifies each row. However, the Sales Invoice # and Item #, taken together, do
uniquely identify each row. Therefore, both attributes are combined to form the primary key.

For example,
STUDENTS
Student ID Last Name First Name Phone
Number
333-33-3333 Simpson Alice 333-3333
111-11-1111 Sanders Ned 444-4444
123-45-6789 Moore Artie 555-5555

9
COURSES
Course ID Course Section Day Time
1234 ACCT-3603 1 MWF 8:30
1235 ACCT-3603 2 TR 9:30
1236 MGMT-2103 1 MW 8:30

STUDENT X COURSE (SCID)


333333333-1234
333333333-1236
333333333-1235
333333333-1236

• Note that within each table, there are no duplicate primary keys and no null primary
keys.
• Consistent with the entity integrity rule.
3. Foreign keys, if not null, must have values that correspond to the value of a primary key
in another table.
Foreign keys are used to link rows in one table to rows in another table. For example, the
Customer # is a foreign key in the sales table and links each sales transaction with customer
who participated in that event. This is only possible, however, if the customer number values
in the sales table correspond to actual customer numbers in the customer table. This constraint
is referred to as the referential integrity rule because it ensures the consistency of the database.
Foreign keys can contain null values, however. For example, some customers pay cash and
for privacy reasons do not want to give company any way to identify and track them. Therefore,
for such cash sales, the Customer # field in the sales table would be blank.
For example,
STUDENTS
Student ID Last Name First Name Phone Advisor
Number Number
333-33-3333 Simpson Alice 333-3333 1418
111-11-1111 Sanders Ned 444-4444 1418
123-45-6789 Moore Artie 555-5555 1503

ADVISORS
Advisor Last Name First Name Office
Number Number
1418 Howard Glen 420
1419 Melton Amy 316
1503 Zhang Xi 202
1506 Radowski J.D. 203

10
Advisor Number is a foreign key in the STUDENTS table. Every incident of Advisor Number
in the STUDENTS table either matches an instance of the primary key in the ADVISORS table
or is null.
4. All non-key attributes in a table should describe a characteristic about the object
identified by the primary key.
Most tables contain other attributes in addition to primary and foreign keys. Consider the sales
table. Sales Invoice # is the primary key. Both Customer # and Salesperson are foreign keys,
although the salesperson table is not shown. The remaining attributes, (date and sales amount)
are other important facts about the sales event. Details about the customer or salesperson who
participated in that transaction, or about the items purchase, however, are stored in those tables,
not in the sales table.
• Could nationality be a non-key attribute in the student table?
• Could advisor’s nationality be a non-key attribute in the student table?
These four constraints produce a well-structured (normalized) database in which data are
consistent and redundancy is minimized and controlled. First, notice that redundancy is
greatly reduced. For example, all non-key attributes, such as customer addresses and unit
prices, are stored just once. This avoids the potential of update anomaly problems. Note that
redundancy is not entirely eliminated. Certain items, such as Sales Invoice # and Item #, appear
in more than one table. These attributes appear more than once (multiple times) only when they
function as foreign keys, however. Consequently, the referential integrity rule ensures that
there will be no update anomaly problems with the foreign keys.
PROBLEMS ASSOCIATED WITH STORING ALL DATA IN ONE TABLE
One problem with trying to store all the data in one table is that it creates a great deal of
redundancy. Three specific types of problems can occur. The first is called an update anomaly,
because changes (updates) to data values are not correctly recorded. For example, changing a
customer`s address requires searching the entire table and changing every occurrence of that
customer`s address. Overlooking even one row would create inconsistency in the database,
because multiple rows, each with a different address, would then exist for the same customer.
This could result in unnecessary duplicate mailings. It could also create errors in analyses such
as counting the number of different customers who purchased something during a specific time
period. The second problem is insert anomaly. To demonstrate the effects of the insertion
anomaly, assume that a new vendor has entered the marketplace. The organization does not
yet purchase from the vendor, but may wish to do so in the future. In the meantime, the
organization wants to add the vendor to the database. This is not possible, however, because
the primary key for the Inventory table is ITEM (PART) NUMBER. Because the vendor does
not supply the organization with any inventory items, the supplier data cannot be added to the
table. The third problem is delete anomaly. The deletion anomaly involves the unintentional
deletion of data from a table. If a customer had made only one purchase, consisting of a single
item, deleting that row from the table would result in the loss of all information about the
customer.
The presence of the deletion anomaly is less conspicuous, but potentially more serious than
the update and insertion anomalies. A flawed database design that prevents the insertion of
records or requires the user to perform excessive updates attracts attention quickly. The

11
deletion anomaly, however, may go undetected, leaving the user unaware of the loss of
important data until it is too late. This can result in the unintentional loss of critical accounting
records and the destruction of audit trails. Table design, therefore, is not just an operational
efficiency issue; it carries internal control significance that accountants need to recognize.
The solution: A set of tables
The problems associated with trying to store all the data in one table can be avoided by creating
the set of tables, which has a table for each separate entity of interest. First, notice that
redundancy is greatly reduced. For example, all non-key attributes, such as customer
addresses and inventory item unit prices, are stored just once. This avoids the potential of
update anomaly problems. Note that the redundancy is not entirely eliminated, however.
Certain items, such as sales invoice number and item number, appear in more than one table.
These attributes appear more than once only when they function as foreign keys, however.
Consequently, the referential integrity rule ensures that there will be no update anomaly
problems with the foreign keys.
An important feature of the schema is that data about various things of interest (customers,
inventory, and sales transactions) are stored in separate tables. This makes it easier to add new
data to the system. For example, information about prospective customers can be stored simply
by adding another row in the customer table. Thus, the set of tables avoids the problem of an
insert anomaly.
Two Approaches to Database Design
There are two basic ways to design well-structured relational databases.
Normalization
Semantic data modeling
Normalization
One approach, called normalization, starts with the assumption that everything is initially
stored in one large table. A set of rules is then followed to decompose that initial table into a
set of normalized tables. The objective is to produce a set of tables in what is called third-
normal form (3NF), because such tables are free of the types of update, insert, and delete
anomaly problems described previously.
Semantic data modeling
An alternative way to design well-structured relational databases involves semantic data
modeling. Under this approach, the database designer uses knowledge about how business
processes typically work and about the information needs associated with transaction
processing to draw a graphical picture of what should be included in the database. The resulting
figure can then be directly used to create a set of relational tables that are in 3NF.
Semantic data modeling has two significant advantages over simply following rules of
normalization. First, because it makes use of the system designer’s knowledge about business
processes and practices, it facilitates the efficient design of transaction processing databases.
Second, because the resulting graphical model explicitly represents information about the
organization’s business processes and policies, it facilitates communicating with the intended
users of the system. Such communication is extremely important in ensuring that the resulting
system meets the actual needs of users.

12
Database Systems and the Future of Accounting
Database systems may profoundly affect the fundamental nature of accounting. For example,
database systems may lead to the abandonment of the double-entry accounting model. The
basic rationale for the double-entry model is that the redundancy of recording the amount of a
transaction twice provides a check on the accuracy of data processing. Every transaction
generates equal debit and credit entries, and the equality of debits and credits is checked and
rechecked at numerous points in the accounting process. Data redundancy, however, is the
antithesis of the database concept. If the amounts associated with a transaction are entered into
a database system correctly, then it is necessary to store them only one, not twice. Computer
data processing is sufficiently accurate to make unnecessary the elaborate system of checks
and double checks that characterizes the double-entry accounting model.
Database systems also have the potential to significantly alter the nature of external
reporting. For example, external users could have access to the company’ s database and
manipulate the data to meet their own reporting needs. Considerable time and effort are
currently invested in defining how companies should summarize and report accounting
information to external users.
Perhaps the most significant effect of database systems will be in the way accounting
information is used in decision making. The difficulty formulating ad hoc queries (i.e., non-
repetitive requests for reports or answers to specific questions about the contents of the
system`s data files) in accounting systems based on traditional files or nonrelational DBMSs
meant that accountants acted, in effect, as information gatekeepers. Financial information was
readily available only in predefined formats and at specified times. Relational databases,
however, provide query languages that are powerful and easy to use. Thus, managers need not
get bogged down in procedural details about how to retrieve information. Instead, they can
concentrate solely on specifying what information they want. As a result, financial reports can
be easily prepared to cover whatever time periods managers want to examine, not just the time
frames accountants traditionally use.
Relational DBMSs can also accommodate multiple views of the same underlying
phenomenon. For example, tables storing information about assets can include columns not
only for historical costs but also for current replacement costs and market values. Thus,
managers will no longer be forced to look at data in ways predefined by accountants.
Finally, relational DBMSs provide the capability of integrating financial and operational
data. For example, data about customer satisfaction, collected by surveys or interviews, could
be stored in the same database used to store information about current account balances and
credit limits. Managers would thus have access to a richer set of data for making tactical and
strategic decisions.
In all these ways, relational DBMSs have the potential to increase the use and value of
accounting information for making the tactical and strategic decisions involved in running an
organization. Accountants, however, must become knowledgeable about database systems so
that they can participate in designing the accounting information systems of the future. Such
participation is important for ensuring that adequate controls are included in those systems to
safeguard the data and ensure the reliability of the information produced.
DATABASE DESIGN USING THE REA DATA MODEL

13
Database Design Process
The five basic steps in database design include the following:
The first stage (system analysis) consists of initial planning to determine the need for and
feasibility of developing a new system. This stage includes preliminary judgments about the
proposal`s technological (i.e., whether a proposed system can be developed given the available
technology) and economic feasibility (i.e., whether the benefits of a proposed system will
exceed the cost).
It also involves identifying user information needs, defining the scope of the proposed new
system, and using information about the expected number of users and transaction volume
to make preliminary decisions about hardware and software requirements. Thus, system
analysis includes both initial planning and requirement analysis stages.
The second stage (conceptual design) includes developing the different schemas for the new
system, at the conceptual, external, and internal levels.
The third stage (physical design or coding) consists of translating the internal-level schema
into the actual database structures that will be implemented in the new system. This is also the
stage when new applications are developed.
The fourth stage (implementation and conversion) includes all the activities associated with
transferring data from existing systems to the new database AIS, testing the new system, and
training employees how to use it.
The final stage (operation and maintenance) include using and maintaining the new system.
This includes carefully monitoring system performance and user satisfaction to determine the
need for making system enhancements and modifications.
Eventually, changes in business strategies and practices or significant new developments in
information technology initiate investigation into the feasibility of developing new system, and
the entire process starts again.
Figure 4.1: Data Modeling in the Database Design Process

Systems Analysis
Data Modeling
Occurs Here
Conceptual Design

Physical Design
Data Modeling Used Here

Implementation and
Conversion

Operation and
Maintenance

14
Accountants can and should participate in every stage of the database design process, although
the level of their involvement is likely to vary across stages.
During the systems analysis phase, accountants help evaluate project feasibility and
identify user information needs.
In the conceptual design stage, accountants participate in developing the logical schemas,
designing the data dictionary, and specifying important controls.
Accountants with good database skills may directly participate in implementing the data
model during the physical design stage.
During the implementation and conversion stage accountants should be involved in testing
the accuracy of the new database and the application programs that will use that data, as
well as assessing the adequacy of controls.
Finally, many accountants are regular users of the organization`s database system to
process transactions and sometimes even have responsibility for its management.
Accountants may provide the greatest value to their organizations by taking responsibility for
data modeling. Data modeling is the process of defining a database so that it faithfully
represents all aspects of the organization, including its interactions with the external
environment. Data modeling occurs during both the systems analysis and conceptual design
stages of database design.
Two important tools that accountants can use to perform data modeling:
1. Entity-relationship diagramming and
2. The REA data model
Entity-Relationship (E-R) Diagrams
An entity-relationship (E-R) diagram is a graphical technique for portraying a database
schema. It is called an E-R diagram because it shows the various entities being modeled and
the important relationships among them.
An entity is anything about which the organization wants to collect and store information. For
example, your university collects and stores information about students, courses, enrollment
activity, etc. In a relational database, separate tables would be created to store information
about each distinct entity; in an object-oriented database, separate classes would be created for
each distinct entity.
In an E-R diagram, entities are depicted as rectangles. Unfortunately, however, there are no
industry standards for other aspects of E-R diagrams. Some data modelers, tools, and authors
use diamonds to depict relationships whereas others do not.
Sometimes the attributes associated with each entity are depicted as named ovals connected to
each rectangle, whereas other times the attributes associated with each entity are listed in a
separate table.
E-R diagrams can be used to represent the contents of any kind of databases. Our focus is on
databases designed to support an organization’s business activities. Consequently, the E-R
diagrams we develop not only depict the contents of a database but also graphically model an
organization`s business processes.
In fact, E-R diagrams can be used not only to design databases but also to:
✓ Document and understand existing databases and
✓ Reengineer business processes

15
E-R diagrams can include many different kinds of entities and relationships among those
entities. An important step in database design, therefore, entails deciding which entities need
to be modeled. The REA data model is useful for making that decision.
The REA Data Model
The REA data model was developed specifically for use in designing accounting information
systems. The REA data model focuses on the business semantics underlying an organization’ s
value chain activities. It provides guidance for database design by:
✓ Identifying what entities should be included in the AIS database.
✓ Prescribing how to structure relationships among the entities in that database
REA data models are usually depicted in the form of E-R diagrams. Therefore, we refer to E-
R diagrams developed according to the REA data model as REA diagrams.
Three Basic Types of Entities
The REA data model is so named because it classifies entities into three distinct categories:
✓ The resources that the organization acquires and uses,
✓ The events(business activities) in which the organization engages,
✓ The agents participating in these events.
Resources are those things that have economic value to the organization. They are defined as
objects that are both scarce and under the control of the enterprise. Resources are used in
economic exchanges with trading partners and are either increased or decreased by the
exchange.
Events are the various business activities about which management wants to collect
information for planning or control purposes. The REA data model helps people design
databases that support the management of an organization`s value chain activities. Therefore,
most of the events in an REA data model fall into one of two categories: economic exchanges
or commitments. Economic exchanges are the value chain activities that directly affect the
quantity of resources. For example, the sales event decreases the quantity of inventory and
the cash receipts event increases the amount of cash. Commitments represent promises to
engage in future economic exchanges. For example, customer orders are commitments that
lead to future sales. Often such commitments are necessary precursors to the subsequent
economic exchange. Moreover, management needs to track commitments for planning
purposes. For example, manufacturing firms often use information from customer orders to
plan production.
Agents are the people and organizations that participate in events and about whom information
is desired for planning, control, and evaluation purposes.
Differences between ER and REA Diagrams
ER and REA diagrams differ visually in a significant way. Entities in ER diagrams are of one
class, and their proximity to other entities is determined by their cardinality and by what is
visually pleasing to keep the diagrams readable. Entities in an REA diagram, however, are
divided into three classes (resources, events, and agents) and organized into constellations by
class on the diagram.

Structuring Relationships: The Basic REA Template

16
The REA data model prescribes a basic pattern for how the three types of entities (resources,
events, and agents) should relate to one another. The essential features of the pattern are:
1. Each event is linked to at least one resource that it affects.
2. Each event is linked to at least one other event.
3. Each event is linked to at least two participating agents.
Rule 1: Every event entity must be linked to at least one resource entity
Events must be linked to at least one resource that they affect. Events, such as the sale of
merchandise, that change the quantity of a resource are linked to that resource in what is called
a stockflow relationship. Other events, such as taking a customer order, that represent future
commitments are linked to resources in what are called reserve relationships.
Some events affect the quantity of a resource:
• If they increase the quantity of a resource, they are called a “ get” event.
• If they decrease the quantity of a resource they are called a “ give” event.
• For example, if you purchase inventory for cash:
✓ The get event is that you receive inventory.
✓ The give event is that you pay cash.
Relationships that affect the quantity of a resource are sometimes referred to as stockflow
relationships because they represent either an inflow or outflow of that resource.
Not every event directly alters the quantity of a resource, however. For example, orders from
customers represent commitments that will eventually result in a future sale of merchandise,
just as orders to suppliers represent commitments that will eventually result in the subsequent
purchase of inventory.
Organizations do, however, need to track the effects of commitments, both to provide better
service and for planning purposes. For example, customer orders reduce the quantity available
of the specific inventory items being ordered. Sales staffs need to know this information to be
able to properly respond to subsequent customer inquiries and orders. Manufacturing
companies may use information about customer orders to plan production.
Rule 2: Every event entity must be linked to at least one other event entity.
Give and get events are linked together in what is labeled an economic duality relationship.
Such give-to-get duality relationships reflect the basic business principle that organizations
typically engage in activities that use up resources only in the hopes of acquiring other resource
in exchange. For example, the sales event, which requires giving up (decreasing) inventory,
is related to the receive cash event, which requires getting (increasing) the amount of cash.
Each accounting cycle can be described in terms of give-to-get economic duality relationships.
✓ The revenue cycle involves interactions with your customers. You sell goods or
services and get cash.
✓ The expenditure cycle involves interactions with your suppliers. You buy goods
or services and pay cash.
✓ In the production cycle, raw materials, labor, and machinery and equipment
time are transformed into finished goods.
✓ The human resources cycle involves interactions with your employees.
Employees are hired, trained, paid, evaluated, promoted, and terminated.

17
✓ The financing cycle involves interactions with investors and creditors. You
raise capital (through stock or debt), repay the capital, and pay a return on it
(interest or dividends).
Not every relationship between two events represents a give-to-get economic duality, however.
Commitment events are linked to other events to reflect sequential cause-effect relationships.
For example, the take customer order event would be linked to the sales event to reflect the
fact that such orders precede and result in sales. Similarly, the order inventory (purchase)
event would be linked to the receive inventory event to reflect another sequential cause-effect
relationship.
Rule 3: Every event entity must be linked to at least two participating agents
For accountability, organizations need to be able to track actions of employees. Organizations
also need to monitor the status of commitments and economic duality exchanges engage in
with outside parties.
For events that involve transactions with external parties, the internal agent is the employee
who is responsible for the resource affected by that event and the external agent is the outside
party to the transaction. For internal events, such as transfer of raw materials from the
storeroom to the production floor, the internal agent is the employee who giving up
responsibility for or custody of the resource and the external agent is the employee one who
is receiving custody of or assuming responsibility for that resource.
DEVELOPING AN REA DIAGRAM
To design an REA diagram for an entire AIS, one would develop a model for each transaction
cycle and then integrate the separate diagrams into an enterprise-wide model.
In this chapter, we focus on the individual transaction cycles.
Developing an REA diagram for a specific transaction cycle consists of the following three
steps:
Step 1: Identify the events about which management wants to collect information.
Step 2: Identify the resources affected by each event and the agents who participate in
those events.
Step 3: Determine the cardinalities of each relationships.
• Let’s walk through an example.
Step 1: Identify Relevant Events
The first step in developing an REA model of a transaction cycle is to identify the events of
interest to management. At a minimum, every REA model must include the two events that
represent the basic “ give-to-get” economic exchange performed in that particular transaction
cycle.
The “ give” event represents an activity which reduces the organization`s stock of a resource
that has economic value; conversely, the “ get” event represents an activity which increases
the organization`s stock of an economic resource. Usually there are other events that
management is interested in planning, controlling, and monitoring that also need to be included
in the REA model.
A solid understanding of basic business activities is needed to identify which events comprise
the basic give-to-get economic duality relationships.
For example, typical activities in the revenue cycle include:

18
1. Take customer order
2. Fill customer orders
3. Bill customers
4. Collect payment from customers
• Analysis of the first activity, taking the customer order, indicates that it does not
involve either the acquisition of resources from or provision of resources to an external
party. It is only a commitment to perform such actions in the future.
• The second activity, fill the customer orders, involves a reduction in the company’s
inventory. It is a give event.
• The third activity, billing customers, involves the exchange of information with an
external party but does not directly increase or reduce the quantity of any economic
resource. Printing an invoice and mailing it to a customer does not increase or decrease
the amount of any resource. Neither does it represent the organization`s commitment
to engage in a future economic exchange. The customer`s obligation to pay the
organization arises not from the billing activity, but from the delivery of the
merchandise. The billing activity is simply an information-processing event that merely
retrieves information from the database about previous customer orders and sales
events. Organizations build databases to collect, process, and store information about
their value chain activities. Printing documents and reports or querying the database,
however, are just different ways of retrieving information about those activities for use
in making decisions. Such information-processing activities do not change the contents
of the database and, therefore, are not modeled as events in an REA diagram.
Consequently, the activity of printing and mailing invoices does not need to appear in
an REA diagram of an organization`s revenue cycle.
• Finally, the analysis of the fourth activity, collect payment from customers, indicates
that it results in an increase in the organization`s supply of an economic resource (cash)
as a result of receiving it from an external party (the customer).
Consequently, analysis of the basic business activities performed in the revenue cycle indicates
that the basic give-to-get economic exchange consists of two events: fill customer order
(usually referred to as the “ sale event” ) and collect payments from customers (often called
“ receive cash event” ). Similar analyses of the business activities performed in the other
transaction cycles results in identification of the other give-to-get economic duality
relationships.
Step 2: Identify Resources and Agents
Once the relevant events have been specified, the resources that are affected by those events
need to be identified.
This involves answering three questions:
1. What economic resource is reduced by the “ give” event?
2. What economic resource is acquired by the “ get” event?
3. What economic resource is affected by a commitment event?
Again, a solid understanding of business processes makes it easy to answer these questions.
For example, the sales event involves giving inventory to customers and that the cash receipts
event involves receiving cash (whether in the form of money, checks, credit card, or debit card),

19
from customers. Although organizations typically use multiple accounts to track cash and cash
equivalents (e.g., operating checking account, petty cash, and short-term investments), these
are all are summarized in one balance sheet account called cash account. Thus, in a relational
database, the “ cash” table would contain a separate row for each specific account (e.g., petty
cash, checking account, etc.). Finally, the take customer order event involves setting aside
merchandise for a specific customer. To maintain accurate inventory records, and to facilitate
timely reordering to avoid stockouts, each take customer order event should result in reducing
the quantity available of that particular inventory item.
What about accounts receivable? Accounts receivable is not modeled as a separate entity
because it is not an independent object, but simply represents a timing difference between two
events: sales and cash receipts. That is, accounts receivable simply represents sales for which
customer payments have not yet been received. Consequently, if data about both sales and cash
collections are already stored in the database, all the information needed to calculate accounts
receivable can be derived from the information stored about those two events.
In addition to specifying the resources affected by each event, it is also necessary to identify
the agents who participate in those events. There will always be at least one internal agent
(employee) and, in most cases, an external agent (customer or supplier) who participate in
each event.
It is important to understand that the agents in an REA data model represent functions, not
specific people. For example, in a cash sale, the salesperson also may act as the cashier and
collect payment from the customer. The REA diagram would still include two agents to
model this situation, however.
Step 3: Determine Cardinalities of Relationships
The final step in drawing an REA diagram for one transaction cycle is to add information about
the nature of the relationships between the various entities (cardinalities). The degree of the
relationship, called cardinality, is the numerical mapping between entity instances. Cardinality
reflects normal business rules as well as organizational policy. Cardinalities describe the nature
of the relationship between two entities by indicating how many instances of one entity can be
linked to each specific instance of another entity.
Consider the relationship between the customer agent entity and the sales event entity. Each
entity in an REA diagram represents a set. For example, the customer entity represents the set
of the organization`s customers, and the sales entity represents the set of individual sales
transactions that occur during the current fiscal period. Each individual customer or sales
transaction represents a specific instance of that entity. Thus, in a relational database, each row
in the “ customer” table would store information about a particular customer and each row in
the “ sales” table would store information about a specific sales transaction. Cardinalities
define how many sales transactions (instances of the sales entity) can be associated with each
customer (instance of the customer entity) and, conversely, how many customers can be
associated with each sales transaction. Thus, cardinality is the degree of association between
two entities. Simply stated, cardinality describes the number of possible occurrences in one
table that are associated with a single occurrence in a related table.

20
Unfortunately, no universal standard exists for representing information about cardinalities in
REA diagrams. We adopt the graphical “ crow’ s feet” notation style for representing
cardinality information because:
✓ It is becoming increasingly popular.
✓ It is used by many software design tools.
Using the crow’ s feet notation:
✓ The symbol for zero is a circle: O
✓ The symbol for one is a single stroke: |
✓ The symbol for many is the crow’s foot:
Cardinalities are represented by the pair of symbols (or numbers) next to an entity. The first
number is minimum cardinality. It indicates whether a row in this table must be linked to at
least one row in the table on the opposite side of that relationship. The minimum cardinality
can be either zero (0) or one (1), depending upon whether the relationship between the two
entities is optional (the minimum cardinality is zero) or mandatory (the minimum cardinality
is one).
It asks the following questions:
❖ Shall each row in this table be linked to at least one row in the other table?
✓ Yes = 1
✓ No = 0
A minimum cardinality of zero (0) means that a new row can be added to that table without
being linked to any specific rows in the table on the other side of the relationship. In contrast,
a minimum cardinality of one (1) means that each row in that table must be linked to at least
one row in the other table participating in that relationship.
The second number in each cardinality pair is the maximum cardinality. It indicates whether
one row in that table can be linked to more than one row in the other table. The maximum
cardinality can either be one or many (the crow`s feet symbol), depending upon weather each
instance in that table must be linked to at most one instance in the table on the other side of
relationship.
It asks the following questions:
❖ Can a row in this table be linked to more than one row in the other table?
✓ Yes = N ✓ No = 1
A maximum cardinality of 1 means that each row in that table can be linked to, at most, only
one row in the other table. A maximum cardinality of N (which stands for many) means that
each row in that table can be linked to more than one row in the other table.
Three types of relationships
Three basic types of relationships between entities are possible, depending on the maximum
cardinality associated with each entity (the minimum cardinality does not matter):
1. A one-to-one (1:1) relationship exists when the maximum cardinality for each entity in
the relationship is one.

For example, (1,1) (1,1) Cash Payment


Purchase

1:1

21
This would reflect a policy that purchases are made for cash, each invoice is settled
independently.
2. A one-to-many (1:N) relationship exists when the maximum cardinality of one entity in
the relationship is 1 and the maximum cardinality for the other entity in that relationship is
many.
Cash Collection
For example, Sale (0,1) (1,n)

1:n

This would reflect a policy that credit sales are made, no installment collections, and several
invoices can be settled at once.
3. A many-to-many (M:N) relationship exists when the maximum cardinality for both
entities in the relationship is many.

For example, (0,n) (1,n) Receive Cash


Sale
m:n

It shows that each sales event may be linked to one or more cash receipt events and that each
cash receipt event may in turn be linked to one or more sales events. This reflects an
organization that has business policies that allow customers to make installment payments and
also permits customers to accumulate a balance representing a set of sales transactions over a
period of time. Keep in mind, however, that maximum cardinalities of N do not represent
mandatory practice.
Caution: Do not confuse the notation used for minimum and maximum cardinalities (a pair of
numbers separated by a comma) with the notation used to describe the cardinality of a
relationship between two entities (a pair of numbers separated by a colon).
Business Meaning of Cardinalities (Rules for Specifying Cardinalities)
The choice of cardinalities is not arbitrary, but reflects facts about the organization being
modeled and its business practices. This information is obtained during the systems analysis
and conceptual design stages of the database design process.

Cardinality rules for agent-event relationships


In relationships between events and agents, the minimum and maximum cardinalities
associated with the event entity in every agent-event relationship are both one (1). This is
almost always the case. The minimum cardinality associated with the event entity is 1 because
there must be some agent who participates in that event. For example, when a sale occurs:
• There is usually one and only one customer.
• There is usually one and only one salesperson. This practice makes it more feasible
for the organization to establish employee accountability for the event.
The maximum cardinality is usually also 1, because the organization wants to be able to hold
some specific agent responsible for that event.

22
There is also a general principle concerning the cardinalities associated with the agent entity
in agent-event relationships. The cardinalities associated with each agent in the agent-event
relationships all have zero minimum and N maximums. The maximum cardinalities associated
with internal agent entities in agent-event relationships is almost always N, because
organizations expect that their employees will participate in numerous events. It is also usually
N for external agents because organizations often engage in repeat transactions with the same
suppliers and customers.
There are two reasons why the minimum cardinality associated with agent entities in agent-
event relationships is usually zero (0).
a) Organizations want to be able to add information about potential customers and suppliers
even though those agents may not have participated (yet) in any business transactions.
b) Event entities are analogous to transaction files, where as agent entities are analogues to
master files. At the end of a fiscal year, the contents of event tables are typically archived
and the new fiscal year begins with no rows in the various event tables. In contrast,
information about agents is permanent in nature and is carried over from one fiscal period
to the next. Therefore, at the beginning of a new fiscal year, customers may not be linked
to any current sales events.
Cardinality rules for resource-event relationships
The minimum and maximum cardinalities associated with each resource in resource-event
relationships are zero (0) and N, respectively. This is typical for most organizations for most
resources, for the same reasons given earlier when explaining the typical minimum and
maximum cardinalities associated with agents in agent-event relationships.
One exception to this general rule is the maximum cardinality associated with the inventory
resource is sometimes one (1). This occurs when a company sales low-cost, mass produced
items. Sometimes, however, organizations do track a specific physical inventory items.
Examples include original artwork, vehicles, or houses. For such merchandise, each row in
the inventory table would represent a specific painting or house and would be identified by
a primary key that is some type of serial ID number. In such cases, a given row in the
inventory table could be associated with at most one sales transaction and, accordingly,
would have a maximum cardinality of 1 instead of N.
The minimum and maximum cardinality associated with event entities in resource-event
relationships is usually 1. For example, each sales event involves at least one row in the
inventory table (for a sale to occur, a company has to sell something). Similarly, each payment
received from a customer must be deposited into some cash account. The only exception to this
general rule arises if an event potentially can be linked to more than one resource entity.
Consider an automobile repair business. Some services, such as tire rotations, may not
include the sale of any parts; whereas other services, such as a brake repair, include both
labor and parts. Thus, the sales event for such an auto repair business could be linked to an
inventory entity, or to a repair services entity; or to both types of resources. Consequently,
the minimum cardinality for the sales event would be 0 in both of those relationships. Note
that, in rare situations, an event might be linked to one of several unique agent entities. In
such cases, the minimum cardinality associated with the event entity again would be 0
instead of the normal 1.

23
There are no general principles concerning the maximum cardinality associated with event
entities in resource-event relationships, however. Instead, the maximum cardinality for an
event depends on the nature of the resource affected by that event and by the organization`s
business policies.
Cardinality rules for event-event relationships
Almost any kind of cardinality pair is possible for each event entity in event-event
relationships. The organization`s business practices and policies must be understood to decide
which possibility is correct.
The only general modeling principle that applies to event-event relationships is that for two
temporally ordered events, the minimum cardinality for the first event is zero, because at the
time it occurs, the other event has not yet happened. Often, but not always, the minimum
cardinality for the event that happens second is one, indicating that the first event had to have
already occurred. For example, for companies that sell to customers either through catalogs
or on the web, customer orders (event 1) precede shipments to customers (event 2).
Sometimes, however, both of the events need not occur.
Uniqueness of REA diagrams
Each organization will have its own unique REA diagram. At a minimum, because business
practices differ across companies, so cardinalities and relationships will differ. In fact, an REA
diagram for a given organization will have to change to reflect changes to existing business
practices. Sometimes, differences in business practices can result in different entities being
modeled.
Data modeling can be complex and repetitive. Frequently, data modelers develop an initial
REA diagram that reflects their understanding of the organizations business process, only to
learn when showing it to intended users that they had omitted key dimensions or misunderstood
some operating procedures. Thus, data modelers must discuss their drafts of models with
intended users to ensure that:
• Key dimensions are not omitted or misunderstood.
• Terminology is consistent.

24

You might also like