Module 1 - Database Systems
Module 1 - Database Systems
INFORMATION
MANAGEMENT
Module 1 – DATABASE CONCEPTS
(Database Systems)
Description
This course covers information management,
database design, data modeling, SQL, and
implementation using relational database system.
Why Databases?
Imagine trying to operate a business without knowing who your customers are, what products you are
selling, who is working for you, who owes you money, and whom you owe money. All businesses have to keep
this type of data and much more; and just as importantly, they must have those data available to decision
makers when they need them. It can be argued that the ultimate purpose of all business information systems
is to help businesses use information as an organizational resource. At the heart of all of these systems are the
collection, storage, aggregation, manipulation, dissemination, and management of data.
Depending on the type of information system and the characteristics of the business, these data could vary
from a few megabytes on just one or two topics to terabytes covering hundreds of topics within the business’s
internal and external environment. Telecommunications companies such as Sprint and AT&T are known to have
systems that keep data on trillions of phone calls, with new data being added to the system at speeds up to
70,000 calls per second! Not only do these companies have to store and manage these immense collections of
data, they have to be able to find any given fact in that data quickly. Consider the case of Internet search staple
Google.
While Google is reluctant to disclose many details about its data storage specifications, it is estimated that
the company responds to over 91 million searches per day across a collection of data that is several terabytes
in size. Impressively, the results of these searches are available nearly instantly.
How can these businesses process this much data? How can they store it all, and then quickly retrieve just
the facts that decision makers want to know, just when they want to know it?
The answer is that they use databases. Databases, are specialized structures that allow
computer-based systems to store, manage, and retrieve data very quickly. Virtually all modern
business systems rely on databases; therefore, a good understanding of how these structures are
created and their proper use is vital for any information systems professional.
Even if your career does not take you down the amazing path of database design and development,
databases will be a key component underpinning the systems that you work with. In any case, it is very likely
that, in your career, you will be making decisions based on information generated from data. Thus, it is
important that you know the difference between data and information.
1|P age
Answer the Progress Assessment question on the Learning
Management System.
To understand what drives database design, you must understand the difference between data and
information. Data are raw facts. The word raw indicates that the facts have not yet been processed to reveal their
meaning.
For example, suppose that you want to know what the users of a computer lab think of its services. Typically,
you would begin by surveying users to assess the computer lab’s performance.
2|P age
Figure 1.1, Panel A, shows the Web survey form that enables users to respond to your questions. When
the survey form has been completed, the form’s raw data are saved to a data repository, such as the
one shown in Figure 1.1, Panel B. Although you now have the facts in hand, they are not particularly
useful in this format—reading page after page of zeros and ones is not likely to provide much insight.
Remember: Therefore, you transform the raw data into a data summary like the one shown in
Figure 1.1, Panel C. Now it’s possible to get quick answers to questions such as “What is the
composition of our lab’s customer base?” In this case, you can quickly determine that most of
your customers are juniors (24.59%) and seniors (53.01%).
Because graphics can enhance your ability to quickly extract meaning from data, you show the data
summary bar graph in Figure 1.1, Panel D.
Information is the result of processing raw data to reveal its meaning. Data processing can be as simple
as organizing data to reveal patterns or as complex as making forecasts or drawing inferences using statistical
modeling. To reveal meaning, information requires context.
For example, an average temperature reading of 105 degrees does not mean much unless you also
know its context: Is this in degrees Fahrenheit or Celsius? Is this a machine temperature, a body
temperature, or an outside air temperature?
For example, the data summary for each question on the survey form can point out the lab’s strengths
and weaknesses, helping you to make informed decisions to better meet the needs of lab customers.
Keep in mind that raw data must be properly formatted for storage, processing, and presentation.
For example, in Panel C of Figure 1.1, the student classification is formatted to show the results
based on the classifications Freshman, Sophomore, Junior, Senior, and Graduate Student. The
respondents’ yes/no responses might need to be converted to a Y/N format for data storage.
Timely and useful information requires accurate data. Such data must be properly generated and stored
in a format that is easy to access and process. And, like any basic resource, the data environment must be
managed carefully. Data management is a discipline that focuses on the proper generation, storage, and retrieval
3|P age
of data. Given the crucial role that data play, it should not surprise you that data management is a core activity
for any business, government agency, service organization, or charity.
Efficient data management typically requires the use of a computer database. A database is a shared, integrated
computer structure that stores a collection of:
• End-user data, that is, raw facts of interest to the end user.
• Metadata, or data about data, through which the end-user data are integrated and managed.
The metadata provide a description of the data characteristics and the set of relationships that links the data
found within the database.
For example, the metadata component stores information such as the name of each data element, the
type of values (numeric, dates, or text) stored on each data element, whether or not the data element
can be left empty, and so on.
Remember: The metadata provide information that complements and expands the value and
use of the data. In short, metadata present a more complete picture of the data in the database.
Given the characteristics of metadata, you might hear a database described as a “collection of
self-describing data.”
A database management system (DBMS) is a collection of programs that manages the database
structure and controls access to the data stored in the database. In a sense, a database resembles a very well-
organized electronic filing cabinet in which powerful software, known as a database management system, helps
manage the cabinet’s contents.
The DBMS serves as the intermediary between the user and the database. The database structure itself
is stored as a collection of files, and the only way to access the data in those files is through the DBMS
FIGURE 1.2 - The DBMS manages the interaction between the end user and the database
4|P age
• Figure 1.2 emphasizes the point that the DBMS presents the end user (or application program) with
a single, integrated view of the data in the database.
• The DBMS receives all application requests and translates them into the complex operations required
to fulfill those requests.
• The DBMS hides much of the database’s internal complexity from the application programs and
users.
• The application program might be written by a programmer using a programming language such
as Visual Basic.NET, Java, or C#, or it might be created through a DBMS utility program.
Remember: Having a DBMS between the end user’s applications and the database offers some
important advantages. First, the DBMS enables the data in the database to be shared among
multiple applications or users. Second, the DBMS integrates the many different users’ views of the
data into a single all-encompassing data repository.
Because data are the crucial raw material from which information is derived, you must have a good method to
manage such data. The DBMS helps make data management more efficient and effective. In particular, a DBMS
provides advantages such as:
• Improved data sharing. The DBMS helps create an environment in which end users have better
access to more and better-managed data. Such access makes it possible for end users to
respond quickly to changes in their environment.
• Improved data security. The more users access the data, the greater the risks of data security
breaches. A DBMS provides a framework for better enforcement of data privacy and security
policies.
• Better data integration. Wider access to well-managed data promotes an integrated view of the
organization’s operations and a clearer view of the big picture. It becomes much easier to see
how actions in one segment of the company affect other segments.
• Minimized data inconsistency. Data inconsistency exists when different versions of the same
data appear in different places. The probability of data inconsistency is greatly reduced in a
properly designed database.
5|P age
• Improved data access. The DBMS makes it possible to produce quick answers to ad hoc queries.
From a database perspective, a query is a specific request issued to the DBMS for data
manipulation—for example, to read or update the data. Simply put, a query is a question, and
an ad hoc query is a spur-of-the-moment question. The DBMS sends back an answer (called the
query result set) to the application.
For example, end users, when dealing with large amounts of sales data, might want
quick answers to questions (ad hoc queries) such as:
1. What was the dollar volume of sales by product during the past six months?
2. What is the sales bonus figure for each of our salespeople during the past three
months?
3. How many of our customers have credit balances of $3,000 or more?
4.
• Improved decision making. Better-managed data and improved data access make it possible
to generate better-quality information, on which better decisions are based. The quality of the
information generated depends on the quality of the underlying data. Data quality is a
comprehensive approach to promoting the accuracy, validity, and timeliness of the data. While
the DBMS does not guarantee data quality, it provides a framework to facilitate data quality
initiatives.
• Increased end-user productivity. The availability of data, combined with the tools that transform
data into usable information, empowers end users to make quick, informed decisions that can
make the difference between success and failure in the global economy.
Types of Databases
A DBMS can support many different types of databases. Databases can be classified according to the number
of users, the database location(s), and the expected type and extent of use.
The number of users determines whether the database is classified as single-user or multiuser.
• A single-user database supports only one user at a time. In other words, if user A is using
the database, users B and C must wait until user A is done.
• A single-user database that runs on a personal computer is called a desktop database.
• In contrast, a multiuser database supports multiple users at the same time.
• When the multiuser database supports a relatively small number of users (usually fewer than
50) or a specific department within an organization, it is called a workgroup database.
• When the database is used by the entire organization and supports many users (more than
50, usually hundreds) across many departments, the database is known as an enterprise
database.
• database that supports data located at a single site is called a centralized database.
• A database that supports data distributed across several different sites is called a
distributed database.
6|P age
The most popular way of classifying databases today, however, is based on how they will be used and
on the time sensitivity of the information gathered from them. For example, transactions such as
product or service sales, payments, and supply purchases reflect critical day-to-day operations. Such
transactions must be recorded accurately and immediately.
Databases can also be classified to reflect the degree to which the data are structured.
• Unstructured data are data that exist in their original (raw) state, that is, in the format in which
they were collected. Therefore, unstructured data exist in a format that does not lend itself to
the processing that yields information.
• Structured data are the result of taking unstructured data and formatting (structuring) such data
to facilitate storage, use, and the generation of information.
• Semistructured data are data that have already been processed to some extent. For example, if
you look at a typical Web page, the data are presented to you in a prearranged format to convey
some information.
Remember: The database types mentioned thus far focus on the storage and management of
highly structured data. However, corporations are not limited to the use of structured data. They
also use semistructured and unstructured data. They also use semistructured and unstructured
data.
7|P age
Questions or Task
1. Why database design is important? Discuss five (5) examples why database design important
in developing a system.
V. References:
•Adrienne Watt and Nelson Eng, Database Design - 2nd Edition
•Carlos Coronel, Steven Morris, and Peter Rob, Database Systems: Design, Implementation, and
Management, ninth edition
•Ramez Elmasri and Shamkant B. Navathe, Fundamentals of Database Systems, Sixth Edition
8|P age