Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

TM - TSI - 05 Foundation of BI (Part 1)

Download as pdf or txt
Download as pdf or txt
You are on page 1of 26

Chapter 4

Foundations of
Business Intelligence
(part 1)
DATABASES & INFORMATION MANAGEMENT

Manajemen Informasi dan Komunikasi


2020
Learning Objectives

• What are the problems of managing data resources in a traditional


file environment
• What are the major capabilities of database management systems
(DBMS), and why is a relational DBMS so powerful?
• What are the principal tools and technologies for accessing
information from databases to improve business performance and
decision making?
• Why are information policy, data administration, and data quality
assurance essential for managing the firm’s data resources?
Managing Data in a
Traditional File
Environment
The Data Hierarchy
A computer system organizes data in a hierarchy that
starts with the bits and bytes and progresses to fields,
records, file, and databases.
• Bit = is represents the smallest unit of data a
computer can handle (0 or 1)
• Bytes = is a group of bits, represents a single
character, which can be a letter, a number, or
another symbol
• Field = Group of characters as word(s) or
number(s)
• Record = A group of related fields, such as the
student’s name, the course taken, the date, and
the grade
• File = A group of records of same type
• Database = Group of related files
Traditional File Processing
• The use of a traditional
approach to file processing
encourages each functional
area in a corporation to
develop specialized
applications (Files maintained
separately by different
departments)
• Each application requires a
unique data file that is likely to
be a subset of the master file.
These subsets of the master file
lead to data redundancy and
inconsistency, processing
inflexibility, and wasted storage
Data redundancy : the presence of Data Redundancy Data inconsistency : the same
duplicate data in multiple data files attribute may have different values.
& Inconsistency

Lack of data
sharing and
availability Program-data
Problem with dependence
Because pieces of information in
different files of the organization cannot Traditional File
be related to one another, it is Environment Any change in a software program
impossible for information to be shared could require a change in the data
or accessed in a timely manner accessed by that program

Poor security Lack of flexibility

Little control or management of data System can deliver routine scheduled


 access to and dissemination of reports, but it cannot deliver reports or
information may be out of control. respond to unanticipated information.
The Database Approach
to Data Management
Database Management Systems (DBMS)

• Software that permits an organization to centralize data, manage

them efficiently, and provide access to the stored data by

application programs.

• Interfaces between applications and physical data files

• The DBMS makes the physical database available for different

logical views required by users


Human Resources Database with Multiple
Views

A single human resources


database provides many
different views of data,
depending on the
information requirements of
the user (Benefits View and
Payroll View)
How a DBMS Solves the Problems of the
Traditional File Environment

Controls data Eliminates data


redundancy inconsistency

Enables organization to centrally manage data,


their use, and security
Relational DBMS

• The most popular type of DBMS today for PCs as well as for larger

computers and mainframes

• A relational database organizes data in the form of two-dimensional

tables (called relations)

• Each table contains data on entity and attributes


Relational Database Tables

• Illustrated here are tables for the


entities SUPPLIER and PART showing
how they represent each entity and its
attributes.
• Supplier_Number is a primary key for
the SUPPLIER table and a foreign key
for the PART table.
Operations of a Relational DBMS
Three basic operations that enable data from two different tables to be combined and only selected
attributes to be displayed:

1. SELECT  Creates subset of data of all records that meet stated criteria
2. JOIN  Combines relational tables to provide user with more information than available in individual tables
3. PROJECT  Creates subset of columns in table, creating tables with only the information specified
Capabilities of DBMSs

• Data definition capability


• Specifies structure of database content, used to create tables and define
characteristics of the fields in each table
• Data dictionary
• An automated or manual file that stores definitions of data elements and their
characteristics
• Data dictionaries may capture additional information, such as usage, ownership,
authorization, security, and the individuals, business functions, programs, and
reports that use each data element.
• Querying and Reporting
Querying and Reporting

• Most DBMS have a specialized language called a data


manipulation language that is used to add, change, delete, and
retrieve the data in the database
• Structured Query Language (SQL)
• Microsoft Access user tools for generating SQL
Example of an SQL Query

Illustrated here are the SQL statements for a query to select suppliers for parts 137 or 150.
They produce a list with the same results as slide 14.
An Access Query

Illustrated here is how the


query in slide 16 would be
constructed using Microsoft
Access query-building tools.
It shows the tables, fields,
and selection criteria used
for the query
Designing Databases

• To create a database, you must understand :


• the relationships among the data,
• the type of data that will be maintained in the database,
• how the data will be used, and
• how the organization will need to change to manage data from a company-
wide perspective.

• The database requires both a conceptual (logical) design and a


physical design
Normalization

• Design process identifies:


• Relationships among data elements and redundant database elements
• Most efficient way to group data elements to meet business information
requirements and the needs of specific application programs
• Normalization
• Streamlining complex groups of data to minimize redundant data elements
and awkward many-to-many relationships
• The process of creating small, stable, yet flexible and adaptive data structures
from complex groups of data
An Unnormalized Relation for Order

An unnormalized relation contains repeating groups. For example, there can be many parts
and suppliers for each order. There is only a one-to-one correspondence between
Order_Number and Order_Date.
Normalized Tables Created from Order

After normalization, the original relation ORDER has been broken down into four smaller
relations. The relation ORDER is left with only two attributes, and the relation LINE_ITEM has a
combined, or concatenated, key consisting of Order_Number and Part_Number.
An Entity-Relationship Diagram

• Referential integrity rules


• Used by RDBMS to ensure relationships between tables remain
consistent
• Entity-relationship diagram
• Used by database designers to document the data model
• Illustrates relationships between entities
• If the business doesn’t get its data model right, the system won’t be
able to serve the business well
An Entity-Relationship Diagram

• This diagram shows the relationships between the entities SUPPLIER, PART, LINE_ITEM, and
ORDER that might be used to model the database in slide 24
• The boxes represent entities.
• The lines connecting the boxes represent relationships.
• A line connecting two entities that ends in two short marks designates a one-to-one
relationship.
• A line connecting two entities that ends with a crow’s foot topped by a short mark
indicates a one-to-many relationship.
Non-relational Databases and Databases in
the Cloud

• Non-relational databases: “NoSQL”


• Managing large data sets stored across many distributed
machines
• Easily scalable, flexible and simple to use as they have no rigid
schema
• Able to handle large volumes of structured, semi-structured, and
unstructured data (Web, social media, graphics)
• Ideal for applications with no specific schema definitions such as
content management systems, big data applications, real-time
analytics
• Tools: Oracle NoSQL, MongoDB
Non-relational Databases and Databases in
the Cloud

• A cloud database
• Serves many of the same functions as a traditional database with the
added flexibility of cloud computing.
• A database service built and accessed through a cloud platform
• Enables enterprise users to host databases without buying dedicated
hardware
• Can be managed by the user or offered as a service and managed by a
provider
• Can support relational databases (including MySQL and PostgreSQL)
and NoSQL databases (including MongoDB and Apache CouchDB)
• Accessed through a web interface or vendor-provided API

You might also like