Unit 3 of MIS
Unit 3 of MIS
Unit 3 of MIS
Database Administrator:
A DBA is individual or person responsible for controlling, maintenance, coordinating, and operation
of database management system. Managing, securing, and taking care of database system is prime
responsibility.
They are responsible and in charge for authorizing access to database, coordinating, capacity,
planning, installation, and monitoring uses and for acquiring and gathering software and hardware
resources as and when needed.
Types of Database Administrator (DBA) :
Administrative DBA –
Their job is to maintain server and keep it functional. They are concerned with data backups,
security.
Data Warehouse DBA – accountable for merging data from various sources into data warehouse.
Development DBA –
They build and develop queries, stores procedure, etc.
Application DBA –
They particularly manages all requirements of application components that interact with database
and accomplish activities such as application installation and coordinating, application upgrades,
database cloning, data load process management, etc.
o Database management system is a software which is used to manage the database. For
example: MySQL, Oracle, etc are a very popular commercial database which is used in different
applications.
o DBMS provides an interface to perform various operations like database creation, storing data
in it, updating data, creating a table in the database and a lot more.
o It provides protection and security to the database. In the case of multiple users, it also
maintains data consistency.
o Data Definition: It is used for creation, modification, and removal of definition that defines the
organization of data in the database.
o Data Updation: It is used for the insertion, modification, and deletion of the actual data in the
database.
o Data Retrieval: It is used to retrieve the data from the database which can be used by
applications for various purposes.
o User Administration: It is used for registering and monitoring users, maintain data integrity,
enforcing data security, dealing with concurrency control, monitoring performance and
recovering information corrupted by unexpected failure.
Advantages of DBMS
o Controls database redundancy: It can control data redundancy because it stores all the data in
one single database file and that recorded data is placed in the database.
o Data sharing: In DBMS, the authorized users of an organization can share the data among
multiple users.
o Easily Maintenance: It can be easily maintainable due to the centralized nature of the database
system.
o Reduce time: It reduces development time and maintenance need.
o Backup: It provides backup and recovery subsystems which create automatic backup of data
from hardware and software failures and restores the data if required.
o multiple user interface: It provides different types of user interfaces like graphical user
interfaces, application program interfaces
Disadvantages of DBMS
o Cost of Hardware and Software: It requires a high speed of data processor and large memory
size to run DBMS software.
o Size: It occupies a large space of disks and large memory to run them efficiently.
o Higher impact of failure: Failure is highly impacted the database because in most of the
organization, all the data stored in a single database and if the database is damaged due to
electric failure or database corruption then the data may be lost forever.
Data Independence
o Data independence can be explained using the three-schema architecture. Schema is the overall
description of the database.
o Data independence refers characteristic of being able to modify the schema at one level of the
database system without altering the schema at the next higher level.
o Logical data independence refers characteristic of being able to change the conceptual schema
without having to change the external schema.
o Logical data independence is used to separate the external level from the conceptual view.
o If we do any changes in the conceptual view of the data, then the user view of the data would
not be affected.
o Logical data independence occurs at the user interface level.
o Physical data independence can be defined as the capacity to change the internal schema
without having to change the conceptual schema.
o If we do any changes in the storage size of the database system server, then the Conceptual
structure of the database will not be affected.
o Physical data independence is used to separate conceptual levels from the internal levels.
1-Tier Architecture
o In this architecture, the database is directly available to the user. It means the user can directly
sit on the DBMS and uses it.
o Any changes done here will directly be done on the database itself. It doesn't provide a handy
tool for end users.
2-Tier Architecture
o The 2-Tier architecture is same as basic client-server. In the two-tier architecture, applications
on the client end can directly communicate with the database at the server side. For this
interaction, API's like: ODBC, JDBC are used.
o The user interfaces and application programs are run on the client-side.
o The server side is responsible to provide the functionalities like: query processing and
transaction management.
o To communicate with the DBMS, client-side application establishes a connection with the
server side.
3-Tier Architecture
o The 3-Tier architecture contains another layer between the client and server. In this architecture,
client can't directly communicate with the server.
o The application on the client-end interacts with an application server which further
communicates with the database system.
o End user has no idea about the existence of the database beyond the application server. The
database also has no idea about any other user beyond the application.
Three schema Architecture
o The three schema architecture contains three-levels. It breaks the database down into three
different categories.
o Mapping is used to transform the request and response between various database levels of
architecture.
o Mapping is not good for small DBMS because it takes more time.
o In External / Conceptual mapping, it is necessary to transform the request from external level to
conceptual schema.
o In Conceptual / Internal mapping, DBMS transform the request from the conceptual to internal
level.
1. Internal Level
o The internal level has an internal schema which describes the physical storage structure of the
database.
o The internal schema is also known as a physical schema.
o It uses the physical data model. It is used to define that how the data will be stored in a block.
2. Conceptual Level
o The conceptual schema describes the design of a database at the conceptual level. Conceptual
level is also known as logical level.
o The conceptual schema describes the structure of the whole database.
o The conceptual level describes what data are to be stored in the database and also describes
what relationship exists among those data.
3. External Level
o At the external level, a database contains several schemas that sometimes called as subschema.
The subschema is used to describe the different view of the database.
o An external schema is also known as view schema.
o Each view schema describes the database part that a particular user group is interested and hides
the remaining database from that user group.
Types of DBMS
The types of DBMS based on data model are as follows −
Relational database.
Object oriented database.
Hierarchical database.
Network database.
Relation Database
A relational database management system (RDBMS) is a system where data is organized in two-
dimensional tables using rows and columns.
This is one of the most popular data models which is used in industries. It is based on SQL.
Every table in a database has a key field which uniquely identifies each record.
This type of system is the most widely used DBMS.
Hierarchical Database
It is a system where the data elements have a one to many relationship (1: N). Here data is organized
like a tree which is similar to a folder structure in your computer system.
The hierarchy starts from the root node, connecting all the child nodes to the parent node.
It is used in industry on mainframe platforms.
Network database
A Network database management system is a system where the data elements maintain one to one
relationship (1: 1) or many to many relationship (N: N).
It also has a hierarchical structure, but the data is organized like a graph and it is allowed to have more
than one parent for one child record.
What is Table ?
In Relational database model, a table is a collection of data elements organised in terms of rows and
columns. A table is also considered as a convenient representation of relations.
What is a Tuple?
A single entry in a table is called a Tuple or Record or Row. A tuple in a table represents a set of
related data.
What is an Attribute?
A table consists of several records(row), each record can be broken down into several smaller parts of
data known as Attributes.
Attribute Domain
When an attribute is defined in a relation(table), it is defined to hold only a certain type of values,
which is known as Attribute Domain.
Hence, the attribute Name will hold the name of employee for every tuple.
What is a Relation Schema?
A relation schema describes the structure of the relation, with the name of the relation(name of table),
its attributes and their names and type.
A relation key is an attribute which can uniquely identify a particular tuple (row) in a relation(table).
Query
A query is a question you define and send to the data source to retrieve the data.
Report
A report is an organized and formatted view of the data the query retrieved.
Types of keys:
1. Primary key
o It is the first key used to identify one and only one instance of an entity uniquely.
2. Candidate key
o A candidate key is an attribute or set of attributes that can uniquely identify a tuple.
o Except for the primary key, the remaining attributes are considered a candidate key. The candidate
keys are as strong as the primary key.
3. Super Key
Super key is an attribute set that can uniquely identify a tuple. A super key is a superset of a candidate key.
4. Foreign key
o Foreign keys are the column of the table used to point to the primary key of another table.
5. Alternate key
There may be one or more attributes or a combination of attributes that uniquely identify each tuple in a
relation. These attributes or combinations of the attributes are called the candidate keys. One key is chosen
as the primary key from these candidate keys, and the remaining candidate key, if it exists, is termed the
alternate key.
Data Warehousing
A Data Warehouse refers to a place where data can be stored for useful mining. It is like a quick
computer system with exceptionally huge data storage capacity. Data from the various organization's
systems are copied to the Warehouse, where it can be fetched and conformed to delete errors. Data
warehouse combines data from numerous sources which ensure the data quality, accuracy, and
consistency.
Data flows into a data warehouse from different databases. A data warehouse works by sorting out data
into a pattern that depicts the format and types of data.
A data warehouse is built to store a huge amount of historical data and empowers fast requests over all
the data, typically using Online Analytical Processing (OLAP). A database is made to store current
transactions and allow quick access to specific transactions for ongoing business processes, commonly
known as Online Transaction Processing (OLTP).
1. Subject Oriented
A data warehouse is subject-oriented. It provides useful data about a subject instead of the company's
ongoing operations, and these subjects can be customers, suppliers, marketing, product, promotion, etc.
A data warehouse usually focuses on modeling and analysis of data that helps the business organization
to make data-driven decisions.
2. Time-Variant:
The different data present in the data warehouse provides information for a specific period.
3. Integrated
A data warehouse is built by joining data from heterogeneous sources, such as social databases, level
documents, etc.
4. Non- Volatile
Data Mining:
Data mining refers to the analysis of data. It is the computer-supported process of analyzing huge sets
of data that have either been compiled by computer systems or have been downloaded into the
computer. In the data mining process, the computer analyzes the data and extract useful information
from it. It looks for hidden patterns within the data set and try to predict future behavior. Data mining is
primarily used to discover and indicate relationships among the data sets.
i. Market Analysis:
Data Mining can predict the market that helps the business to make the decision. For example, it
predicts who is keen to purchase what type of products.
Data Mining methods can help to find which cellular phone calls, insurance claims, credit, or debit card
purchases are going to be fraudulent.
Data Mining techniques are widely used to help Model Financial Market
Analyzing the current existing trend in the marketplace is a strategic benefit because it helps in cost
reduction and manufacturing process as per market demand.
Differences between Data Mining and Data Warehousing:
Data Mining Data Warehousing
Data mining is the process of determining data A data warehouse is a database system designed
patterns. for analytics.
Data mining is generally considered as the Data warehousing is the process of combining all
process of extracting useful data from a large the relevant data.
set of data.
Business entrepreneurs carry data mining with Data warehousing is entirely carried out by the
the help of engineers. engineers.
In data mining, data is analyzed repeatedly. In data warehousing, data is stored periodically.
1. Classification:
This technique is used to obtain important and relevant information about data and metadata. This data
mining technique helps to classify data in different classes.
2. Clustering:
Clustering is a division of information into groups of connected objects. Describing the data by a few
clusters mainly loses certain confine details, but accomplishes improvement. It models data by its
clusters.
3. Regression:
Regression analysis is the data mining process is used to identify and analyze the relationship between
variables because of the presence of the other factor. It is used to define the probability of the specific
variable.
4. Association Rules:
This data mining technique helps to discover a link between two or more items. It finds a hidden
pattern in the data set.
Association rules are if-then statements that support to show the probability of interactions between
data items within large data sets in different types of databases.
5. Outer detection:
This type of data mining technique relates to the observation of data items in the data set, which do not
match an expected pattern or expected behavior. This technique may be used in various domains like
intrusion, detection, fraud detection, etc.
6. Sequential Patterns:
The sequential pattern is a data mining technique specialized for evaluating sequential data to
discover sequential patterns.
7. Prediction:
Prediction used a combination of other data mining techniques such as trends, clustering, classification,
etc. It analyzes past events or instances in the right sequence to predict a future event.
Business intelligence (BI) is a collection of applications and techniques used to transform data into
actionable information. BI involves enterprise-level data analysis that pinpoints areas for operational
improvement and external expansion. In addition, business intelligence can incorporate data
visualization, which further facilitates strategic business decisions.
Aside from internal data analysis, companies can use BI on third party databases to obtain insights
about rivals or potential business partners. Ultimately, companies use business intelligence to make
decisions that better serve and target customers while simultaneously increasing cost savings.