Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
27 views

Lecture 3 Databases and DBMS Relational Data Model

The document discusses databases and database management systems (DBMS). It explains that a DBMS allows users to create, organize, and interact with databases. Key benefits of using a DBMS include supporting large datasets, maintaining data integrity, allowing concurrent access, providing a query language like SQL, and reducing data redundancy. The document also discusses technological trends related to hardware, software, and networks that have increased demand for database technologies.

Uploaded by

Prod Brown
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views

Lecture 3 Databases and DBMS Relational Data Model

The document discusses databases and database management systems (DBMS). It explains that a DBMS allows users to create, organize, and interact with databases. Key benefits of using a DBMS include supporting large datasets, maintaining data integrity, allowing concurrent access, providing a query language like SQL, and reducing data redundancy. The document also discusses technological trends related to hardware, software, and networks that have increased demand for database technologies.

Uploaded by

Prod Brown
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 69

GS 121 FUNDAMENTAL OF GIS

DATABASES AND DBMS


DR. DOROTHEA DEUS & MR. MICHAEL MAVURA
OBJECTIVES

At the end of this lecture, you should be able to:


• Explain the situations in which database technology can be used;
• Explain the notions of database, DBMS, data model and schema;
• Design simple SQL queries for data extraction from an existing
database;
• Design simple updates for changing an existing data set.
TOPICS COVERED

• Introduction to Database Technology


• The relational data model
• The relational query language (SQL)
INTRODUCTION TO DATABASE
TECHNOLOGY
• Objectives of this lecture:
• GIS and databases
• General overview of database(db) technology
• Know the most important terms
• Understand where/when db technology can be applied, and
where/when it should be applied
LECTURE OVERVIEW

▪ Technological trends
▪ GIS
▪ Databases
▪ Database and DBMS
▪ Users & Interaction with a database (incl. GIS and DBMS)
▪ Spatial DBMS

▪ GIS and DBMS


▪ Spatial database
TECHNOLOGICAL TRENDS
HDW & SFW & NET

• 1st PC (in 1977): ▪ Notebook4students:


• CPU: 1MHz ▪ CPU: 2.8 GHz
• Memory: 4KB ▪ Memory: 4096 MB
• Storage capacity: real storage ▪ Storag capacity: 320 GB
device (cassette tape)
100KB/30min side

Commodore PET 2001 series HP 8530W


(source: Wikipedia) (source: ITC's intranet)
TECHNOLOGICAL TRENDS
HDW & SFW & NET

▪ software with more functionality


▪ consuming more memory

▪ Wirth's law:

“Software is getting slower more rapidly than hardware becomes


faster.”
(source:
Wikipedia)
TECHNOLOGICAL TRENDS
HDW & SFW & NET

▪ Wired
▪ from 56 kbps (standard dial-up modem) to 1Gbps (fibre optic cable)

▪ Wireless
▪ Bluetooth (up to 3 Mbps)
▪ Wifi (throughput up to 54 Mbps)

▪ Mobile
▪ protocol (up to 20 Mbps)
TECHNOLOGICAL TRENDS
HDW & SFW & NET

• from ‘computer hall’ to operator's pocket

1946 2010
GEOGRAPHIC INFORMATION SYSTEM (GIS)
WHAT IS GIS?

▪ A Geographic information system (GIS) is a computer


based system that provides the following four sets of
capabilities to handle georeferenced data:
1. Data capture and preparation
2. Data management, including storage and maintenance
3. Data manipulation and analysis
4. Data presentation
GEOGRAPHIC INFORMATION SYSTEM (GIS)
WHAT IS GIS?

▪ A GIS should not be


called GIS if any of
these four
components are
missing.
GEOGRAPHIC INFORMATION SYSTEM (GIS)
GIS SOFTWARE ARCHITECTURE

• web browser
User interface
• GIS application GUI

Application logic
• GIS server/desktop application

Data storage
• GIS files
• DBMS
DATABASES

• Databases have been in use


since 1960 for many
applications:
• Bank accounts
• Large amounts of data
• Stock monitoring
• simple & regular structure
• Flight booking
• Library systems

• etc.
DATABASE

• A database is a large,
computerized collection of
Database
structured data.
• The data is structured in
tables.
• One such table is called a
relation.
• Relationships connect
tables into structure.
DATA, INFORMATION AND
KNOWLEDGE
• Data
• A resource held on paper or in digital format
that serves to record or administer some facts
and description of phenomena of interest.

• Information
• Data resource(s) as above, interpreted by
humans. (This is a fuzzy discussion, often
disrespected in day-to-day, human
communication)

• Knowledge
• Deeper understanding by humans, as (usually)
derived from various information sources.
DATA SETS
• Data sets
• A homogeneous data collection, normally describing a single kind of phenomenon.
Some examples:
• The soil areas of a district: their soil class, spatial extent, geological history, fitness
for agricultural use. Their number may be in the 100s in the data set.
• The land use laws of a country: their date of act installation, number of the act,
and full texts. May be 10s.
• The land tax officers: their personal details, area of constituency, data hired, and
total revenue collected per annum. May be 10s to a few 100s.
• The agricultural crops grown in the district: their name and growth & harvest
season, and fertilizers use for them. May be 10s.
To modify: on the View menu, click Header and Footer 25/04/2023

17 DATASET, DATABASE, DBMS,


DATABASE SYSTEM
• Database
• A collection of interrelated data sets properly structured by means of, and stored through
a DBMS.
• Typically, a resource shared by many users.
• Normal cycle of use: store, maintain and retrieve.

• Database management system (DBMS)


• A software package that allows to set-up, maintain and exploit one or more databases.
• A DBMS is a to a database what MS Word is to a text document.

• Database system
• Combination of a database and its DBMS.
DATABASE MANAGEMENT SYSTEM
(DBMS)
• In setting up a database we can
Database design Database distinguish a few phases:
structure 1. Database design
2. Data entry
Data entry Database 3. Database maintenance
4. Use of the database (Data
query)

Maintenance Use - Query


DATABASE MANAGEMENT SYSTEM
(DBMS)
• A database management system
(DBMS) is a software package that
allows users to set up, use and
maintain databases.
DBMS

• A DBMS is a to a database what MS


Word is to a text document.

• You can compare a DBMS to a GIS-


software package.
WHY USE A DMBS?

Reasons for using a DBMS:

• Supports very large


datasets
• Maintains data
correctness
• Supports concurrent use
• Provides a high-level,
query language
• Uses a data model
• Provides backup and
recovery functions
• Reduces data redundancy
WHY DMBS? – LARGE DATASETS

Supports very large datasets

• A DBMS utilizes many different


algorithms to compute the result of a
search statement.

• The DBMS will produce a plan of how


to execute the query, which is
generated by analyzing the run times
of the different algorithms and
selecting the quickest.
WHY DMBS? - CORRECTNESS
bank
account Maintains data correctness.
• The DBMS can check the integrity of
the data in many ways.

• Correctness is maintained with


integrity constraints.

Whoops!

Net Rat
WHY DMBS? – CONCURRENT USE

Supports concurrent use


• Many users can use the same data at
the same time.

DBMS
• The DBMS makes sure that data can
be safely shared without generating
conflicts.

• This DBMS function is called


concurrency control.
WHY DMBS? - QUERY

User Provides a high-level, query


language.
• A query is a computer program that
extracts data from the database that
meets the conditions in the query.

• The most common way is through


SQL.

• SQL stands for Structured Query


Language.

• Queries specify what must be


retrieved rather than how it must be
retrieved.
DBMS
WHY DMBS? – DATA MODEL

Uses a data model.


relation attributes • A data model is a language for
defining data structure and
manipulation of the data.
• The most prominent model is the
tuples relational data model.
• A relation is a table, which contains a
set of tuples.
• Tables contain records or rows, which
are called tuples.
• Every tuple in a relation has the same
attribute set of attributes.
values • Attributes form the columns of a
table.
WHY DMBS? - RECOVERY

Provides backup and recovery


cras h functions
• Users rely on the data.

• The data must be safeguarded against


backup
possible calamities.

• A DBMS has mechanisms for


protecting the data.
recovery
WHY DMBS? - REDUNDANCY

Reduces data redundancy

• A well-designed database takes care of


storing single facts only.

• Storing the same fact more than once


– a phenomenon known as data
redundancy – may lead to
contradictions or maintenance
problems.
WHEN TO USE A DBMS?

When to use a DBMS?


• Text files are good for text.
A B C D

When to use a DBMS? 2


• Spreadsheets support some forms of
database

▪ Text files are good for text.
▪ Spreadsheets support some
forms of data analysis.
3
data analysis.
▪ However, computations on 4
text files and spreadsheets
are usually restricted to a 5
single table.
▪ A DBMS is good for relating
values of different nature in
6
multiple tables.
7 • However, computations on text files
text file and spreadsheets are usually restricted
to a single table.

I • A DBMS is good in relating values of


III II different nature in multiple tables.
END USERS

• The users of a database system


• End-user
• A person who is part of the UoD, and wants to store, maintain or retrieve
data in that context.
• Does not need to be a database expert, connoisseur or specialist.
• Can actually al be another computer system with such needs, in case of
connected database systems.

• End-user group
• Group of end-users with similar or even identical needs to database systems.
KINDS OF END-USERS

• Casual end-user: every now and then, to answer an incidental question.


Mid-level and higher managers

• Naïve end-user: uses the database regularly and routinely, often via
predefined functions (so-called canned transactions) for standard tasks
like reservations, bookings, and other standard administrative processes.
Desk clerks

• Sophisticated end-user: uses database regularly for all kinds of


purpose, and masters the full db query and manipulation facilities.
Engineers, business process analysts, accountants . . .
DATABASE PROFESSIONALS

• Database administrator (DBA):


• A person in charge of database household tasks, such as:
• Ensuring the system continues to perform optimally.
• Watching over data correctness and data completeness.
• Organizing that end-users can do what they need to do, and cannot
do what they are not supposed to do.
• In general: optimize and guarantee the continued use of the valuable
data resource.

• Must be a database expert and connoisseur and specialist, and a good


communicator too.
DATABASE PROFESSIONALS

• Database designer:
• A person, normally operating in a team of designers, who develops a
database system.
• Discovers the requirements for the system through interviews with
future users and organizations management.
• Documents the design decisions.
• Proposes (alternative) solutions for the information needs.
• Implements solutions.
• Prepares end-user documentation: manuals.
DATABASE PROFESSIONALS

• Systems analyst / Application programmer:


• Works out user requirements.
• Develops canned transactions for naïve
end-users.
• Develops user interfaces to facilitate
and ease database use.
HOW DO END-USERS INTERACT
WITH THE DATABASE?
Naive
Sophisticated
end-user
end-user

Casual
user
Database
professional
SPATIAL DATABASE

A DATABASE THAT COMBINES SPATIAL DATA WITH


THEMATIC (ATTRIBUTE) DATA IS CALLED A SPATIAL
Visual representation DATABASE.
SPATIAL DATA IS STORED AS A GEOMETRY DATA TYPE
SPATIAL DATABASE IS NOT CONSTRAINED BY THE
NEED TO PRESENT DATA VISUALLY

Spatial database
SPATIAL DATABASE

Spatial data • With vector representation, every


spatial object (point, line, polygon) has
a unique identifier.

Object ID
• This identifier, called the Object ID,
2020
2003 can be used to link the spatial entity
2323 1462 with its spatial data.
2001

• In this example, Parcel is a table with


thematic and spatial attributes.

• Here, Location is a spatial attribute.

• With raster representation, every


raster cell has a link to the attribute
(thematic) data.
Spatial attribute
SPATIAL DBMS

• A spatial DBMS will also provide


2020
2003
special functions for exploring
2323 1462
spatial relationships such as 'area',
2001
'buffer', 'distance', 'adjacency', etc.

Who are the owners younger


than 40 year of the parcels
adjacent to the parcel with
location number 1462?
'Adjacent' is a spatial property and
can not be determined from the
thematic data!
SPATIAL DBMS

"Find all the Indian restaurants


A Spatial DBMS will provide: within 2 km of my hotel":
1. Spatial data types
SELECT R.name
FROM Restaurants AS R,
2. AS
Hotels Spatial
H indexing
WHERE R.type
3. =Spatial
‘Indian’
join
AND H.name = ‘Hilton’
4. Spatial operators
AND Intersect(R.Geometry,
Buffer(H.Geometry, 2))

• The red part creates a spatial join


• all these will be integrated
between restaurants and hotels.
in the relational model. • 'Geometry' carries the spatial data.
• 'Intersect' and 'Buffer' are spatial
operators.
To modify: on the View menu, click Header and Footer 25/04/2023

40 SPATIAL DBMS

• A S(patial)DBMS is a software package that:


• can work with underlying DBMS (Oracle, SQLServer, Informix, DB2, MySQL,
PostgreSQL, etc)
• supports spatial data models (vector, raster) e.g. point, line, or polygon,
• Provides special functions for querying and manipulating spatial data using
structured query language (SQL) e.g. SELECT p.* FROM parcels AS p WHERE
Area(p.geom)>1000
• supports spatial indexing, efficient algorithms for processing spatial operations, and
domain specific rules for query optimization.
To modify choose 'Insert' then 'Header and footer' 25/04/2023

41
GIS AND DBMS

• Large scale GIS applications will require a DBMS for data storage and a
GIS for spatial functionality.

• DBMS focuses on storage, querying and sharing large data sets.

GIS Application

DBMS

Oracle, SQL Server


IBM DB2, Informix
GDO_GID NAM ID
_____________________
GDO_GID X1 Y1
_____________________
_____________________
_____________________
THE RELATIONAL
DATA MODEL
25/04/2023

OVERVIEW OF THE LECTURE

• Essential terminology used in relational data model


• Domain, attribute, relation

• Constraints
• Keys
25/04/2023

RELATIONAL DATA MODEL


WHY STUDY RELATIONAL DATA MODEL?

• The introduction of relational data model (by Ted Codd in


1970) is considered as the most important event in the
history of the database field.

• It is based on simple and uniform data structure (the


relation).

• It is based on predicate logic and set theory.


25/04/2023

RELATIONAL DATA MODEL


WHY STUDY RELATIONAL DATA MODEL?

• DBMS used in most GIS makes use of the relational data


model
▪ web browser
User interface
▪ GIS application GUI

Application logic ▪ GIS server/desktop


application

▪ GIS files
Data storage
▪ DBMS
25/04/2023

RELATIONAL DATA MODEL


WHY STUDY RELATIONAL DATA MODEL?

• DBMS used in most GIS makes use of the relational data model
25/04/2023

RELATIONAL DATA MODEL


WHY STUDY RELATIONAL DATA MODEL?

• Recent development used in DBMS is object-relational data model:


• relational model that supports complex (also user-defined) data types – e.g.
geometry
• allows to define functions to use with these data types
• Example object-relational DBMS – PostgreSQL, Oracle
25/04/2023

TERMINOLOGY
UNIVERSE OF DISCOURSE

• We always aim at representing only a part of the real world.

• Universe of discourse is a part of the real world that is of


interest (e.g. to users of a database)
• e.g.: a system that will allow analysis on crop production, yields with relation to a
population constitution in different countries of the world.

• We use data models for representing the universe of discourse


and storing that representation in a database.
25/04/2023

TERMINOLOGY
DATA MODEL

A Data Model is an integrated


collection of:

a ▪ Data structuring primitives


d t a
▪ Rules of how to structure

▪ mechanisms to handle the data in


a database.

In other words, a data model is a toolbox


that allows us to create and manipulate
databases.
25/04/2023

TERMINOLOGY
RELATIONAL DATA MODEL

The relational data model is an integrated collection of:


• data structuring primitives = attributes, tuples, and relations,
25/04/2023

TERMINOLOGY
RELATIONAL DATA MODEL

The relational data model is an integrated collection of:


• data structuring primitives = attributes, tuples, and relations,
• rules of how to structure = data definition language,

CREATE TABLE Productions(

cid single,

crop varchar(255),

annum integer,
score integer,
quality varchar(255)
)
25/04/2023

TERMINOLOGY
RELATIONAL DATA MODEL

The relational data model is an integrated collection of:


• data structuring primitives = attributes, tuples, and relations,
• rules of how to structure = data definition language, and
• mechanisms to handle the data in a database = data manipulation language.

SELECT *

FROM Productions AS p, Countries AS c

WHERE c.ID=p.cid AND p.crop=“Potatoes" AND c.CNAME=“Slovakia”


25/04/2023

TERMINOLOGY
THE LANGUAGE USED IN RELATIONAL DATA MODEL

• Structured Query Language (SQL) – the relational database


language:
• ISO standard (from 1987)
• Powerful and natural language based on relational calculus (specific
version of predicate logic and set theory)

• Which countries produced potatoes last year?


SELECT DISTINCT c.CNAME
FROM Countries AS c, Productions AS p
WHERE c.ID=p.cid AND p.crop=“Potatoes" AND
p.annum=2009 AND p.score>0
25/04/2023

TERMINOLOGY
TUPLES

All the tuples of this relation


contain three attributes:
▪ A table or relation is a collection
PId, Location and AreaSize
of tuples (or records).

▪ A tuple has a fixed number of


named fields.

▪ An attribute is a named field of a


tuple.

This relation contains four tuples


or records
25/04/2023

TERMINOLOGY
ATTRIBUTES

The attribute value is “1462”


▪ Each tuple has a value (attribute
value) for an attribute.

▪ This value must be atomic, i.e. it


has no more than one value.

▪ An attribute’s domain is a set of


values, for example string,
integer, real or date.

number Polygon ▪ The attribute’s domain describes


Number the type of data that can be
stored in an attribute.
The attribute domain
25/04/2023

TERMINOLOGY
DOMAIN

• Domain
• A set of atomic values – e.g.: the domain of the real numbers, the domain of the
dates, the domain of character strings with maximal length 10…

• In database technology, a domain is simply a data type:


• a system-defined type, e.g.: INTEGER, CHAR,VARCHAR, DATE/TIME . . .
• a user-defined type, e.g.: there are only three possible values (*, F and M) for the
‘quality’ attribute in Productions table. This is known as an enumerated type.
25/04/2023

TERMINOLOGY
RELATION

• Relation, Tuple, Attribute in Productions relation in FAOcrops.mdb


25/04/2023

TERMINOLOGY
RELATION

• Relation always consists of two parts:


• relation schema – a bracketed list of attributes with their domains (e.g.
Productions (cid: integer, crop: varchar(255), annum: integer, score: integer, quality:
varchar(5)), and
• relation instance – a set of tuples.

on 18.10.2010 – 15:47
RELATIONS

• Relational DBMS products store data about entities in


relations, which are a special type of table

• A relation is a two-dimensional table that has the following


characteristics:
• Rows contain data about an entity
• Columns contain data about attributes of the entity
• All entries in a column are of the same kind
• Each column has a unique name
• Cells of the table hold a single value
• The order of the columns is unimportant
• The order of the rows is unimportant
• No two rows may be identical
ALTERNATIVE TERMINOLOGY

• Although not all tables are relations, the terms table and relation are
normally used interchangeably
• The following sets of terms are equivalent:
THE RELATIONAL DATA MODEL

Example:
• For the relational data model,
the structures used to define Three tables or relations called
PrivatePerson, Parcel and TitleDeed:
the database are:
• Relations
• Tuples
• Attributes
• Attribute domains

• Relations are commonly


known as tables.
THE RELATION SCHEMA

• When a relation is created we


Relation schema: have to indicate:
PrivatePerson (TaxId: string, Surname: string, Birthdate: date) 1. Name of the relation
2. Attributes
3. Domain for each attribute

Name of the
Relation • The definition of a relation is
Attributes also known as the relation
schema.

Attribute’s domain
THE RELATIONAL DATA MODEL
Database schema:

• The relation schemas of all


PrivatePerson (TaxId: string,
tables (relations) together
Surname: string, Birthdate:date)
make up the database
schema.
Parcel (Pid: number, Location:
polygon, AreaSize: number)
• The database schema is an
important part of the
TitleDeed (Plot: number, Owner:
database design.
string, DeedDate: date)
25/04/2023

RELATIONS
RELATIONAL DATABASE

Relational database
• A collection of relations, where each relation stores facts of a certain type.

A database schema:
• A collection of relation schemata (with no duplicate relation names) – e.g.:
FAOcrop (Productions,Yields, Countries, Population)
25/04/2023

RELATIONS
DATABASE INSTANCE

• A database instance for a database schema is a collection of relation


instances, one for each relation in the database schema – e.g.:
FAOcrops.mdb on 18.10.2010 – 15:47

373729 records

6930 records 231 records 6930 records


KEYS
Key, 1 attribute
• The DBMS must support quick
searches amongst many tuples.
• This is why it uses the notion of key.

• A key of a relation comprises one or


more attributes.
• A value for this/these attributes
uniquely identifies a tuple.

Key, combination of 2
attributes
KEY

• When a key is comprised of two


attributes, a single attribute is not
enough to uniquely identify the tuple.
• In our example, an owner can own
many plots, so owner alone is not
enough.*
• A plot can be owned by many people,
so plot alone is also not enough.*

The same owner owns


two different plots

*The relationship between Person and Parcel


is called a many-to-many relationship.
PRIMARY KEY

• A primary key is a key selected as the primary means of


identifying rows in a relation:
• There is one and only one primary key per relation
• The primary key may be a composite key
• The ideal primary key is short, numeric and never changes
PRIMARY KEY
Primary key
• Primary key of a relation is indicated
in the database schema by an
PrivatePerson (TaxId: string, underline of the attribute name.
Surname: string, Birthdate: date)

• All the attributes that together make


Parcel (Pid:number, Location: up the primary key are underlined.
polygon, AreaSize: number)

TitleDeed (Plot: number, Owner:


string, DeedDate: date)

Primary key
FOREIGN KEY
Foreign key
• A tuple can refer to another tuple by
storing that other tuple’s key value.
• This makes it possible to link the two
relations that store the same tuple
value.

• The attribute is called a foreign key


Primary key when it refers to a primary key.

The table TitleDeed has a foreign key


in its attribute Plot.
This attribute refers to primary key of
the Parcel relation.

You might also like