Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

2021 L4 GGY283 Data Models 1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 28

2021/03/26

GGY 283
Theme 2: Data models
L4 – Data models Part 1

©2021

Introduction to GIS ©2021 1

Theme 2 – Relevant texts


Prescribed
1. Bolstad - Chapter 2

Recommended
1. Di Biase – Chapter 1
2. De Smith et.al. – Chapter 2
3. Essentials of GIS - pp 61-70, 76-93, 104-106.
4. GIS Commons. Chapter 1 and Chapter 2 no. 1-4
5. Olaya, V. – p25 to 36;
6. Sutton et.al. - Topics 2,3,5 and 6.

Introduction to GIS ©2021 2

1
2021/03/26

Spatial Data Models Part 1


Study goals (L4)
• Explain in broad terms what a GIS model is and how it represents the real world
entities on computer.
• Explain how coordinates are used to describe the location of features on the earth.
• Discuss the terms “georeferenced” and “scale” in relation to spatial data.
• Name and describe the two common spatial data models used in a GIS.
• Explain how a vector data model represents real world entities in a GIS.
• Discuss the importance of vector topology and be able to interpret and briefly discuss,
and differentiate between, basic topological constructs.
• Scale (level) of measurement of attribute data: Explain how attribute data are used to
describe spatial features in a GIS and differentiate between the scale/levels of
measurement.

Introduction to GIS ©2021 3

GIS as a Real-world model


• To bring the real world into GIS, one has to make use of
simplified models of the real world.
Real world
example
• Uniformed phenomena can be classified and described
in the real-world model, then converted into a data
model by applying elements of geometry and quality.

• The data model is stored in a database that can handle


digital data, from where the data can be presented in
selected ways, e.g.
– Based on a categorized classification
Simplified
– Based on certain attributes model
– Based on relationships …
Introduction to GIS ©2021 4

2
2021/03/26

Data Models
A model facilitates the study of a selected area of application by reducing the number of
complexities considered. (It simplifies reality and limits unnecessary detail)
Computers and digital data models
• Unlike humans, computers cannot “learn” all the characteristics of manholes, property
lines, lakes etc.
• What computers can do, is to manipulate geometric objects, such as point, lines and
polygons or manage cells in rows and columns.

To use GIS, the real world must be abstracted using a model of presentation, e.g.
Vector points / lines / polygons
OR
Raster cells

Introduction to GIS ©2021 5

Example:
How is a “Real-world” model established in GIS
• The basic carrier of information is the entity, which is defined as a
real-world phenomenon that is not divisible into phenomena of the
same kind.

• An entity in a GIS:
Real world
Real world Data model Database Maps/reports
model

Entity: Object: Object: Symbols, lines,


- Type - Type - Type text …
- Attribute - Attribute - Attribute
- Relationship - Relationship - Relationship
- Geometry - Geometry
- Quality - Quality

Introduction to GIS ©2021 6

3
2021/03/26

Typical vector data model: Entities and Objects


• Objects are characterized by:
– Type Object
– Attributes Object Object
– Geometry
– Relations Entity
– Quality
• A single entity may comprise several objects.
– E.g. a road may be represented as a compilation of several connected road sections.

• An object may belong to a class (feature class), e.g. a line feature class for roads.
• A class contains a set of objects with specific themes or types, e.g. all objects are types of roads.
• Each object have attributes, e.g. for roads: ID number, type, route number, length, etc.

Introduction to GIS ©2021 7

Data models: Object relations


• Relations between objects are often encountered.
– Pertains / belongs to
– Comprises / contains
– Located in / on
– Borders on, etc.
• Relations may be calculated from:
– The coordinates of an object, e.g. lines that intersect.
– Object structure, such as the beginning and end points of a line, lines that
form a polygon, etc..
– Relations that can be entered as attributes, such as the division of a country
into provinces.

Introduction to GIS ©2021 8

4
2021/03/26

Geographic data components (Example – vector data)


Geographic data objects have geometry. Each geometric object is linked to attributes.

Points
Geometry Lines
Polygons
Geographic
data Qualitative data values Nominal

Attributes Ratio

Quantitative data values Ordinal


Interval
9

Data models: Data quality


• The true value of any description of reality depends on the
quality of all the data it contains: e.g. the quality of its
geometry, attributes and relations.
• Accurate geometric data obviously describe reality more
faithfully than data of a lower accuracy.
• Recently updated data are preferable to older data (temporal
factors).
Note:
In the Practical 2 crowdsourcing exercise, you experienced one example of data collection (Task 1) and the associated
data quality concerns (Task 2).
We will discuss various concepts around data quality in more detail later in the course.

Introduction to GIS ©2021 10

5
2021/03/26

Data *Abstraction (from reality to a model)


When collecting or displaying the real world features in a GIS, it
is important to identify what is necessary and what is not.

• Scale matters!
• Intended use matters!

* Abstraction refers to the process of taking away, or


hiding, certain characteristics of something in order to
simplify it by showing only the essential bits.
Introduction to GIS ©2021 11

Important properties of spatial data

1. Georeferenced
- It is referenced to a geographic space.

2. Scale
- It can be collected and represented at a variety of scales.

Introduction to GIS ©2021 12

6
2021/03/26

1. First important property of geospatial data (It is georeferenced)


In GIS, the data is referenced to a geographic space (a location).
• Data is registered to an accepted geographic coordinate system (mostly on/above/below the
Earth’s surface).
Note: When georeferencing (placing) features on a screen or map – there must be a transformation of
coordinate locations from the earth’s curved surface onto a flat surface – this is referred to as map
projection (more detail later).

Introduction to GIS ©2021 13

2. Second important property of geospatial data


Data can be collected / represented at a variety of
scales:

• Small scale representing large area (less detail)


–1:1,000,000
–Generalization and symbolization required
• Large scale representing small area (more detail)
–1:10,000

Introduction to GIS ©2021 14

7
2021/03/26

Example: Map scales

Source and additional Reading:


http://myscienceschool.org/index.php?/arc
hives/11783-WHAT-IS-THE-SCALE-OF-A-
MAP.html
Introduction to GIS ©2021 15

Test your understanding of scale:


1. Which map has more
Map A B
detail?
2. Which map has less
line segments? A

3. Which map is more


suitable for a large scale
map application? B
Map B

Falkland Islands
4. Which map
will take up
Note the Spatial data
collection/presentation vs scale more data
– the number of line segments Source: OpenStreetMap storage space? B
www.openstreetmap.org
vs faithful approximation
Introduction to GIS ©2021 16

8
2021/03/26

Discrete versus continuous spatial data


Spatial data can be:
– Discrete OR - Continuous
Represents a continuous phenomenon
Represents identifiable discrete for which it is difficult to identify exact
real world features with well boundaries e.g. elevation, temperature,
defined boundaries, e.g. roads, air pressure …
buildings, parks, provinces, etc.

Municipal Demarcation Board


http://www.demarcation.org.za/

Introduction to GIS ©2021 17

TWO Spatial Data Models used to represent the real world in GIS

In a GIS, spatial data is mainly stored in the


following two representation models:
1. Vector (Points, lines & polygons)
2. Raster cells (Tessellation)

Raster
not Rasta!

Introduction to GIS ©2021 18

9
2021/03/26

Example: Vector versus Raster data models

Source: Bolstad, Chapter 2

Introduction to GIS ©2021 19

Vector Data types


Vector Data:
1 Points (Zero-dimensional)
• Uses single coordinate pair to represent the location of an entity.
• Example: accident location, gas wells.
• Attribute data attached to each point.

2 Lines (One-dimensional)
• Linear features, often represented by arcs.
• Most often represented as an ordered set of coordinate pairs.
• Curved line -> collection of short straight line segments.

3 Polygons (Two-dimensional)
• Has interior region.
• Lake, province, building, etc.

Introduction to GIS ©2021 20

10
2021/03/26

Vector data types (example)


– Examples

Station - point

Rail line - line

Stadium - polygon

Introduction to GIS ©2021 21

Vector data – Impact of the Scale of interest

City - Point

Scale of interest determines the


choice of geometry type – e.g. on
a smaller scale Bloemfontein
(Mangaung Metro) may be presented Source: AfriGIS
http://maps.afrigis.co.za
with a vector point

Introduction to GIS ©2021 22

11
2021/03/26

Vector data - Scale of interest (cont.)

City - Polygon

Scale of interest determines


the choice of geometry type,
e.g. on a larger scale a polygon
may be used to represent the
AfriGIS
city of Bloemfontein http://maps.afrigis.co.za
(Mangaung Metro)

Introduction to GIS ©2021 23

Vector data – discreet versus continuous


Although Vector data are mostly used to show discreet phenomena, it can also be used to
represent geographic continuous data, A function / interpolation determines the value at
each point in space, e.g. isolines (like contour lines/isohyets/isotherms/isobars etc.)
Contours Isohyets

https://www.researchgate.net/figure/304201389_fig1_Figure-4-Climatic-
conditions-of-the-Southern-African-coastline-The-Kalkkop-Crater-falls

Introduction to GIS ©2021 24

12
2021/03/26

Vector data geometry and attributes


Vector geometry is made up of one or more interconnected vertices. A vertex describes a position in space using
an x, y axis (Adding a ‘z’ e.g. elevation axis is optional)

Source: QGIS documentation @ https://docs.qgis.org/2.8/en/docs/gentle_gis_introduction/vector_data.html


25

Vector features and Attribute tables


• Vector objects (features) are associated with non-spatial characteristics.
• A data table is used to organize the attributes.
Provincial vector data layer linkage
Entries in attribute data

Introduction to GIS ©2021 26

13
2021/03/26

• The attribute table has unique values in the ‘identifier’ column, e.g
the FID column below.
• This column is usually listed first and used to identify each feature
(i.e. province) uniquely.
• Other attributes are organised in fields (columns) with field names
and values.

Introduction to GIS ©2021 27

Relationship between Spatial &


Attribute Data (Example)
(Diagram from Bolstad Pg. 32)

Point object showing The fire


the approximate location of hydrants have
fire hydrant entity characteristics
(attributes)
which are
organized in
fields (columns)
In this example a vector point
layer (Feature class) has
objects (fire hydrants) which
are arranged as records Each cell has
(rows) in the Attribute table an attribute
value

Image source: Bolstad Chapter 2


Introduction to GIS ©2021 28

14
2021/03/26

Vector Topology
Topology refers to data structures which retains certain
geometric properties and relationships when the forms are bent,
stretched or undergo similar transformations.
involves a set of rules on how objects relate to each other.
1
X 1 Y Y E.g. Figure B clearly
X 3 2 underwent transformation
3 2 from Fig A, but the relation
between the components
A Z
transformed Z B are still the same, e.g. Arc 1
still connects vertices X and
Y and the arcs 2 & 3 still
both connect to node Z, etc.

Introduction to GIS ©2021 29

Topological versus non-topological data structures


Topological data structure:
– Stores topological characteristics (spatial relationships) of spatial features and ensure
that , for instance, lines that should connect or meet at a specific point, do so.
Non-topological (also referred to as “Spaghetti” structures)
- Does not store such relationships and errors may not be easily detected.

Source: Bolstad, Chapter 2

Introduction to GIS ©2021 30

15
2021/03/26

What are the main advantages of a topological data


structure in a vector data set?

• Topology rules help in Error Detection, by identifying


– open polygons
– unlabeled polygons
– slivers
– polygons that cannot exist next to each other, etc.
• Topology rules are important for Network Modeling
– Flow of resources, routing, faster adjacency analysis, …

Introduction to GIS ©2021 31

Example: Network topology (one-dimensional)


• Topological relationships among points and polylines
• Nodes + edge primitives (non-intersecting)
• Network topology requires to-node
nodes at the start and end edge1
of all edges (lines), but lines
may cross without them.
• Nodes may (or may not) from-node edge 2
occur where lines cross, but
nodes are essential at edge 3
intersections for network
analysis like routing.
• There are no polygons in
one-dimensional topology. node
(topological junction)

Note: edge=arc=chain=segment=line
Introduction to GIS ©2021 32

16
2021/03/26

Example: Network topology (one-dimensional)

• Planar: when each intersection is recorded as a node

node (planar)

Introduction to GIS ©2021 33

Example: Network topology (one-dimensional)


• Non-planar: when edges cross without producing a node

no node (non-planar)

Introduction to GIS ©2021 34

17
2021/03/26

Example: Network topology (one-dimensional)


• Allows connectivity checks, network computations, shortest
path, etc.
A C

Shortest path
from B to C
no node

B
Introduction to GIS ©2021 35

Example: Network topology (one-dimensional)


• Allows connectivity checks, network computations, shortest path, etc.

node
Shortest path
from B to C

B
Introduction to GIS ©2021 36

18
2021/03/26

Examples of networks
– Road networks; utility networks (e.g. water and sewage)

E.g. Route options from UP to UNISA

Introduction to GIS ©2021 37

Planar & Polygon topology (two-dimensional)


Planar polygon topology requires that all features occur on a 2D surface.
There can be no overlaps among lines or polygons in the same layer.
When planar topology is enforced, lines may not cross over or under other lines.
At each line crossing there must be an intersection. Isolated point:
list of arcs is empty

Planar network of polygons


to-node
node polygon 1
polygon 2

edge with a polygon left, and


from-node a polygon right

Introduction to GIS ©2021 38

19
2021/03/26

Planar topology (two-dimensional)


Example: Vector
features and topology
tables as discussed in
Bolstad, Chapter 2.

Introduction to GIS ©2021 39

Characteristics of Topological data structures


Advantages: Disadvantages:
 No redundancy (less storage) Disadvantages:
 Each point/line stored only once Complex data structure slows down
 Efficient computation some operations
 topological queries More effort to maintain (keep data
 spatial analysis ‘clean’)
 Basis for more powerful/complex data structures
 Fast and efficient updates
 Change in boundary once for all polygons
 Geometrically more correct (single boundaries, etc.)
 Aesthetically more pleasing

Introduction to GIS ©2021 40

20
2021/03/26

Characteristics of Non-topological data structures (spaghetti data)

• Early vector models


• Lines are captured individually with explicit starting and ending nodes,
and intervening vertices used to define the shape of the line.
• Records each line separately.
• The model does not explicitly enforce or record connections of line
segments when they cross, nor when two line ends meet.
• A shared polygon boundary may be represented twice, with a line for
each polygon on either side of the boundary.

Introduction to GIS ©2021 41

Non-topological data structures (spaghetti data) – cont.


Advantages

 Cartographic data structure (for map displays, pretty picture).


 Each object described independently.
 Topological relationships are computed on demand.
 Simple!
 Easy to add objects
Disadvantages:

BUT
 Redundancy: shared border stored for both polygons
 Inconsistency: adjacent border with slightly different coordinates,
etc.
 Undetected errors and no network analysis will be possible

Introduction to GIS ©2021 42

21
2021/03/26

Non-topological data structures (spaghetti data)


Typical examples:
– Rivers, nature tracks, cadastre, etc.

Source: DWAF

Introduction to GIS ©2021 43

L4 - Class exercise 1

1. Explain what topology is?


Topology refers to geometric properties/relationships that
______________________ when geographic data undergo
transformations.

2. Topology is important for _______ detection and


______________ modelling

Introduction to GIS ©2021 44

22
2021/03/26

3. Find the error in the following Left-Right Topology list. Just give
the number of the Arc where the error occurred.

http://2012books.lardbucket.org/books/geographic-information-system-basics

Introduction to GIS ©2021 45

4. Create an Arc-Node List for the following figure


The arc-node list identifies the from and to nodes for each arc. Connected arcs are determined by searching through the
list for common node numbers. In the following example, it is possible to determine that arcs 1, 2, and 3 all intersect
because they share node 11. The computer can determine that it is possible to travel along arc 1 and turn onto arc 3
because they share a common node (11), but it’s not possible to turn directly from arc 1 onto arc 5 because they don’t
share any common node.

10 1 11 2 12

Arc From To
Node Node
3
1 10 11 4 14 5
2 11 12 13 15
3 11 13
4 13 14 6
5 14 15
6 14 16 16
Introduction to GIS ©2021 46

23
2021/03/26

5. Create a Left-right Topology table for the following figure

1 2
4 Arc Left Polygon Right Polygon
B 1 A B
5 C
2 A C
A D 3 B C

6 3 4 B C
5 C D
6 B D

Introduction to GIS ©2021 47

Attribute data values:


Four fundamental measurement levels / scales
Numeric values can represent four types of information: nominal data (class), ordinal
data (rank), interval data (ordered scale), or ratio data (continuous scale). The type
of information represented influences how the values can be interpreted or used.

Source:
http://training.esri.com/Cours
es/Rasters
Introduction to GIS ©2021 48

24
2021/03/26

Examples of measurement scales


Nominal - Unordered categories
• Land cover types like Forest, Savanna, Grassland, Waterbodies, Built-up areas
• Census ward names or ID codes (Any code that identifies something)
• Soil types, Vegetation types, etc.
Note:
Ordinal - Ordered Categories
The measurement scale determines
• Flood risk: low, medium, high
the type of calculations / function
• Income: low, medium, high
that are appropriate.
Interval - Continuous data without an inherent 0 value:
E.g. You cannot multiply Census
• Temperature (0° is arbitrary and different in °C or °F)
ward ID numbers and get a sensible
• pH scale
result.
• Time of day / calendar years
In this introductory course we will
Ratio - Continuous data with an inherent meaningful 0 value
primarily use nominal and ratio data
• Precipitation (rainfall)
scales
• Population total / Population density
• Elevation above / below sea level (if sea level is indicated as 0)
• Actual number of vehicles, trees, people, plants, etc.

Introduction to GIS ©2021 49

L4 - Class exercise 2

Attribute values: Identify the level of measurement

Introduction to GIS ©2021 50

25
2021/03/26

Level of measurement?
Cell values represent the relative frequency of occurrence of
a certain plant species in the area represented by each cell

Abundant
1

1
2

1
4

4
Common ?
Rare
2 3 3
None

Hint: The cell values have relative meaning, but the numeric difference between the values is not
meaningful (it is a code that represents how commonly the plants occur in that cell). For
instance, you can say that cells with a value of 2 has less plants than the cells where the value is
1, but you cannot say that it has half as many as cells with the value of 1.
Adapted from: http://training.esri.com/Courses/Rasters

Introduction to GIS ©2021 51

Level of measurement?
Cell values represent the total number of wild garlic
plants in the areas represented by each cell.

58 81 189
< 150

151 - 300
?
117 162 254
301 - 450
345 410 473
> 450

Hint: The cell values represent an actual count of plants in each cell.
Cell values have been grouped (classified) for visualization purposes.
Adapted from: http://training.esri.com/Courses/Rasters

Introduction to GIS ©2021 52

26
2021/03/26

Level of measurement?
Cell values represent a scale from 1-10 to represent
crop damage after a hail storm.

9.5 7.1 5.2 Unrecoverable


?
8.8 6.2 4.3

5.9 3.6 1.8 Fully recoverable

HINT: The numbers represent a scale ranging from 1 to 10 to represent crop


health in a field (indicated from red to green). The value 10 represents
unrecoverable damage and the value 1 means that there is little damage and
the plants can recover. Adapted from: http://training.esri.com/Courses/Rasters

Introduction to GIS ©2021 53

Level of measurement?
Cell values represent the soil and sun conditions.
1 0
1
1 1

2 0 ?
2 1

Hint: The cell values are codes where the first digit indicates soil
moisture and the second digit represents the amount of sunlight.
Source: training.esri.com/Courses/Rasters

Introduction to GIS ©2021 54

27
2021/03/26

References
• Bernhardsen, 2002. Geographic Information Systems an Introduction 3rd Edition
• Bolstad, Paul. 2012: GIS Fundamentals, A First Text on Geographic Information Systems,
• Clark, 2003. Getting Started with Geographic Information Systems 4th Edition,
• Chang, 2004. Introduction to Geographic Information Systems 2nd Edition
• Sutton, T. Dassau, A and Sutton, M., 2009. A Gentle Introduction to GIS.

Online sources:

Vector Topology Types. TNTgis – Advanced software for Geostapial Analysis, MicroImages, Inc.
(2003) https://www.microimages.com/documentation/TechGuides/68vtopo1.pdf

QGIS online documentation, https://docs.qgis.org/2.8/en/docs/gentle_gis_introduction/topology.html

ESRI online documentation,


http://webhelp.esri.com/arcgiSDEsktop/9.3/index.cfm?TopicName=Values_and_what_they_represent

Introduction to GIS ©2021 55

WHAT IS NEXT?
Theme 2 – P3 & L5

Please let us know if you have other questions!


Use the GIS chats discussion board on clickUP

Introduction to GIS ©2021 56

28

You might also like