RSG 311 (2 Units) : Course Contents
RSG 311 (2 Units) : Course Contents
RSG 311 (2 Units) : Course Contents
Course Contents
Concept of Information. Characteristics and types of GIS data.
Principles of cartography. Maps and projections: ellipsoids,
cartographic projections, coordinate systems, types of scales;
accuracy of maps. Inputs to GIS; GIS components (hardware and
software); GIS models; Database design and organization; integration
in GIS; querying; GIS outputs and visualization; accuracy of data and
integration errors. Overview of GIS computer packages. Cartographic
communication and Visualization (Graphic Symbology and Graphic
Variables). Issues in Map Design. Visualization of 3D Data.
.
Introduction to GIS
• GIS is a computer-based system that collect, store, check, analyze, manipulate and present data which are spatially
referenced to a geographical location.
• a computer system capable of assembling, storing, manipulating, and displaying geographically referenced information .
• a system of computer software, hardware and data, and person to help manipulate, analyze and present information that
• Geographic connotes the geographic coordinates in terms of latitude and longitude that define the locations of the spatial
data.
• Information suggests that in a GIS, data are organized to produce much more meaningful knowledge.
• System indicates that a GIS is constituted of interrelated and linked components with different functions.
• Geographical information which include water quality, rock structures, land use, sea surface temperature, mineral
distribution and others are used by GIS
• GIS is data driven. Thus, GIS principles involve data gathering, data processing, database management, data analysis and
modeling
• Data gathering involves data input and output – data input is the procedure of encoding data into a computer readable
format and writing the data to the GIS database
• In GIS, 2 types of data include spatial and attribute data. Spatial data represent the geographic location of features
represented by using point, line and polygon. Attribute data represent the characteristics or description of the spatial
data.
• The spatial data and attribute data need to be entered and correctly linked.
• Five types of data entry systems include keyboard entry, coordinate geometry, digitizing, scanning and input of existing
digitized files
• Data output –is the procedure by which information from the
GIS is presented in a form suitable to the user. 3 formats for
data output are hardcopy, softcopy and electronic.
• Data processing – data collected often cannot be used in
their original form and so pre-processing and post processing
are necessary
• Database management –data storage requires definition of
data structure
• Data structure is the organization of data so that basic
relationships among them can be derived from the database.
Two classes of data structure are raster and vector.
• Data analysis and modeling – involves conventional
principles of image classification such as distance to mean,
nearest neighbor, cluster analysis, map overlaying, proximity
analysis, reclassification, cartographic modeling and others.
Concepts of Data and Information
• GIS stores, edits, processes, and presents data and
information
• Data refer to facts, measurements, characteristics, or traits of an
object of interest
• Data is the plural form of datum. For example different kinds of
data can be collected about all kinds of objects, number of
buildings submerged by floods, total area of cocoa plantation
encroached by urban expansion, values of ecosystem services lost
to anthropogenic activities around the coastal areas.
• Information, on the other hand, refers to the knowledge of value
obtained through the collection, interpretation, and/or analysis of
data. In other words, once data are put into context, used to
answer questions, situated within analytical frameworks, or used
to obtain insights.
• You may not need a computer to collect, record, manipulate, process, or
visualize data, or to process it into information.
• But computers can automate repetitive tasks, store data efficiently in
terms of space and cost, and provide a range of tools for analyzing data
from spreadsheets to GIS.
• Information technology plays a vital role on high volume of data collected
by satellites, grocery store product scanners, traffic sensors, temperature
gauges, and your mobile phone carrier over a period of time
• Geographic data or spatial data refer to geographic facts, measurements,
or characteristics of an object that permit us to define its location on the
surface of the earth
• Spatial data refers to the shape, size and location of the feature.
• Such data include but are not restricted to the latitude and longitude
coordinates of points of interest, street addresses, postal codes, political
boundaries, and even the names of places of interest.
• geographic data are concerned with defining the location of an object of
interest, attribute data are concerned with its nongeographic traits and
characteristics.
Characteristics of spatial data
• Raster data is grid-based (remotely sensed data). Types of raster data are
continuous and discreet. Continuous raster data include temperature and
elevation measurements while discreet raster data is population density.
There are also three types of raster datasets namely thematic data
(DEM), spectral data (spectral signatures of objects) and pictures
(imagery)
• Tabular data refers to attribute data. It provides information about the
characteristics of spatial features.
GIS DATA MODELS
• A data model organizes data elements and standardizes how the data elements relate to one another. A
data model can be sometimes referred to as a data structure.
• A data structure is a particular way of organizing data in a GIS so that it can be used efficiently.
• 2 broad types of data model: Spatial data model and Attribute data model
• Spatial data model: Three basic types of spatial data models have evolved for storing geographic
data digitally namely raster data model, vector data model and Image data model
• Raster data model
• Raster data models incorporate the use of a grid-cell data structure where the geographic area is
divided into cells identified by row and column.
• This data structure is commonly called raster. While the term raster implies a regularly spaced grid
other tessellated data structures do exist in grid based GIS systems.
• In particular, the quadtree data structure has found some acceptance as an alternative raster data
model. The size of cells in a tessellated data structure is selected on the basis of the data accuracy and
the resolution needed by the user.
• There is no explicit coding of geographic coordinates required since that is implicit in the layout of
the cells.
• A raster data structure is in fact a matrix where any coordinate can be quickly calculated if the origin
point is known, and the size of the grid cells is known.
• Since grid-cells can be handled as two-dimensional arrays in computer encoding many analytical
operations are easy to program. This makes tessellated data structures a popular choice for many GIS
software.
• Topology is not a relevant concept with tessellated structures since adjacency and connectivity are
implicit in the location of a particular cell in the data matrix.
• Vector data model-
• Vector storage implies the use of vectors (directional lines) to represent a geographic feature.
• Vector data is characterized by the use of sequential points or vertices to define a linear segment.
Each vertex consists of an X coordinate and a Y coordinate.
• Vector lines are often referred to as arcs and consist of a string of vertices terminated by anode. A
node is defined as a vertex that starts or ends an arc segment.
• Point features are defined by one coordinate pair, a vertex.
• Polygonal features are defined by a set of closed coordinate pairs.
• In vector representation, the storage of the vertices for each feature is important, as well as the
connectivity between features, e.g. the sharing of common vertices where features connect.
• The topologic data structure is often referred to as an intelligent data structure because spatial
relationships between geographic features are easily derived when using them.
• Primarily for this reason the topologic model is the dominant vector data structure currently used in
GIS technology.
• Many of the complex data analysis functions cannot effectively be undertaken without a topologic
vector data structure.
• The secondary vector data structure that is common among GIS software is the computer-aided
drafting (CAD) data structure. This structure consists of listing elements, not features, defined by
strings of vertices, to define geographic features, e.g. points, lines, or areas.
• There is considerable redundancy with this data model since the boundary segment between two
polygons can be stored twice, once for each feature. The CAD structure emerged from the
development of computer graphic systems without specific considerations of processing
geographic features.
Spatial Data Models
• Spatial data are often stored and presented in the form of a map.
• There are three basic types for storing geographic data digitally
• These are raster, vector and image
• The selection of a particular data model, vector or raster largely depends on
the source of data, type of data and the intended use of the data.
• Certain analytical procedures require raster data while others are better
suited to vector data.
• Raster Data Model:
• A simple raster data set is a regular grid of cells divided into rows and
columns.
• In a raster data set, these values may represent an elevation in meters above
sea level, a land use class, a plant biomass in grams per square meter, and so
forth.
• The spatial resolution of the raster data set is determined by the size of the
cell
• The size of cells in a tessellated data structure is selected on the basis of the
data accuracy and the resolution needed by the user.
• A raster data structure is a matrix where any coordinate can be quickly calculated if the
origin point is known, and the size of the grid cells is known.
• Since grid-cells can be handled as two-dimensional arrays in computer encoding many
analytical operations are easy to program.
• This makes tessellated data structures a popular choice for many GIS software.
• Topology is not a relevant concept with tessellated structures since adjacency and
connectivity are implicit in the location of a particular cell in the data matrix.
• Since geographic data is rarely distinguished by regularly spaced shapes, cells must be
classified as to the most common attribute for the cell.
• The problem of determining the proper resolution for a particular data layer can be a
concern. If one selects too coarse a cell size then data may be overly generalized.
• If one selects too fine a cell size then too many cells may be created resulting in a large
data volume, slower processing times, and a more cumbersome data set
• Most data is captured in a vector format, e.g. digitizing, data must be converted to the
raster data structure - called vector-raster conversion
• Most raster based GIS software requires that the raster cell contain only a single discrete
value
• A data layer, e.g. forest inventory stands, may be broken down into a series of raster
maps, each representing an attribute type, e.g. a species map, a height map, a density
map, etc. Referred to as one attribute maps
• In contrast to most conventional vector data models that maintain data as multiple
attribute maps
Advantages of Raster Data
• The geographic location of each cell is implied by its
position in the cell matrix. Accordingly, other than an
origin point, e.g. bottom left corner, no geographic
coordinates are stored.
• Due to the nature of the data storage technique, data
analysis is usually easy to program and quick to
perform.
• The inherent nature of raster maps, e.g. one attribute
maps, is ideally suited for mathematical modeling and
quantitative analysis.
• Discrete data, e.g. forestry stands, is accommodated
equally well as continuous data, e.g. elevation data, and
facilitates the integrating of the two data types.
Disadvantages
• The cell size determines the resolution at which the data is
represented
• It is especially difficult to adequately represent linear
features depending on the cell resolution. Accordingly,
network linkages are difficult to establish
• Processing of associated attribute data may be cumbersome
if a large amount of data exists. Raster maps inherently
reflect only one attribute or characteristic for an area.
• Since most input data is in vector form, data must undergo
vector-to-raster conversion
• Besides increased processing requirements this may
introduce data integrity concerns due to generalization and
choice of inappropriate cell size
Vector Data Models
• The vector data model is based upon vectors as opposed to space occupancy of raster data
structures.
• The fundamental primitive of the vector model is a point.
• The various objects are created by connecting the points with straight lines, but some systems
allow the points to be connected using arcs of circles.
• The areas are defined in this model by sets of lines.
• Polygon is synonymous with area in vector databases because of the set of straight-line
connections between points
• Two commonly used Vector data models in GIS data storage:
• The topologic data structure: referred to as an intelligent data structure because spatial
relationships between geographic features are easily derived when using them.
• Primarily for this reason the topologic model is the dominant vector data structure currently used in
GIS technology.
• Many of the complex data analysis functions cannot effectively be undertaken without a topologic
vector data structure.
• The secondary vector data structure that is common among GIS software is the computer-aided
drafting (CAD) data structure
• This structure consists of listing elements, not features, defined by strings of vertices, to define
geographic features, e.g. points, lines, or areas
• There is considerable redundancy with this data model since the boundary segment between two
polygons can be stored twice, once for each feature.
• The CAD vector model lacks the definition of spatial relationships between features that is defined
by the topologic data model.
Advantages of Vector data models
• Data can be represented at its original resolution
without generalization
• Graphic output is usually more aesthetically
pleasing.
• Since most data, e.g. hard copy maps are in vector
form, no conversion is required.
• Accurate geographic location of data is maintained.
• Allows for efficient encoding of topology, and as a
result more efficient operations that require
topological information, e.g. proximity, network
analysis
Disadvantages
• The location of each vertex needs to be stored
explicitly
• Algorithms for manipulative and analysis
functions are complex and may be processing
intensive.
• Continuous data, such as elevation data, is not
effectively represented in vector form.
• Spatial analysis and filtering within polygons
are impossible.
Topology
• The explicit nature of the relationships in vector GIS requires ‗topology
• It also allows much easier analysis of these kinds of relationships,
especially connectivity between locations (points), which is done with
lines.
• If a map is stretched and distorted, some of its properties change,
including: Distances, Angles and Relative proximities
• Other properties remain constant: areas remain areas, lines remain lines,
and points remain points; Adjacencies and other relationships, such as "is
contained in", "crosses" (intersecting arcs) are maintained
• Strictly, topological properties are those, which remain unchanged after
distortion
• A spatial database is often called "topological" if one or more of the
following relationships have been computed and stored; Connectedness
of arcs at intersections, Ordered set of arcs forming each polygon
boundary, Adjacency relationships between areas.
• In general, "topological" implies that certain relationships are stored,
making the data more useful for various kinds of spatial analysis
Image Data Model
• Image data is most often used to represent graphic or pictorial data
• The term image inherently reflects a graphic representation, and in the GIS world,
differs significantly from raster data.
• Most often, image data is used to store remotely sensed imagery, e.g. satellite scenes
or orthophotos, or ancillary graphics such as photographs, scanned plan documents,
etc.
• Image data is typically used in GIS systems as background display data (if the image has
been rectified and georeferenced); or as a graphic attribute.
• Remote sensing software makes use of image data for image classification and
processing.
• Typically, this data must be converted into a raster format (and perhaps vector) to be
used analytically with the GIS
• Image data is typically stored in a variety of de facto industry standard proprietary
formats. These often reflect the most popular image processing systems.
• Other graphic image formats, such as TIFF, GIF, PCX, etc., are used to store ancillary
image data.
• Most GIS software will read such formats and allow you to display this data.
ATTRIBUTE DATA MODELS (DBMS Models used in GIS):
• A separate data model is used to store and maintain attribute
data for GIS software.
• These data models may exist internally within the GIS software,
or may be reflected in external commercial Database
Management Software (DBMS).
• A variety of different data models exist for the storage and
management of attribute data.
• The most common are:
• Tabular Model : This type of data model is outdated in the GIS
arena. It lacks any method of checking data integrity, as well as
being inefficient with respect to data storage, e.g. limited
indexing capability for attributes or records, etc.
• Hierarchical Model:
• The hierarchical database organizes data in a tree structure.
• Data is structured downward in a hierarchy of tables.
• Any level in the hierarchy can have unlimited children, but any child can have
only one parent.
• Hierarchical DBMS have not gained any noticeable acceptance for use within GIS.
• They are oriented for data sets that are very stable, where primary relationships
among the data change infrequently or never at all.
• Also, the limitation on the number of parents that an element may have is not
always conducive to actual geographic phenomenon.
• Network Model :
• The network database organizes data in a network or plex structure (Any data
structure that permits data to be organized into interconnected and interrelated
groupings)
• Any column in a plex structure can be linked to any other
• Like a tree structure, a plex structure can be described in terms of parents and
children.
• This model allows for children to have more than one parent
• Relational Model:
• The relational database organizes data in tables.
• Each table, is identified by a unique table name,
and is organized by rows and columns.
• Each column within a table also has a unique
name.
• Columns store the values for a specific
attribute, e.g. cover group, tree height.
• Data is often stored in several tables
• Tables can be joined or referenced to each other by
common columns (relational fields).
• common column is an identification number for a
selected geographic feature,
• This identification number acts as the primary key for the
table.
• The ability to join tables through the use of a common
column is the essence of the relational model.
• Such relational joins are usually ad hoc in nature and
form the basis of querying in a relational GIS product.
• The relational database model is the most widely
accepted for managing the attributes of geographic data.
The relational DBMS has the following advantages
• Simplicity in organization and data modeling
• Flexibility - data can be manipulated in an ad
hoc manner by joining tables
• Queries do not need to take into account the
internal organization of data
• Efficiency of storage-proper design of data
tables can reduce redundancy.
• The relational DBMS has emerged as the
dominant commercial data management tool
in GIS implementation and application.
Object Oriented Model
• The object-oriented database model manages data through
objects
• An object is a collection of data elements and operations that
together are considered a single entity
• The object-oriented database is a relatively new model
• This approach has the attraction that querying is very natural, as
features can be bundled together with attributes at the database
administrator's discretion.
• only a few GIS packages are promoting the use of this attribute
data model.
• However, initial impressions indicate that this approach may
hold many operational benefits with respect to geographic data
processing.
• Fulfilment of this promise with a commercial GIS product
remains to be seen.
GIS database design and organization
• There is a need for developing error statements for data contained within geographic information systems (Vitek et al, 1984).
• The integration of data from different sources and in different original formats (e.g. points, lines, and areas), at different original
scales, and possessing inherent errors can yield a product of questionable accuracy (Vitek et al, 1984).
• The accuracy of a GIS-derived product is dependent on characteristics inherent in the source products, and on user requirements,
such as scale of the desired output products and the method and resolution of data encoding (Marble, Peuquet, 1983).
• The highest accuracy of any GIS output product can only be as accurate as the least accurate data theme of information involved in
the analysis (Newcomer, Szajgin, 1984).
• Accuracy of the data decreases as spatial resolution becomes more coarse (Walsh et al, 1987). ; and
• As the number of layers in an analysis increases, the number of possible opportunities for error increases (Newcomer, Szajgin,
1984).
•
• Coordinate system and its usefulness in GIS functionality
• A coordinate system - a reference system used to represent the locations of geographic
features.
• It enables geographic datasets to use common locations for integration.
• Coordinate systems enable you to transform vector maps from one coordinate system to the
other.
• For general analysis purposes, it is advised to use one coordinate system for all your maps.
This coordinate system should be wide enough to cover all X- and Y-coordinates that should
be stored in your maps
• Each coordinate system is defined by: Its measurement framework either geographic Or
planimetric. Geographic- spherical coordinates are measured from the earth's center.
Planimetric - the earth's coordinates are projected onto a two-dimensional planar surface
• Types of coordinate systems: Geographic coordinate systems otherwise known as global or
spherical coordinate systems (latitude-longitude)
• Projected coordinate system – often referred to as map projections.
• It is based on map projections - transverse Mercator, Albers equal area, or Robinson etc
• provide various mechanisms to project maps of the earth's spherical surface onto a two-
dimensional Cartesian coordinate plane
• Global and local coordinate systems in GIS
• Global systems (Latitude, Longitude, and Height)
• The Prime Meridian and the Equator are the reference planes used to define latitude and longitude
• The geodetic latitude (there are many other defined latitudes) of a point is the angle from the
equatorial plane to the vertical direction of a line normal to the reference ellipsoid.
• The geodetic longitude of a point is the angle between a reference plane and a plane passing
through the point, both planes being perpendicular to the equatorial plane.
• The geodetic height at a point is the distance from the reference ellipsoid to the point in a direction
normal to the ellipsoid.
• Examples: Earth Centered, Earth Fixed Cartesian coordinates, Universal Transverse Mercator (UTM)
and World Geographic Reference System
• Local coordinate systems:
• Universal Polar Stereographic (UPS): UPS is defined above 84 degrees north latitude and south of
80 degrees south latitude. The eastings and northings are computed using a polar aspect
stereographic projection. Zones are computed using a different character set for south and north
Polar Regions.
• National Grid Systems: Many nations have defined grid systems based on coordinates that cover
their territory. The British National Grid (BNG) is based on the National Grid System of England,
administered by the British Ordnance Survey.
• State Plane Coordinates: In the United States, the State Plane System was developed in the 1930s
and was based on the North American Datum 1927 (NAD27).
• Geographic coordinate systems
• is a coordinate system that enables every location on the Earth to be specified by a set
of numbers or letters
• are often chosen such that one of the numbers represents vertical position, and two or
three of the numbers represent horizontal position.
• A common choice of coordinates is latitude, longitude and elevation.
• Lines joining points of the same latitude trace circles on the surface of the Earth called
parallels, as they are parallel to the equator and to each other.
• The North Pole is 90° N; the South Pole is 90° S.
• The 0° parallel of latitude is designated the equator, the fundamental plane of all
geographic coordinate systems. The equator divides the globe into Northern and
Southern Hemispheres.
• The Longitude of a point on the Earth's surface is the angle east or west from a
reference meridian to another meridian that passes through that point
• All meridians are halves of great ellipses which converge at the north and south poles.
• A line passing near the Royal Observatory, Greenwich (near London in the UK) has
been chosen as the international zero-longitude reference line, the Prime Meridian.
• The antipodal meridian of Greenwich is both 180°W and 180°E.
• Projected coordinate systems
• is a flat, two-dimensional representation of the Earth
• It is based on a sphere or spheroid geographic system, but it uses linear
units of measure for coordinates, so that calculations of distance and
area are easily done
• The latitude and longitude coordinates are converted to x, y
coordinates on the flat projection
• Mathematical formulas are used to convert a three-dimensional
geographic coordinate system to a two-dimensional flat projected The
transformation is referred to as a map projection
• Depending on the projection used, different spatial properties will
appear distorted.
• The most common types of map projections include: Equal area
projections, Conformal projections and equidistant projections.
•
Map and Projection
• A map - a flat representation of a curved surface
• Maps – more useful than globes in many situations
• Maps are more compact and easier to store
• Readily accommodate a large range of scales
• Viewed easily on computer displays
• Facilitate measuring properties of the terrain being mapped
• Have capability to show larger portions of the earth’s surface at once
• Maps are products of map projections
• map projection - transforms the surface of the earth or a portion of the
earth on a flat surface.
• a systematic transformation of the latitudes and longitudes of locations on
the surface of a sphere or an ellipsoid into locations on a plane
• Map projections are necessary for creating maps.
• Map projections - preserve one or more of these properties: Area, Shape,
Direction, Bearing, Distance and Scale
• Some projections minimize distortions in some of these properties at the expense of
maximizing errors in others.
• Conformality: the projection is conformal When the scale of a map at any point on the map
is the same in any direction,
• Shape is preserved locally on conformal maps.
• Distance: equidistant map showing distances from the centre of the projection to any other
place on the map
• Scale
• Area –equal area map
Choice of Projections
• Concept of a developable surface - a surface can be unfolded or unrolled into a plane or
sheet without stretching, tearing or shrinking.
• Surfaces –cylinder, cone and plane are good examples
• However, sphere and ellipsoid do not have developable surfaces.
• Hence, projection of such surfaces on to a plane distorts the image
• Azimuthal preserves direction from one or two points to every other point.
• Conformal or orthomorphic preserves shape locally
• Equal-area or equiareal preserves area.
• Equidistant preserves distance.
• Gnomonic preserves shortest route