Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
20 views

Basic Data Structures For Both Vector and Raster

Uploaded by

andreia anjos
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

Basic Data Structures For Both Vector and Raster

Uploaded by

andreia anjos
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

Basic data structures for both vector and raster

Public Participatory GIS and Spatial Data Infrastructure in Disaster Management


Presenters: Alberto Vavassori and Chiara Gerosa, Politecnico di Milano
Lecture Outline

1. A short introduction: from real World to Raster and Vector Data Models
2. Raster Data Model
• Raster Data Model: generalities
• Raster Data Model: formats
• How to import a Raster File in QGIS 3.10
3. Vector Data Model
• Vector Data Model: generalities
• Vector Data Model: formats
• How to import a Vector Layer on QGIS 3.10
4. Vector or Raster Data models: how to choose?
5. GeoPackage format
• GeoPackage format in QGIS 3.10

Basic data structures for both vector and raster


A short introduction: from real World to Raster and
Vector Data Models

GIS is dedicated to georeferenced information.

Spatial information component can be structured according to two different


models:

• Raster Model

• Vector Model

That differently represent real objects, possibly divided into two abstractions:
• Continuous fields (i.e. elevation, land use)
Real world can be described by a number of variables, each one defined in
different positions, on a continuous surface

• Discrete objects (i.e. a tree, a river)


Objects can be modelled as points, lines or polygons

Basic data structures for both vector and raster


A short introduction: from real World to Raster and
Vector Data Models

Raster data model:


The information is represented on a regular grid (matrix) of cells (pixels). To
each raster cell a numeric value is assigned, which can represent any kind of
information about that geographic location, for example height in meters.

Vector data model:


Feature boundaries are defined by x,y(z) coordinate pairs (triplets), which refer to
a location in the real world.
• Points are defined by a single x,y coordinate pair
• Lines are defined by two or more linked x,y coordinate pairs
• Polygons are defined by a closed line that forms the polygon boundaries

Basic data structures for both vector and raster


A short introduction: from real World to Raster and
Vector Data Models

Both vector and raster data models are useful for representing geographic data, but one may be
more appropriate than the other when it comes to representing a particular type of geographic data
or answering different types of questions.

Basic data structures for both vector and raster


Raster Data Model: generalities

• The information is stored in a matrix made up of cells called “pixels” (smallest unit of the
image).
• Each pixel stores a single value, or a combination of values (according to the RGB
composition) in case of a raster used to represent an image.
• The raster georeferencing is defined when the coordinates of a vertex and the spatial
orientation of the matrix (i.e. a single coordinate of another vertex) are known: the location
of every pixel can be derived from its row and column numbers and pixel dimensions.

Basic data structures for both vector and raster


Raster Data Model: generalities
The raster model is used to represent:

1. Digital images: each pixel stores information


used to represent a picture or photograph (three
different values, according to the RGB color
composition) or a satellite image (a value for
each spectral band).

2. Thematic maps: each pixel stores information


used to represent a geographic element (e.g. the
land cover) or a phenomenon that has spatial
variability (e.g. the temperature distribution).
123 m

100 m

3. Digital Terrain Models (DTM): each pixel


stores the information relative to the height of a 75 m

portion of territory. 50 m

25 m

0m

Basic data structures for both vector and raster


The Raster Data Model: generalities
• The spatial resolution is an important property of the raster model, associated with the
number of pixels per unit of length.
• The higher the spatial resolution, the higher the level of spatial detail of the raster as well
as the memory required to store the information.
• In order to reduce the physical space of memory necessary to store a raster, compression
techniques may be applied. There exist lossy compression (with loss of information) and
lossless compression (without loss of information) formats.

Basic data structures for both vector and raster


Raster Data Model: formats
Esri Grid

There exist two Esri Grid formats:

1. Proprietary binary format (ARC/INFO GRID)


• The raster is stored in several files contained in at least two directories (Name and Info directories).
• The grid name must begin with an alphabetic character and must only include alphanumeric characters and the
underscore.

2. Non-proprietary ASCII format (ARC/INFO ASCII GRID)


The grid is stored in text format, with a header with the following data:
• number of rows and columns,
• (x, y) coordinates of a vertex,
• single cell size,
• numeric value for each cell (NoData is generally encoded as -9999)
listed row after row, from north to south.
Basic data structures for both vector and raster
Raster Data Model: formats
GeoTIFF (Geo Tagged Image File Format)

• It is a format extension used to store georeferencing information (projection, coordinate system, ellipsoid and
datum) in a compliant TIFF file.

• It is very often used to store satellite images.

• It includes the header tags, that define the image geometry (size, definition, image-data arrangement, image
compression).

• It does not produce information loss (lossless compression format).

• It supports black-and-white, greyscale, pseudo-color and true color images.

Basic data structures for both vector and raster


Raster Data Model: formats
ERDAS (Earth Resources Data Analysis System) IMAGINE IMG

• Proprietary file format originally developed to be used with ERDAS IMAGINE software for processing remote
sensing data.

• It distinguishes between two types of raster layer:


1. a continuous raster layer: used to represent an image acquired by a sensor, thus the value stored in
each pixel may vary in a continuous range (temperature, elevation, radiance, ecc);
2. a thematic raster layer: used to represent a specific theme or subject, thus each pixel stores a single
value (numerical code for a particular category).

• It has its own binary format (.img) for images.

• It is linked to a commercial package (not an open format): interoperability and accessibility are limited. It is
recommended to export the data to an open and wider-supported format, such as GeoTIFF.

Basic data structures for both vector and raster


Raster Data Model: formats
NetCDF-CF (Network Common Data Form CF)

• It is a NetCDF file format with CF (Climate and Forecast) metadata conventions.

• Originally developed for model-generated data (particularly for atmospheric models), it is becoming important
to represent observational data and for environmental applications.

• It consists in a binary storage in open format with optional compression.

The file consists in two sections:

1. The CF header: it contains dimensions, variables (data type, units, content description, missing data values) and
global attributes.

2. The data section: it contains the actual values for each dimension and variable listed above.

Basic data structures for both vector and raster


Raster Data Model: formats
JPG2000

• It is an open-source raster format.

• Image compression standard (at the moment the most widespread), developed by a Joint Photographic Experts
Group committee.

• It corresponds to a compression technique for maintaining the quality of large imagery (allows for a high-
compression ratio and fast access to large amounts of data at any scale).

• It is used for either color and black-and-white images.

• It allows both lossy and lossless compression.

Basic data structures for both vector and raster


How to import a Raster File in QGIS 3.10

Possibility to choose the work directory and the raster file format

Basic data structures for both vector and raster


How to import a Raster File in QGIS 3.10

Raster file added as a


new layer and
displayed on the right

Basic data structures for both vector and raster


Vector Data Model: generalities

Vector features:
• Points: for small geographic features (locations of trees, depth,
point of interest), each point is just a couple (triplet) of x,y(z)
representing a single location. A point has no dimension.
• Lines or Polylines: for linear features such as rivers, roads,
railroads, pipelines. A path links all coordinates of the points
(vertices) of the line. Its length can be measured.
• Polygons: for geographic features that have an area (lakes, parks,
buildings). Polygon borders are lines whose first and last vertex
coincide. Both perimeter and area of a polygon can be measured.

Basic data structures for both vector and raster


Vector Data Model: generalities

Vector features:
• Points: for small geographic features (locations of trees, depth,
point of interest), each point is just a couple (triplet) of x,y(z)
representing a single location. A point has no dimension.
• Lines or Polylines: for linear features such as rivers, roads,
railroads, pipelines. A path links all coordinates of the points
(vertices) of the line. Its length can be measured.
• Polygons: for geographic features that have an area (lakes, parks,
buildings). Polygon borders are lines whose first and last vertex
coincide. Both perimeter and area of a polygon can be measured.

Vector features are grouped into layers: features in the same layer
have the same geometry type and generally represent similar thematic
information

Basic data structures for both vector and raster


Vector Data Model: generalities
Vector Data Model: Basic format elements

Each feature is stored in a database, along with its coordinates and its attributes (ID,
thematic attributes…).

Each province here is a feature with its attributes (such


as name, unique code, nation, area, length, …) stored
in the database

Mozambique
administrative
provinces
boundaries layer

Open attribute table

Basic data structures for both vector and raster


Vector Data Model: formats
Shapefile

Vectors are defined through different Data Formats.

• The Shapefile is one of the most popular vector data GIS format, developed and regulated by Esri.

• It was introduced with ArcView GIS version 2 in the early 1990s, and now it is regulated as a mostly
open specification for data interoperability among Esri and other GIS software products.
• Easy to be read and written with a wide variety of software.
• The shapefile format can spatially describe vector features as points, lines, and polygons with their
location. Each item usually has attributes that describe it. However, Shapefile format lacks the
capacity to store topological information.
• ESRI Shapefile is in a binary format.

Basic data structures for both vector and raster


Vector Data Model: formats
Shapefile

A Shapefile format consists of a collection of files, stored in the same directory.


Three files are mandatory:
• .shp: it is the main file, it stores the geometry data: each record describes a shape with
the list of its vertices.
• .shx: it is the shape index file; it contains a positional index for each feature geometry to
allow seeking forward and backward quickly
• .dbf: it is the file of the dBASE table, devoted to store the attributes; one record per
feature, with the one-to-one relationship attribute-feature that is maintained by the index
of the feature.
The main file (.shp) alone is incomplete for distribution as the other supporting files are
required.

Basic data structures for both vector and raster


Vector Data Model: formats
Shapefile

Additionally to the mandatory .shp, .shx and .sbn, other files can be used as .prj (for
projection description, readable in a text editor) or .cpg (to specify the code page for .dbf) or
.xml (for storing metadata).

Mozambique administrative provinces boundaries shapefile .prj (for projection


description)
.xml (for storing metadata).

Basic data structures for both vector and raster


Vector Data Model: formats
Shapefile: Limitations

• Spatial representation
The edges of a polyline or polygon are composed of points. The spacing of the points implicitly determines the scale
at which the feature is represented. Additional points could be required to achieve smoother shapes or curves,
increasing the data storage size. The shapefile format does not support splines.

• Data storage
The size of both .shp and .dbf component files cannot exceed 2 GB. The table format for the .dbf component file is
based on an older dBase standard. Supported field types are: floating point (13 character storage), integer (4 or 9
character storage), date (no time storage; 8 character storage), and text (maximum 254 character storage).

• Topology
Shapefile does not store topological information.

• Style
When sending or receiving a shapefile the style is not stored in the files, the receiver cannot see how the sender
intended him to visualize the vector data.

Basic data structures for both vector and raster


Vector Data Model: formats
KML

KML (Keyhole Markup Language) is XML-based format and it


is primarily used for Google Earth.
• It was developed by Keyhole Inc which was later acquired by
Google. KML is an international open standard of OGC.
• As an XML file it can be easily manipulated by any text editor.
• KML was born as a format for spatial 2D and 3D
visualisation. It specifies an interesting set of features like
place marks, images, paths, polygons, 3D models,
textual descriptions and it can be displayed in various
geospatial software.
• For its reference system, KML uses 3D geographic
coordinates: longitude, latitude (both defined by WGS84) and
altitude (distance in meters from the EGM96 Geoid vertical
datum), with negative values e.g. for west and south.
• KML comes with an extension .kmz, which corresponds to KML
zipped version KMZ (KML-Zipped).

Basic data structures for both vector and raster


Vector Data Model: formats
GML

GML ( Geography Markup Language) is an XML-based open standard format


for GIS data exchange. It is often used in WebGIS and cartographic services.
• It is defined by OGC.
• An XML file can be easily manipulated by any text editor.
• It enables the definition of spatial entities (including not only conventional
"vector" or discrete objects, but coverages and sensor data, it can have also
multimedia content such as text, video, and audio, along with a stylization of the
spatial phenomena) and attributes such as coordinate reference systems,
geometry, topology, and time.
• It defines Features as distinct from geometry objects (points, LineStrings
and Polygons). A feature is an application object that represents a physical
entity, e.g. a building, a river, or a person. A feature can have geometric aspects.
A geometry object defines a location (coordinates) or region instead of a
physical entity, and hence is different and separated from a feature.
• The desired coordinate system must be specified explicitly.
• GML can be very heavy as a format to be processed.

Basic data structures for both vector and raster


Vector Data Model: formats
GeoJSON

GeoJSON is a lightweight geospatial data interchange format based on


JavaScript Object Notation (JSON), used by many open source GIS packages.

• It is an open standard format designed for representing simple Features


geographical features, along with their non-spatial attributes.
Feature
Properties
type
• GeoJSON supports basic geometric primitives: points, lines, and
polygons plus multi-part collections of these types (groups containing Attributes Geometry
multiple points, lines, or polygons treated as single entities with labels
MultiPoint, MultiLineString, and MultiPolygon, respectively). Geometric
Coordinates
Geometric objects with additional properties are Feature objects primitive
(spatially bounded entities). Sets of features are contained by
FeatureCollection objects.

• In a FeatureCollection it is not required that all Features in it have the


same geometry type.

• The GeoJSON format differs from other GIS standards because it was
written and it is maintained not by a formal standards organization, but by
an Internet working group of developers.

Basic data structures for both vector and raster


How to import a Vector Layer on QGIS 3.10

Basic data structures for both vector and raster


How to import a Vector Layer on QGIS 3.10

It works for shapefiles


but also for other
vector formats

Insert the chosen file


path

Basic data structures for both vector and raster


How to import a Vector Layer on QGIS 3.10

A Vector Layer
visualization on QGIS

Mozambique
administrative
provinces
boundaries layer

Basic data structures for both vector and raster


Vector or Raster Data models: how to choose?
Advantages and Disadvantages

Raster Data Models


✓ Better for storing image data × Dataset can be large: storage space can
be a problem
✓ Good for surface analysis
× Network analysis is difficult to perform
✓ Simple Data Structure
× Loss of information when large cells are
✓ Same Grid Cell for several attributes used
× Impossible to define topological
information

Vector Data Models


✓ Since most information is in vector × Location of each vertex has to be stored
format, no data conversion is required explicitly
✓ Compact data structure: less space for × Inefficient to represent high spatial
storing data variability, difficult to perform filtering
✓ Efficient for network analysis × Complex data structure
✓ Suitable for storing topological
information
Basic data structures for both vector and raster
Vector or Raster Data models: how to choose?

It depends on the input and on the goal: displaying or analyzing data?

• Vector models show higher resolution with less memory storage required
• Vector model can be used to define the topologic relationships among elements
• Vector data are recommended for network analysis
• Raster models allow easier and faster elaborations on data
• Raster models allow an easier comparison between different maps referred to the same territory
• Raster models allow effective representation of spatially distributed phenomena

General recommendations:
• If the choice is for Raster the pixel size has to be accurately chosen, depending on the analysis to perform.
• The conversions from one format to the other should be kept at minimum, because every conversion adds
errors to the map.

Basic data structures for both vector and raster


GeoPackage format
Characteristics

• It is an open, standard-based, platform-independent, portable, self-describing, compact format created to


share and transfer spatial data (both vector and raster).

• It is alternative to the traditional raster and vector formats.

• It is a sort of «container» of geographic and tabular information.

• It allows the storage of the following elements within a SQLite database: vector features, file matrix of imagery
and raster maps, extensions.

• The GeoPackage standard defines the schema for a GeoPackage file: table definitions, integrity assertions,
format limitations, content constraints.

Basic data structures for both vector and raster


GeoPackage format
Advantages

• Open format, based on OGC standards. It is the «default format» for QGIS 3.0.

• It is a single file (.gpkg), thus preferable for geographic data transfer: it facilitates interoperability.

• It supports the direct use: data can be directlty accessed and modified in a «native» format.

• It is designed to store complex and large data.

• It allows the fields name to be longer than 10 characters and the textual attributes to be longer than 255
characters.

• It is possible to store the style (symbology) of vector and raster files inside the same GeoPackage single file.

Basic data structures for both vector and raster


GeoPackage format in QGIS 3.10
How to create a GeoPackage layer

Full path of the file(s), including name and extension

Table name in the database

Geometry type (polygon, polyline, point, etc)

Projection

Basic data structures for both vector and raster


GeoPackage format in QGIS 3.10
How to create a GeoPackage layer

Add a connection to the


new GeoPackage

Basic data structures for both vector and raster


GeoPackage format in QGIS 3.10
How to import a raster and a vector file in a GeoPackage

1. Add the raster file (i.e. a GeoTIFF) and the vector file (i.e. a shapefile) as new layers.
2. Drag and drop the layers into the GeoPackage.
3. If needed, define and save the symbology of the layers into the GeoPackage file.

Save the symbology inside


the GeoPackage
Drag and drop to the
GeoPackage file

GeoTIFF and shapefile


added as new layers Define and save the
symbology

Basic data structures for both vector and raster


GeoPackage format in QGIS 3.10
How to import a raster and a vector file in a GeoPackage

Example of a GeoPackage containing a shapefile (administrative boundaries of Maputo Province) and a GeoTIFF file (digital
elevation model of the Maputo area).

Basic data structures for both vector and raster


References

• https://www.spatialpost.com/raster-vector-data-model/
• https://www.loc.gov/preservation/digital/formats/fdd/fdd000420.shtml
• https://pigrecoinfinito.com/2018/04/08/qgis-e-il-formato-geopackage/
• https://dans.knaw.nl/en/about/services/easy/information-about-depositing-data/before-depositing/file-formats/erdas-imagine-file-format
• http://gis4uo.blogspot.com/2014/04/geographic-data-models.html
• https://2012books.lardbucket.org/books/geographic-information-system-basics/s06-03-map-abstraction.html
• https://www.esri.com/library/whitepapers/pdfs/shapefile.pdf
• https://developers.google.com/kml
• http://xml.fmi.fi/namespace/woml/swo/2010/11/15/index449.html
• https://tools.ietf.org/html/rfc7946
• https://gdal.org/drivers/vector/geojson.html
• Google Earth
• ArcGIS Desktop Documentation

Last access: 01/10/2020

Basic data structures for both vector and raster

You might also like