Basic Data Structures For Both Vector and Raster
Basic Data Structures For Both Vector and Raster
1. A short introduction: from real World to Raster and Vector Data Models
2. Raster Data Model
• Raster Data Model: generalities
• Raster Data Model: formats
• How to import a Raster File in QGIS 3.10
3. Vector Data Model
• Vector Data Model: generalities
• Vector Data Model: formats
• How to import a Vector Layer on QGIS 3.10
4. Vector or Raster Data models: how to choose?
5. GeoPackage format
• GeoPackage format in QGIS 3.10
• Raster Model
• Vector Model
That differently represent real objects, possibly divided into two abstractions:
• Continuous fields (i.e. elevation, land use)
Real world can be described by a number of variables, each one defined in
different positions, on a continuous surface
Both vector and raster data models are useful for representing geographic data, but one may be
more appropriate than the other when it comes to representing a particular type of geographic data
or answering different types of questions.
• The information is stored in a matrix made up of cells called “pixels” (smallest unit of the
image).
• Each pixel stores a single value, or a combination of values (according to the RGB
composition) in case of a raster used to represent an image.
• The raster georeferencing is defined when the coordinates of a vertex and the spatial
orientation of the matrix (i.e. a single coordinate of another vertex) are known: the location
of every pixel can be derived from its row and column numbers and pixel dimensions.
100 m
portion of territory. 50 m
25 m
0m
• It is a format extension used to store georeferencing information (projection, coordinate system, ellipsoid and
datum) in a compliant TIFF file.
• It includes the header tags, that define the image geometry (size, definition, image-data arrangement, image
compression).
• Proprietary file format originally developed to be used with ERDAS IMAGINE software for processing remote
sensing data.
• It is linked to a commercial package (not an open format): interoperability and accessibility are limited. It is
recommended to export the data to an open and wider-supported format, such as GeoTIFF.
• Originally developed for model-generated data (particularly for atmospheric models), it is becoming important
to represent observational data and for environmental applications.
1. The CF header: it contains dimensions, variables (data type, units, content description, missing data values) and
global attributes.
2. The data section: it contains the actual values for each dimension and variable listed above.
• Image compression standard (at the moment the most widespread), developed by a Joint Photographic Experts
Group committee.
• It corresponds to a compression technique for maintaining the quality of large imagery (allows for a high-
compression ratio and fast access to large amounts of data at any scale).
Possibility to choose the work directory and the raster file format
Vector features:
• Points: for small geographic features (locations of trees, depth,
point of interest), each point is just a couple (triplet) of x,y(z)
representing a single location. A point has no dimension.
• Lines or Polylines: for linear features such as rivers, roads,
railroads, pipelines. A path links all coordinates of the points
(vertices) of the line. Its length can be measured.
• Polygons: for geographic features that have an area (lakes, parks,
buildings). Polygon borders are lines whose first and last vertex
coincide. Both perimeter and area of a polygon can be measured.
Vector features:
• Points: for small geographic features (locations of trees, depth,
point of interest), each point is just a couple (triplet) of x,y(z)
representing a single location. A point has no dimension.
• Lines or Polylines: for linear features such as rivers, roads,
railroads, pipelines. A path links all coordinates of the points
(vertices) of the line. Its length can be measured.
• Polygons: for geographic features that have an area (lakes, parks,
buildings). Polygon borders are lines whose first and last vertex
coincide. Both perimeter and area of a polygon can be measured.
Vector features are grouped into layers: features in the same layer
have the same geometry type and generally represent similar thematic
information
Each feature is stored in a database, along with its coordinates and its attributes (ID,
thematic attributes…).
Mozambique
administrative
provinces
boundaries layer
• The Shapefile is one of the most popular vector data GIS format, developed and regulated by Esri.
• It was introduced with ArcView GIS version 2 in the early 1990s, and now it is regulated as a mostly
open specification for data interoperability among Esri and other GIS software products.
• Easy to be read and written with a wide variety of software.
• The shapefile format can spatially describe vector features as points, lines, and polygons with their
location. Each item usually has attributes that describe it. However, Shapefile format lacks the
capacity to store topological information.
• ESRI Shapefile is in a binary format.
Additionally to the mandatory .shp, .shx and .sbn, other files can be used as .prj (for
projection description, readable in a text editor) or .cpg (to specify the code page for .dbf) or
.xml (for storing metadata).
• Spatial representation
The edges of a polyline or polygon are composed of points. The spacing of the points implicitly determines the scale
at which the feature is represented. Additional points could be required to achieve smoother shapes or curves,
increasing the data storage size. The shapefile format does not support splines.
• Data storage
The size of both .shp and .dbf component files cannot exceed 2 GB. The table format for the .dbf component file is
based on an older dBase standard. Supported field types are: floating point (13 character storage), integer (4 or 9
character storage), date (no time storage; 8 character storage), and text (maximum 254 character storage).
• Topology
Shapefile does not store topological information.
• Style
When sending or receiving a shapefile the style is not stored in the files, the receiver cannot see how the sender
intended him to visualize the vector data.
• The GeoJSON format differs from other GIS standards because it was
written and it is maintained not by a formal standards organization, but by
an Internet working group of developers.
A Vector Layer
visualization on QGIS
Mozambique
administrative
provinces
boundaries layer
• Vector models show higher resolution with less memory storage required
• Vector model can be used to define the topologic relationships among elements
• Vector data are recommended for network analysis
• Raster models allow easier and faster elaborations on data
• Raster models allow an easier comparison between different maps referred to the same territory
• Raster models allow effective representation of spatially distributed phenomena
General recommendations:
• If the choice is for Raster the pixel size has to be accurately chosen, depending on the analysis to perform.
• The conversions from one format to the other should be kept at minimum, because every conversion adds
errors to the map.
• It allows the storage of the following elements within a SQLite database: vector features, file matrix of imagery
and raster maps, extensions.
• The GeoPackage standard defines the schema for a GeoPackage file: table definitions, integrity assertions,
format limitations, content constraints.
• Open format, based on OGC standards. It is the «default format» for QGIS 3.0.
• It is a single file (.gpkg), thus preferable for geographic data transfer: it facilitates interoperability.
• It supports the direct use: data can be directlty accessed and modified in a «native» format.
• It allows the fields name to be longer than 10 characters and the textual attributes to be longer than 255
characters.
• It is possible to store the style (symbology) of vector and raster files inside the same GeoPackage single file.
Projection
1. Add the raster file (i.e. a GeoTIFF) and the vector file (i.e. a shapefile) as new layers.
2. Drag and drop the layers into the GeoPackage.
3. If needed, define and save the symbology of the layers into the GeoPackage file.
Example of a GeoPackage containing a shapefile (administrative boundaries of Maputo Province) and a GeoTIFF file (digital
elevation model of the Maputo area).
• https://www.spatialpost.com/raster-vector-data-model/
• https://www.loc.gov/preservation/digital/formats/fdd/fdd000420.shtml
• https://pigrecoinfinito.com/2018/04/08/qgis-e-il-formato-geopackage/
• https://dans.knaw.nl/en/about/services/easy/information-about-depositing-data/before-depositing/file-formats/erdas-imagine-file-format
• http://gis4uo.blogspot.com/2014/04/geographic-data-models.html
• https://2012books.lardbucket.org/books/geographic-information-system-basics/s06-03-map-abstraction.html
• https://www.esri.com/library/whitepapers/pdfs/shapefile.pdf
• https://developers.google.com/kml
• http://xml.fmi.fi/namespace/woml/swo/2010/11/15/index449.html
• https://tools.ietf.org/html/rfc7946
• https://gdal.org/drivers/vector/geojson.html
• Google Earth
• ArcGIS Desktop Documentation