Getting Started With Postgis: Topic
Getting Started With Postgis: Topic
Getting Started With Postgis: Topic
Lab
Topic
Spatial data
Questions
In this first part of the lesson, you'll get an introduction to Postgres's graphical interface
called pgAdmin. You'll also import a shapefile, load data from a text file, and see how
queries are performed in pgAdmin.
1. Open pgAdmin 4. The application should open in your default web browser with a
pane on the left side of the window labeled Browser. Within the Browser, you
should see a tree with Servers at the top. Beneath Servers, you should see
a Postgres server.
2. Double-click on that server to open a connection to it. You will be logging in with the
default user name of postgres.
Enter the password you defined earlier for the postgres account when you installed
the software. You should now see 3 nodes beneath the localhost
server: Databases, Login/Group Roles, and Tablespaces.
3. Expand the Databases list. You should see at least one "starter"
database: postgres. It was created when you installed Postgres.
We want to create a new database that is specific to our desire to use the PostGIS
functionality.
4. Right-click on the Databases list, and choose Create > Database.
5. In the Create - Database dialog, set the Database to ITECH2004DB, and from
the Owner list, select the postgres user name.
Hit Save.
6. Now, click on the ITECH2004DB database to expand its list of contents.
Right-click on Extensions, and select Create > Extension.
Page 1 of 30
7. In the Create Extension dialog under the General tab, set the Name to postgis.
In the same dialog, select the Definition tab, and set the Version to 3.0.
(The settings you just established are reflected under the SQL tab.)
Click Save to dismiss the Create Extension dialog.
We will next concern ourselves with schemas. In Postgres, schemas are the
containers for a set of related tables. Generally speaking when you begin a new
project, you'll want to create a new schema.
8. Expand the Schemas list. At this point, you should see only one
schema: public. We'll have a look at the public schema soon; but for now, let's
create a new schema. This schema will store data for the United States that we will
use in the next two lessons.
9. Right-click on Schemas, and select Create > Schema.
10. In the Create Schema dialog, specify a name of usa.
11. Set the Owner of the schema to postgres.
12. Click Save to create the schema.
A common workflow for PostGIS users is to convert their data from Esri shapefile format
to PostGIS tables. Fortunately, some PostGIS developers have created a Shapefile
Import/Export Manager that makes this conversion easy. In prior versions of pgAdmin, the
shapefile importer was accessible as a plug-in. In pgAdmin 4, it must be run as a separate
application.
Note:
If you encounter an error that the file "libintl-8.dll is missing," the easiest fix for this
problem is to navigate up to the bin folder where libintl-8.dll is found, copy it, and
paste it into the postgisgui folder.
Since we will be using this executable several times, I suggest that you make a
desktop shortcut for it.
3. At the top of the application window, click the View connection details button.
4. Confirm that the PostGIS Connection parameters are set as follows:
Username: postgres
Password: <the password you set when installing the software>
Server Host: localhost (port 5432)
Database: ITECH2004DB
Before performing the import, let's spend a moment discussing the SRID
(Spatial Reference IDentification) setting. This ID is set to 0 by default, a value that
indicates the spatial reference of the shapefile is unknown. As with other GIS
applications, defining a dataset's spatial reference is critical in enabling most of the
functionality we typically need. We'll talk more about SRIDs later. For now, it's
sufficient to know that 4269 is the ID associated with the decimal degree/NAD83
coordinate system used by the Lesson 3 data.
8. Click Import. After just a moment, the Log Window area of the dialog should report
that the import process has been completed.
9. Close the Import/Export Manager application by clicking the X button in the upper
right of the dialog or on the Cancel button.
10. Back in pgAdmin, expand the object list associated with the usa schema.
11. Click on Tables. You should now see the newly imported states table.
Note:
It's sometimes necessary to refresh the GUI after creating new objects like this. This
can be done by right-clicking on the schema or Tables node in the Browser and
selecting Refresh (or hitting F5 on the keyboard).
12. Right-click on the states table and select View/Edit Data > First 100 Rows.
Note the other options in this context menu which are rather straightforward to
understand.
While looking at the table, note that the column headers include not just the names
of the columns but also their data types. The gid column is an auto-incrementing
integer column that was added by the importer. The presence of [PK] in its header
indicates that it was also designated as the table's primary key.
Also note that the geom column header contains an "eye" icon. Clicking it allows
13. Repeat these steps to import the us_cities shapefile. Truncate its name to
just cities.
Loading data from a comma-delimited text file is a common workflow for database
developers. Let's see how this can be done in Postgres by loading some state
demographic info from the 2010 Census. We'll begin by creating a new blank table.
1. To create a new table, right-click on Tables under the usa schema and
select Create > Table.
2. Under the General tab, set the table's Name to census2010 and
the Owner to postgres.
3. Under the Columns tab, click the + button. You should see an entry appear for
setting the new column's properties.
4. Set the column's Name to state, its Data type to character varying and
its Length to 50. Finally, set its Primary key value to Yes.
5. Repeat the last two steps to add a column with the Name total and Data type
of integer. The length property need not be set for the integer type and the column
should not be defined as the primary key.
6. Add the following additional columns to the table, all as integer data type.
Note:
Instead of having to expand the Data Type pick list, you can start typing the
word integer, and the slot will let you auto fill with choices. After you type inte, you
can pick "integer." Be sure to add the columns in this order, otherwise the data load
will not work properly.
male
female
white
black
amind
asian
hawaiian
Before executing the command that will import the data into the table let's have a look at
the data file in a plain text editor and also note its location.
Let's look for a moment at the options set in the WITH clause. If our input file were
tab-delimited, we would use a FORMAT setting of text rather than csv. We set the
HEADER option to True since our file contains a header row. And we set the
QUOTE option to the double-quote character to indicate that the input file encloses
text strings with that character. A number of other options are available and can be
found on the COPY command's page [5] in the documentation. Among the other
options that you may need to set is DELIMITER. This defaults to the comma for csv
files and to tab for text files. If your file uses another delimiter, such as the pipe
character (|), you can indicate that using the DELIMITER option.
The COPY command attempts to insert values from the first column of the input file
into the first column of the table, values from the second column of the input file into
the second column of the table, etc. The HEADER option simply tells Postgres to
skip the first line, not to read the column headers, and intelligently match the
columns of the input file to the columns of the table. If your table happens to have
more columns than the input file, and/or the columns are in a different order, you
can deal with this by supplying to the COPY command a list of column names that
matches the input file after the table name. For example:
COPY usa.census2010 (state, total, male....) FROM ....
Note:
If you encounter a "permission denied" error, it means the "postgres" database login
doesn't have permission to read the csv file where it is currently located. Try copying
it to a sub-directory belonging to the "Public" user (e.g., 'C:\Users\Public\Public
Documents') or to a location that has no permission restrictions (e.g., 'C:\temp').
You could also reset the permissions on the folder that stores the CSV file as
outlined in this stackoverflow thread:
http://stackoverflow.com/questions/14083311/permission-denied-when-trying-to-
import-a-csv-file-from-pgadmin [6]
7. Confirm that the data loaded properly using the method you used for the states
table.
1. Click on Tools > Query Tool again to open a new query tab.
2. In the SQL box, enter the following query to identify the states where most of the
population uses the term 'Soda' when referring to soft drinks:
3. SELECT name, sub_region
4. FROM states
WHERE sub_region = 'Soda';
5. Run the query by clicking the Execute button on the toolbar. You should receive an
error message that the relation "states" does not exist.
6. The reason for this error has to do with pgAdmin's search path. Among other things
the search path determines which schema(s) will be scanned when tables are
specified using unqualified names (e.g., like we just did with "states"). There are two
solutions to this problem. The first is to qualify all table names with their parent
schema. For example:
7. SELECT name, sub_region
8. FROM usa.states
WHERE sub_region = 'Soda';
The second solution is to reset pgAdmin's search path so that the schema you're
using is part of that path. By default, pgAdmin searches only the public schema.
We will take this second approach since it allows us to omit the schema qualifier.
9. So, highlight the text of your query and cut it out of the editor window. We'll be
pasting it back in momentarily.
10. Enter the following statement into the SQL Editor:
11. Run this query by clicking the Execute button. You should receive a message
that the query returned successfully, though you should expect no tabular output
from a query like this. pgAdmin will now look for unqualified tables first in
the usa schema, then in the public schema. We include the public schema
because the search path is used not just for searching for tables but also for
functions. When we move on to spatial queries, we'll need to have access to some
of the functions available in the public schema.
If you are curious, you can run the following query to find out what the search path
is set to.
SHOW search_path;
12. You can now retry the query that you cut out. You should see a list of 17
states in the Output pane at the bottom of the window.
Note:
You may still be receiving an error if you left the table's name set to States rather
than states during the import process; pgAdmin converts all table/column names to
lower-case prior to execution by default. Thus, even if your FROM clause reads
"FROM States", it will be evaluated as "FROM states". And if your table is
named States, pgAdmin won't find a matching table. To override this case
conversion, you can put the table/column name in double quotes like this:
SELECT name, sub_region
FROM "States"
WHERE sub_region = 'Soda';
To avoid having to qualify your table/column names in this way, it's best to use lower
case in your naming.
13. While on the subject of case, you may have noticed that my examples place
all SQL keywords in upper case and table/column names in lower case. This is a
convention that is followed by many SQL developers because it makes it easy to tell
at a glance which parts of the statement are SQL keywords and which are schema
elements. This is just a convention not a requirement, so you should feel free to
deviate from it if you prefer. For example, this query will produce the same results:
14. select name, sub_region
15. from states
where sub_region = 'Soda';
To help you get oriented to writing SQL queries on the pgAdmin command line, try your
hand at the following exercises. Recall that the 2008 population data, soft drink data, and
geometries are in the states table, and that the 2010 data are in the census2010 table.
Solutions [7] (This link takes you to the bottom of the Lesson.)
What sets spatial databases apart from their non-spatial counterparts is their support for
answering geometric and topological questions. Let's have a look at some simple
1. Return to the Query dialog in pgAdmin and execute the following query:
2. SELECT name, ST_Centroid(geom) AS centroid
3. FROM states
WHERE sub_region = 'Soda';
The obvious difference between this and our earlier queries is that it calls upon a
function called ST_Centroid(). Like the functions we worked with in Lesson 1, the
ST_Centroid() function accepts inputs and returns outputs. Here we supply
the geom column as an input to the function, and it returns the geometric centers of
the shapes stored in that column.
You've probably noticed that the output from ST_Centroid() is not human friendly. It
contains the coordinates of a point in the coordinate system of the input column, but
expressed in hexadecimal notation [8]. To display the coordinate values in a more
readable form, we can nest the call to the ST_Centroid() function within a call to a
function named ST_AsText().
Now, let's try retrieving the areas of the states using ST_Area().
In the area column, take note of the values returned by ST_Area(). They are in the
units of the input geometry, squared. Recall that the Lesson 3 shapefiles are in
latitude/longitude coordinates, which means the area values we're seeing are in
square degrees. Hopefully, you recognize that this is a poor way to compute area
since a square degree represents a different area depending on the part of the
Take note of the values now displayed in the area column. In this version of the
query, the ST_Transform() function is first used to re-project the geometry into the
spatial reference 2163 before ST_Area() is called. That spatial reference is an
equal-area projection in meters that is suitable for the continental US. Don't worry,
we'll discuss how you'd find that information later in this lesson.
We'll spend much more time discussing the spatial functions that are available in PostGIS
later. Right now, let's go over the geometry types that are supported.
In the last section, we worked with a table – usa.states – containing geometries of the
type POLYGON. The other basic geometry types are POINT and LINESTRING. As we'll
see momentarily, there are numerous other geometry types available in PostGIS that
allow for the storage of multipart shapes, 3-dimensional shapes, and shapes that have a
measure (or M value) associated with its vertices. If keeping all of the various types
straight becomes difficult, it may help to remember that the simple geometries we deal
with most often are POINT, LINESTRING, and POLYGON.
To demonstrate some of the concepts in this section, we're going to create a new schema
to store points of interest in New York City. Unlike the last schema where we used the
Shapefile Import/Export Manager to both create and populate a table at the same time,
here we'll carry out those steps separately.
1. In the Browser pane within pgAdmin, right-click on the Schemas node beneath
the ITECH2004DB database and select Create > Schema.
The last column we want to add to the table is one that will hold the geometries.
While it's possible to add a column of type 'point' through the GUI, there are a
number of other important settings that should be made when adding a geometry
column (such as its spatial reference ID, or SRID). These settings are all handled by
a PostGIS maintenance function called AddGeometryColumn(), so that is the
route we will take.
9. Click Save to dismiss the dialog and create the table. Before adding the geometry
column to the table, let's recall the pgAdmin search path. It's not set to include the
nyc_poi schema, so let's do that first.
10. Reset the search path by executing the following statement in
a Query window:
11. Now add a geometry column called geom to the table by executing this
statement. (You can re-use the Query window you already have open for this and
subsequent queries.)
SELECT AddGeometryColumn('nyc_poi','pts','geom',4269,'POINT',2);
First, let's address the unusual syntax of this statement. You've no doubt grown
accustomed to listing column names (or *) in the SELECT clause, but here we're
plugging in a function without any columns. We're forced to use this awkward syntax
because SQL rules don't allow for invoking functions directly. Function calls must be
made in one of the statement types we've encountered so far (SELECT, INSERT,
UPDATE, or DELETE). In this situation, a SELECT statement is the most
appropriate.
We're about to add rows to our pts table through a series of INSERT statements. You'll
find it much easier to copy and paste these statements rather than typing them manually,
if not now, then certainly when we insert polygons later using long strings of coordinates.
1. Execute the following statement to insert a row into the pts table.
2. INSERT INTO pts (name, geom)
VALUES ('Empire State Building', ST_GeomFromText('POINT(-73.985744
40.748549)',4269));
The key point to take away from this statement (no pun intended) is the call to
the ST_GeomFromText() function. This function converts a geometry supplied in
text format to the hexadecimal form that PostGIS geometries are stored in. The
other argument is the spatial reference of the geometry. This argument is required in
this case because when we created the geom column
using AddGeometryColumn(), it added a constraint that values in that column
must be in a particular spatial reference (which we specified as 4269).
3. Execute the statements below to add a couple more rows to the table. Note that
while we've executed single statements thus far in the lesson, you are also allowed
to execute multiple statements in succession.
4. INSERT INTO pts (name, geom)
5. VALUES ('Statue of Liberty', ST_GeomFromText('POINT(-74.044508 40.689229)',4269));
6.
7. INSERT INTO pts (name, geom)
VALUES ('World Trade Center', ST_GeomFromText('POINT(-74.013371 40.711549)',4269));
10. Finally, add two more rows using the statement below. Note that in this step
you're adding multiple rows using a single statement.
11. INSERT INTO pts (name, geom)
12. VALUES ('Radio City Music Hall', ST_GeomFromText('POINT(-73.97988
40.760171)',4269)),
('Madison Square Garden', ST_GeomFromText('POINT(-73.993544 40.750541)',4269));
13. In the pgAdmin window, right-click on the pts table and select View/Edit
Data > All Rows to confirm that the INSERT statements executed properly.
1. Repeat the steps (in Part A above) to create a new table within the nyc_poi schema,
that will hold NYC line features. Pay particular attention to these differences:
o Give the table a name of lines.
o The table should have the same column definitions, with the exception that
the geometry type should be set to LINESTRING rather than POINT.
o No need to set the search path again as it will already include the nyc_poi
schema.
2. Execute the following statement to insert 3 new rows into the lines table:
3. INSERT INTO lines (name, geom)
4. VALUES ('Holland Tunnel',ST_GeomFromText('LINESTRING(
5. -74.036486 40.730121,
6. -74.03125 40.72882,
7. -74.011123 40.725958)',4269)),
8. ('Lincoln Tunnel',ST_GeomFromText('LINESTRING(
9. -74.019921 40.767119,
10. -74.002841 40.759773)',4269)),
11. ('Brooklyn Bridge',ST_GeomFromText('LINESTRING(
12. -73.99945 40.708231,
-73.9937 40.703676)',4269));
Note that I've split this statement across several lines to improve its readability, not
for any syntax reasons. You should feel welcome to format your statements
however you see fit.
1. Repeat the steps (in Part A) to create a new table within the nyc_poi schema, with
the following exceptions:
o Give the table a name of polys.
o Set the geometry type of the geom column to POLYGON rather than
LINESTRING.
2. Execute the following statement to add a row to your polys table:
3. INSERT INTO polys (name, geom)
4. VALUES ('Central Park',ST_GeomFromText('POLYGON((
5. -73.973057 40.764356,
6. -73.981898 40.768094,
7. -73.958209 40.800621,
8. -73.949282 40.796853,
-73.973057 40.764356))',4269));
While the syntax for constructing a polygon looks very similar to that of a linestring,
there are two important differences:
o The first X/Y (lon/lat) pair should be the same as the last (to close the
polygon).
o Note that the coordinate list is enclosed in an additional set of parentheses.
This set of parentheses is required because polygons are actually composed
of potentially multiple rings. Every polygon has a ring that defines its exterior.
Some polygons also have additional rings that define holes in the interior.
When constructing a polygon with holes, the exterior ring is supplied first
followed by the interior rings. Each ring is enclosed in a set of parentheses,
and the rings are separated by commas.
To see an example, let's add Central Park again, this time cutting out the large
reservoir near its center.
9. First, let's remove the original Central Park row. In pgAdmin, right-click on
the polys table and select Truncate > Truncate. Note that this deletes all rows from
the table.
Earlier in this section, we discussed 3-dimensional (XYZ and XYM) and 4-dimensional
(XYZM) geometries in the context of properly specifying the dimension argument to the
AddGeometryColumn() function. We won't be doing so in this course, but let's look for a
moment at the syntax used for creating these geometries.
To define a column that can store M values as part of the geometry, use the POINTM,
LINESTRINGM, and POLYGONM data types. When specifying objects of these types, the
M value should appear last. For example, an M value of 9999 is attached to each
coordinate in these features from our nyc_poi schema:
POINTM(-73.985744 40.748549 9999)
Perhaps the most common usage of M coordinates is in linear referencing (e.g., to store
the distance from the start of a road, power line, pipeline, etc.). This Wikipedia article
on Linear Referencing [9] provides a good starting point if you're interested in learning
more.
To define a column capable of storing Z values along with X and Y, use the "plain" POINT,
LINESTRING and POLYGON data types rather than their "M" counterparts. The syntax for
specifying an XYZ coordinate is the same as that for an XYM coordinate. The "plain" data
type name tells PostGIS that the third coordinate is a Z value rather than an M value. For
example, we could include sea level elevation in the coordinates for the Empire State
Building (in feet):
POINT(-73.985744 40.748549 190).
Finally, in the event you want to store both Z and M values, again use the "plain" POINT,
LINESTRING and POLYGON data types. The Z value should be listed third and the M
value last. For example:
POINT(-73.985744 40.748549 190 9999)
F. Multipart geometries
PostGIS provides support for features with multiple parts through the MULTIPOINT,
MULTILINESTRING, and MULTIPOLYGON data types. A classic example of multipart
geometry is the state of Hawaii which is composed of multiple disconnected islands. The
syntax for specifying a MULTIPOLYGON builds upon the rules for a regular POLYGON;
the parts are separated by commas and an additional set of parentheses is used to
enclose the full coordinate list. The footprints of the World Trade Center Towers 1 and 2
(now fountains in the 9/11 Memorial) can be represented as a single multipart polygon as
follows:
MULTIPOLYGON(((-74.013751 40.711976, -74.01344 40.712439,
-74.012834 40.712191,
-74.013145 40.711732,
-74.013751 40.711976)),
((-74.013622 40.710772,
-74.013311 40.711236,
-74.012699 40.710992,
-74.013021 40.710532,
-74.013622 40.710772)))
G. Mixing geometries
The tables we've created so far reflect a bias toward Esri-centric design with each table
storing a single column of homogeneous geometries (i.e., all points, or all lines, or all
polygons, but not a mix). However, PostGIS supports two design approaches that are
good to keep in mind when putting together a database:
Let's see how this heterogeneous column approach can be used to store all of our
nyc_poi data in the same table.
1. Repeat the steps (in Part A) to create a new table within the nyc_poi schema. Pay
particular attention to these differences:
o Give the table a name of mixed.
o The table should have the same column definitions with the exception that the
geometry type should be set to GEOMETRY.
2. Add the same features to this new table by executing the following statement:
3. INSERT INTO mixed (name, geom)
4. VALUES ('Empire State Building', ST_GeomFromText('POINT(-73.985744
40.748549)',4269)),
5. ('Statue of Liberty', ST_GeomFromText('POINT(-74.044508 40.689229)',4269)),
6. ('World Trade Center', ST_GeomFromText('POINT(-74.013371 40.711549)',4269)),
7. ('Radio City Music Hall', ST_GeomFromText('POINT(-73.97988 40.760171)',4269)),
8. ('Madison Square Garden', ST_GeomFromText('POINT(-73.993544 40.750541)',4269)),
9. ('Holland Tunnel',ST_GeomFromText('LINESTRING(
10. -74.036486 40.730121,
11. -74.03125 40.72882,
12. -74.011123 40.725958)',4269)),
13. ('Lincoln Tunnel',ST_GeomFromText('LINESTRING(
14. -74.019921 40.767119,
15. -74.002841 40.759773)',4269)),
16. ('Brooklyn Bridge',ST_GeomFromText('LINESTRING(
17. -73.99945 40.708231,
24. In the pgAdmin window right-click on the mixed table and select View/Edit
Data > All Rows to confirm that the INSERT statement executed properly.
At some point in this lesson, you probably thought to yourself, "This is fine, but what if I
want to see the geometries?" That is the focus of the next part of the lesson where we will
use the third-party application Quantum GIS (QGIS) to view our PostGIS data.
1. Open QGIS. It'll be the QGIS Desktop X.x choice from the QGIS folder in your Start
menu.
The basic elements of the application GUI are similar to ArcMap's. The Layers
panel in the lower left of the window lists the project layers and their symbology,
while the much wider pane to the right displays the layer features themselves.
Across the top of the window is a set of toolbars that can be moved to custom
positions by the user.
Above the Layers panel is the Browser panel, which provides an interface for
browsing data sources. Moving from top to bottom:
o Favorites - for enabling easy access to frequently used folders on your file
system
o Home - for accessing data located within your folder in C:\Users
o C:\ - for accessing data anywhere on your hard drive
You're welcome to choose either method, though for the purposes of this
class, it should be fine to select the Basic method.
Figure 3.1: The Create a New PostGIS connection dialog, showing the correct
information entered.
4. You should click the Test Connection button to make sure you have typed things
correctly.
Now let's take a quick tour of how some common GIS operations are performed in
QGIS.
1. Zoom to the full extent of all the layers by selecting View > Zoom Full.
2. Click and drag to rearrange the layers. Put them in the following order, from top to
bottom: pts, lines, polys, and then the mixed pts, lines, and polys layers.
Note the various features.
3. The pts, lines, and polys layers are redundant with the three layers based on
the mixed table, so turn off pts, lines, and polys by clicking the x next to each
layer.
While playing with layer visibility, you may note that the polys layer contains a hole
in the Central Park polygon, whereas the "mixed" version of that polygon does not.
This is to be expected, since we didn't bother to create that inner ring in the mixed
version.
4. Each layer based on the mixed table has the same name. Go ahead and re-name
these layers: mixed - lines, mixed - pts, and mixed - polys by right-clicking on the
layers one at a time and selecting the Rename command.
The Symbology tab is where you'd go to change the way a layer is symbolized.
Note the pick list at the very top of the dialog which provides Single Symbol,
Categorized, and Graduated options, etc.
The Actions tab provides functionality similar to ArcGIS's Hyperlink settings for
launching external applications to view data found in the attribute table such as
images and URLs.
The Joins tab is where you'd go if you need to join data from another table to the
layer's attribute table.
The Diagrams tab provides settings for creating pie chart and histogram (bar) chart
overlays from numeric data in the attribute table.
You can investigate the other elements of the Layer Properties at your leisure.
6. Now, still in the dialog for the mixed - lines Layer Properties, note the locations of
the three line features, then go to the Source tab. Recall that the three line features
represent two tunnels (the longer line features) and a bridge. We are going to
restrict the layer to showing just tunnels by doing the following:
o Click on the Query Builder button at the lower right of the Layer Properties
dialog.
o In the Provider specific filter expression box, compose the expression
shown below, or just copy-paste it.
o Click the Test button to verify the veracity of your expression, then OK after
confirming that it returns 2 rows.
o Click OK to dismiss the Query Builder dialog.
You'll see the expression mirrored in the Provider feature filter box of the
Layer Properties dialog.
o Click OK to dismiss the Layer Properties dialog.
You should see that the Brooklyn Bridge is no longer displayed as part of the
layer.
7. An intuitive set of zoom/pan tools can be found on the Map
Navigation Toolbar (the bar containing the "hand" icon). Cicking on the Pan tool
activates it; you can then click-and-drag in the map display area to alter the visible
extent.
gid < 4
Note that you can build the expression graphically by expanding the Fields and
Values and Operators lists, then double-clicking on items in those lists.
Under the General tab, you can: set the selection and background colors, specify
whether references to data sources should be stored with relative or absolute paths,
and set the display units of the project.
Under the CRS (Coordinate Reference System) tab, you can specify on-the-fly
coordinate system transformation settings. Let's re-project our data into the New
York East State Plane system.
15. Click on the CRS tab.
16. The Coordinate reference systems of the world section of the dialog
(where you pick a coordinate system) is quite small, unfortunately. Click and drag
the scrollbar up to the top of the list and click the minus sign [-] sign boxes to
collapse the lists. Note that the options are categorized into Geographic
Coordinate Systems and Projected Coordinate Systems.
In Postgres and other sophisticated RDBMS's, stored SQL statements like these are
called views. In this section, we'll see how views can be created in Postgres.
1. In the pgAdmin Query dialog, execute the following query (which identifies the state
capitals):
2. After confirming that the query returns the correct rows, copy the SQL to your
computer's clipboard (Ctrl-C).
3. Back in the main pgAdmin window, navigate to the usa schema.
4. Right-click on the Views node and select Create > View.
5. Set the view's Name to vw_capitals and its Owner to postgres.
6. Click on the Code tab, and paste the SQL statement held on the clipboard (Ctrl-V)
into the text box.
7. Click Save to complete creation of the view.
You could now use this view any time you want to work with state capitals. It's
important to note that the output from the view is not a snapshot from the moment
you created the view; if the underlying source table were updated, perhaps to add
Montpelier, those updates would automatically be reflected in the view.
Just as we saw in MS-Access, the records returned by views can be used as the source
for a query.
1. Open a new Query Tool tab and execute the following query, which identifies the 18
relatively small capitals. Note the use of the view we just created.
Views can also include spatial functions, or a combination of spatial and non-spatial
criteria, in their definition. To demonstrate this, let's create views that re-project
our states and cities data on the fly.
2. Follow the procedure outlined above to create a new view based on this query.
Assign a name of vw_states_2163 to this view.
3. Again, repeat this process to create an on-the-fly re-projection of the cities data
called vw_cities_2163. Define the view using the following query:
Note: The steps outlined below are intended to expose you to the projection behavior of
QGIS, while confirming that the coordinate transformation programmed into the views
created above do in fact work properly. At Postgres v12/PostGIS v3/QGIS v3.10, the
transformations appear correct when geometries are previewed in pgAdmin. The steps
instruct you to turn off QGIS's reprojection behavior so that you can see the differently
projected geometries (some in 4269, some in 2163) in different coordinate spaces (i.e.,
not lining up). However, QGIS's behavior appears to have changed from its earlier
versions. It appears that it's no longer possible to add layers in different projections
without QGIS attempting to align the layers through on-the-fly re-projection, though a
pattern in its behavior has been hard for me to pin down. I'm going to leave the steps
below unchanged in the event there is a bug in QGIS that will be corrected. In any case,
you should feel free to define views with geometries in multiple projections as outlined
above if that makes your work in PostGIS easier. But be advised that you may
experience "flakiness" bringing data from such views into QGIS.
Click the arrow next to your Lesson3 connection selected to view the available
schemas.
If an Enter Credentials dialog pops up, just supply the postgres user name and the
password you established for it.
5. Expand the object list associated with the usa schema. You should see the
original cities and states tables. You should also
see vw_capitals, vw_states_2163 and two versions of vw_cities_2163.
You see two versions of vw_cities_2163 because that view outputs all of the
columns from the cities table, including geom, plus a column of geometries re-
projected into SRID 2163 (geom_2163).
6. Add the six usa schema layers to the QGIS project.
7. Go back to Project > Properties. You should see that the check box for No
projection is now unchecked. Apparently, QGIS detected the fact that the spatial
reference property settings for the layers we added are not all same, so it engaged
the on-the-fly projection capability.
8. Check the No projection box again, and hit the Apply button.
Close the Project Properties dialog.
This section showed how to save queries as views which can then be utilized in the same
way as tables. In the next section, we'll go into a bit more detail on the topic of spatial
references.
As we’ve seen, populating a geometry column with usable data requires specifying the
spatial reference of the data. We also saw that geometries can be re-projected from one
spatial reference to another using the ST_Transform() function. In both cases, it is
necessary to refer to spatial reference systems by an SRID (Spatial Reference ID). So
where do these IDs come from, and where can a list of them be found?
The answer to the question of where the IDs come from is that PostGIS uses the spatial
reference IDs defined by the European Petroleum Survey Group (EPSG). As for finding
the ID for a spatial reference you want to use, there are a few different options.
Using pgAdmin
All of the spatial reference IDs are stored in a Postgres table in the public schema
called spatial_ref_sys.
1. In pgAdmin, navigate to the spatial_ref_sys table and view its first 100 rows. Make
particular note of the srid and srtext columns.
Using QGIS
Under the CRS tab, recall that the various coordinate systems are categorized
as Geographic or Projected. If you’re an Esri ArcMap user, this sort of interface
should feel familiar. Let’s say you wanted to find the ID for UTM zone 18N, NAD83.
2. Expand the Projected Coordinate Systems category, then expand the Universal
Transverse Mercator (UTM) category.
3. To easily get to this sublist, you might want to use the Filter capability.
4. Scroll down through the list and find NAD83 / UTM zone 18N.
5. On the right side of the dialog, you should see a column called Authority ID. Note
that most of the Authority ID values are prefixed with EPSG, which means those are
the values you should use in PostGIS. In this case, you would find that the desired
SRID for UTM NAD83 Zone 18N is 26918.
The Prj2EPSG website [12] provides an easy-to-use interface for finding EPSG IDs. As its
name implies, it allows the user to upload a .prj file (used by Esri to store projection
metadata) and get back the matching EPSG ID. The site also makes it possible to enter
search terms. My test search for ‘pennsylvania state plane’ yielded some garbage
matches but also the ones that I would expect.
We’ve seen that the public schema contains a table called spatial_ref_sys that stores all
of the spatial references supported by PostGIS. Another important item in that schema is
the geometry_columns view. Have a look at the data returned by that view and note that
it includes a row for each geometry column in the database. Among the metadata stored
here are the parent schema, the parent table, the geometry column’s name, the
coordinate dimension, the SRID and the geometry type (e.g., POINT, LINESTRING, etc.).
Being able to conduct spatial analysis with PostGIS requires accurate geometry column
information, so the PostGIS developers have made these data accessible through a read-
only view rather than a table.
We’ll talk more about measuring lengths, distances, and areas in the next lesson, but
while we’re on the topic of spatial references it makes sense to consider 2D Cartesian
measurement in the context of planimetric map data versus measurement in the context
of the spherical surface of the Earth. For example, the PostGIS
function ST_Distance() can be used to calculate the distance between two geometries.
When applied to geometries of the type we’ve dealt with so far, ST_Distance() will
calculate distances in 2D Cartesian space. This is fine at a local or regional scale, since
the impact of the curvature of the earth at those scales is negligible, but over a continental
or global scale, a significant error would result.
PostGIS offers a couple of alternative approaches to taking the earth’s curvature into
account. Let’s assume that we wanted to measure the distance between points in the
(U.S.) cities table that we created earlier in the lesson. We could use a version of the
ST_Distance() function called ST_Distance_Spheroid(). As its name implies, this
function is designed to calculate the minimum great-circle distance between two
geometries.
The other approach is to store the features using a data type introduced in PostGIS 1.5
called geography. Unlike the geometry data type, the geography data type is meant for
storing only latitude/longitude coordinates. The advantage of the geography data type is
that measurement functions like ST_Distance(), ST_Length() and ST_Area() will return
measures calculated in 3D space rather than 2D space. The disadvantage is that
the geography data type is compatible with a significantly smaller subset of functions as
compared to the geometry type. Calculating spherical measures can also take longer
than Cartesian measures, since the mathematics involved is more complex.
The take-away message is that the geography data type can simplify data handling for
projects that cover a continental-to-global scale. For projects covering a smaller portion of
the earth’s surface, you are probably better off sticking with the geometry data type.
With that, we've covered all of the content for Lesson 3. In the next section, you'll find a
project that will allow you to put what you've learned to use.
Links
[1] https://www.postgresql.org/download/
[2] https://winnie.postgis.net/download/windows/pg12/buildbot/postgis-bundle-pg12x64-setup-3.0.1-
2.exe
[3] http://qgis.org/en/site/forusers/download.html
[4] https://www.e-education.psu.edu/spatialdb/sites/www.e-
education.psu.edu.spatialdb/files/Lesson3Data.zip
[5] https://www.postgresql.org/docs/current/static/sql-copy.html
[6] http://stackoverflow.com/questions/14083311/permission-denied-when-trying-to-import-a-csv-file-
from-pgadmin
[7] https://www.e-education.psu.edu/spatialdb/L3_practice_solutions.html
[8] http://en.wikipedia.org/wiki/Hexadecimal
[9] http://en.wikipedia.org/wiki/Linear_referencing
[10] http://en.wikipedia.org/wiki/SpatiaLite
[11] http://en.wikipedia.org/wiki/SQLite
[12] http://prj2epsg.org/
[13] https://www.e-education.psu.edu/spatialdb/sites/www.e-
education.psu.edu.spatialdb/files/868_roster_su20.txt
[14] https://www.postgresql.org/docs/current/static/index.html