Why Python For GIS
Why Python For GIS
When it comes to data science, and especially spatial data science and spatial data analysis,
Python currently is the most useful programming or scripting language to learn. On the one
hand, the Python ecosystem, which allows people to share their libraries easily with others, is
very rich, and has packages for almost any task imaginable. Researchers who find that a
particular tool does not exist, often go ahead and create it themselves. Python’s packages cover
such a broad range of applications, that its import statement has become the trope of many
jokes:
There are more advantages more specifically for geographers and spatial data: most desktop GIS
(e.g., QGIS, ESRI ArcGIS) have language bindings or programming interfaces for Python: that
means that you can write Python programs that interact with data, libraries, and user interface
of these applications, and create custom plugins to extend their functionality. Beyond that, here
are powerful Python packages for virtually any task regarding spatial data management and
spatial analysis.
During this course, we will focus on the latter: carrying GIS analysis and data management tasks
without a desktop GIS, let alone a proprietary third-party software package such as ESRI ArcGIS.
Why? There’s more than one reason:
Python itself, and most packages for it, is open source. It means it’s free to use (as in free
beer), and free to modify for your own purposes (as in free speech). Open source and libre
software is inherently democratic: you don’t have to rely on access to a license, and anybody
across the globe can use them. Read more about free beer vs. free speech at the Free Software
Foundation’s web page
Writing a script that solves a problem makes you learn and understand more deeply how geo-
processing operations work, and how they work together. In this course, you will gain a deeper
understanding of GIS.
Python is efficient and scales very well: You can use a Python script to analyse Big Data and
other large datasets, and it is likely to outperform any desktop GIS tool.
Python is highly flexible: it can read and write almost any data format, and its package
ecosystem provides libraries for almost any programming task imaginable.
Using Python, or any other open source software, for what it’s worth, supports the ideas of
open science and reproducable research. Anyone can take your code and repeat your
experiment or analysis to verify your claims and to learn from your example.
Combining the functionality of different Python packages you can achieve surprising results
with comparably low effort: e.g., building fancy web-GIS applications by chaining GeoDjango
with a PostGIS backend.
During the GeoPython course, we have already used a few Python modules for carrying out
different tasks, such as numpy for mathematical calculations, and matplotlib for data
visualisation. In ’AutoGIS’, we will get to know a range of other Python packages that are useful
for data analysis and the handling of geo-spatial data sets.
Below we listed most of the most common Python packages for GIS and data analysis (and
links to their documentation). This list helps you get started when you approach a data analysis
or GIS problem (but be sure to also use your favourite search engine to see if better alternatives
have become available).
Even if you learnt about a package from a blog post, or from this course, always check the
package’s own documentation page for its recommended use.
Plotly: Interactive visualizations (also maps) for the web (commercial - free for educational
purposes
GIS:
GDAL: Fundamental package for processing vector and raster data formats (many modules
below depend on this). Used for raster processing.
Geopandas: Working with geospatial data in Python made easier, combines the capabilities of
pandas and shapely.
Shapely: Python package for manipulation and analysis of planar geometric objects (based on
widely deployed GEOS
OSMnx: Python for street networks. Retrieve, construct, analyze, and visualize street networks
from OpenStreetMap
Networkx: Network analysis and routing in Python (e.g. Dijkstra and A* -algorithms), see this
post.
Cartopy: Make drawing maps for data analysis and visualisation as easy as possible.
Rasterio: Clean and fast and geospatial raster I/O for Python.