Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
153 views

Spatial Data Analysis: Course Outline: Gilberto Câmara National Institute For Space Research, Brazil

The document outlines the course details for a spatial data analysis course. It discusses that the course will cover point pattern analysis, areal data analysis, surface data analysis, and trends in spatial data analysis. It provides the course schedule and overview for the two-week period, including topics to be covered each day, readings, and software to be used. It also provides context on the instructor's affiliation with the National Institute for Space Research in Brazil and background on spatial data and analysis.

Uploaded by

ravi01nov
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
153 views

Spatial Data Analysis: Course Outline: Gilberto Câmara National Institute For Space Research, Brazil

The document outlines the course details for a spatial data analysis course. It discusses that the course will cover point pattern analysis, areal data analysis, surface data analysis, and trends in spatial data analysis. It provides the course schedule and overview for the two-week period, including topics to be covered each day, readings, and software to be used. It also provides context on the instructor's affiliation with the National Institute for Space Research in Brazil and background on spatial data and analysis.

Uploaded by

ravi01nov
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 92

Spatial Data Analysis:

Course Outline

Gilberto Câmara
National Institute for Space Research,
Brazil

Fall School 2005


INPE - brief description
 National Institute for Space Research
 main civilian organization for space activities in Brazil
 staff of 1,800 ( 800 Ms.C. and Ph.D.)

 Areas:
 Space Science, Earth Observation, Meteorology and Space
Engineering
CBERS-2
CBERS-2 Launch
(21 October 2003)
CBERS-2 CCD, Minas Gerais, Brazil
 CBERS-2 image
from Louisiana,
EUA

 Obtained from
on-board data
recorder
Amazon Deforestation 2003

Deforestation 2002/2003

Deforestation until 2002

Fonte: INPE PRODES Digital, 2004.


Amazônia in 2005

source: Greenpeace
Amazônia in 2015? fonte: Aguiar et al., 2004
R&D in GIScience at INPE

 Graduate programs in Computer Science and Remote


Sensing
 Research areas
 Spatial statistics
 Spatial dynamical modelling
 Spatio-temporal databases
 Image databases and image processing

 Technology
 TerraLib – open source library for ST DBMS
Course outline

 Motivation: why do need spatial data analysis?

 Point pattern analysis

 Areal data analysis

 Surface data analysis (geostatistics)

 Trends in spatial data analysis


Course outline: 1st week

 Monday – Introduction
 10:30 – 12:00 (2)
 Tuesday – Basic concepts
 10:30 – 12:00 (2)
 Wednesday – Areal analysis I (LAB work)
 9:00 – 10:30 and 11:00 – 12:30 (4)
 Thursday – Areal analysis II
 10:30 – 12:00 (2)
 Friday – Areal analysis III (LAB work)
 9:00 – 10:30 and 11:00 – 12:30 (4)
 Saturday – QUIZ
 14:00 – 17:00 (LAB)
Course outline: 2nd week

 Monday – Surface analysis I


 10:30 – 12:00 (2)
 Tuesday – Surface analysis (LAB)
 10:30 – 12:00 (2)
 Wednesday – Surface analysis II (LAB)
 9:00 – 10:30 and 11:00 – 12:30 (4)
 Thursday – Point pattern analysis (LAB work)
 9:00 – 10:30 and 11:00 – 12:30 (4)
 Friday – Trends in spatial data analysis + quiz
 9:00 – 10:30 (2) and 11:00 – 12:30 (quiz)
Course material

 Course homepage
 www.dpi.inpe.br/gilberto/tutorials.html
 Bailey and Gattrel, “Spatial Data Analysis by example”

 Software
 R – statistical suite (open source)
 www.r-project.org
 GeoDa – analysis of areal data (gratis)
 TerraView – visualisation and analysis (open source)
 www.terralib.org
 TerraLib – GIS library (open source)
Spatial Data Analysis: An
Introduction

Gilberto Câmara
National Institute for Space Research, Brazil

Fall School 2005


Why does GIS matter?

 Why do we do care about spatial data?

 Because...

 Space is an essential component of everyday life!

 How do you choose a house?

 How does a disease propagate?

 How do I start a new business?


And the answer is...

 Location, location, location!

 Space matters...

 My new house will be in a nice neighbourhood...

 He caught malaria when he was in Amazonia...

 His new business will be a shopping center in a recently


developed area....
What is Spatial Data?

 Spatial == “catch-all” word

 General feature
 Refers to a geographical location
 Either “in situ” or indirectly (remote sensing, place names,
adresses)
 In many cases, in “someone else’s backyard”

 And...there’s LOTS of it, it’s everywhere and it’s


about almost everything!!
LBA Flux Towers on Amazonia

Source: Carlos Nobre (INPE)


Source: Carlos Nobre (INPE)

Biodiversity...
CBERS Image
Fire Monitoring in Brazil

Imagem TM Cartographic Base


Landsat/CBERS Reception

Products
NOAA Reception NOAA Image

Internet

CPTEC Weather Forecast


Decision Making
We know how much you spend...

Source: Stan Openshaw


…where you spend it...

Source: Stan Openshaw


…who you talk to...

Source: Stan Openshaw


…where you live...

LS2 9JT

What your
neighbours
are like…

Source: Stan Openshaw


...your neighbors...

Census tracts and


Houses for data collection
...your misbehaviour and...

 crime type
 crime location
 insurance data

Source: Stan Openshaw


...your health

 environmental data
 socio-economic data
 admissions data

Source: Stan Openshaw


GeoSensors: New technology of earth observations

Smart Dust (UC “Spec” mote


Berkeley)
UC Berkeley

Intel mote

MICA
mote
The Road Ahead: Geosensors
 Advances in remote sensing are giving computer
networks the eyes and ears they need to observe their
physical surroundings.
 Sensors detect physical changes in pressure,
temperature, light, sound, or chemical concentrations
and then send a signal to a computer that does
something in response.
 Scientists expect that billions of these devices will
someday form rich sensory networks linked to digital
backbones that put the environment itself online.
(Rand Corporation, “The Future of Remote Sensing”)
 A new international organization tasked with
implementation a Global Earth Observation System
of Systems (GEOSS).
 GEOSS shall coordinate a wide range of space-based,
air-based, land-based, and ocean-based environmental
monitoring platforms, resources and networks –
presently often operating independently.
 Membership in GEO currently includes 51 countries
plus the European Commission, and 29 participating
international organisations.
Coordinating Earth Observing Systems
Vantage Points Capabilities
L1/HEO/GEO
Far- TDRSS &
Permanent

Space Commercial
Satellites
LEO/MEO
Near- Commercial
Space Satellites
and Manned
Spacecraft

Airborne Aircraft/Balloon
Event Tracking
Deployable

and Campaigns

Terrestrial

Forecasts & Predictions User


Community
Remote Sensing: Increased
EO capability
What do we do with so much spatial data?

 First, we collect it...


 GPS, remote sensing, field surveys
 Data conversion

 Then, we organize it...


 Spatial modelling
 Spatial databases
 Spatial visualization

 But more important is to analyse and understand it!


Vision: from data to knowledge
fonte: NASA
Space
Objects Actions

Material world Events

“Space is a system of entities and a system of actions”


Milton Santos
Spatial Data

Natural Human
Domain Domain

INFRASTRUCTURE

IMAGES ENVIRONMENTAL -roads CADASTRAL CENSUS


DATA -utilities DATA DATA
-planes -dams
-satellites -topography -parcels -Demographics
-soils -streets -Economics
-temperature -land use
-hidrography
-geology
FROM DATA TO COMPUTER REPRESENTATION

X,Y,Z
X,Y,Z X,Y,Z
X,Y,Z

X,Y,Z
EVENTS / POINT SAMPLES

SURFACES / REGULAR GRIDS

AREA DATA / POLIGONS

FLUX DATA / NETWORKS


Remote Sensing

LANDSAT 5 TM image of São Paulo, 1997


Aerial Photos

Favela da maré, Rio de Janeiro - 2001


Choropletic Maps

São Paulo - 96 districts per São Paulo – 270 survey areas per
capita income capita income
Trend Surfaces

iex

Social Exclusion 1995 Social Exclusion 2002


FLUXES
The Five Orders of Ignorance
 0th Order Ignorance (0OI): Lack of Ignorance
 I (provably) know something

 1st Order Ignorance (1OI): Lack of Knowledge


 I do not know something

 2nd Order Ignorance (2OI): Lack of Awareness


 I do not know that I do not know something

 3rd Order Ignorance (3OI): Lack of Process


 I do not know a suitably effective way to find out that I don’t know that
I don’t know something

 4th Order Ignorance (4OI): Meta-Ignorance


 I do not know about the Five Orders of Ignorance

The five orders of ignorance, Phillip G. Armour, CACM, 43(10), Oct 2000
The First Law of Geography

 Tobler’s Law
 Everything is related to everything else, but near things are
more related than distant things
 We call this “spatial dependence”

 Can we see Tobler’s law in action?

 Yes, there are lots of exemples...Here are some....


Lung Cancer for White Males in USA
Log homicide rate for males of ages 15-50 per 100.000
residents of the same sex and age, 1990-92

Minas Gerais

Espírito
Santo
São Paulo

Rio de Janeiro
LEGENDA
N Capitais

O L
classes (n de municípios)

S 0,95 a 1,906 (28)


1,906 a 2,862 (209)
2,862 a 3,818 (460)
0 100 200
3,818 a 4,774 (223)
Kilômetros 4,774 a 5,73 (64)
0 óbitos (448)

Source: Marilia Carvalho (FIOCRUZ/Brasil)


Homicides in Belo Horizonte (1998)

Source: Renato Assunção (UFMG/Brasil)


Crimes in Belo Horizonte

Aggression
Burglaries

Source: Renato Assunção (UFMG/Brasil)


80% of homicidies in Belo
Horizonte
occured in these regions

Source: Renato Assunção (UFMG/Brasil)


Spatial Inequalities in São Paulo
Per capita income Jobs/ populations Illiterate / population

Source: Fred Ramos (CEDEST/Brasil)


Social Exclusion in São Paulo

 “Hot spots” of social exclusion/inclusion in São Paulo


We also try to describe nature

 Serra dos Carajás

 World’s largest iron ore deposit

PARÁ

 We also try to find Zn and Cu Serra dos


Carajás
6oS

50oW
TIN GRADE RETANGULAR

Where is the Zync?

PONTOS AMOSTRADOS DE Zn AMOSTRAS KRIGEAGEM


+

-
AMOSTRAS GRADE RETANGULAR
Where is the Copper?

930.3
1001 1001

20.7 171 171

Krig. Ordinária Krig. por Indicação – Médias (2) Krig. por Indicação – Mediana

80000 731.84 1641.29

53956.26 0 0
Byrsonima subterranea (Malpighiaceae)

Considered to be extinct in the state of São Paulo, Brazil


One specimen was found...
Where can we try to find others?
Source: Marinez Siqueira (CRIA/Brasil)
After building a species occurence model
Source: Marinez Siqueira (CRIA/Brasil)

Found 4 additional specimens of B. subterranea


What can we deal with so much spatial data?

 We need a body of scientific theories that


 Extract knowledge from spatial data
 Expresses what is “special about space”

 Remember...
 Science is more than a body of knowledge; it is a way of thinking. [...]
The method of science ... is far more important than the findings of
science. (Carl Sagan)
 Science is made of “conjectures and refutations”

 Therefore...
 Spatial data analysis is what makes GIS more than a collection of data
What is Spatial Data Analysis?

 Formal quantitative study of phenomena that manifest


themselves in space (Anselin)

 Collection of tools for investigating


 Spatial patterns of values and places
 Variations of spatial phenomena based on their association

 What we want is to..


 discover not only where, but also what defines and structures the
places we live in

 Spatial data analysis


 Transforms “Tobler’s law” into quantifiable assessments
What is Spatial Data Analysis?
 Analytical capacity
 Understand the spatial distribution of values (identify
trends and clusters)
 Develop possible explanations (models) for the observed
patterns
 Use the models to indicate what can happen in other
ocasions
 Remember
 Our primary aim is not to describe the data accurately
 It is more important to understand the spatial patterns and
to explore relationships between variables
Infant Mortality in Minas Gerais State

City rates in 1994.


(756 cities)

600
SMR vary from 0 to

Standardized mortality rate


600 ! 500

Observe the funnel 400

effect. 300

200

100

0 5 10
Log newborn children

Source: Renato Assunção (UFMG/Brasil)


Can we believe everything we see?

 Infant mortality rates in Minas Gerais


 15 cities with 0 deaths and < 30 born alive.
 If one death is recorded, rates jump from 0 to values
between 116 and 1048!!!
 The current extreme value is 608.9

 Dealing with incomplete data


 Requiresstatistical hypothesis
 What we see is a blurred picture of reality....
Can we believe everything we see?

Infant mortality rates in Rio de Janeiro – Average is 24 per 1000


But note extreme values of 130 per 1000
Are these values real?
Statistics 101

 Statistics is about
 Systematically studying phenomena in which we are interested
 Quantifying variables in order to use mathematical techniques
 Summarizing these quantities in order to describe and make inferences
 Using these descriptions and inferences to make decisions or
understand

 Why do we need statistics to handle spatial data?


 Help us to make sense of incomplete and misleading data
 Allows inclusion of “external” knowledge (things we know about the
data)
 Enables us to make informed assessments
Random Variables

 Things that change


 Environmental events or conditions
 Personal characteristics or attributes
 Behaviors

 Anything that takes on different values in different


situations (e.g., in different places)
Distribution of Random Variables

 Statistics deals with regularities and variability of events

 Statistics measures the consistency of variables and the


variability around this consistency

 Expression of these regularities/variabilities


 Distribution function
 Mathematical expression of the likelihood that a random variable has a
particular value
 Properties of spatial distributions
 Mean value
 Variance
Counts as a Random Variable

 Counts
 Typical
type of spatial survey data
 Number of children born, AIDS patients, crimes in a district

 The statistical way


 Number of observed counts Oi in area i is a random variable
 This means that Oi has a probability distribution with a mean
and a variance, etc.
 Usual assumption: Oi ~ Poisson(i)
 i = expected number of counts in area
Poisson Probability Distribution

=0,3 =1 =5

=10 =25 =60

 is the expected arrival/occurence rate of a discrete random variable

Source: Renato Assunção (UFMG/Brasil)


A general advice

Look at your data!!


Ways to look at spatial data

Spatial data as a map


Ways to look at spatial data

Spatial data as a combined distribution


Ways to look at spatial data
Each area has a unique probability distribution (the same model is assumed)
Ways to look at spatial data
If each area has a unique probability distribution,
what are we seeing?
Types of spatial data analysis

 Lattice data
 Discrete variation over space, with observations associated
with regular or irregular areal units
 Surface data
 observations associated with a continuous variation over
space, typically in function of distance)
 Point patterns
 occurrences of events at locations in space
Point patterns

Dr. Snow’s cholera cases in London


Point Patterns: violence data

CACHOEIRINHA

ALVORADA

GUAÍBA
VIAMÃO

0 5 10

Quilômetros

Legenda:  Homicídios /  Acidentes de transporte /  Suicídios Santos,S.M., 1999


Surface data – homicide risk in São Paulo – 1996
Areal data: social exclusion in São José dos Campos

[-1.00~-0.75]
[-0.75~-0.50]
[-0.50~-0.25]
[-0.25~0.00] Muito Alto
[0.00~0.25] Alto
[0.25~0.50] Médio Alto
[0.50~0.75] Médio Baix o
[0.75~1.00] Baix o
Muito Baix o
From Areas to Surfaces
Perceptions of space

 Different
representations
 Images
 Areas
 Surfaces
Perceptions of space

Space as an areal data set

Space as a continuous
surface
Space as Clusters
From color maps...
Mapas coloridos... to spatial patterns

 “Clusters” de exclusão/inclusão social em São Paulo


Spatial Analysis Methodologies

 Data-driven approach
 information is derived from the data without a prior notion
of what the theoretical framework should be.
 "data speak for themselves“
 information on spatial pattern, spatial structure and spatial
interaction without the constraints of a pre-conceived
theoretical notion.
Spatial Analysis Methodologies
 Model-driven approach
 starts
from a theoretical specification, which is
subsequently confronted with the data.
 Generalized linear models
 Linear regression
 Problem: how does "space“ affect these models?
Model-Driven Approaches

 Model of discrete spatial variation


 Each subregion is described by is a statistical distribution Zi
 e.g., homicides numbers are Poisson (, ).
 The main objective of the analysis is to estimate the joint
distribution of random variables Z = {Z1,…,Zn}

 Model of continuous spatial variation


 Allof the area is a continuous surface
 The main objective is to estimate the distribution Z(x), x 
A
Models of Discrete Spatial Variation

Z i  Random
variable in
area i
Yi

• n° of ill people

Taxas de Leishmaniose Visceral (1997/1998) .


• n° of newborn babies
casos por 100 mil habitantes .
200 a
150 a
100 a
250 (1)
200 (2)
150 (1)
• per capita income
50 a 100 (4)
10 a 50 (29)
5a 10 (16)
1a 5 (43)
<1 (19)

Source: Renato Assunção (UFMG/Brasil)


Models of Continuous Spatial Variation
Temperature, Water ph, soil acidity...

Sampling stations in locations marked by

Location to predict value: shown as

Source: Renato Assunção (UFMG/Brasil)


Spatial Data Analysis

Data Types Example Typical Problems

Point Pattern Localized Disease


Cluster tests
Analysis Events Mapping

Geostatistics
Field Mineral Surface
(surface
Samples Deposit Interpolation
modelling)

Features and Census Spatial Autocorr.


Areal Data
Attributes Data Indicators
Thematic Maps Digital Terrain Models Images

Multiple Representations of Space

Networks
Features

You might also like