Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Accident Analysis and Prevention: Apostolos Ziakopoulos, George Yannis T

Download as pdf or txt
Download as pdf or txt
You are on page 1of 30

Accident Analysis and Prevention 135 (2020) 105323

Contents lists available at ScienceDirect

Accident Analysis and Prevention


journal homepage: www.elsevier.com/locate/aap

A review of spatial approaches in road safety T


Apostolos Ziakopoulos , George Yannis

National Technical University of Athens, Department of Transportation Planning and Engineering, 5 Heroon Polytechniou Str., GR-15773, Athens, Greece

ARTICLE INFO ABSTRACT

Keywords: Spatial analyses of crashes have been adopted in road safety for decades in order to determine how crashes are
Road safety affected by neighboring locations, how the influence of parameters varies spatially and which locations warrant
spatial analysis interventions more urgently. The aim of the present research is to critically review the existing literature on
crash analysis different spatial approaches through which researchers handle the dimension of space in its various aspects in
study characteristics
their studies and analyses. Specifically, the use of different areal unit levels in spatial road safety studies is
areal units
investigated, different modelling approaches are discussed, and the corresponding study design characteristics
are summarized in respective tables including traffic, road environment and area parameters and spatial ag-
gregation approaches. Developments in famous issues in spatial analysis such as the boundary problem, the
modifiable areal unit problem and spatial proximity structures are also discussed. Studies focusing on spatially
analyzing vulnerable road users are reviewed as well. Regarding spatial models, the application, advantages and
disadvantages of various functional/econometric approaches, Bayesian models and machine learning methods
are discussed. Based on the reviewed studies, present challenges and future research directions are determined.

1. Introduction heterogeneity.
In simple terms, spatial dependence essentially refers to events at a
Road safety has been a major issue in contemporary societies, with location being highly influenced by events at neighboring locations. It is
road crashes incurring major human and material costs annually usually measured via spatial autocorrelation metrics. In turn, auto-
worldwide. Traffic and road safety practices have been implemented to correlation refers to the influence of variable values of given points on
save lives by halting the increase of road traffic fatalities against an variable values of adjacent points (spatially or temporally). Spatial
ever-rising population (WHO, 2015), though it appears that the global heterogeneity occurs in the modelled relationships as the coefficients
target of halving road traffic deaths by 2020 will not be met (WHO, between random parameters and observed events are not fixed spa-
2018). tially.
The still occurring and plateauing crash casualties suggest a lot of Therefore, researchers have discovered several caveats and merits in
untapped potential and margins for safety improvements that can be conducting spatial analysis. Road crashes are subject to both spatial and
exploited if the occurrence of crashes can be predicted more accurately. temporal variations (Loo and Anderson, 2015), intuitively suggesting
Road safety scientists have invested considerable efforts in studying the spatial analyses as informative. By accounting for spatial dependence
impacts of several risk factors (e.g. Theofilatos and Yannis, 2014; and heterogeneity in the estimates, spatial analyses describe how re-
Papadimitriou et al., 2019) and road safety measures (e.g. Elvik et al., gions affect and are affected by the road safety attributes of their
2009) and have developed or adopted a number of mathematical neighbors, and how the influence of explanatory parameters varies
methodologies to approach crash prediction problems (e.g. Lord and across space as well.
Mannering, 2010) or road safety site prioritization problems (e.g. Lee As a more specific example, when considering spatial correlation in
and Abdel-Aty, 2018). crash models, estimates are effectively "pooling strength" from neigh-
Since road transport involves distances by nature, it stands to reason boring locations, thus improving the produced estimations (Aguero-
that spatial analyses would be considered by researchers. Spatial ana- Valverde and Jovanis, 2008). Road crashes are a complex phenomenon,
lyses in road safety typically involve the examination of crashes while and their analysis requires assumptions and merging of the examined
taking their absolute or relative locations into account. Crashes face the parameters for a feasible approach, which unavoidably leads to some
typical issues of all point data: spatial dependence and spatial degree of loss of information or even misrepresentation of the actual


Corresponding author.
E-mail address: apziak@central.ntua.gr (A. Ziakopoulos).

https://doi.org/10.1016/j.aap.2019.105323
Received 8 July 2019; Received in revised form 27 September 2019; Accepted 3 October 2019
Available online 22 October 2019
0001-4575/ © 2019 Elsevier Ltd. All rights reserved.
A. Ziakopoulos and G. Yannis Accident Analysis and Prevention 135 (2020) 105323

conditions (Xu and Huang, 2015). Spatial analyses can counterbalance Different spatial units are discussed in the following section, and
this loss by providing predictions of counts of crashes (and of similar study characteristics for each spatial unit level are summarized on
incidents, such as near-misses) that vary across different units of ana- Tables 1–4. It was decided to include study characteristics initially
lyses, thus capturing all the unobserved trends and particularities of considered by researchers on the Tables of this review, even if they
each area. Thus not only is better theoretical understanding provided were not found significant in the respective final models, to better
for crash occurrence across space, but the identification of high-risk showcase the scope of each research. The examined crash categories are
sites (known as hotspots) becomes more accurate (El-Basyouny and denoted with the following acronyms with respect to the involved road
Sayed, 2009; Aguero-Valverde, 2014). users: Total Crashes (TC), Motorcycle crashes (MC), Single Vehicle
After decades of research, the topic of spatial analysis of traffic crashes (V), Vehicle-vehicle crashes (V-V), Bicycle-vehicle crashes (B-V)
crashes covers a wide range, including mapping and visualization of and Pedestrian-vehicle crashes (P-V). When crash category details are
crash counts, identifying clustering patterns of traffic collisions, and use not given about the examined crashes in a study, they are noted as TC.
of spatial models to investigate the effects of contributory factors and Additional details, such as the analysis of a specific crash type are noted
recommend targeted countermeasures. The mathematic particulars of as well.
spatial analyses have been examined in several published studies, for
instance in Bivand et al. (2009) for Global and Local Moran's I and in 2.1. Road segment and intersection approaches
Ver Hoef et al. (2018) for conditional autoregressive priors (CAR)
models or simultaneous autoregressive priors (SAR) models. The reader Initial approaches of spatial analyses involved the more intuitive
is also referred to Yao et al. (2016), for a review of major advancements examination of road safety indicators across singular or multiple road
of spatial crash analysis using applied GIS tools. The examined research sections, such as straight road segments and intersections. Earlier ap-
there starts from a significantly older period (in 1976) and includes proaches involve the depiction and analysis of spatial distribution of
topics that fall out of the scope of the present research, such as visua- crashes on (state) highways, in an attempt to perceive visual patterns of
lizing and mapping of events. heightened concentration and possible correlation with touristic areas
The aim of the present paper is to provide a review of the scientific (Page and Meyer, 1996), albeit with a small sample. Furthermore, ex-
literature regarding spatial approaches and spatial analyses in road amination of the impact of the length of segments on crash counts and
safety. The present study is an endeavor to investigate how road safety density which were found to follow Poisson distribution in the smaller
researchers handle the dimension of space in its various aspects in their segment scales growing from more intermediate distributions to normal
studies, whether that regards modelling of spatial events, selecting the distributions as segments increased, as shown by a study by Thomas
scale of areal units or proximity structures, tackling boundary problems (1996) that also first touched on the modifiable areal unit problem in
or other specific issues (such as vulnerable road users – VRUs). In order road safety (discussed in section 2.6).
to achieve the aim of the current research, published scientific studies It has been determined that local environment and road infra-
(in English) are critically examined. The selected studies were intended structure are critical factors of crash occurrence (Flahaut, 2004; Wang
to be representative of a wide array of countries and adopted meth- et al., 2016a). A traditional division when examining straight road
odologies, in order to provide a well-rounded summary of the state-of- segments is road type; highways with divided traffic directions display
the art in road safety spatial analyses. Emphasis was given to more different road safety mechanisms than undivided two-lane expressways
recent studies, with some seminal endeavors being included as well for and for decades have been analyzed separately, a practice that is con-
completeness. tinued in segment-based spatial analyses.
The main focus of the current study is on study characteristics, The environment of road segments has been traditionally examined
modelling approaches and methodological issues. It should be noted separately in the literature, with researchers distinguishing between
that this research only includes studies that conducted explicit and urban and rural segments and often producing comparative analyses
dedicated spatial or spatio-temporal analyses, as opposed to studies that between different types of segments. A spatial analysis by Flahaut
examine different areas for purposes of cross-sectional or case-control (2004) determined 2-lane configurations as the most unsafe config-
studies (and as such do not examine the spatial aspect of road safety uration for rural roads. For urban roads, it has been found that increases
incidents). The second category of studies has its own merits and has in the number of crosswalks and the densities of unsignalized inter-
been extensively implemented in road safety research, but falls out of sections both increase crash occurrence (Barua et al., 2014). Further-
the scope of this review. more, local and non-local drivers are found to cluster along road seg-
This paper is organized as follows. Section 2 includes an examina- ments, and segments with adverse safety interactions between these
tion of the different spatial units of analyses, together with famous two groups are estimated to transfer these effects spatially to neigh-
boundary and zonal problems, as well as the issues of proximity boring segments (Wang et al., 2016a).
structures. Section 3 outlines various modelling approaches, while In spatial analyses, researchers examine intersections either in
Section 4 discusses issues in spatial analyses of VRUs. Finally, a dis- groups (Guo et al., 2010; El-Basyouny and Sayed, 2011) or in ag-
cussion of overall findings from the review process and future research gregation (Miaou and Lord, 2003; Wang and Abdel-Aty, 2006). Inter-
directions are provided in Section 5. section geometry, location and traffic parameters are important within
the context of spatial analyses. The size of intersection, the traffic
2. Examination of spatial units conditions by turning movement, and the coordination of signal phase
have significant impacts on the number of crashes at intersections (Guo
Spatial analyses in road safety fundamentally involve the ex- et al., 2010). Xie et al. (2013) have shown intersections on segments
amination of road safety indicators (crash counts or rates, injury se- with lower mean speeds were associated with fewer crashes than those
verity rates etc.) across spatial units of analyses. The manner in which with higher speeds, and that intersections on two-way roads, under
researchers select and define these spatial units directly influences the elevated roads, and in close proximity to each other, tended to have
scope of the study, as well as the interpretability of results, while this higher crash frequencies as well. A seminal result of a study by Abdel-
can apply to data preparation as well (Imprialou et al., 2016). There is a Aty & Wang (2006) shows that overall, three-legged intersections tend
structural difference, for instance, in examining spatial distribution of to exhibit lower crash rates than four-legged intersections, and that
road safety indicators in consequent road segments that feed traffic flow they exhibit different road safety mechanisms. Furthermore, effective-
seamlessly into each other compared to examining junction clusters ness of implemented road safety treatments can vary between locations
with several inflows and outflows for the distributions of the same in- when considering injury severity levels (El-Basyouny and Sayed, 2011).
dicators. When proximal segments are considered, with the layout of a simple

2
Table 1
Studies with road safety spatial analyses primarily on the individual road segment/intersection level

Study Characteristics Dependent variables Independent variables – parameters

Traffic Road user

Author(s) (Year) Country of Crash type Crash Crash rate Injury Casualty Speed Traffic Vehicle Number of Road Modal
A. Ziakopoulos and G. Yannis

study analyzed count/ Severity rate volume distance Trips - OD user/ distinc-
frequency traveled Populatio- tion
n age

Abdel-Aty and Wang (2006) United States TC ● ● ●


Aguero-Valverde (2014) United States TC ● ●
Aguero-Valverde and Jovanis (2010) United States TC ● ●
Aguero-Valverde and Jovanis (2008) United States TC ● ●
Aguero-Valverde et al. (2016) United States TC (6 Crash types) ● ●
Alarifi et al. (2018) United States TC ● ● ●
Alarifi et al. (2017) United States TC ● ● ●
Barua et al. (2016) Canada TC ● ○ ●
Barua et al. (2014) Canada TC ● ○ ●
Chiou et al. (2014) Taiwan TC ● ● ● ●
Effati et al. (2015) Iran TC ● ●
El-Basyouny and Sayed (2011) Canada TC ● ○ ●
El-Basyouny & Sayed (2009) Canada TC ● ●
Guo et al. (2010) United States TC ● ○
Huang et al. (2017) China TC | V/V-V P-V | B- ● ● ● ○
V
Huang et al. (2016) United States TC ● ● ● ● ●
● ○ ●

3
Flahaut (2004) Belgium TC
Liu et al. (2017) United States TC ● ● ●
Ma et al. (2017) United States TC ● ○ ● ●
Miaou & Lord (2003) Canada TC ● ●
Miaou & Song (2005) Canada | TC ● ● ● ● ●
United States
Mitra (2009) United States TC ● ● ●
Mountrakis & Gunson (2009) United States V-A ●
Page & Meyer (1996) New Zealand TC ● ○
Thomas (1996) Belgium TC ● ○ ○
Wang & Abdel-Aty (2006) United States V-V (rear-end only) ● ●
Wang & Huang (2016) United States TC ● ● ●
Wang et al. 2016a United States TC ● ● ●
Wang et al. (2009) United TC ● ○ ● ●
Kingdom
Wen et al. (2019) China TC ● ●
Xie et al. (2014) China TC ● ● ●
Xie et al. (2013) China TC ● ● ●
Zeng & Huang (2014) United States TC ● ●

Study Characteristics Spatial aggregation approach Analysis - Modelling approach

Road environment

Author(s) (Year) Country of Speed Curvature Gradient Lane Lane Intersecti- Roadway Regional Zonal Link/
study Limit width number on nr./ length level level segment/
density intersec-
tion level

Abdel-Aty and Wang (2006) United States ● ● ● ○ ● Intersecti- Negative Binomial Regression
Accident Analysis and Prevention 135 (2020) 105323

ons with and without Generalized


(continued on next page)
Table 1 (continued)

Study Characteristics Spatial aggregation approach Analysis - Modelling approach

Road environment

Author(s) (Year) Country of Speed Curvature Gradient Lane Lane Intersecti- Roadway Regional Zonal Link/
study Limit width number on nr./ length level level segment/
A. Ziakopoulos and G. Yannis

density intersec-
tion level

estimating equations | Cluster


analysis
Aguero-Valverde (2014) United States ● ● ○ ● Rural road Full Bayes hierarchical Poisson
segments model (1) with normal priors for
spatial random effects | (2) with
CAR priors for spatial random
effects | (3) with a joint
distribution
Aguero-Valverde and Jovanis (2010) United States ● ● ● ● ○ ● Rural & Full Bayes hierarchical Poisson
Urban model with CAR priors for spatial
road random effects
segments
Aguero-Valverde and Jovanis (2008) United States ● ● ○ ● Rural road Bayesian Multivariate Poisson
segments Lognormal Regression | Bayesian
random effects models
Aguero-Valverde et al. (2016) United States ● Rural road Full Bayes Poisson Regressions
segments (Univariate, Univariate Spatial,
Multivariate, Multivariate
Spatial)

4
Alarifi et al. (2018) United States ● ● ● ● Intersecti- 13 Bayesian hierarchical Poisson-
ons | Road lognormal joint spatial models
segments with adjacency-based, adjacency-
route, distance-order, and
distance-based spatial weight
features
Alarifi et al. (2017) United States ● ● ● ● Intersecti- Multilevel Poisson-lognormal
ons | Road joint model (1,2) with corridor
segments and sub-corridor random effects
(3,4) with corridor and sub-
corridor random parameters
Barua et al. (2016) Canada ● ● ● Urban Full Bayesian Poisson lognormal
road multivariate random parameters
segments models (1) with heterogenous
effects (2) with CAR priors for
spatial heterogeneity (3) with
both
Barua et al. (2014) Canada ● ● ● Urban Full Bayesian Poisson lognormal
road univariate and multivariate
segments random parameters models (1)
with heterogenous effects (2)
with CAR priors for spatial
heterogeneity (3) with both
Chiou et al. (2014) Taiwan ● ● ● ○ ● Highway Multinomial-generalized Poisson
segments with error-components (spatial
error and spatial exogenous)
Effati et al. (2015) Iran ● ● ● ● Highway Support Vector Machine
segments Algorithms (SVMs) | Coactive
neuro-fuzzy inference system
El-Basyouny and Sayed (2011) Canada ○
Accident Analysis and Prevention 135 (2020) 105323

(continued on next page)


Table 1 (continued)

Study Characteristics Spatial aggregation approach Analysis - Modelling approach

Road environment

Author(s) (Year) Country of Speed Curvature Gradient Lane Lane Intersecti- Roadway Regional Zonal Link/
study Limit width number on nr./ length level level segment/
A. Ziakopoulos and G. Yannis

density intersec-
tion level

Intersecti- Univariate and Multivariate


ons Poisson Lognormal Regressions |
Full Bayes estimations
El-Basyouny & Sayed (2009) Canada ● ● ● Urban Full Bayesian Multivariate
road Poisson Lognormal with and
segments without CAR Prior | Full
Bayesian Multiple Membership
model | Full Bayesian Extended
Multiple Membership model
Guo et al. (2010) United States ● ○ ○ Intersecti- Fixed effects Bayesian Poisson
ons Regression | Fixed and Mixed
effects Bayesian Negative
Binomial Regression | Spatial
CAR Prior extended Poisson/
Negative Binomial models
Huang et al. (2017) China ● ○ Intersecti- Poisson Regression (Univariate,
ons Multivariate Lognormal & Spatial
random effects models)
Huang et al. (2016) United States ● ● ● ● TAZ Intersecti- Bayesian spatial model with CAR

5
ons | Road prior (macroscopic) | Bayesian
segments spatial joint models with CAR
prior (microscopic)
Flahaut (2004) Belgium ● ○ ● ○ ○ Rural & Logistic regression with and
Highway without spatial autocorrelation
segments
Liu et al. (2017) United States ● ● Highway Geographically Weighted
segments Negative Binomial Regression |
Negative Binomial Regression
Ma et al. (2017) United States ● ● Highway Hierarchical Bayesian random
segments parameters models (structured
and unstructured spatio-
temporal effects)
Miaou & Lord (2003) Canada ○ ○ Intersecti- Full Bayes | Empirical Bayes
ons
Miaou & Song (2005) Canada | ○ ○ ● Intersecti- Multivariate spatial Bayesian
United States ons | generalized linear mixed models
Rural with and without CAR Prior
segments
Mitra (2009) United States Intersecti- Hierarchical Full Bayes Jointly
ons specified spatial model |
Negative Binomial Regression |
Local Moran's I
Mountrakis & Gunson (2009) United States ○ Rural Spatial, Temporal &
segments Spatiotemporal kernel estimation
| Ripley’s K-function
Page & Meyer (1996) New Zealand ○ National Highway Percentage descriptive statistics
Parks segments
Thomas (1996) Belgium ●
(continued on next page)
Accident Analysis and Prevention 135 (2020) 105323
Table 1 (continued)

Study Characteristics Spatial aggregation approach Analysis - Modelling approach

Road environment

Author(s) (Year) Country of Speed Curvature Gradient Lane Lane Intersecti- Roadway Regional Zonal Link/
study Limit width number on nr./ length level level segment/
A. Ziakopoulos and G. Yannis

density intersec-
tion level

Highway Univariate and bivariate


segments descriptive statistics, chi^2 and W
tests
Wang & Abdel-Aty (2006) United States ● ● ○ Intersecti- Generalized Estimating
ons Equations with Negative
Binomial link function
Wang & Huang (2016) United States ● ● ● ● TAZ Intersecti- Bayesian hierarchical joint
ons | Poisson Regression | Bayesian
Urban joint Poisson Regression |
segments Negative Binomial Regression
Wang et al. 2016a United States ● ● ● ● ● Highway Multivariate Poisson Lognormal
segments regression with CAR Prior
Wang et al. (2009) United ● ● ● ● Highway Bayesian Multivariate Poisson
Kingdom segments Lognormal | Negative Binomial
Regression | Poisson Models with
CAR priors (with first/second
order neighbors)
Wen et al. (2019) China ● ● Highway (1) Poisson Lognormal regression
segments with CAR Prior | (2) Poisson

6
Lognormal regression with
spillover effects | (3) Hybrid of
(1) and (2)
Xie et al. (2014) China ● ○ ● Intersecti- Bayesian Negative Binomial
ons | regression (basic, random effect,
Urban random parameter, hierarchical,
segments hierarchical CAR)
Xie et al. (2013) China ● ○ ● Intersecti- Bayesian Negative Binomial
ons | regression (basic, random
Urban parameter, hierarchical)
segments
Zeng & Huang (2014) United States ● ● ● ● Intersecti- Poisson Regression | Negative
ons | Binomial Regression | Bayesian
Urban spatial model with CAR prior |
segments Bayesian spatial joint models
with CAR prior

● Considered in the study design, ○ considered in the study process as filter/defining characteristic
Accident Analysis and Prevention 135 (2020) 105323
Table 2
Studies with road safety spatial analyses primarily on the zonal level

Study Characteristics Dependent variables Independent variables – parameters

Traffic Road environment

Author(s) (Year) Country of study Crash type Crash Crash rate Injury Casualty Speed Traffic Vehicle Number of Speed Curvatu- Lane
A. Ziakopoulos and G. Yannis

analyzed count/ Severity rate volume distance Trips - OD Limit re width


frequency traveled

Abdel-Aty et al. (2013) United States TC ● ○ ● ● ●


Abdel-Aty et al.(2011) United States TC ● ○ ○ ● ●
Amoh-Gyimah et al. (2017) Australia TC ● ○ ● ●
Anderson (2007) United Kingdom TC ● ○
Anderson (2009) United Kingdom TC | ● ○
P-V | B-V
Bao et al. (2018) United States TC ● ○ ● ● ○
Bao et al. (2017) United States TC | ● ● ●
V-V | P-V
Cai et al. (2019a) United States TC ● ●
Cai et al. (2018) United States TC ● ●
Cai et al. (2017b) United States TC | ● ●
P-V | B-V
Cai et al. (2016) United States P-V | B-V ● ● ●
Cottrill & Thakuriah (2010) United States P-V ● ● ○ ●
Cui et al. (2015) Canada TC (on boundary) ●
Delmelle & Thill (2008) United States B-V ●
Dong et al. (2016) United States TC ● ●
Dong et al. (2015) United States TC ● ● ● ○

7
Dong et al. (2014) United States TC ● ● ● ○
Erdogan et al. (2008) Turkey TC ● ● ○ ○ ○
Gomes et al. (2017) Brazil TC ● ○
Guo et al. (2017) Hong Kong P-V ● ○ ● ● ●
Hadayeghi et al. (2010) Canada TC ● ● ● ●
Hadayeghi et al. (2003) Canada TC ● ○ ● ● ●
Jiang et al. (2016) United States TC | B-V | P-V ● ○ ●
Ladron de Guevara et al. (2004) United States TC ● ○ ○ ●
LaScala et al. (2004) United States P-V | B-V ● ○ ●
LaScala et al. (2000) United States P-V ● ●
Lee & Abdel-Aty (2018) United States B-V ● ● ●
Lee et al. (2018)b United States Crashes of 8 road ● ● ●
user types
Lee et al. (2017)a United States TC | P-V ● ○ ●
| B-V
Lee et al. (2015)a United States V/V-V | ● ● ●
P-V | B-V
Lee et al. (2015b) United States P-V ● ● ●
Lee et al. (2014a) United States V/V-V ●
(at-fault)
Lee et al. (2014b) United States TC ● ○ ● ○
Levine et al. (1995) United States TC ● ○
Loukaitou-Sideris et al. (2007) United States P-V ● ○ ●
Lovegrove & Sayed (2007) Canada TC ● ○ ●
Lovegrove & Sayed (2006) Canada TC ● ○ ● ● ●
Lovegrove et al. (2009) Canada TC ● ○ ● ○
MacNab (2004) Canada TC ●
Naderan & Shahi (2010) Iran TC ● ○ ●
Narayanamoorthy et al. (2013) United States P-V | B-V ● ●
Nashad et al. (2016) United States P-V | B-V ● ●
Accident Analysis and Prevention 135 (2020) 105323
Table 2 (continued)

Study Characteristics Dependent variables Independent variables – parameters

Traffic Road environment

Author(s) (Year) Country of study Crash type Crash Crash rate Injury Casualty Speed Traffic Vehicle Number of Speed Curvatu- Lane
analyzed count/ Severity rate volume distance Trips - OD Limit re width
frequency traveled
A. Ziakopoulos and G. Yannis

Ng et al. (2002) China TC | P-V ● ○


Noland & Quddus (2005) United Kingdom TC | P-V ● ○
Noland & Quddus (2004) United Kingdom TC ● ○ ○
Pirdavani et al. (2014)a Belgium TC ● ○ ● ● ● ●
Pirdavani et al. (2014)b Belgium V-V ● ○ ● ●
P-V | B-V
Pirdavani et al. (2013) Belgium V-V ● ○ ● ●
P-V | B-V
Quddus (2008) United Kingdom TC ● ○ ● ● ●
Rhee et al. (2016) South Korea TC ● ○ ● ● ○
Siddiqui & Abdel-Aty (2012) United States P-V (interior & ● ●
boundary)
Siddiqui et al. (2012) United States P-V | B-V ● ○ ●
Soltani & Askari (2017) Iran V-V ● ●
Tasic et al. (2017) United States TC | V-V | ● ○ ● ●
P-V | B-V
Ukkusuri et al. (2012) United States P-V ● ○ ●
Ukkusuri et al. (2011) United States P-V ●
Wang et al. (2016)b China P-V ● ○
Wang & Kockelman (2013) United States P-V ● ○ ●

8
Wei & Lovegrove (2013) Canada B-V ● ●
Wier et al. (2009) United States P-V ● ○ ●
Xu and Huang (2015) United States TC ● ● ○ ● ●
Xu et al. (2017)a United States TC (interior & ● ○ ● ● ●
boundary)
Xu et al. (2017)b United States TC ● ● ●
Yasmin & Eluru (2016) Canada B-V ● ●
Zhai et al. (2019)a United States TC (interior & ● ● ● ●
boundary)
Zhai et al. (2018) United States TC (interior & ● ● ● ●
boundary)

Study Characteristics Independent variables – parameters

Road environment Demographic Socio-economic Land Use

Author(s) (Year) Country of study Lane Intersecti- Roadway Populatio- Road Modal Househol- Employm- Land use Regional Zonal Link/
number on nr./ length n user/ distinc- d/ ent factor(s) level level segment/
density number/ Populatio- tion Personal percen- intersec-
density n age income tage/ tion level
density

Abdel-Aty et al. (2013) United States ● ● ● ● ● ● TAZ | CT | Intersecti- Bayesian Multivariate


BG ons Poisson Lognormal
Regression
Abdel-Aty et al.(2011) United States ● ● ○ TAZ Negative Binomial
Regression
Amoh-Gyimah et al. (2017) Australia ● ● ● ● ● Random parameter
negative binomial model |
Accident Analysis and Prevention 135 (2020) 105323

(continued on next page)


Table 2 (continued)

Study Characteristics Independent variables – parameters

Road environment Demographic Socio-economic Land Use

Author(s) (Year) Country of study Lane Intersecti- Roadway Populatio- Road Modal Househol- Employm- Land use Regional Zonal Link/
number on nr./ length n user/ distinc- d/ ent factor(s) level level segment/
A. Ziakopoulos and G. Yannis

density number/ Populatio- tion Personal percen- intersec-


density n age income tage/ tion level
density

SA1 | SA2 Semi-parametric Poisson


| TAZ | GWR (also on custom grid
SED | ZIP cells)
Anderson (2007) United Kingdom CT Urban Kernel density estimation
road | Network analysis |
segments Census Output Area
estimation
Anderson (2009) United Kingdom ○ ● ● ● Hotspot Kernel density estimation
clusters | K-means clustering
Bao et al. (2018) United States ● ● ● ● ● ● ZIP Poisson GWR | Latent
Dirichlet Allocation
Bao et al. (2017) United States ● ● ● ● ○ ● ● TAZ Geographically Weighted
Regression (GWR)
Cai et al. (2019a) United States ● ● ● ● ● ● ● TAD Bayesian Poisson
Lognormal Regression:
(1) at macro- level;
(2) at micro- level; (3)
integrated at macro- and

9
micro- levels
Cai et al. (2018) United States ● ● ● ● ● ● ○ County TAD Poisson-lognormal
models: (1) Fixed param.
univariate model; (2)
Grouped random param.
univ. spatial model;
(3) Grouped random
param. univ. spatial
model with zonal factors;
(4) Grouped random
param. multiv. spatial
model with zonal factors
Cai et al. (2017b) United States ● ● ● ● ● ● ● TAD Bayesian Negative
Binomial regression |
Bayesian Logit regression
model | Bayesian Joint
model [of the two] |
Elasticity analysis
Cai et al. (2016) United States ● ● ● ● ● ● TAZ Negative Binomial spatial
and aspatial models
(basic, zero-inflated &
hurdle)
Cottrill & Thakuriah (2010) United States ● ● ● ○ ● ● EJ (CT) Poisson Regression with
heterogeneity | Poisson
Regression with
exogenous underreporting
Cui et al. (2015) Canada ● ● 2 city Neighbor- (1) Entropy-based
areas hoods histogram thresholding
(continued on next page)
Accident Analysis and Prevention 135 (2020) 105323
Table 2 (continued)

Study Characteristics Independent variables – parameters

Road environment Demographic Socio-economic Land Use

Author(s) (Year) Country of study Lane Intersecti- Roadway Populatio- Road Modal Househol- Employm- Land use Regional Zonal Link/
number on nr./ length n user/ distinc- d/ ent factor(s) level level segment/
density number/ Populatio- tion Personal percen- intersec-
A. Ziakopoulos and G. Yannis

density n age income tage/ tion level


density

(2) Collision density


probability distribution
(3) Collision aggregation
through density ratio
Delmelle & Thill (2008) United States ● ○ ● ● ○ ● ● CT OLS Regression | Kernel
density
Dong et al. (2016) United States ● ● ● TAZ Bayesian Multivariate
Poisson Lognormal
Regression | Bayesian
spatial-temporal
interaction models
Dong et al. (2015) United States ● ● ● ● TAZ ν-Support Vector Machine
with Correlation-based
Feature Selector |
Bayesian Multivariate
Poisson Lognormal with
CAR Prior
Dong et al. (2014) United States ● ● ● ● ● TAZ Bayesian Multivariate

10
Poisson Lognormal with
CAR Prior Regression for
boundary and non-
boundary area models
Erdogan et al. (2008) Turkey ● Hotspot Poisson test | Chi^2 test |
clusters Kernel density analysis
Gomes et al. (2017) Brazil ● ● ● ● ● TAZ Negative binomial
regression | Poisson GWR
| Negative Binomial GWR
Guo et al. (2017) Hong Kong ● ● ● ○ ● TAZ Space Syntax | Poisson
Lognormal Regression |
Bayesian Poisson
Lognormal with CAR
Prior Regression with (1)
contiguity (2) geometry-
centroid distance and (3)
road network
connectivity
Hadayeghi et al. (2010) Canada ● ● ● ● ● ● TAZ Poisson GWR | Negative
Binomial Regression |
Poisson regression
Hadayeghi et al. (2003) Canada ● ● ● ● ● ● TAZ GWR | Negative Binomial
Regression
Jiang et al. (2016) United States ● ● ● ● ○ ● ● TAZ Random Forest Models
(CART trees) | Wiloxon
Tests
Ladron de Guevara et al. (2004) United States ● ● ● ● ● ● TAZ
(continued on next page)
Accident Analysis and Prevention 135 (2020) 105323
Table 2 (continued)

Study Characteristics Independent variables – parameters

Road environment Demographic Socio-economic Land Use

Author(s) (Year) Country of study Lane Intersecti- Roadway Populatio- Road Modal Househol- Employm- Land use Regional Zonal Link/
number on nr./ length n user/ distinc- d/ ent factor(s) level level segment/
A. Ziakopoulos and G. Yannis

density number/ Populatio- tion Personal percen- intersec-


density n age income tage/ tion level
density

Negative Binomial
Regression | Simultaneous
equation estimation
LaScala et al. (2004) United States ● ● ● ○ ● ● ● Communi- Geograph- Linear regression models
ties ic units
LaScala et al. (2000) United States ○ ● ● ● ○ ● ● ● CT Spatial autocorrelation
regression log-linear
model
Lee & Abdel-Aty (2018) United States ● ● ● ● ● ● ● ZIP Bayesian Poisson
lognormal CAR models
Lee et al. (2018)b United States ● ● ● ● ● ● ● ● TAZ Fractional Split
Multinomial Model
Lee et al. (2017)a United States ○ ● ● ○ ● ● County | TAD | ZIP Intersecti- Mixed effects Negative
County | TAZ | CT ons Binomial models with: (1)
Division | BG | CB micro-level variables, (2)
micro- and macro-level
variables and (3) micro-
and macro-level variables

11
with random-effects
Lee et al. (2015)a United States ● ● ○ ● ● TAZ Univariate and
Multivariate Bayesian
Poisson Lognormal with
CAR Prior Regression
Lee et al. (2015b) United States ● ● ● ● ● ● ● ● ZIP Bayesian Poisson
lognormal simultaneous
equations spatial
error model
Lee et al. (2014a) United States ● ● ● ● ● ZIP Bayesian Poisson-
lognormal model
Lee et al. (2014b) United States ● ● ● ● TSAZ | Brown-Forsythe test |
TAZ Bayesian Multivariate
Poisson Lognormal
Regression
Levine et al. (1995) United States ○ ● ● ● ● BG Spatal lag regression
model
Loukaitou-Sideris et al. (2007) United States ○ ○ ● ● ○ ● ● ● CT OLS regression
Lovegrove & Sayed (2007) Canada ● ● ● ● ● Neighbor- Groups of Macrolevel
hood - Crash Prediction Models
TAZ using GLMs
Lovegrove & Sayed (2006) Canada ● ● ● ● ● ● Neighbor- Groups of Macrolevel
hood - Crash Prediction Models
TAZ using GLMs
Lovegrove et al. (2009) Canada ● ● ● ● ● TAZ Groups of Collision
Prediction GLMs |
Modified T-tests
MacNab (2004) Canada ● ● ● ● ●
(continued on next page)
Accident Analysis and Prevention 135 (2020) 105323
Table 2 (continued)

Study Characteristics Independent variables – parameters

Road environment Demographic Socio-economic Land Use

Author(s) (Year) Country of study Lane Intersecti- Roadway Populatio- Road Modal Househol- Employm- Land use Regional Zonal Link/
number on nr./ length n user/ distinc- d/ ent factor(s) level level segment/
A. Ziakopoulos and G. Yannis

density number/ Populatio- tion Personal percen- intersec-


density n age income tage/ tion level
density

Local Bayesian spatial model


health with spatial
area autocorrelation
Naderan & Shahi (2010) Iran ● TAZ Negative Binomial
regression
Narayanamoorthy et al. (2013) United States ○ ● ● ● ● ● CT Customized generalized
ordered-response spatial
multivariate count model
Nashad et al. (2016) United States ● ● ● ● ● ● sTAZ Negative binomial
regression (copula-based)
Ng et al. (2002) China ● ○ ● TAZ Negative Binomial
Regression with Empirical
Bayes approach | Cluster
Analysis
Noland & Quddus (2005) United Kingdom ● ● ● ● ● ● ● Enumerat- Negative Binomial
ion Regression | ANOVA
District
Noland & Quddus (2004) United Kingdom ● ● ● ● ● ● ● Ward Negative Binomial

12
Regression
Pirdavani et al. (2014)a Belgium ● ● ● ● ● TAZ Geographically Weighted
GLM | Negative Binomial
Regression
Pirdavani et al. (2014)b Belgium ● ○ ● ● TAZ Geographically Weighted
Regression (GWR)
Pirdavani et al. (2013) Belgium ● ○ ● TAZ Negative Binomial
regression Zonal Crash
Prediction Models
Quddus (2008) United Kingdom ● ● ● ● ○ ● Ward Negative Binomial
Regression | Spatial
autoregressive model |
Spatial error model |
Bayesian hierarchical
models for spatial units
Rhee et al. (2016) South Korea ○ ● ● ● ● ● ● ● TAZ OLS regression | Spatial
lag regression | Spatial
error regression | Poisson
GWR
Siddiqui & Abdel-Aty (2012) United States ● ● ● ○ ● ● TAZ Multivariate Negative
Binomial regression |
Multivariate Bayesian
Negative Binomial
regression for boundary
and non-boundary area
models
Siddiqui et al. (2012) United States ● ● ● ○ ● ● ● TAZ Bayesian Multivariate
Poisson Lognormal |
(continued on next page)
Accident Analysis and Prevention 135 (2020) 105323
Table 2 (continued)

Study Characteristics Independent variables – parameters

Road environment Demographic Socio-economic Land Use

Author(s) (Year) Country of study Lane Intersecti- Roadway Populatio- Road Modal Househol- Employm- Land use Regional Zonal Link/
number on nr./ length n user/ distinc- d/ ent factor(s) level level segment/
A. Ziakopoulos and G. Yannis

density number/ Populatio- tion Personal percen- intersec-


density n age income tage/ tion level
density

Negative Binomial
Regression
Soltani & Askari (2017) Iran ● ○ ● TAZ Moran's I | Getis-Ord Gi*
index
Tasic et al. (2017) United States ● ● ● ● ● ● ● CT Generalized Additive
Models
Ukkusuri et al. (2012) United States ● ● ● ● ● ● CT | ZIP Negative binomial
regression | Negative
binomial regression with
heterogeneity in
dispersion parameter |
Zero-inflated negative
binomial regression
Ukkusuri et al. (2011) United States ○ ● ● ● ● ○ ● CT Negative Binomial
Regression with random
parameters
Wang et al. (2016)b China ● ● ● ○ ● TAZ Bayesian Conditional
Autoregressive (CAR)

13
models with seven
different spatial weight
features
Wang & Kockelman (2013) United States ● ● ○ ● ● CT Multivariate Poisson
Lognormal Regression
with and without CAR
Priors
Wei & Lovegrove (2013) Canada ● ● ● ● ● ● ● ● TAZ Negative Binomial
Macrolevel Crash
Prediction Models
Wier et al. (2009) United States ● ● ● ● ○ ● ● CT Log-linear multivariate
OLS regression model
Xu and Huang (2015) United States ● ● ● ● TAZ Negative Binomial
regression | Bayesian
negative binomial model
with CAR prior | Random
parameter negative
binomial model | Semi-
parametric Poisson GWR
Xu et al. (2017)a United States ● ● ● ● ● ● TAZ Bayesian spatially varying
coefficients model
Xu et al. (2017)b United States ● ● ● ● ● ● ● ● TAZ Semi-parametric Poisson
GWR | One-way ANOVA
tests
Yasmin & Eluru (2016) Canada ● ● ● ● ● ● ● TAZ Poisson Regression |
Negative Binomial
regression (basic and
Latent Segmentation)
Accident Analysis and Prevention 135 (2020) 105323
A. Ziakopoulos and G. Yannis Accident Analysis and Prevention 135 (2020) 105323

road network, it is important to note that there are spatial correlations

Multivariate CAR priors


between intersections and their adjacent segments, which have been

lognormal models with


found to be significant in the literature (Abdel-Aty and Wang, 2006;

Bayesian Poisson-
Quddus, 2008; Aguero-Valverde and Jovanis, 2010; Dong et al., 2014;
Dong et al., 2015; Wang and Huang, 2016). Spatial correlation is also
found in crashes of intersections along the same corridor, due to similar
traffic flow patterns, presence of traffic signals and geographic char-
acteristics (Guo et al., 2010), an issue which ought to be properly ad-
dressed with proper modelling tools (Xie et al., 2014). Additionally,
tion level
segment/
intersec-

several studies have integrated corridor-level characteristics into seg-


Link/

TAZ
ment-level or intersection-level analysis in an effort to capture factors
explaining heterogeneity (Abdel-Aty and Wang, 2006; Guo et al., 2010;
BG | TAZ |
CT | ZIP

Xie et al., 2014).


Zonal
level

A different effort was made by Zeng and Huang (2014), who en-
deavored to model crash counts on road segments and intersections
simultaneously. They used Bayesian spatial joint models to account for
Regional

spatial correlations between adjacent road segments and intersections


level

that were found to be more accurate than simple Poisson and negative
binomial models. The joint model integrated junctions and segments to
Land Use

Land use
factor(s)

the basic link function. An indicator variable which denoted whether a


segment or intersection was examined was utilized. The authors high-
light that the spatial correlations between intersections and their con-
Employm-

nected segments were more significant than those found between in-
percen-

density

tersections or between segments only, presumably due to common


tage/
ent
Socio-economic

unobserved parameters such as speed. The approach of joint simulta-


neous modelling of intersections and segments was further advanced by
Househol-

Personal
income

Alarifi et al. (2017) who developed four multi-level Bayesian joint


models for that purpose. Specifically, the reasoning was to complement
d/

the intersection/segment examination by including corridor-level


characteristics in the models. Because corridor characteristics vary
distinc-
Modal

along their length, random forest models were used to divide corridors
tion

into-sub corridors of fixed-value characteristics. Ultimately there were


statistically significant variables at the segment level, at the intersection
Populatio-

level and at the corridor/sub-corridor level; the importance of median


n age
user/
Road

opening density for crash occurrence was underlined from the results.

Demographic

However, spatial autocorrelation of adjacent road entities was not ex-


● Considered in the study design, ○ considered in the study process as filter/defining characteristic
Populatio-

number/

amined in that study. Moreover, Alarifi et al. (2018) (discussed in


density

Section 2.7) also conducted analyses including intersection-, road seg-



n

ment- and corridor-level parameters, in an attempt to explore that re-


Study Characteristics Independent variables – parameters

search question.
Roadway
length

Reviewed studies that primarily focus on spatial analyses at the


individual road segment/intersection level are shown on Table 1.


Intersecti-

2.2. Zonal approaches


density
on nr./
Road environment

A number of zonal units have been adopted by researchers, from


smaller to larger ones. Their boundaries can be census-based, admin-
number

istrative-based or traffic-based, and are dependent on the country or


Lane

environment of study. Studies in the UK might utilize enumeration


districts, namely areas averaging circa 200 households (Noland and
Quddus, 2005) or census wards, which include about 2000 households
Country of study

(Noland and Quddus, 2004; Quddus, 2008). Similarly, studies from


United States

United States

other countries have used locally available spatial units, such as the
Australian ABS structure units (Statistical areas 1,2 (SA1,2), state
electoral divisions (SED)) used by Amoh-Gyimah et al. (2017).
Many studies originate from the US and have utilized units that are
used there: Census Blocks (CBs) are the smallest unit, averaging 85
Bayesian Poisson-lognormal model

people and are expanded to Census Block Groups (CBGs), averaging 39


blocks with about 1500 people (Lee et al., 2017a). CBGs have been
utilized by road safety researchers to some extent (Levine et al., 1995;
Table 2 (continued)

Abdel-Aty et al., 2013).


Zhai et al. (2019)a

Zhai et al. (2018)

with CAR prior


Author(s) (Year)

Traffic Analysis Zones (TAZs) are created primarily in the US with


the explicit purpose of collecting trip and traffic statistics and data,
though they have been implemented in other countries as well (Ng
et al., 2002; Gomes et al., 2017). From traditional zonal approaches,
TAZs are the only traffic-related zone system (Lee et al., 2017a), which

14
Table 3
Studies with road safety spatial analyses primarily on the regional level

Study Characteristics Dependent variables Independent variables – parameters


A. Ziakopoulos and G. Yannis

Traffic Road
environ-
ment

Author(s) (Year) Country of study Crash type analyzed Crash Crash rate Injury Casualty Speed Traffic Vehicle Number of Speed Curvature Gradient
count/ Severity rate volume distance Trips - OD Limit
frequency travelled

Aguero-Valverde (2013) Costa Rica TC ● ● ●


Aguero-Valverde & Jovanis United States TC ● ○ ●
(2006)
Atubi (2012) Nigeria TC ● ○
Bu et al. (2018) United States TC ● ● ● ●
Erdogan (2009) Turkey TC ● ● ●
Flask & Schneider (2013) United States MC ● ○ ● ●
Han et al. (2018) United States TC ● ●
Huang et al. (2010) United States TC ● ● ○ ●
LaScala et al. (2001) United States P-V ● ● ●
Lee et al. (2019)a United States P-V ○ ○ ● ●
Lee et al. (2019)b Italy, United States TC | P-V | B-V ●
Lee et al. (2018)a United States TC ● ○
Lee et al. (2018)c United States P-V | B-V ● ○ ●

15
Lee et al. (2017)b United States MC ● ○
Li et al. (2019) United States TC ● ○ ●
Li et al. (2013) United States TC ● ○ ● ●
Liu and Sharma (2018) United States TC ● ● ●
Moeinaddini et al. (2014) 20 Cities Worldwide TC ● ○
Noland & Oh (2004) United States TC ● ○ ● ●
Song et al. (2006) United States TC ● ○ ○ ●
Zhai et al. (2019)b Hong Kong P-V ●

Study Characteristics Independent variables – parameters Spatial Analysis - Modelling


aggrega- approach
tion
approach
Demogra- Socio- Land Use
phic economic

Author(s) (Year) Country of study Lane Lane Intersecti- Roadway Populatio- Road Modal Household/ Employme- Land use Regional
width number on nr./ length n user/ distinction Personal nt factor(s) level
density number/ Populatio- income percentage/
density n age density

Aguero-Valverde (2013) Costa Rica ● ● ● ● Canton Full Bayes hierarchical


approach Poisson
multivariate CAR model
for spatial random effects.
United States ● ● ● County
(continued on next page)
Accident Analysis and Prevention 135 (2020) 105323
Table 3 (continued)

Study Characteristics Independent variables – parameters Spatial Analysis - Modelling


aggrega- approach
tion
approach
Demogra- Socio- Land Use
phic economic
A. Ziakopoulos and G. Yannis

Author(s) (Year) Country of study Lane Lane Intersecti- Roadway Populatio- Road Modal Household/ Employme- Land use Regional
width number on nr./ length n user/ distinction Personal nt factor(s) level
density number/ Populatio- income percentage/
density n age density

Aguero-Valverde & Jovanis Negative Binomial


(2006) Regression | Full Bayesian
hierarchical models
Atubi (2012) Nigeria ● ● State Multivariate linear
regression
Bu et al. (2018) United States ● ● Metropoli- Simple Density
tan areas distribution analysis
Erdogan (2009) Turkey ● ● ● ● County Moran's I and Geary's c
values, Z and G statistics
Flask & Schneider (2013) United States ○ ● ● ● ● County | Bayesian Negative
Township Binomial Regression with
mixed effects
Han et al. (2018) United States ○ ● ● County Bayesian hierarchical
(spec. random parameter model
road type) | Bayesian hierarchical
random intercept model |

16
Bayesian Poisson
lognormal model
Huang et al. (2010) United States ● ● ● ● ○ ● ● ● County Bayesian Spatial CAR
Priors regression
LaScala et al. (2001) United States ● ○ ● ● ○ ● ● ● Communi- Spatial autocorrelation
ties regression log-linear
model
Lee et al. (2019)a United States ● ● ● ● ● Metropoli- Multiple linear regression
tan areas model integrated in a
Poisson Lognormal Model
Lee et al. (2019)b Italy, United States ● ● ● ● County | Negative Binomial
Provincia Regression | Calibration
factors | Transferability
Indexes
Lee et al. (2018)a United States State Crash Modification
Factors
Lee et al. (2018)c United States ● ● ● ● Metropoli- Bayesian integrated and
tan areas non-integrated Bivariate
Models
Lee et al. (2017)b United States ● ● ● ● County | Before-and-After Study
Parish (1) with Comparison
Group | (2) With
Empirical Bayes | Safety
Performance Functions |
Crash Modification
Factors
Li et al. (2019) United States ● ● ○ ● ● ● County Hierarchical Bayesian
random parameters
(continued on next page)
Accident Analysis and Prevention 135 (2020) 105323
Table 3 (continued)

Study Characteristics Independent variables – parameters Spatial Analysis - Modelling


aggrega- approach
tion
approach
Demogra- Socio- Land Use
phic economic
A. Ziakopoulos and G. Yannis

Author(s) (Year) Country of study Lane Lane Intersecti- Roadway Populatio- Road Modal Household/ Employme- Land use Regional
width number on nr./ length n user/ distinction Personal nt factor(s) level
density number/ Populatio- income percentage/
density n age density

models (structured and


unstructured spatio-
temporal effects)
Li et al. (2013) United States ● ● ● ● ● County Negative Binomial
Regression | Poisson GWR
Liu and Sharma (2018) United States ● ● ● County Hierarchical Bayesian
random parameters
models (structured and
unstructured spatio-
temporal effects)
Moeinaddini et al. (2014) 20 Cities Worldwide ● ● City Gamma-distributed GLM
Noland & Oh (2004) United States ● ● ● ● ● County Negative Binomial Panel
Regression
Song et al. (2006) United States ○ County Bayesian Multivariate
Poisson Lognormal
Regression with and

17
without CAR Prior
Zhai et al. (2019)b Hong Kong ● ● ● ● City Binary & Mixed logit
models with and without
variable interaction terms

● Considered in the study design, ○ considered in the study process as filter/defining characteristic
Accident Analysis and Prevention 135 (2020) 105323
Table 4
Studies with road safety spatial analyses primarily by conditional approaches

Study Characteristics Dependent variables Independent variables – parameters

Traffic Road environment

Author(s) (Year) Country of study Crash type analyzed Crash Crash rate Injury Speed Traffic Vehicle Number of Speed Curvature Gradient Lane
A. Ziakopoulos and G. Yannis

count/ Severity volume distance Trips - OD Limit width


frequency traveled

Bao et al. (2019) United States TC ● ● ● ●


Bíl et al. (2013) Czech Republic TC ●
Cai et al. (2019b) United States TC ● ● ●
Cai et al. (2017)a United States TC | P-V | B-V ● ○ ● ●
Chung et al. (2018) United States TC ● ○ ●
Imprialou et al. (2016) United Kingdom TC ● ○ ● ● ● ●
Kim et al. (2006) United States TC | V-V | P-V | B-V ●
Loo et al. (2011) China V-V | P-V ● ○
Mohaymany et al. (2013) Iran TC ● ○
Ossenbruggen et al. (2010) United States TC ● ○ ● ● ●
Xie et al. (2017) United States P-V ○ ● ● ●
Xie and Yan (2008) United States TC ● ○

Study Characteristics Independent variables – parameters

Road environment Demographic Socio- Land Use Spatial Analysis -


economic aggrega- Modelling
tion approach

18
approach
Author(s) (Year) Country of study Lane Intersecti- Roadway Populatio- Road Modal Househol- Employm- Land use Zonal Link/ Condition-
number on nr./ length n user/ distinc- d/ ent factor(s) level segment/ based
density number/ Populatio- tion Personal percen- intersec- level
density n age income tage/ tion level
density

Bao et al. (2019) United States ● ● ● ○ ● Multiple Convolutional


grids Neural Network
(approx. augmented with a
to ZIP Long Short-term
areas) Memory Network
Bíl et al. (2013) Czech Republic ○ ● Rural Rural road Network Kernel
segments network Density Estimation
split into with significance
funda- verification
mental
segments
Cai et al. (2019b) United States ● ● ● ● ● 9-mi2 grid Convolutional
structure Neural Networks
divided to (GLM and
smaller Artificial Neural
cells Networks for
benchmarking
purposes)
Cai et al. (2017)a United States ● ● ● ● TAD | TAZ Multiple Multivariate
| CT grids from Poisson Lognormal
1 to 100 Regression with
mi2 and without
(continued on next page)
Accident Analysis and Prevention 135 (2020) 105323
Table 4 (continued)

Study Characteristics Independent variables – parameters

Road environment Demographic Socio- Land Use Spatial Analysis -


economic aggrega- Modelling
tion approach
approach
A. Ziakopoulos and G. Yannis

Author(s) (Year) Country of study Lane Intersecti- Roadway Populatio- Road Modal Househol- Employm- Land use Zonal Link/ Condition-
number on nr./ length n user/ distinc- d/ ent factor(s) level segment/ based
density number/ Populatio- tion Personal percen- intersec- level
density n age income tage/ tion level
density

spatial
autocorrelation
Chung et al. (2018) United States ○ Areas Categorical
within 20 analysis
mi of (sensitivity,
2271 positive predictive
weather value, Cohen's
stations Kappa) | Negative
Binomial
Regression
Imprialou et al. (2016) United Kingdom ● ● Rural & Pre-crash Bayesian
Highway conditions Multivariate
segments Poisson Lognormal
Regression
Kim et al. (2006) United States ● ○ ● ● 0.1-mi2 Negative Binomial
grid Regression | OLS

19
structure Regression
Loo et al. (2011) China ○ ○ Urban & Urban and Network Kernel
suburban suburban Density Estimation
segments network
split into
funda-
mental
segments
Mohaymany et al. (2013) Iran ○ ● Rural Rural road Network Kernel
segments split into Density Estimation
funda-
mental
segments
Ossenbruggen et al. (2010) United States 1-mi2 grid Homogeneous
structure Poisson process
spatial testing
Xie et al. (2017) United States ● ● ● ● ● ● 300 × 30- Linear Regression
0 feet2 Model | Tobit
grid Model | Potential
structure for Safety
Improvement
Xie and Yan (2008) United States ○ Urban Network Kernel
network Density Estimation
split into
funda-
mental
lixels

● Considered in the study design, ○ considered in the study process as filter/defining characteristic
Accident Analysis and Prevention 135 (2020) 105323
A. Ziakopoulos and G. Yannis Accident Analysis and Prevention 135 (2020) 105323

might explain their popularity for utilization in spatial analyses (e.g. Ng et al. (2006) applied Bayesian multivariate spatial models in county-
et al., 2002; Hadayeghi et al., 2003; Ladron de Guevara et al., 2004; level data in Texas, and results indicated that eastern Texas counties
Lovegrove and Sayed, 2006; Lovegrove and Sayed, 2007; Hadayeghi had higher crash risks than western Texas counties, with less safe sites
et al., 2010; Naderan and Shahi, 2010; Abdel-Aty et al., 2011; Abdel- being near large city conglomerations. Studies have examined road
Aty et al., 2013; Dong et al., 2014; Lee et al., 2014b; Dong et al., 2015; safety indicators at the level of geographic units formed from commu-
Lee et al., 2015a; Xu and Huang, 2015; Dong et al., 2016; Nashad et al., nities (LaScala et al., 2001, 2004), at the city level (Moeinaddini et al.,
2016; Xu et al., 2017a, 2017b; Bao et al., 2017; Gomes et al., 2017). 2014), at the metropolitan area level (Bu et al., 2018), at the county
TAZs can be also expanded for road safety assessment purposes by level (Noland and Oh, 2004; Song et al., 2006; Erdogan, 2009; Huang
aggregating TAZs groups with similar crash rates, thus creating Traffic et al., 2010; Li et al., 2013) or similarly at the state level (Atubi, 2012).
Safety Analysis Zones (TSAZs), (Lee et al., 2014b; Abdel-Aty et al., Regional-wide crash modification factors (CMFs) have also been
2016). developed for a single change affecting the traffic environment uni-
Census Tracts (CTs, or census output areas) are larger units con- formly, e.g. for legal changes in some U.S. States or across the entire
taining about 4000 people of comparable socio-economic statuses in the country (Lee et al., 2017b, 2018a), however this approach does not take
US (or about 2500 people in the UK). They too have been adequately spatial effects explicitly into account. As the area size increases, it is
explored in road safety spatial analyses in the literature (e.g. LaScala important to remember that unobserved heterogeneity is more difficult
et al., 2000; Loukaitou-Sideris et al., 2007; Delmelle and Thill, 2008; to capture, due to multiple unobserved parameters being introduced in
Wier et al., 2009; Cottrill and Thakuriah, 2010; Ukkusuri et al., 2011; the occurrence of events; as Wang et al. (2016b) state, it becomes more
Narayanamoorthy et al., 2013). difficult to capture spatial trends and problems in a larger area. If dif-
Similar to TAZs, Traffic Analysis Districts (TADs) are newly created, ferences in comparable units between remote areas such as different
larger geographic traffic related units used for transport analyses. A few countries are taken into account, it is reasonable to assume that
recent studies have utilized TADs as basis for analysis (e.g. Abdel-Aty transferability of results for macroscopic spatial analysis is far from
et al., 2016, Cai et al., 2017b; Lee et al., 2017a). Other zonal areas have seamless. In a study seeking to examine transferability of results across
been used as well by exploiting existing utility systems, such as postal- regions of different countries (from US counties to Italian provincias)
ZIP codes (e.g. Lee et al., 2014a; Bao et al., 2018) and urban/rural areas Lee et al. (2019b) employed negative binomial models using data from
defined by healthcare authorities (e.g. MacNab, 2004; Bu et al., 2018). both countries and calculated the respective transferability indexes and
Reviewed studies that primarily focus on spatial analyses at zonal calibration factors. Models for total crashes and bicycle crashes were
levels are shown on Table 2. transferable from Italy to the US; the opposite, however, was found to
TAZ approaches can conceptually include elements of segment ap- be untrue for most study areas. In addition, no model for pedestrian
proaches nested in them. An example is the study of Yasmin and Eluru crashes was found to be transferrable between the two countries. It is
(2016) that employed latent segmentation count models where TAZs important to note that this statistical disagreement emerged even while
are allocated probabilistically to different segments. This was in order several significant variables were common across the two countries,
to limit external factor impact and to classify segments within a TAZ to and without accounting for spatial effects in the models of the study.
high- and low- risk based on empirical expected crash means. Studies Reviewed studies that primarily focus on spatial analyses at the
have also developed models on several zonal systems for comparison zonal level are shown on Table 3.
purposes between them. Abdel-Aty et al. (2013) claimed that while
TAZs and CBGs are equally desirable for spatial analysis, TAZs allow the 2.4. Conditional approaches
examination of more transport-related factors, and thus are easier to
integrate in transport contexts. Furthermore, the aggregation of TAZs Apart from defined zones, conditional approaches have been
into TSAZs with a rate of about 1:2 was found to be preferable for adopted. As conditional is hereby defined any approach that does not
macroscopic safety modeling (Lee et al., 2014b). Cai et al. (2017a) utilize any of the previous segment, zonal or regional approaches but a
conducted comparative Poisson lognormal models for three crash types more rigid ruleset set by researchers. An example is fix-distance grid
with and without considering spatial autocorrelation effects, and re- structures, such as 0.1 square mile grids (Kim et al., 2006), 1 square
commended that CTs are better used for socio-demographic data col- mile grids (Ossenbruggen et al., 2009) and multiple grid sizes from 1 to
lection, TAZs are used for transportation demand forecasting and TADs 100 square miles (Cai et al., 2017a). While the impacts of grid-based
are used for transportation safety planning. Different zonal levels have characteristics on crash counts have been proven to be statistically
also been used in conjunction for simultaneous aggregate and dis- significant, a grid of a particular size might be improper for certain
aggregate modelling; it has been shown that aggregate models using areas, depending on spatial distributions of safety-related parameters
ZIP codes were more volatile in parameter values and significance le- (Kim et al., 2006).
vels, while disaggregate CT models provided more consistent results An example of approaches that are conditional not by area, but by
(Ukkusuri et al., 2012). Lastly, it has been determined that separate crash circumstance, are link-based approaches that utilize crash-map-
considerations for crashes near TAZ boundaries revealed unique pre- ping algorithms and assign crashes to each road segment, and assuming
dictor variables (Siddiqui and Abdel-Aty, 2012), a finding worthy of that the crashes happening on the same link have the same underlying
examination in all spatial units. conditions, which might not always be the case. Link-based approaches
can be problematic in providing interpretable results, however.
2.3. Regional approaches Conversely, crashes can also be grouped by pre-crash conditions, re-
gardless of their actual location, for the purposes of spatial analyses.
Regional areas (counties, cities, metropolitan areas, states) that are Pre-crash conditional approaches have appeared to be more transfer-
larger than the zonal ones examined above have also been implemented able overall (Imprialou et al., 2016).
in the literature. Regional areas are administrative units, with often Reviewed studies that primarily focus on conditional spatial ana-
different governance laws and frameworks than their neighboring lyses are shown on Table 4.
areas, as is often the case in US states. In the US, entire Metropolitan
Statistical Areas (MSAs) have been used for the National Household 2.5. Integration of different areal units
Travel Survey, which has provided data for pedestrian trips (Lee et al.,
2019a). The benefit of using regional units can lie in the interpretation The aforementioned integration of characteristics of the corridor
of model results and possible evaluation of risk factors or road safety level to road segment or intersection level analysis by several studies
interventions, such as legislation changes. For instance, a study by Song (Zeng and Huang, 2014; Alarifi et al., 2017, 2018) is a considerable

20
A. Ziakopoulos and G. Yannis Accident Analysis and Prevention 135 (2020) 105323

achievement in road safety. In these studies, the levels of analysis can (1999) claimed that neighboring zones influence crashes close to the
be considered to be close in geographical characteristics (i.e. a segment borders of areal units. Since then, several studies have explored the
is similar to a corridor). There have been other endeavors, however, to problem, each proposing a solution. Delmelle and Thill (2008) mention
integrate factors from units of more different scales in spatial analyses, simple solutions such as (1) assigning the locations as they were as-
such as zonal-level characteristics to segment-level analysis. signed by police records, (2) double-counting boundary crashes or (3)
As stated before, the zonal level has become a promising medium apportioning crashes, dividing the counts per neighboring zones.
during the more recent years for the exploration of new approaches of Separate predictor sets have been prepared for boundary and in-
spatial analyses. Zonal factors, such as Vehicle Miles Traveled (VMT), terior pedestrian crashes per TAZ, introducing buffer zones around 2-D
are considered to be shared by segments of both segments and inter- borders. This mutually exclusive separation and modelling within a
sections of the same zone. It has been hypothesized that both observed hierarchical Bayesian framework has led to increased model fit.
and unobserved heterogeneity at the zonal level would influence crash However, this approach was adopted due to the limited distance tra-
frequency at both segments and intersections inside these zones. Cai velled by pedestrians, and accounting for additional road user types
et al. (2018) investigated crashes at the TAD level across three counties might differ due to higher amounts of areal units that are typically
to determine the influence of any observed and unobserved zonal fac- crossed (Siddiqui and Abdel-Aty, 2012). Instead of using a fixed buffer
tors. Results indicate that including zonal factors improve model per- zone, Cui et al. (2015) introduced an entropy-based method applied on
formance for both segment and intersection crash frequency prediction. histogram thresholding, to obtain a variable buffer zone size. The crash
Another concept is incorporating macro-level variables into micro- density probability distribution was then calculated, and boundary
level safety analysis. This has been attempted by Lee et al. (2017a) crashes were aggregated into neighborhoods. The case study resulted in
across seven areal units of varying sizes for intersection crashes. They 6 m and 9 m buffer zones for central areas and south areas in Edmonton,
determined that accounting for macro-level variables and introducing Canada, respectively. The authors concluded that the entropy-based
macro-level random-effects leads to models of better performance than method was precise when compared to ground truth data, though more
the baseline, though performance varies when using data of different variables are required to verify this finding; especially traffic-related
areal unit size. Additionally, there have been endeavors to link crash variables such as speed and traffic volume.
counts of micro- and macro-levels through their spatial interaction (Cai An alternative was proposed by Zhai et al. (2018), who adopted an
et al., 2019a). A spatial interaction matrix was created based on whe- iterative data aggregation approach to compensate for the boundary
ther a road segment (micro-level) was inside a zone (macro level), and effect. The reasoning behind this method was the division of each zone
an adjustment factor was introduced to bridge the different estimates of into boundary and interior, the development of a crash prediction
expected crashes that would occur for the two levels. Once again, fol- model for each zone based on interior crashes only, the aggregation of
lowing an integrated approach increased model performance; more- crashes based on crash model predictions, the assignment of boundary
over, the determination of both macro- and micro-level risk factors that crashes to each zone based on the proportions of expected interior
influenced crashes were possible, as well as crash hotspots on both le- crashes, and, as a last step, re-run the prediction model until con-
vels. vergence. The crash assignment based using the CAR Poisson Log-
Conversely, road-level factors have been shown to influence safety normal Bayesian Spatial Model. It is notable that the impact of several
by varying effects across regions, and can be considered to be correlated independent variables were found to be influenced by the boundary
with unobserved heterogeneity, to an extent. To demonstrate this, a effect in the case study in Florida, US. Both Cui et al. (2015) and Zhai
dedicated study examined specifically urban two-lane roadway seg- et al. (2018) demonstrated that certain analytical approaches outper-
ments in 34 counties in Florida, US. Regression coefficients of Poisson form conventional rules such as the various ratio methods that split
lognormal models and hierarchical models were found to fluctuate boundary crashes based on numerical rules or exposure parameters). It
considerably for crash counts across the examined counties (Han et al., is also worth noting that certain Bayesian statistical models can express
2018). However, neither factors at the regional level nor spatial cor- the interaction of neighboring zones on crashes close to zone bound-
relations at the microscopic level were taken into account in that par- aries via the utilization of corresponding spatial weights (e.g. Wang
ticular study. et al., 2016b).
Huang et al. (2016) investigated a possible bridging of the macro- The modifiable areal unit problem (MAUP) occurs when boundaries
and micro-level approaches for an integrated crash prediction and are changed inside the study areas, causing possible influences on the
hotspot identification approach. Crashes were analyzed both jointly at statistical models and resulting inferences (Openshaw, 1984). The issue
the micro-level (road segment/intersection level) and at the macro- is particularly present in road safety when area boundaries are arbitrary
level (TAZ level). The authors developed both a micro-level Bayesian or malleable, without any hard geographical borders, such as admin-
spatial joint model and a macro-level Bayesian spatial model; as ex- istrative areas or grids. Two studies did experiment with the dis-
pected, the models included different statistically significant variables. crepancies caused by MAUP on different aggregation levels (Ukkusuri
Results reaffirmed the known model merits: micro-level modelling et al., 2012; Abdel-Aty et al., 2013). While the areas which provided
provided more informative and precise insights for directly improving more accurate predictions were determined, no uniform solutions were
road safety, while macro-level modelling allows for incorporating proposed. When outlining MAUP, Xu et al. (2018) outlined four po-
safety improvements in long term transportation planning. The authors tential solutions. These were: (1) using disaggregate data as possible (2)
acknowledge that TAZs may have unobserved scale and zonal effects capturing the spatial non-stationarity, which refers to capturing local
and further, the boundary issue – explained in the following – needs to space variation for each explanatory variable, (3) designing optimal
be accounted for. zoning systems, an approach which presents its own limitations and (4)
conduct sensitivity analysis for MAUP effects specifically.
2.6. Boundary problem and Modifiable areal unit problem A recent study has empirically highlighted the important effects of
MAUP on four different zonal configurations using an identical dataset
Apart from conducting studies across many different areal levels (Zhai et al., 2019a). It was determined that the impact of MAUP was
and bridging aspects and attributes of different spatial levels, re- significant on parameter estimates, model assessment and hotspot
searchers have also shown interest on how to define areas and areal identification. Larger zones, such as CTs and ZIP codes led to models of
units and how to treat events on their boundaries. The boundary pro- higher predictive accuracy in that study. It has also been considered
blem, or boundary effect, refers to the manner in which crashes re- that the zonal systems may have inherent limitations by Lee et al.
corded on (or very close to) the borders of neighboring study areas are (2014b), who developed ten new zonal systems to tackle both the
allocated and treated in statistical analyses. Fotheringham and Wegener boundary and the MAUP problems. The Brown-Forsythe homogeneity

21
A. Ziakopoulos and G. Yannis Accident Analysis and Prevention 135 (2020) 105323

of variance test was implemented to obtain the optimal zonal scale, from land-based stations was contrasted with data from fatal crash
which was found to be at the custom TSAZ level, as zones cannot be databases. Through categorical analysis, sensitivity, positive predictive
scaled up indefinitely to reduce boundary crash percentages. However, value, and Cohen's Kappa were examined, and it was determined that
the authors state that the boundary issue still needs to be accounted for there were agreements of data in rain and snow weather conditions but
in TSAZs, and that further research on additional crash types such as not in fog, which displayed a 91% rate of false alarm. The authors
non-motorized (VRU) crashes is needed. suggest that fog may present higher spatio-temporal sensitivity as a
parameter. While the weather station data was found adequate overall
2.7. Examination of spatial proximity structures for use in crash analyses, the finding regarding the fog parameter ought
to make researchers carefully consider possible data sources for their
A critical point that attracts researcher interest is the creation of studies.
different spatial proximity structures and the examination of the effects Furthermore, instead of analyzing crashes collectively in each areal
these structures have on model performance and fit. Various spatial unit, or treating them as separate variables, different crash categories
proximity structures have been formulated both at the microscopic and can be examined while taking their interactions into account. A study
macroscopic levels. Regarding the microscopic level, Aguero-Valverde by Lee et al. (2018b) analyzed the proportions of crashes of each vehicle
and Jovanis (2010) concluded that by including route information in type at the TAZ level, using a fractional split multinomial model. The
the neighboring structure, especially in a simple neighboring structure fractional approach ensures the summation of crash proportions of all
(direct adjacency), model performance is improved. categories to 100%, thus forcing interactions between each category.
Regarding the macroscopic level, Dong et al. (2014) evaluated crash Findings showed considerable differences as to which variables were
prediction models at the TAZ level using four different types of spatial statistically significant for each vehicle type. Moreover, the spatial
proximity structures (0–1 first-order adjacency, common-boundary distribution of hot zones varied considerably per vehicle type con-
length, geometry-centroid distance, and crash-weighted centroid dis- sidered. On that matter, hotspots have also been found to vary tem-
tance). The best model fit was provided when weighting the common- porally. Soltani and Askari (2017) conducted a spatial autocorrelation
boundary length of neighboring TAZs, though cross-zonal spatial cor- analysis of crashes and hotspots at the TAZ-level in Iran. Moran’s I and
relations was identified as present in crash occurrence for all four dif- Getis-Ord Gi* methods were used, and were found to provide sig-
ferent configurations. The authors comment that the inclusion of all nificant clustering. The authors examined crashes based on location,
possible spatial correlations increases model complexity, thus resulting time of day and injury severity, which is a very rare combination of
in decreased prediction performance. parameters. This time, hotspots were found to vary considerably across
Moreover, Alarifi et al. (2018) sought to investigate spatial weights the various times of day. Another important finding is that zones lo-
configuration for a hierarchical spatial proximity structure, including cated at intersections connecting other zones were identified as clusters
intersection-, road segment- and corridor-level parameters. The authors with high crash rates. Despite the hotspot identification, however, no
examined four different types of conceptualization of spatial relation- other explanatory characteristics were introduced in the analysis. It
ships and calibrated 13 Bayesian hierarchical Poisson-lognormal joint appears thus reasonable to assume that the identified hotspots may vary
model with spatial effects. The adjacency-based first-order model considerably if certain elements are introduced to a study or omitted
(where directly adjacent road entities and feeding road entities are from it.
considered for each segment) was among the best performing models
and once again significant variables were found in all configurations for 3. Modelling approaches
all unit levels. The authors suggest that the sensitivity of AADT in the
models is a matter for further investigation. This section provides a brief overview of the various modelling
Another sophisticated approach was the utilization of the space approaches implemented so far in the literature of spatial analysis in
syntax technique for modelling street patterns. Space syntax acknowl- road safety. A multitude of tools have been developed that endeavor to
edges the configuration of the urban grid itself is responsible for gen- predict road safety indicators (Lord and Mannering, 2010; Mannering
eration of movement patterns (Hillier et al., 1993), though its exact use and Bhat, 2014) and explain spatial correlation and unobserved het-
for deriving certain route choices has been challenged in the past (Ratti, erogeneity and to incorporate the effects of various spatial character-
2004). Guo et al. (2017) considered simple geographical proximity as istics that are difficult to be represented individually. Several studies
inadequate to properly describe spatial relationships of crashes. Rather, have been testing various advanced models against simpler ones for
they sought to integrate road network characteristics in a zonal level performance assessment (e.g. Miaou and Song, 2005; Chiou et al., 2014;
examination. They used space syntax to quantify road network struc- Dong et al., 2016; Aguero-Valverde et al., 2016; Cai et al., 2019b).
tures in Hong Kong through three main parameters on the TAZ level: Multivariate models are found to have better goodness-of-fit and
(1) connectivity, (2) local integration and (3) global integration. After precision due to correlation between dependent variables, such as
calculating global integration for three road network patterns (grid, crashes of different severity levels while accounting for spatial corre-
deformed grid and irregular), it was determined that global integration lation (Barua et al., 2014) or simultaneous crash frequency and severity
was positively related with increased pedestrian-vehicle crashes. Fur- examination (Chiou et al., 2014). The benefits of multi-level data have
thermore, the more structured patterns featured the highest global in- been discussed in spatial analyses, for instance the multilevel structural
tegration values, thus irregular patterns were found to be the safest, hierarchy proposed by Huang and Abdel-Aty (2010) combining driver-
followed by deformed grids and lastly (regular) grids. level and site-level data with geographic region characteristics.
Spatial analyses often test for spatial autocorrelation or hetero-
2.8. Further topics of areal unit analysis geneity of events, and also consider size and structure for the various
research areas and spatial units of analysis in the adopted approaches.
In spatial analysis, study designs sometimes appear to be data- For the precise examination of autocorrelation phenomena, various
driven, conducted where there is availability of information instead of geo-spatial statistics have been adopted by scientists for decades, such
intuition or previous experience. Availability of data does not ne- as Moran's I, Local Moran's I, and Getis-Ord-Gi* statistics.
cessarily imply its fitness for use in studies. As an indication, weather Generalized Linear Models (GLMs) have been used extensively in
data measured from stations may or may not describe the situation at the road safety literature for decades, since they assume crashes are
crash sites accurately. A study was conducted to evaluate the effec- independent, random and sporadic countable events (Hauer et al.,
tiveness of coverage of weather stations for use in spatially analyzing 1988; El-Basyouny and Sayed, 2009). Their intricacies and limitations
traffic crashes (Chung et al., 2018). Hourly data which are observed have been covered in past studies (e.g. Lord and Mannering, 2010).

22
A. Ziakopoulos and G. Yannis Accident Analysis and Prevention 135 (2020) 105323

While GLMs in their basic form are aspatial, they can be extended to 3.2. Autoregressive prior models
incorporate spatial effects in their structure, eventually becoming quite
advanced. An example is the EMGP model by Chiou and Fu (2013), A common problem in geographical studies with spatial dataset can
further advanced by Chiou et al. (2014), which originated as an ex- be the selection of the appropriate size and scale units for analyses. This
tension of the multinomial-Poisson regression model with added error has a direct impact on results, as experience suggests that increasing
components, to which spatial correlation effects were also added. Better granularity (i.e. spatial resolution) can weaken correlations between
predictions have been obtained from GLMs including random effects output areas and introduce spatial autocorrelation issues (Loo and
rather from fixed effects, and from GLMs including zonal factors as Anderson, 2015). To counter this, studies have introduced spatial au-
opposed to those not including them (Cai et al., 2018). tocorrelation effects (e.g. Aguero-Valverde and Jovanis, 2006, 2008;
Guo et al., 2010; Flask and Schneider, 2013; Chiou et al., 2014) or
3.1. Geographically Weighted Regression temporal autocorrelation effects in crash count models (e.g. Wang and
Abdel-Aty, 2006). The respective models often use CAR or SAR
A method that accounts for spatial variation is the simultaneous with the former being more frequently implemented in road safety
development of several localized models using Geographically spatial analyses. A seminal study by Besag et al. (1991) presented a
Weighted Regression (GWR). First proposed by Fotheringham et al. normal distribution for spatial autocorrelation effects using a CAR
(2002), these models extend the traditional regression framework to prior, which has been implemented in many studies since (e.g. Huang
allow for a continuous surface of parameter values, with measurements et al., 2016; Cai et al., 2018; Zhai et al., 2018; Wen et al., 2019).
at points that indicate the spatial variability of such a surface. A number CAR models have been found to perform better than Poisson models
of road safety GWR analyses have been published (Hadayeghi et al., and Multiple Membership models (where higher level units are formed
2003, 2010; Pirdavani et al., 2014a, 2014b; Rhee et al., 2016; Gomes by each unit and its adjacent neighbors), by explaining a high degree of
et al., 2017; Liu et al., 2017). As Pirdavani et al. (2014b) note, GWR spatial heterogeneity and by being more lenient in spatial variable
models offer explanatory and descriptive power and provide intuitive omission (El-Basyouny and Sayed, 2009). However, Yasmin and Eluru
results that enable researchers and stakeholders to investigate varying (2016) note that considering spatial autocorrelation effects and latent
effects of explanatory variables on crash occurrence throughout the segmentation simultaneously can be analytically challenging. Auto-
study areas. regressive models can also be developed within a Bayesian Framework
Gomes et al. (2017) compared the performance of GWR extended in as shown in Aguero-Valverde et al. (2016); CAR models have been
a GLM context and highlight that Geographically Weighted Negative found to be convenient to compute while using a Gibbs sampler in the
Binomial Regression (GWNBR) is appropriate for spatially analyzing Bayesian inference (Huang et al., 2010). Bayesian CAR models have
crash data while accounting for their over-dispersion. Additionally, been shown as capable to function with a variety of customizable
GWNBR models significantly reduced the spatial dependence of model spatial weights (Aguero-Valverde and Jovanis, 2010; Alarifi et al.,
residuals. GWNBR models were also utilized by Liu et al. (2017) to 2018). These weights can be calculated based on several different bases
produce localized models at the roadway segment level, without re- (e.g. by geometric distance of zone centroids or by land use type). Of
strictions by jurisdiction boundaries. The variation of three calculated these weight sets, it is natural that some will outperform others for a
parameters (intercept, AADT and segment length) was found to be specific study configuration, though not always in the expected manner,
substantial in highway segments across Virginia, US, though the effects as shown by Wang et al. (2016b), where a simple 0-1 configuration
of several factors remain to be examined. Additionally, the introduced based on proximity outperformed land use type- and intensity-based
parameter of segment length is present in spatial structures, which weights for pedestrian crash prediction (population was used as ex-
might introduce bias to GWNBR estimations. The authors comment that posure parameter for pedestrians only, without a corresponding para-
GWNBR models are highly localized, thus the transferability of their meter for vehicles).
predictions is limited and need to be reapplied to each area.
Xu and Huang (2015) extended GWR to semiparametric GWR (S- 3.3. Bayesian modelling
GWR), which combines geographically varying parameters with geo-
graphically constant parameters. Although their composite approach The process of Bayesian inference has led to the development of
outperformed a random parameter negative binomial (RPNB) model, several interesting methodologies during more recent years. Bayesian
the authors claimed that S-GWR models are not transferable spatially, hierarchical joint models have been developed in various complexities
and that each region would need to develop separate S-GWR models (a using regression and regression methods for parameter estimation,
common conclusion with the GWNBR method). S-GWR was compared possibly with regression splines, as shown in an early Bayesian ap-
again with RPNB by a study conducting crash analysis across six spatial proach by MacNab (2004). Moreover, multivariate Bayesian models are
units and three injury severity levels (Amoh-Gyimah et al., 2017). capable of estimating excess crash frequencies at different severity le-
Again, results indicated that S-GWR performed better than the RPNB vels in the same spatial analysis unit (Aguero-Valverde, 2013). Bayesian
overall, based on mean absolute deviation (MAD) and Akaike in- hierarchical joint models have been shown to highlight significant
formation criterion (AIC) metrics, and had increased prediction accu- variables at both micro and macro levels while accounting for spatial
racy. On the other hand, RPNB displayed increased sensitivity when correlations between entities (e.g. in Cai et al., 2019a). Such an appli-
examining the effect of variation of spatial units on unobserved het- cation by Wang and Huang (2016) determined higher AADT, more
erogeneity compared to S-GWR. It should be noted that the latter study lanes and accesses for segments on the micro level, signal control, more
did not examine any geometrical characteristics such as segment length intersection legs, and higher speed limit for segments for intersections
or intersection density. on the micro level and higher road network and trip generation den-
S-GWR has also been employed to investigate possible correlations sities as significant risk factors, among others.
between jobs-housing balance and road safety, since disruptions in that As studies often report, models with Bayesian approaches have been
balance have been found to lead to reduced road network efficiency (Xu found to perform consistently better than their non-Bayesian counter-
et al., 2017b). The authors converted jobs-housing ratio to a categorical parts (e.g. Miaou and Song, 2005; Siddiqui et al., 2012; Wang and
variable and then applied S-GWR models at the TAZ level. Considerable Huang, 2016). Bayesian models with CAR effects have been shown to
spatial variations were discovered for different jobs-housing ratio ca- simultaneously account the spatial correlation and uncorrelated het-
tegories, through elasticity analysis of the model results for each jobs- erogeneity present in aggregated crash count data, and to reveal more
housing ratio category. However, the study did not compare the S-GWR significant variables with the same signs as frequentist modelling
results with those of another baseline model. (Quddus, 2008). However, Bayesian models are not without drawbacks,

23
A. Ziakopoulos and G. Yannis Accident Analysis and Prevention 135 (2020) 105323

as a main strength of their applications is reduced in cases without any converted count variables into continuous approximations for their
solid basis of prior knowledge (uninformed priors). Furthermore, they analyses. They then used an explanatory variable in the expression of a
require a considerable amount of calibration cases (sometimes men- spatially lagged dependent variable to form a spatial autoregressive
tioned as burn-outs) which leads to some loss of information and might (SAR or spatial lag) model.
require considerable computational time and power to obtain. Cai et al. (2016) included spatial spillover effects in the examination
A noteworthy development is the recent investigation of spatio- of pedestrian and bicyclist crashes. Via the application of dual-state
temporal heterogeneity using multivariate hierarchical Bayesian GLMs, it was determined that taking observed spatial spillover effects
models across injury severity categories. Relevant studies have en- into consideration results to models with better performance con-
deavored to capture data heterogeneity with spatial and temporal ef- sistently. The zero-inflated negative binomial models were found to
fects, with the hierarchical framework serving to predict crash counts of have the best fit for pedestrian and bicycle crashes, though unobserved
different severities simultaneously. Spatial and temporal components spatial autocorrelation effects were not simultaneously examined in the
are specified with several structured and unstructured components, and study. To evaluate the impacts of significant factors, marginal effects
random effects can be inserted in the models to address the underlying were calculated as well.
data structure. Specifically, Ma et al. (2017) aggregated crash counts In addition, Wen et al. (2019) aimed to capture both spatial auto-
from 100 homogenous US highway segments into injury/no injury correlation and spillover effects using a hybrid model. The hybrid
crash categories using high temporal resolution (daily intervals). They model featured the traditional Poisson-lognormal basis. The authors
identified vehicle-distance travelled and some geometric characteristics expressed spatial autocorrelation effects as the CAR prior and spillover
as significant crash predictors, as well as variables that are more sen- effects as exogenous variables of neighboring road segments. Homo-
sitive temporally, such as wet pavement and average speed. geneous highway segments were used for the analysis. Both of spatial
In a recent study by Liu and Sharma (2018) examining injury cra- autocorrelation and spatial spillover effects were found to be sig-
shes, both spatial and temporal effects were bound to be important in nificantly correlated with the respective crash data. This hybrid ap-
approximately the same magnitude across spatial, temporal and spatio- proach yielded better estimates than both of its individual components,
temporal structures. Crash frequencies showed significant spatial, but with coefficients that showed lower standard deviations. The authors
not temporal, autocorrelations. Similarly, Li et al. (2019) mentioned the suggest that accounting for spatial heterogeneity may further refine the
issues of spatio-temporal instability in crash data, apart from the typical model, but a much more complex structure would be required.
unobserved heterogeneity that is inherent to data collection. They ca-
librated Bayesian random parameters models (with both structured and 3.6. Alternative Prior Distributions
unstructured spatio-temporal effects) which show that daily VMT,
proportion of males, unemployment rate and education are found to Apart from the widely used CAR model, other approaches can be
positively increase crash frequency and are normally distributed across implemented to account for spatial effects in models through different
crash severities for crashes related to substance consumption. prior distributions. Mitra (2009) adopted a hierarchical Full Bayes
spatial model to investigate the presence of possible influences of spa-
3.4. Empirical Bayes and Full Bayes analyses tially structured factors on injury crashes at intersections. The rea-
soning behind such an approach is an attempt to capture both hetero-
Since several decades, Empirical Bayes (EB) methods have been geneity from spatial effects (implying a common global structure) and
implemented in road safety by contrasting crash counts of a road seg- excess heterogeneity (originating from spatially unstructured effects).
ment with sites with comparable true crash risk, which are the re- The first level of the hierarchy is a Poisson-lognormal specification. The
ference population. EB estimations have displayed better predicting Poisson rate then included the typical intercept and covariates, and also
capabilities and eliminate regression to the mean issues than Naive two separate effect terms, spatially structured and unstructured, to
before-after comparisons (Hauer, 1997; Geurts and Wets, 2003). EB capture spatial and excess heterogeneity respectively. The spatially
methods have been also used in a before-after study in complementarity structured effects used a multivariate normal joint prior. Results in-
to a before-after study with a comparison group in order to obtain more dicated considerable spatial autocorrelation effects at the intersection
reliable CMFs (Lee et al., 2017b). level, while a comparison with aspatial Negative Binomial regression
Further to that direction, Full Bayes (FB) extended models can be revealed similar coefficient estimates but increased model precision.
used to account for heterogeneity due to unobserved road geometric A similar jointly-specified approach was adopted by Aguero-
characteristics, traffic characteristics, environmental factors and driver Valverde (2014), to determine the effective range after which no lin-
behavior (El-Basyouny and Sayed, 2011; Ma et al., 2017). The FB ap- gering correlation is found at the road segment level. The Poisson rate
proach has also been shown to be more reliable empirically in hotspot function featured one parameter for heterogeneity among segments,
identification compared to EB (Huang et al., 2009). The advantage of using a normal distribution, and one for spatially correlated random
FB over EB is that it takes into account that model parameter estimates effects per segment, using a jointly specified prior. Additionally, a
include an amount of uncertainty and can provide a quantitative temporal indicator for the evolution of crashes in years in covariate
measure of said uncertainty (Miaou and Lord, 2003). The FB approach values and predicted crash counts was included. Ultimately, the joint
is the basis of several recent developments discussed in the following. prior model outperformed a random-effects model and a CAR prior
model and the effective range was determined (at about 168 m). The
3.5. Spatial spillover effects author states that the manner in which distance is measured (e.g. Eu-
clidean distance, ground route distance or any other way) also has an
An emerging aspect of spatial analyses is the examination of spatial impact on model predictions.
spillover effects. Spatial spillover effects are the effects that exogenous A different form is the Full Bayes Multiple Membership (MM) spatial
observed variables have on the dependent variable at both the target model proposed by El-Basyouny and Sayed (2009). The approach in-
and the neighboring locations. Spatial spillover effects differ from cludes similar spatially structured and unstructured effects as the pre-
spatial autocorrelation (or error correlation) effects, which entail un- vious studies. In addition, MM models consider each site as a member of
observed exogenous variables at one location affecting dependent a higher-level unit that contains its nearest neighbors. They also include
variables at the targeted and neighboring locations (Narayanamoorthy a parameter measuring the strength of association between structured
et al., 2013; Cai et al., 2016; Lee et al., 2018b). and unstructured spatial effects. The authors further extended MM
Past studies have utilized spatial lag regression models in an effort models by adding an additional component to allow for variance in the
to capture spillover effects. LaScala et al. (2000) and Quddus (2008) values of crash risks and characteristics between mutually exclusive

24
A. Ziakopoulos and G. Yannis Accident Analysis and Prevention 135 (2020) 105323

corridors. When tested, the extended MM model slightly outperformed models exhibited better performance on the daily level, while bench-
a CAR model, which in turn outperformed a basic MM model, though mark econometric models generally performed better on the weekly
the overall DIC metrics showed quite close values. level, suggesting that neither approach is clearly superior. Another in-
Xu et al. (2017a) introduced another methodological alternative in teresting application is described in Zhu et al. (2018); the CNNs de-
the form of a very detailed Bayesian spatially varying coefficients ap- veloped in the study take into account spatio-temporal network and
proach, based on the hierarchy proposed by Huang and Abdel-Aty traffic structure. However, they are used for traffic incident detection/
(2010). The process again started with a Poisson function in a Full identification, and not road safety prediction or causation analysis.
Bayesian framework, and the parameters were modelled using a CAR Cai et al. (2019b) explored that research direction by applying CNNs
prior. The innovation of the study lied in the utilization of a single set of for road safety prediction by collecting and utilizing high-resolution
random effects ranging from purely unstructured to purely spatially data: 3mile x 3mile grids with crash counts and data, each grid con-
structured effects; this simultaneous process is considered superior by taining 100 × 100 cells with width and height of 158.4 feet, examined
the authors, however it features a mathematical structure that is quite in 17 layers of data matrices. By feeding data of a higher resolution into
complicated. a CNN, the authors allowed variables to fluctuate across locations more
freely, thus increasing the model accuracy. It was stated that the hier-
3.7. Machine learning & Deep learning approaches archical structure enables better understanding of the circumstances of
crash occurrence. While the authors demonstrated a viable approach for
Given their popularity as a powerful, data-driven family of predic- crash prediction, it is obvious that extra effort is required for the
tion tools, machine learning (ML) methods have been implemented for creation of this high-resolution grid and the complementing database.
spatial and spatio-temporal road safety analyses. Indicative methods Some variables might be readily available for calculation in high-re-
used in road safety spatial analyses are outlined below. ML methods can solution or inferred via the existing road geometry (such as segment
operate with increased degrees of freedom without requiring traditional lengths), while others may be harder to obtain in case of missing data
assumptions as regression models do, and are more resilient to data (such as land uses). Approaches such as CNNs might require custom,
outliers. They are methods typically used in conjunction with big data tailor-made data collection frameworks in order to provide their full
in transport and road safety. potential, as the authors suggest. Furthermore, no specific framework is
Random forest (RF) models are collections of numerous super- established for assigning the values of required hyperparameters during
imposed decision trees that emerge from a selection and validation the CNN training phase.
process, as described in Chang and Wang (2006). RF models have been
used in road safety studies by researchers. For instance in Jiang et al. 3.8. Kernel Density Estimation
(2016) the feasibility of RF models for ranking hot-zones on a TAZ level
and identifying critical parameters for crash occurrence when utilizing Another crash and hotspot analysis method is kernel density esti-
big data was investigated. Road network distribution (density) and mation (KDE), which allows the generalization of incident locations to
socio-economic features such as school enrollment and car ownership an entire area. It should be noted that this is not a direct analytical
percentages were found as the most statistically significant variables for method, but rather an interpolation technique (Anderson, 2007) mainly
crash occurrence. The study concludes that RF models provide classi- used for the identification of clustering patterns of traffic collisions.
fication with about 80% accuracy in hotspot identification. KDE can be advantageous in predicting the spread of crash risks, though
Support Vector Algorithms (SVM) have been successfully im- the kernel radius has been a matter of debate in several scientific fields
plemented as alternatives to traditional statistical-regression modelling. (e.g. Raykar and Duraiswami, 2006; Hart and Zandbergen, 2014). It
In a relevant study, SVMs were employed together with a coactive appears that bandwidth determination influences the outcome of the
neuro-fuzzy inference system (CANFIS) algorithm (Effati et al., 2015). hotspots (Fotheringham et al., 2000; Anderson, 2009; Loo and
SVMs were found to be considerably better performing when examining Anderson, 2015). Furthermore, the fact that KDE treats discrete events
crash injury severity, especially when utilizing a radial basis kernel as a continuous area effect can be presented as a limitation (Anderson,
function (RBF). The researchers propose the enhancement of spatial 2009). Erdogan et al. (2008) conducted an analysis of hotspot clusters
analyses with machine learning algorithms as the key to unveiling in a province of Turkey and utilized KDE together with a repeatability
significant factors affecting crash injury severity while accounting for analysis of hotspot crashes for a decade. The authors reported con-
spatial correlation and heterogeneity effects. The study of Dong et al. siderable overlap of the outcomes, though KDE determined less hotspot
(2015) implemented SVMs as a tool for handling big and complex data locations overall. An interesting approach by Mountrakis and Gunson
structures. They examined zone-level crash prediction while taking (2009) investigated the development of KDE spatially (determining
spatial autocorrelation into account, and SVMs were found to perform varying density peaks among roads) and temporally (determining an
better when including a spatial weight feature with an RBF kernel as exponentially increasing trend with annual periodicity and a seasonal
opposed to SVM models. SVMs have been also used in conjunction with cyclic component) for animal-related crash hotspots in Vermont, US.
Bayesian methods, though, to the authors' knowledge, not yet in a Kernels are projected over 2-D spaces, while road crashes usually
spatial analysis framework; for instance, Wang et al. (2019) used occur in a 1-D linear area, which most road environments approach, as
Bayesian logistic regression to detect factors contributing to highway Xie and Yan (2008) note. In order to overcome this discrepancy, KDE
ramp crashes. has been expanded to network KDE approaches, in which the network is
Latest technological progressions make neural network im- represented as fundamental units of equal network length (termed
plementation much more feasible than past years. Bao et al. (2019) lixels). Xie and Yan (2008) investigated this method and how funda-
utilized a deep learning approach for short-term crash risk prediction mental lengths and regular kernel bandwidth affect its performance for
for crash risk on an urban level. They augmented a convolutional neural road crash prediction. They conclude that network KDE describes crash
network (CNNs) with a long short-term memory network in order to densities and network borders more precisely than regular KDE, and
examine variables that varied spatially, temporally or spatio-tempo- that lixel length appears more important than Kernel function selection.
rally, proposed by earlier research for traffic speed and congestion However, Loo et al. (2011) implemented network KDE in areas of
prediction (Ma et al., 2015a, 2015b). Weekly, daily and hourly pre- varying land use and found that kernel bandwidth critically affects the
diction models with varying spatial grids were produced as a result. The spatial distribution of resulting density estimates. Furthermore, wider
authors mention that prediction performance of the proposed model bandwidths appeared to be more appropriate for non-urban areas
decreases as the spatiotemporal prediction outcome resolution in- where crash density is lower.
creases towards the hourly level. It is noteworthy that machine learning Similarly, Mohaymany et al. (2013) applied network KDE to a rural

25
A. Ziakopoulos and G. Yannis Accident Analysis and Prevention 135 (2020) 105323

road in order to determine hazardous segments; apart from static spa- bicyclists in their ZIP of residence was explored in a study by Lee and
tial autocorrelation of crashes they also investigated its temporal evo- Abdel-Aty (2018). Bayesian Poisson lognormal CAR models were used
lution through a three-year period. Bíl et al. (2013) also used KDE in a to examine bicycle crashes, and the contributing factors were not
1-D area by separating the network into sections. They explored an identical in each case. For instance, increases in the number of schools
alternative venue for better refining KDE results by providing a method per mi2 were only found to lead to increases in bicycle crashes in the
to test their statistical significance. The proposed method utilized re- crash location ZIP. Conversely, lower income areas were found to be a
lative spatial positions of crashes and roadway length to calculate contributing factor overall through the significance of many related
kernel strength, which allows detection and prioritization of the most variables. Again, PSI was used to identify VRU crash hotspots in both
hazardous locations, which included classifying clusters with values studies.
above the 95th percentile of the kernel density function as hazardous. A noteworthy finding is that of Siddiqui et al. (2012), who produced
Bayesian models for pedestrian and bicyclist crashes at the TAZ level,
4. Vulnerable Road Users noting the necessity of accounting for spatial correlation while ex-
amining VRU crashes at the macroscopic level, which is also corrobo-
In road safety, vulnerable road users (VRUs) include pedestrians, rated by Guo et al. (2017). In addition, spatial spillover effects have also
bicyclists and other road users who are often children, elderly, people been examined in a VRU context, as mentioned before (Cai et al., 2016).
with impairments and disabilities. Due to their vulnerability to injuries Apart from methodological and modelling approaches, the influence
or fatalities compared to vehicle users, VRUs have increased safety of parameters for pedestrian crashes have also been examined in high
needs. The use of spatial analyses, or approaches in a spatial context, to resolution. Specifically, the effects of weather conditions have been
examine aspects of road safety concerning VRUs warrants specific ex- investigated using GIS within a spatial context (Zhai et al., 2019b).
amination. A notable example is the study of Tasic et al. (2017) which Binary and mixed logit models were used in the study, in a basic form
investigated crashes involving vehicles and VRUs by using models that and in a more advanced form including terms of interaction between
accounted for spatial correlation effects. Data was aggregated on a CT weather conditions and risk factor variables. Both high temperatures
level for a large array of about a hundred variables for vehicle-only, and precipitation were found to be associated with pedestrian crashes
pedestrian and bicycle crashes. The data were analyzed using an ex- of increased severity. Hotter weather and the presence of rain were also
tension of GLMs, Generalized Additive Models (GAMs), which included found to exacerbate the effect of risk factors, such as jaywalking or
a two-dimensional smooth function to account for spatial correlation. A unsafe driver behavior.
remarkable finding was that the expected pedestrian or bicyclist crashes
increased less than proportionally with the exposure variables of ve- 5. Discussion
hicle, pedestrian or bicyclist trips, confirming the safety-in-numbers
effect on a macroscopic level while accounting for spatial correlation 5.1. Findings from reviewed studies
effects.
Analyzing pedestrians' walking exposure and crashes in an in- The examination of the studies that was carried out in this research
tegrated manner was proposed in a dedicated study on the MSA level has led to some noteworthy conclusions for spatial analyses in road
(Lee et al., 2019a). For estimating exposure, multiple linear regression safety. It appears that a multitude of different approaches and model-
models were calibrated, followed by a Poisson-lognormal regression ling methodologies has been adopted in the literature, with a trend
model for fatality estimation using the estimated exposure as input. towards advanced Bayesian models and methods in the past decade.
Walking hours was determined as the best performing exposure vari- This has led to the development of powerful tools that provide accurate
able. The proposed integrated model outperforming the non-integrated predictions for crash counts per area with increasingly complex model
ones. Spatial correlation of trips was not investigated in the study, configurations. However these approaches also lead to a lack of a
however, and pedestrian safety features were not examined either. VRU common established methodology or framework to compare results of
exposure, in the form of trips, has also been estimated at a macroscopic spatial analyses. Additionally, this finding does not imply that more
level in an integrated manner. These trip numbers were used to cali- traditional functional/econometrics methods, such as GLM models or
brate VRU crash prediction models in a study across 23 Metropolitan GWR are not found useful still, at least for benchmarking purposes.
areas, and it was found that estimated exposure (VRU trips) led to Functional models appear to be more straightforward in their inter-
models with calibrated performance compared to observed exposure for pretation and assessment of results. In both cases, results of spatial
both pedestrians and cyclists (Lee et al., 2018c). studies have also been reported to have limited transferability as well.
Pedestrian crash hotspots have been examined through spatial Recently, machine learning approaches have come to challenge the
processing of their respective costs using big data from multiple sources dominance of Bayesian models by being implemented alongside or in-
such as taxi trips and social media (Xie et al., 2017) by employing a grid stead of them. It should be noted that these are mostly data-driven
structure divided in higher resolution cells, similar to Cai et al. (2019b). approaches, which have also been reported as containing inherently
Crash costs were assigned to cells using a kernel density estimation biased samples, especially when examining big data (e.g. Bao et al.,
function, and sites were identified using tobit models with potential 2017, 2019). While the aforementioned transferability issues are mostly
safety improvements (PSIs) and ranked as potential hotspots based on solved with machine learning methods, there are often difficulties in
the potential of pedestrian crash cost reduction. The authors claim that the interpretation of results: A commonly cited example is the hidden
their method can be transferred to less populated regions by adjusting layers of neural networks and the meaning of each contributing factor.
kernel bandwidths. Approaches such as SVM are subpar in determining the significance of
Pedestrian crashes do not necessarily occur in the zone of residence revealed patterns in the data they examine or the utility each variable
of the pedestrians involved; Lee et al. (2015b) sought to identify zones offers in prediction tasks.
where pedestrian crashes occur, and zones where pedestrian crashes Further on the results of spatial studies, another important finding is
originated from. Using different exposure variables, a variation of a the revelation of sensitivity of hotspot locations. Researchers have
Bayesian lognormal model with Poisson structure was applied. The shown that hotspots are radically different across users of different
occurrence of crashes with pedestrian involvement was revealed to be vehicles and ages, and that hotspots display significant variation
significantly affected by more location-related factors, while pedestrian throughout the time of day. It can be reasonably surmised that many
origin was revealed to be significantly affected by more demographic- elements that are introduced to an analysis radically change the hotspot
related factors. A similar concept of investigating both ZIP codes of map. Naturally, the employed methodologies also affect the final out-
crash locations for bicyclists and the number of crash-involved come of spatial studies. Researchers should be vigilant and try to

26
A. Ziakopoulos and G. Yannis Accident Analysis and Prevention 135 (2020) 105323

convert unobserved factors into observed ones, in order to receive more frequently as crash counts; rather, they have mostly been used as a
substantial and precise hotspot maps. categorization mechanism. By jointly examining crash severities and
Though studies have been published internationally, spatial ana- occurrence while taking spatial effects into account, more informative
lyses have been more common in more modernized and developed results can be reached for practitioners. Similar potential exists for
countries (especially USA), while developing countries are considerably studies aiming to examine casualty rates. In addition to the previous, it
less represented. The use of different sizes of spatial units as basis for would be interesting to spatially analyze other road safety indicators,
spatial analyses has been examined extensively, and it appears that such as those related to driver behavior: conflicts, near-misses, harsh
apart from information and data availability, spatial areas of each size events and traffic law violations. These can aid in determining high
have different advantages and disadvantages. Several studies include crash concentrations and locations of poor road safety performance
exposure parameters in order to establish a common baseline for crash (hotspots).
risk comparisons between models (Imprialou et al., 2016). When ex- Hotspot detection, or problematic region identification in greater
posure parameters such as road length, AADT and vehicle distance scales, is a crucial advantage typically provided by spatial analyses for
travelled are examined, they are found to increase crash risk overall, as locating problems. Therefore, the determination of the spatial impacts
expected, however there are particular cases where these results might of implemented road safety measures would also be very beneficial.
not apply or even be reversed (e.g. Dong et al., 2014). Before-after studies within a spatial context (or even a spatiotemporal
It has been demonstrated that the parametrization of the spatial context, if a dedicated data collection scheme can be set) would allow
correlation term, namely, its inclusion as a variable in models, can aid observation of crash reductions due to targeted observations from the
in situations where data is scarce or difficult to obtain. Its use can be initial analyses. Such study designs would also allow the examination of
further expanded, however, as a complementary feature to even vari- the variation of spatial autocorrelation of events (and whether any
able-rich models, in order to explain parts of variation in the data. exists) before and after interventions, and would offer interesting in-
That being said, data availability remains a critical issue, and lack of sights in any possible crash mitigation phenomena. Another promising
consistent data across a respectable duration of time can be a critical research direction is the transfer and application of more focused spa-
obstacle in conducting spatial and spatio-temporal analysis. Spatial tial analysis methods for the examination of segments of a contiguous
analyses in road safety appear data-driven most of the time, stemming road network, similar to network KDE approaches, so that segments are
from the drive of researchers to prove or test a concept. There are assessed instead of areal units, but in the form of an extended and
variables that have not been extensively tested due to lack of data, for complex road network, as an expansion of the segment analysis ap-
instance pavement condition. Similarly, there are study areas that merit proaches mentioned in section 2.1.
more attention, such as extensive urban network environments formed Some spatial issues, while proven to exist, need to be further ana-
by roads of lower categories. lyzed to increase comprehensiveness. The specific effective range of
Traffic speed does not appear to be as frequently used as in past spatial correlation among analysis units, as studied by Aguero-Valverde
decades, though speed limits are taken into account as network char- (2014) and Wang et al. (2016b) needs to be expanded upon. Again,
acteristics, rather than traffic characteristics. Moreover, it can be ob- there is a need for results for different road environments, road users,
served that certain geometrical features seem to be used less frequently, crash types and injury severities in order to obtain measures of the
such as road gradient, curvature and lane width. As an indication, the extent that spatial dependency needs to be accounted for. In addition,
'gradient' column on Table 2 was blank at the end of the reviewing different countries are expected to produce varying results, possibly due
process and was thus removed. This decline in use can be attributed to to differences in driving culture or other unobserved factors.
missing data for many study areas, or to difficulty in data acquisition. Another direction that would increase the low transferability of
Another reason may be the lower prioritization of geometrical features results of spatial analysis is the creation of common frameworks for the
from researchers: studies often seek to include crash data, traffic data, two famous problems (boundary and MAUP), preferably on the inter-
socio-economic data, demographic data and land-use data. Therefore national scale. The establishment of an acceptable boundary value in
traditional road geometry data is receiving less attention in comparison order to address boundary issues under different conditions, as sug-
to past decades. gested by Zhai et al. (2018b), is such an example. More effort is needed
to be devoted to understanding the impacts of both the boundary issue
5.2. Future research directions and MAUP across areal unit sizes as well, especially if different con-
tributor variables are found in boundaries. Similarly, methods to obtain
This section outlines research directions that do not appear to be more homogeneous road segments or areal units need to be developed,
adequately investigated from the present literature of road safety spa- in an effort to reduce heterogeneity. They would have to be compre-
tial analyses and can constitute meaningful future research endeavors. hensible and straightforward in order to be more widely accepted and
An important aspect that was does not appear to be adequately in- applied by practitioners worldwide.
vestigated is that of micro-level road safety and event analysis with Yet another finding from the reviewed studies is that built en-
spatial modelling considerations. A small number of studies has been vironment is not very strictly defined in the sense that every study se-
found to explore concepts such as automated conflict extraction via lects some of its characteristics to examine. In a dedicated study,
trajectory analyses using automated data (Saunier and Sayed, 2007; St- Ukkusuri et al. (2012) include in the term built environment factors
Aubin et al., 2015). The inclusion of spatial effects in such design such as land use patterns, population characteristics such as age profiles
concepts would be very interesting for the determination of the influ- and professional driver percentages, road infrastructure and transit
ence of spatial effects at a small-unit level. characteristics. This review has not exhausted all built environment
While crash counts have been examined extensively, their dis- parameters, and the investigation of more specific variables such as the
tributions over several categories have received less focus within a presence of refuge islands or crosswalks or proximity to health or
spatial context. The recent fractional approach by Lee et al. (2018b) education buildings merit additional investigation, and can be a future
that examines crash distribution across vehicle types is an example direction of targeted road safety spatial analyses.
towards that direction, as is the examination per crash type proposed by These endeavors can all be further augmented by new technological
Aguero-Valverde et al. (2016). Nonetheless, more research is needed on developments, such as transport applications of big data, cloud com-
the manner in which various categories of crashes occur across study puting and connected & autonomous vehicle technologies that can be
areas. The distribution of exact crash proportions and the factors that used to provide a more connected spatial environment (e.g. as in Bao
affect them needs to be researched within a spatial context. For in- et al., 2018). For instance, it has been found that smartphone tech-
stance, injury severity distributions have not been investigated as nology sampling can provide a vast amount of driving data in real

27
A. Ziakopoulos and G. Yannis Accident Analysis and Prevention 135 (2020) 105323

conditions, including risk factors such as distraction and speeding Research 3, 28–43.
(Papadimitriou et al., 2018), while achieving a seamless transition from Besag, J., York, J., Mollié, A., 1991. Bayesian image restoration, with two applications in
spatial statistics. Annals of the institute of statistical mathematics 43 (1), 1–20.
data collection to data analysis (Yannis et al., 2017). This framework Bíl, M., Andrášik, R., Janoška, Z., 2013. Identification of hazardous road locations of
could enable not only a collection of a wealth of real-time information traffic accidents by means of kernel density estimation and cluster significance eva-
across several spatial unit levels, but also allow for easier calibration of luation. Accident Analysis & Prevention 55, 265–273.
Bivand, R., Müller, W.G., Reder, M., 2009. Power calculations for global and local
spatial models without the doubt of transferability that is often present Moran’s I. Computational Statistics & Data Analysis 53 (8), 2859–2872.
in spatial analyses. Bu, L., Wang, F., Gong, H., 2018. Spatial and factor analysis of vehicle crashes in
Mississippi state. Natural Hazards 1–22.
Cai, Q., Abdel-Aty, M., Sun, Y., Lee, J., Yuan, J., 2019a. Applying a deep learning ap-
Acknowledgements proach for transportation safety planning by using high-resolution transportation and
land use data. Transportation Research Part A: Policy and Practice 127, 71–85.
This research is co-financed by Greece and the European Union Cai, Q., Abdel-Aty, M., Lee, J., Huang, H., 2019b. Integrating macro-and micro-level
safety analyses: a Bayesian approach incorporating spatial interaction.
(European Social Fund - ESF) through the Operational Programme
Transportmetrica A: Transport Science 15 (2), 285–306. https://doi.org/10.1080/
«Human Resources Development, Education and Lifelong Learning» in 23249935.2018.1471752.
the context of the project “Strengthening Human Resources Research Cai, Q., Abdel-Aty, M., Lee, J., Wang, L., Wang, X., 2018. Developing a grouped random
Potential via Doctorate Research” (MIS-5000432), implemented by the parameters multivariate spatial model to explore zonal effects for segment and in-
tersection crash modeling. Analytic methods in accident research 19, 1–15.
State Scholarships Foundation (IKY). Cai, Q., Abdel-Aty, M., Lee, J., Eluru, N., 2017a. Comparative analysis of zonal systems for
The authors would also like to thank three anonymous reviewers for macro-level crash modeling. Journal of safety research 61, 157–166.
providing valuable suggestions to increase the completeness of this Cai, Q., Abdel-Aty, M., Lee, J., 2017b. Macro-level vulnerable road users crash analysis: a
Bayesian joint modeling approach of frequency and proportion. Accident Analysis &
study. Prevention 107, 11–19.
Cai, Q., Lee, J., Eluru, N., Abdel-Aty, M., 2016. Macro-level pedestrian and bicycle crash
References analysis: Incorporating spatial spillover effects in dual state count models. Accident
Analysis & Prevention 93, 14–22.
Chang, L.Y., Wang, H.W., 2006. Analysis of traffic injury severity: An application of non-
Abdel-Aty, M.A., Lee, J., Eluru, N., Cai, Q., Al Amili, S., Alarifi, S., 2016. Enhancing and parametric classification tree techniques. Accident Analysis & Prevention 38 (5),
generalizing the two-level screening approach incorporating the highway safety 1019–1027.
manual (HSM) methods, Phase 2. University of Central Florida, Department of Civil, Chiou, Y.C., Fu, C., Chih-Wei, H., 2014. Incorporating spatial dependence in simulta-
Environmental and Construction Engineering. neously modeling crash frequency and severity. Analytic methods in accident re-
Abdel-Aty, M., Lee, J., Siddiqui, C., Choi, K., 2013. Geographical unit based analysis in search 2, 1–11.
the context of transportation safety planning. Transportation Research Part A: Policy Chiou, Y.C., Fu, C., 2013. Modeling crash frequency and severity using multinomial-
and Practice 49, 62–75. generalized Poisson model with error components. Accident Analysis & Prevention
Abdel-Aty, M., Siddiqui, C., Huang, H., Wang, X., 2011. Integrating trip and roadway 50, 73–82.
characteristics to manage safety in traffic analysis zones. Transportation Research Chung, W., Abdel-Aty, M., Lee, J., 2018. Spatial analysis of the effective coverage of land-
Record 2213 (1), 20–28. based weather stations for traffic crashes. Applied geography 90, 17–27.
Abdel-Aty, M., Wang, X., 2006. Crash estimation at signalized intersections along corri- Cottrill, C.D., Thakuriah, P.V., 2010. Evaluating pedestrian crashes in areas with high
dors: analyzing spatial effect and identifying significant factors. Transportation low-income or minority populations. Accident Analysis & Prevention 42 (6),
Research Record: Journal of the Transportation Research Board 1953, 98–111. 1718–1728.
Aguero-Valverde, J., Wu, K.F., Donnell, E.T., 2016. A multivariate spatial crash frequency Cui, G., Wang, X., Kwon, D.W., 2015. A framework of boundary collision data aggregation
model for identifying sites with promise based on crash types. Accident Analysis and into neighbourhoods. Accident Analysis & Prevention 83, 1–17.
Prevention 87, 8–16. Delmelle, E., Thill, J.C., 2008. Urban bicyclists: spatial analysis of adult and youth traffic
Aguero-Valverde, J., 2014. Direct spatial correlation in crash frequency models: estima- hazard intensity. Transportation Research Record: Journal of the Transportation
tion of the effective range. Journal of Transportation Safety & Security 6 (1), 21–33. Research Board 2074, 31–39.
Aguero-Valverde, J., 2013. Multivariate spatial models of excess crash frequency at area Dong, N., Huang, H., Lee, J., Gao, M., Abdel-Aty, M., 2016. Macroscopic hotspots iden-
level: Case of Costa Rica. Accident Analysis & Prevention 59, 365–373. tification: a Bayesian spatio-temporal interaction approach. Accident Analysis &
Aguero-Valverde, J., Jovanis, P.P., 2010. Spatial correlation in multilevel crash frequency Prevention 92, 256–264.
models: Effects of different neighboring structures. Transportation Research Record Dong, N., Huang, H., Zheng, L., 2015. Support vector machine in crash prediction at the
2165 (1), 21–32. level of traffic analysis zones: assessing the spatial proximity effects. Accident
Aguero-Valverde, J., Jovanis, P.P., 2008. Analysis of road crash frequency with spatial Analysis & Prevention 82, 192–198.
models. Transportation Research Record 2061 (1), 55–63. Dong, N., Huang, H., Xu, P., Ding, Z., Wang, D., 2014. Evaluating spatial-proximity
Aguero-Valverde, J., Jovanis, P.P., 2006. Spatial analysis of fatal and injury crashes in structures in crash prediction models at the level of traffic analysis zones.
Pennsylvania. Accident Analysis & Prevention 38 (3), 618–625. Transportation Research Record: Journal of the Transportation Research Board 2432,
Alarifi, S.A., Abdel-Aty, M.A., Lee, J., Wang, X., 2018. Exploring the effect of different 46–52.
neighboring structures on spatial hierarchical joint crash frequency models. Effati, M., Thill, J.C., Shabani, S., 2015. Geospatial and machine learning techniques for
Transportation research record 2672 (38), 210–222. wicked social science problems: analysis of crash severity on a regional highway
Alarifi, S.A., Abdel-Aty, M.A., Lee, J., Park, J., 2017. Crash modeling for intersections and corridor. Journal of Geographical Systems 17 (2), 107–135.
segments along corridors: a Bayesian multilevel joint model with random parameters. El-Basyouny, K., Sayed, T., 2011. A full Bayes multivariate intervention model with
Analytic methods in accident research 16, 48–59. random parameters among matched pairs for before–after safety evaluation. Accident
Amoh-Gyimah, R., Saberi, M., Sarvi, M., 2017. The effect of variations in spatial units on Analysis & Prevention 43 (1), 87–94.
unobserved heterogeneity in macroscopic crash models. Analytic methods in accident El-Basyouny, K., Sayed, T., 2009. Urban arterial accident prediction models with spatial
research 13, 28–51. effects. Transportation Research Record: Journal of the Transportation Research
Anderson, T.K., 2009. Kernel density estimation and K-means clustering to profile road Board 2102, 27–33.
accident hotspots. Accident Analysis & Prevention 41 (3), 359–364. Elvik, R., Vaa, T., Hoye, A., Sorensen, M. (Eds.), 2009. The handbook of road safety
Anderson, T., 2007. Comparison of spatial methods for measuring road accident ‘hot- measures. Emerald Group Publishing.
spots’: a case study of London. Journal of Maps 3 (1), 55–63. Erdogan, S., 2009. Explorative spatial analysis of traffic accident statistics and road
Atubi, A.O., 2012. Determinants of road traffic accident occurrences in Lagos State: Some mortality among the provinces of Turkey. Journal of safety research 40 (5), 341–351.
lessons for Nigeria. International Journal of Humanities and Social Science 2 (6), Erdogan, S., Yilmaz, I., Baybura, T., Gullu, M., 2008. Geographical information systems
252–259. aided traffic accident analysis system case study: city of Afyonkarahisar. Accident
Bao, J., Liu, P., Ukkusuri, S.V., 2019. A spatiotemporal deep learning approach for city- Analysis & Prevention 40 (1), 174–181.
wide short-term crash risk prediction with multi-source data. Accident Analysis & Flahaut, B., 2004. Impact of infrastructure and local environment on road unsafety:
Prevention 122, 239–254. Logistic modeling with spatial autocorrelation. Accident Analysis & Prevention 36
Bao, J., Liu, P., Qin, X., Zhou, H., 2018. Understanding the effects of trip patterns on (6), 1055–1066.
spatially aggregated crashes with large-scale taxi GPS data. Accident Analysis & Flask, T., Schneider IV, W., 2013. A Bayesian analysis of multi-level spatial correlation in
Prevention 120, 281–294. single vehicle motorcycle crashes in Ohio. Safety science 53, 1–10.
Bao, J., Liu, P., Yu, H., Xu, C., 2017. Incorporating twitter-based human activity in- Fotheringham, A.S., Brunsdon, C., Charlton, M., 2002. Geographically weighted regres-
formation in spatial analysis of crashes in urban areas. Accident Analysis & sion. John Wiley & Sons, Limited, West Atrium, pp. 159–183.
Prevention 106, 358–369. Fotheringham, S., Brunsdon, C., Charlton, M., 2000. Quantitative Geography:
Barua, S., El-Basyouny, K., Islam, M.T., 2016. Multivariate random parameters collision Perspectives on Spatial Data Analysis. Sage, London.
count data models with spatial heterogeneity. Analytic methods in accident research Fotheringham, S., Wegener, M., 1999. Spatial models and GIS: New and potential models
9, 1–15. Vol. 7 CRC press.
Barua, S., El-Basyouny, K., Islam, M.T., 2014. A full Bayesian multivariate count data Gomes, M.J.T.L., Cunto, F., da Silva, A.R., 2017. Geographically weighted negative bi-
model of collision severity with spatial correlation. Analytic Methods in Accident nomial regression applied to zonal level safety performance models. Accident

28
A. Ziakopoulos and G. Yannis Accident Analysis and Prevention 135 (2020) 105323

Analysis & Prevention 106, 254–261. approaches. Transportation Research Record 2637 (1), 27–37.
Geurts, K., Wets, G., 2003. Black spot analysis methods: Literature review. Flemish Lee, J., Abdel-Aty, M., Jiang, X., 2015a. Multivariate crash modeling for motor vehicle
Research Center for Traffic Safety, Diepenbeek, Belgium. and non-motorized modes at the macroscopic level. Accident Analysis & Prevention
Guo, Q., Xu, P., Pei, X., Wong, S.C., Yao, D., 2017. The effect of road network patterns on 78, 146–154.
pedestrian safety: A zone-based Bayesian spatial modeling approach. Accident Lee, J., Abdel-Aty, M., Choi, K., Huang, H., 2015b. Multi-level hot zone identification for
Analysis & Prevention 99, 114–124. pedestrian safety. Accident Analysis & Prevention 76, 64–73.
Guo, F., Wang, X., Abdel-Aty, M.A., 2010. Modeling signalized intersection safety with Lee, J., Abdel-Aty, M., Choi, K., 2014a. Analysis of residence characteristics of at-fault
corridor-level spatial correlations. Accident Analysis & Prevention 42 (1), 84–92. drivers in traffic crashes. Safety science 68, 6–13.
Hadayeghi, A., Shalaby, A.S., Persaud, B.N., 2010. Development of planning level Lee, J., Abdel-Aty, M., Jiang, X., 2014b. Development of zone system for macro-level
transportation safety tools using Geographically Weighted Poisson Regression. traffic safety analysis. Journal of transport geography 38, 13–21.
Accident Analysis & Prevention 42 (2), 676–688. Levine, N., Kim, K.E., Nitz, L.H., 1995. Spatial analysis of Honolulu motor vehicle crashes:
Hadayeghi, A., Shalaby, A., Persaud, B., 2003. Macrolevel accident prediction models for II. Zonal generators. Accident Analysis & Prevention 27 (5), 675–685.
evaluating safety of urban transportation systems. Transportation Research Record: Li, Z., Chen, X., Ci, Y., Chen, C., Zhang, G., 2019. A hierarchical Bayesian spatiotemporal
Journal of the Transportation Research Board 1840, 87–95. random parameters approach for alcohol/drug impaired-driving crash frequency
Han, C., Huang, H., Lee, J., Wang, J., 2018. Investigating varying effect of road-level analysis. Analytic Methods in Accident Research.
factors on crash frequency across regions: a Bayesian hierarchical random parameter Li, Z., Wang, W., Liu, P., Bigham, J.M., Ragland, D.R., 2013. Using geographically
modeling approach. Analytic methods in accident research 20, 81–91. weighted Poisson regression for county-level crash modeling in California. Safety
Hart, T., Zandbergen, P., 2014. Kernel density estimation and hotspot mapping: science 58, 89–97.
Examining the influence of interpolation method, grid cell size, and bandwidth on Liu, C., Sharma, A., 2018. Using the multivariate spatio-temporal Bayesian model to
crime forecasting. Policing: An International Journal of Police Strategies & analyze traffic crashes by severity. Analytic methods in accident research 17, 14–31.
Management 37 (2), 305–323. Liu, J., Khattak, A.J., Wali, B., 2017. Do safety performance functions used for predicting
Hauer, E., 1997. Observational before/after studies in road safety. Estimating the effect of crash frequency vary across space? Applying geographically weighted regressions to
highway and traffic engineering measures on road safety. account for spatial heterogeneity. Accident Analysis & Prevention 109, 132–142.
Hauer, E., Ng, J.C., Lovell, J., 1988. Estimation of safety at signalized intersections (with Loo, B.P., Anderson, T.K., 2015. Spatial Analysis Methods of Road Traffic Collisions. CRC
discussion and closure) (No. 1185). Press.
Huang, H., Zhou, H., Wang, J., Chang, F., Ma, M., 2017. A multivariate spatial model of Loo, B.P., Yao, S., Wu, J., 2011. Spatial point analysis of road crashes in Shanghai: A GIS-
crash frequency by transportation modes for urban intersections. Analytic methods in based network kernel density method. June In: In 2011 19th international con-
accident research 14, 10–21. ference on geoinformatics. IEEE. pp. 1–6.
Huang, H., Song, B., Xu, P., Zeng, Q., Lee, J., Abdel-Aty, M., 2016. Macro and micro Lord, D., Mannering, F., 2010. The statistical analysis of crash-frequency data: a review
models for zonal crash prediction with application in hot zones identification. and assessment of methodological alternatives. Transportation research part A: policy
Journal of Transport Geography 54, 248–256. and practice 44 (5), 291–305.
Huang, H., Abdel-Aty, M., 2010. Multilevel data and Bayesian analysis in traffic safety. Loukaitou-Sideris, A., Liggett, R., Sung, H.G., 2007. Death on the crosswalk: A study of
Accident Analysis & Prevention 42 (6), 1556–1565. pedestrian-automobile collisions in Los Angeles. Journal of Planning Education and
Huang, H., Abdel-Aty, M., Darwiche, A., 2010. County-level crash risk analysis in Florida: Research 26 (3), 338–351.
Bayesian spatial modeling. Transportation Research Record: Journal of the Lovegrove, G., Lim, C., Sayed, T., 2009. Community-based, macrolevel collision predic-
Transportation Research Board 2148, 27–37. tion model use with a regional transportation plan. Journal of transportation en-
Huang, H., Chin, H., Haque, M., 2009. Empirical evaluation of alternative approaches in gineering 136 (2), 120–128.
identifying crash hot spots: naive ranking, empirical Bayes, and full Bayes methods. Lovegrove, G., Sayed, T., 2007. Macrolevel collision prediction models to enhance tra-
Transportation Research Record: Journal of the Transportation Research Board 2103, ditional reactive road safety improvement programs. Transportation Research
32–41. Record: Journal of the Transportation Research Board 2019, 65–73.
Imprialou, M.I.M., Quddus, M., Pitfield, D.E., Lord, D., 2016. Re-visiting crash–speed Lovegrove, G.R., Sayed, T., 2006. Macro-level collision prediction models for evaluating
relationships: A new perspective in crash modelling. Accident Analysis & Prevention neighbourhood traffic safety. Canadian Journal of Civil Engineering 33 (5), 609–621.
86, 173–185. Ma, X., Chen, S., Chen, F., 2017. Multivariate space-time modeling of crash frequencies by
Jiang, X., Abdel-Aty, M., Hu, J., Lee, J., 2016. Investigating macro-level hotzone identi- injury severity levels. Analytic Methods in Accident Research 15, 29–40.
fication and variable importance using big data: A random forest models approach. Ma, X., Tao, Z., Wang, Y., Yu, H., Wang, Y., 2015a. Long short-term memory neural
Neurocomputing 181, 53–63. network for traffic speed prediction using remote microwave sensor data.
Kim, K., Brunner, I.M., Yamashita, E.Y., 2006. Influence of land use, population, em- Transportation Research Part C: Emerging Technologies 54, 187–197.
ployment, and economic activity on accidents. Transportation research record 1953 Ma, X., Yu, H., Wang, Y., Wang, Y., 2015b. Large-scale transportation network congestion
(1), 56–64. evolution prediction using deep learning theory. PloS one 10 (3), e0119044.
Ladron de Guevara, F., Washington, S., Oh, J., 2004. Forecasting crashes at the planning MacNab, Y.C., 2004. Bayesian spatial and ecological models for small-area accident and
level: simultaneous negative binomial crash model applied in Tucson. Arizona. injury analysis. Accident Analysis & Prevention 36 (6), 1019–1028.
Transportation Research Record: Journal of the Transportation Research Board 1897, Mannering, F.L., Bhat, C.R., 2014. Analytic methods in accident research: Methodological
191–199. frontier and future directions. Analytic methods in accident research 1, 1–22.
LaScala, E.A., Gruenewald, P.J., Johnson, F.W., 2004. An ecological study of the locations Mitra, S., 2009. Spatial autocorrelation and Bayesian spatial statistical method for ana-
of schools and child pedestrian injury collisions. Accident Analysis & Prevention 36 lyzing intersections prone to injury crashes. Transportation research record 2136 (1),
(4), 569–576. 92–100.
LaScala, E.A., Johnson, F.W., Gruenewald, P.J., 2001. Neighborhood characteristics of Miaou, S.P., Lord, D., 2003. Modeling traffic crash-flow relationships for intersections:
alcohol-related pedestrian injury collisions: a geostatistical analysis. Prevention dispersion parameter, functional form, and Bayes versus empirical Bayes methods.
Science 2 (2), 123–134. Transportation Research Record: Journal of the Transportation Research Board 1840,
LaScala, E.A., Gerber, D., Gruenewald, P.J., 2000. Demographic and environmental 31–40.
correlates of pedestrian injury collisions: a spatial analysis. Accident Analysis & Miaou, S.P., Song, J.J., 2005. Bayesian ranking of sites for engineering safety improve-
Prevention 32 (5), 651–658. ments: decision parameter, treatability concept, statistical criterion, and spatial de-
Lee, J., Abdel-Aty, M., 2018. Macro-level analysis of bicycle safety: Focusing on the pendence. Accident Analysis & Prevention 37 (4), 699–720.
characteristics of both crash location and residence. International journal of sus- Moeinaddini, M., Asadi-Shekari, Z., Shah, M.Z., 2014. The relationship between urban
tainable transportation 12 (8), 553–560. street networks and the number of transport fatalities at the city level. Safety science
Lee, J., Abdel-Aty, M., Huang, H., Cai, Q., 2019a. Transportation Safety Planning 62, 114–120.
Approach for Pedestrians: An Integrated Framework of Modeling Walking Duration Mohaymany, A.S., Shahri, M., Mirbagheri, B., 2013. GIS-based method for detecting high-
and Pedestrian Fatalities. Transportation Research Record 2673 (4), 898–906. crash-risk road segments using network kernel density estimation. Geo-spatial
Lee, J., Abdel-Aty, M., De Blasiis, M.R., Wang, X., Mattei, I., 2019b. International trans- Information Science 16 (2), 113–119.
ferability of macro-level safety performance functions: a case study of the United Mountrakis, G., Gunson, K., 2009. Multi‐scale spatiotemporal analyses of moose–vehicle
States and Italy. Transportation Safety and Environment. collisions: a case study in northern Vermont. International Journal of Geographical
Lee, J., Abdel-Aty, A., Park, J., 2018a. Investigation of associations between marijuana Information Science 23 (11), 1389–1412.
law changes and marijuana-involved fatal traffic crashes: A state-level analysis. Naderan, A., Shahi, J., 2010. Aggregate crash prediction models: Introducing crash
Journal of Transport & Health 10, 194–202. generation concept. Accident Analysis & Prevention 42 (1), 339–346.
Lee, J., Yasmin, S., Eluru, N., Abdel-Aty, M., Cai, Q., 2018b. Analysis of crash proportion Narayanamoorthy, S., Paleti, R., Bhat, C.R., 2013. On accommodating spatial dependence
by vehicle type at traffic analysis zone level: A mixed fractional split multinomial in bicycle and pedestrian injury counts by severity level. Transportation research part
logit modeling approach with spatial effects. Accident Analysis & Prevention 111, B: methodological 55, 245–264.
12–22. Nashad, T., Yasmin, S., Eluru, N., Lee, J., Abdel-Aty, M.A., 2016. Joint modeling of pe-
Lee, J., Abdel-Aty, M., Cai, Q., Wang, L., Huang, H., 2018c. Integrated modeling approach destrian and bicycle crashes: copula-based approach. Transportation Research
for non-motorized mode trips and fatal crashes in the framework of transportation Record: Journal of the Transportation Research Board 2601, 119–127.
safety planning. Transportation research record 2672 (32), 49–60. Ng, K.S., Hung, W.T., Wong, W.G., 2002. An algorithm for assessing the risk of traffic
Lee, J., Abdel-Aty, M., Cai, Q., 2017a. Intersection crash prediction modeling with macro- accident. Journal of safety research 33 (3), 387–410.
level data from various geographic units. Accident Analysis & Prevention 102, Noland, R.B., Quddus, M.A., 2005. Congestion and safety: A spatial analysis of London.
213–226. Transportation Research Part A: Policy and Practice 39 (7-9), 737–754.
Lee, J., Abdel-Aty, M., Wang, J.H., Lee, C., 2017b. Long-term effect of universal helmet Noland, R.B., Oh, L., 2004. The effect of infrastructure and demographic change on
law changes on motorcyclist fatal crashes: comparison group and empirical Bayes traffic-related fatalities and crashes: a case study of Illinois county-level data.

29
A. Ziakopoulos and G. Yannis Accident Analysis and Prevention 135 (2020) 105323

Accident Analysis & Prevention 36 (4), 525–532. Research Board 2568, 55–63.
Noland, R.B., Quddus, M.A., 2004. A spatially disaggregate analysis of road casualties in Wang, X., Yang, J., Lee, C., Ji, Z., You, S., 2016b. Macro-level safety analysis of pedestrian
England. Accident Analysis & Prevention 36 (6), 973–984. crashes in Shanghai. China. Accident Analysis & Prevention 96, 12–21.
Openshaw, S., 1984. The modifiable areal unit problem. Concepts and techniques in Wang, J., Huang, H., 2016. Road network safety evaluation using Bayesian hierarchical
modern geography. Geo Books, Norwich. joint model. Accident Analysis & Prevention 90, 152–158.
Ossenbruggen, P.J., Linder, E., Nguyen, B., 2009. Detecting unsafe roadways with spatial Wang, Y., Kockelman, K.M., 2013. A Poisson-lognormal conditional-autoregressive model
statistics: point patterns and geostatistical models. Journal of Transportation for multivariate spatial analysis of pedestrian crash counts across neighborhoods.
Engineering 136 (5), 457–464. Accident Analysis & Prevention 60, 71–84.
Page, S.J., Meyer, D., 1996. Tourist accidents: an exploratory analysis. Annals of Tourism Wang, C., Quddus, M.A., Ison, S.G., 2009. Impact of traffic congestion on road accidents:
Research 23 (3), 666–690. A spatial analysis of the M25 motorway in England. Accident Analysis & Prevention
Papadimitriou, E., Filtness, A., Theofilatos, A., Ziakopoulos, A., Quigley, C., Yannis, G., 41 (4), 798–808.
2019. Review and ranking of crash risk factors related to the road infrastructure. Wang, X., Abdel-Aty, M., 2006. Temporal and spatial analyses of rear-end crashes at
Accident Analysis & Prevention 125, 85–97. signalized intersections. Accident Analysis & Prevention 38 (6), 1137–1150.
Papadimitriou, E., Tselentis, D.I., Yannis, G., 2018. Analysis of Driving Behaviour Wei, F., Lovegrove, G., 2013. An empirical tool to evaluate the safety of cyclists:
Characteristics Based on Smartphone Data. In: Proceedings of 7th Transport Research Community based, macro-level collision prediction models using negative binomial
Arena TRA 2018. Vienna, Austria. pp. 2018 April 16-19. regression. Accident Analysis & Prevention 61, 129–137.
Pirdavani, A., Bellemans, T., Brijs, T., Wets, G., 2014a. Application of geographically Wen, H., Zhang, X., Zeng, Q., Lee, J., Yuan, Q., 2019. Investigating spatial autocorrelation
weighted regression technique in spatial analysis of fatal and injury crashes. Journal and spillover effects in freeway crash-frequency data. International journal of en-
of Transportation Engineering 140 (8), 04014032. vironmental research and public health 16 (2), 219.
Pirdavani, A., Bellemans, T., Brijs, T., Kochan, B., Wets, G., 2014b. Assessing the road Wier, M., Weintraub, J., Humphreys, E.H., Seto, E., Bhatia, R., 2009. An area-level model
safety impacts of a teleworking policy by means of geographically weighted regres- of vehicle-pedestrian injury collisions with implications for land use and transpor-
sion method. Journal of transport geography 39, 96–110. tation planning. Accident Analysis & Prevention 41 (1), 137–145.
Pirdavani, A., Brijs, T., Bellemans, T., Kochan, B., Wets, G., 2013. Evaluating the road World Health Organization – WHO, 2018. Global status report on road safety 2018.
safety effects of a fuel cost increase measure by means of zonal crash prediction Available from:. . https://www.who.int/violence_injury_prevention/road_safety_
modeling. Accident Analysis & Prevention 50, 186–195. status/2018/en/.
Quddus, M.A., 2008. Modelling area-wide count outcomes with spatial correlation and World Health Organization – WHO, 2015. Global status report on road safety 2015.
heterogeneity: an analysis of London crash data. Accident Analysis & Prevention 40 Available from:. . http://www.who.int/violence_injury_prevention/road_safety_
(4), 1486–1497. status/2015/en/.
Ratti, C., 2004. Space syntax: some inconsistencies. Environment and Planning B: Xie, K., Ozbay, K., Kurkcu, A., Yang, H., 2017. Analysis of traffic crashes involving pe-
Planning and Design 31 (4), 487–499. destrians using big data: Investigation of contributing factors and identification of
Raykar, V.C., Duraiswami, R., 2006. Fast optimal bandwidth selection for kernel density hotspots. Risk analysis 37 (8), 1459–1476.
estimation. In: Proceedings of the 2006 SIAM International Conference on Data Xie, K., Wang, X., Ozbay, K., Yang, H., 2014. Crash frequency modeling for signalized
Mining. Society for Industrial and Applied Mathematics. pp. 524–528. intersections in a high-density urban road network. Analytic methods in accident
Rhee, K.A., Kim, J.K., Lee, Y.I., Ulfarsson, G.F., 2016. Spatial regression analysis of traffic research 2, 39–51.
crashes in Seoul. Accident Analysis & Prevention 91, 190–199. Xie, K., Wang, X., Huang, H., Chen, X., 2013. Corridor-level signalized intersection safety
Saunier, N., Sayed, T., 2007. Automated analysis of road safety with video data. analysis in Shanghai, China using Bayesian hierarchical models. Accident Analysis &
Transportation Research Record: Journal of the Transportation Research Board 2019, Prevention 50, 25–33.
57–64. Xie, Z., Yan, J., 2008. Kernel density estimation of traffic accidents in a network space.
Siddiqui, C., Abdel-Aty, M., Choi, K., 2012. Macroscopic spatial analysis of pedestrian and Computers, environment and urban systems 32 (5), 396–406.
bicycle crashes. Accident Analysis & Prevention 45, 382–391. Xu, P., Huang, H., Dong, N., 2018. The modifiable areal unit problem in traffic safety:
Siddiqui, C., Abdel-Aty, M., 2012. Nature of modeling boundary pedestrian crashes at basic issue, potential solutions and future research. Journal of traffic and transpor-
zones. Transportation Research Record 2299 (1), 31–40. tation engineering (English edition) 5 (1), 73–82.
Soltani, A., Askari, S., 2017. Exploring spatial autocorrelation of traffic crashes based on Xu, P., Huang, H., Dong, N., Wong, S.C., 2017a. Revisiting crash spatial heterogeneity: a
severity. Injury 48 (3), 637–647. Bayesian spatially varying coefficients approach. Accident Analysis & Prevention 98,
Song, J.J., Ghosh, M., Miaou, S., Mallick, B., 2006. Bayesian multivariate spatial models 330–337.
for roadway traffic crash mapping. Journal of multivariate analysis 97 (1), 246–273. Xu, C., Li, H., Zhao, J., Chen, J., Wang, W., 2017b. Investigating the relationship between
St-Aubin, P., Saunier, N., Miranda-Moreno, L., 2015. Large-scale automated proactive jobs-housing balance and traffic safety. Accident Analysis & Prevention 107,
road safety analysis using video data. Transportation Research Part C: Emerging 126–136.
Technologies 58, 363–379. Xu, P., Huang, H., 2015. Modeling crash spatial heterogeneity: random parameter versus
Tasic, I., Elvik, R., Brewer, S., 2017. Exploring the safety in numbers effect for vulnerable geographically weighting. Accident Analysis & Prevention 75, 16–25.
road users on a macroscopic scale. Accident Analysis & Prevention 109, 36–46. Yannis, G., Tselentis, D.I., Vlahogianni, E.I., Argyropoulou, A., 2017. Monitoring dis-
Theofilatos, A., Yannis, G., 2014. A review of the effect of traffic and weather char- traction through smartphone naturalistic driving experiment. In: 6th International
acteristics on road safety. Accident Analysis & Prevention 72, 244–256. Naturalistic Driving Research Symposium. The Hague, Netherlands. 7-9 June 2017.
Thomas, I., 1996. Spatial data aggregation: exploratory analysis of road accidents. Yasmin, S., Eluru, N., 2016. Latent segmentation based count models: analysis of bicycle
Accident Analysis & Prevention 28 (2), 251–264. safety in Montreal and Toronto. Accident Analysis & Prevention 95, 157–171.
Ukkusuri, S., Miranda-Moreno, L.F., Ramadurai, G., Isa-Tavarez, J., 2012. The role of Zeng, Q., Huang, H., 2014. Bayesian spatial joint modeling of traffic crashes on an urban
built environment on pedestrian crash frequency. Safety science 50 (4), 1141–1151. road network. Accident Analysis & Prevention 67, 105–112.
Ukkusuri, S., Hasan, S., Aziz, H., 2011. Random parameter model used to explain effects Zhai, X., Huang, H., Xu, P., Sze, N.N., 2019a. The influence of zonal configurations on
of built-environment characteristics on pedestrian crash frequency. Transportation macro-level crash modeling. Transportmetrica A: transport science 15 (2), 417–434.
Research Record: Journal of the Transportation Research Board 2237, 98–106. Zhai, X., Huang, H., Sze, N.N., Song, Z., Hon, K.K., 2019b. Diagnostic analysis of the
Ver Hoef, J.M., Peterson, E.E., Hooten, M.B., Hanks, E.M., Fortin, M.J., 2018. Spatial effects of weather condition on pedestrian crash severity. Accident Analysis &
autoregressive models for statistical inference from ecological data. Ecological Prevention 122, 318–324.
Monographs 88 (1), 36–59. Zhai, X., Huang, H., Gao, M., Dong, N., Sze, N.N., 2018. Boundary crash data assignment
Wang, L., Abdel-Aty, M., Lee, J., Shi, Q., 2019. Analysis of real-time crash risk for ex- in zonal safety analysis: an iterative approach based on data augmentation and
pressway ramps using traffic, geometric, trip generation, and socio-demographic Bayesian spatial model. Accident Analysis & Prevention 121, 231–237.
predictors. Accident Analysis & Prevention 122, 378–384. Zhu, L., Guo, F., Krishnan, R., Polak, J.W., 2018. The Use of Convolutional Neural
Wang, Y., Veneziano, D., Russell, S., Al-Kaisy, A., 2016a. Traffic Safety Along Tourist Networks for Traffic Incident Detection at a Network Level (No. 18-00321).
Routes in Rural Areas. Transportation Research Record: Journal of the Transportation

30

You might also like