Data Based Computational Approahes
Data Based Computational Approahes
129
130
131
132
2 Data Sources
As noted in the previous section, one of the major factors fueling the development of
computation methods for the analysis and forecasting of political violence has been
the availability of machine-readable data sets. Collection of data sets has now been
underway for close to five decadesthe Correlates of War Project, for example,
began in 1963and are now the product of thousands of hours of careful research,
refinement and coding, and have been used in hundreds of studies. In other cases
automated coding of atomic event data and analysis of the new social mediathe
133
collections are relatively new and unexplored, but potentially provide very large
amounts of data in near-real-time. These data are generally available on the web,
either through academic archives such as the Inter-University Consortium for
Social and Political Research (http://www.icpsr.umich.edu/icpsrweb/ICPSR/) and
Harvards Dataverse Network (http://thedata.org/), through government and IGO
sources such as USA.gov (http://www.usa.gov/Topics/Reference-Shelf/Data.shtml)
and the United Nations Statistics Division (http://unstats.un.org/unsd/databases.
htm) orincreasinglythrough individual web sites established by the projects
collecting the data.
In this section, we will briefly describe some general types of data that have been
used in models of political violence, with comments on a few of the strengths and
weaknesses of each type. As with any typology, not all of the data sets fit clearly
into a single category, but most will.
134
135
subcategories. While the original event data collections focused almost exclusively
on nation-states, contemporary systems provide higher levels of substrate aggregation, sometimes down to the level of individuals [68, 147]. While most studies in
the past aggregated the events into a single cooperation-conflict dimension using a
scale [73], more recent approaches have created composite events out of patterns of
the atomic events [88, 140] or looked at sequences of events [50].
136
Skeptics, however, have pointed out that social media also have some disadvantages. While access to the Web is increasing rapidly, it is still far from universal, and
social media in particular tend to be disproportionately used by individuals who are
young, economically secure, and well-educated. In areas with strong authoritarian
regimes, notably China, there are substantial (though not uniformly successful)
efforts to control the use of these media, and a government agency with even modest
resources can easily create a flood of false posts, sites and messages. While there
is some political content in these media, the vast bulk of the postings are devoid of
political contentOMG, Bieber fever!!!and what relevant content does
exist may be deliberately or inadvertently encoded in a rapidly-mutating morass of
abbreviations and slang almost indecipherably to conventional NLP software. (This
contrasts to the news stories used to encode atomic event data, which generally are
in syntactically correct English.) Finally, a number of analysts have argued that the
critical communications development for political mobilization is the cell phone,
both for voice and texting, rather than the Web-based media.
2.6.1 Actors
Most major event data datasetsincluding WEIS, CAMEO, ICEWS, and VRA
code the source and target actors for each event in the data set. However, many
of these actors may be irrelevant to the specific outcome-of-interest. For example,
a study focusing on Israeli-Palestinian conflicts would not want to include events
between Aceh rebels and the Indonesian army, as these are not relevant to the
conflict of interest. Although excluding Indonesian rebel activity is obvious in this
case, more difficult decisions exist, such as whether or not to include events between
members of the Lebanese and Syrian armies, or the governments of the United
States and Iran in a study of conflict between Israel-Palestine. Yonamine [170]
provides a more detailed description of event data aggregation.
137
2.6.2 Actions
Event datasets use a numerical code to reflect the specific type of event that is occurring between the two actors. Since the numerical codes do not carry intrinsic value,
researchers manipulate the code to reflect meaningful information. The majority
of extant event data literature either scales all events, assigning them a score on a
conflict-cooperation continuum or generates event counts reflecting the number of
events that occur within conceptually unique categories. The Goldstein Scale [73],
which is the most commonly used scaling technique within the event data literature
[73, 77, 125, 136, 138], assigns a value from a 10 to C10 conflict/cooperation
scale, with 10 reflecting the most conflictual events and 10 indicating the most
cooperative.
Despite the preponderance of the Goldstein scale, a number of other studies
[140, 141, 144] utilize count measures. Duvall and Thompson [51] put forth the
first event data count model by placing all events into one of four conceptually
unique, mutually exclusive categories: verbal cooperation, verbal conflict, material
cooperation, material conflict. Although this count approach is more simplistic than
scaling methods, [135, 144] find strong empirical results using this count method of
action aggregation.
2.6.3 Temporal
Finally, scholars must temporally aggregate data in order to perform empirical
analyses at levels appropriate for their theory or empirical models of choice. All
of the previously mentioned event data sets code the exact day on which events
occur. As the specific time-of-day that events occurred is not reported, events must
at the very minimum be aggregated to the daily level [125,135,144], though weekly
[26, 148], monthly [136, 165], quarterly [93], and annual level aggregations are
common within the literature. By aggregated, we mean that the events occurring
within the selected temporal length must be jointly interpreted. Common approaches
are to calculate the sum or the mean of events that occur within the chosen temporal
domain.
3 Statistical Approaches
Most of the work on forecasting political conflict has used statistical modeling, since
this has a much longer history in political science than algorithmic and machine
learning approaches. While the bulk of these studies have focused on simply interpreting coefficient estimates within the frequentist mode of significance testing, a
method which has proven to have little utility for predictive modeling [165], more
recent work has taken prediction seriously, both using classical time series models
and more recently a substantial amount of work using vector autoregression (VAR)
138
models. In addition, recent work has focused on one of the most challenging aspects
of forecasting, particularly when applied to counterterrorism: the fact that these
events occur very rarely. While this presents a serious challenge to the method, a
number of sophisticated methods have been developed to deal with it.
i D 1; : : : ; n;
(1)
(3)
While these approaches have been used extensively in the academic literature,
their practical utility is, unfortunately, quite limited due to a reliance on the
frequentist significance testing approach. Extended discussions of this issue can
be found in [3, 70, 96, 137] but briefly, significance testing was originally developed
in the early twentieth century to study problems where the null hypothesis of
a variable having no effect was meaningful (the archetypical example is whether
a new medicine has a different effect than a placebo). In some political science
applications, this is still validfor example, determining whether a forthcoming
election is a statistical tie based on opinion pollingbut in most models a variable
is generally not included unless there are theoretical (or common sense) reasons to
assume that it will have at least some effect. Because this is all that the significance
test is assessinga non-zero effect, not an effect that has some meaningful impact
the information it provides is limited and all but negligible for predictive problems
[165].
Although the frequentist approach can be useful in weeding out variables that
might seem to be important but in fact are not, in contemporary models that
tend to be complex, even this should be interpreted with caution. Linear models
are particularly susceptible to problems of colinearity when the independent
variables xi are correlatedas is often the case, particularly in models where
139
140
Xt D c C
p
X
'i Xt i C "t
(4)
i D1
where '1 ; : : : ; 'p are the parameters of the model, c is a constant and "t is white
noise. In most applications, the error terms "t are assumed to be independent,
identically-distributed, and sampled from a normal distribution.
The notation MA.q/ refers to the moving average model of order q:
Xt D C "t C
q
X
i "t i
(5)
i D1
p
X
i D1
'i Xt i C
q
X
i "t i
(6)
i D1
In practical terms, classical time series models are useful for phenomena where
the current value of the variable is usually highly dependent on past values. This
incremental behavior is characteristic of a great deal political behaviors, notably
public opinion, budgeting, and most conflict behavior: once a civil war or insurgency
gets going, it is likely to continue for a while, and once peace is firmly established,
that is also likely to the maintained. In such situations, the best predictor of the
variable at time t is its value at time t 1 Consequently, these models are very
widely used.
Incremental models, however, have a critical weakness: by definition, they
cannot explain sudden change, whether isolated incidents or the onset and cessation
of protracted behaviors. An autoregressive model can have a very good overall
predictive record but still miss the beginning and end of a conflict, and these may be
of greatest interest to decision-makers.
The classical time series literature places a great emphasis on the characteristics
of the error terms. While these are treated in the literature as random variables, in
practical terms, much of the error is simply the effects of variables that were not
included in the model. Because most social and demographic indicators and many
economic indicators are also very strongly autoregressivefor example indicators
such as infant mortality rate and literacy rarely change more than a percentage
point year to yearthese errors will strongly correlate with their lagged values,
hence the interest in the MA and ARMA models. Unfortunately, disentangling
the effects of autoregressive variables and autoregressive errors is extraordinarily
difficult in many circumstances, which in turn has led to the development of vectorautoregressive models, discussed in Sect. 3.3.
141
(7)
142
variables. The downside of VAR is that it is very data-intensive and requires data
measured consistently across a large number of time points.
A more recent variation on VAR models has been the addition of switching
components, which allow for the impact of distinct regimes on the behavior of the
system [155]. While originally developed for the study of foreign exchange markets,
this is clearly applicable to political conflict situations where the system is likely
to behave differently in times of peace than during times of war. Bayesian versions
have also been developed [26] and applied to cases such as the conflict in the Middle
East [25, 27].
f .t/ dt
S 0 .t/ dt
D
S.t/
S.t/
(8)
which the event rate at time t conditional on survival until time T , T t. From
these functions, information such as the average number of events in a given period
or average lifetime can be derived.
Of particular interest in survival analysis is the overall shape of the hazard
function. For example, a survival function which levels off indicates that once
an interval has occurred without an event occurring, it is less likely to occur
again, whereas a survival function that declines linearly means that the occurrence
or nonoccurrence provides no information. Other forms are also possiblefor
example many biological organisms exhibit a U-shaped hazard function, meaning
that mortality is high in the early stages of life, then becomes low for a period of
time, then rises again as the organism reaches its natural age limit.
A key downside of survival models is the difficulty that humans have in working
with probabilities [94, 161]. In addition, the models are relatively new to the study
of political conflict, due to earlier constraints of computational feasibility and the
availability of appropriate estimation software. However, with these constraints
removed, there are likely to be more such applications in the future.
143
144
specific region or country over a short period of time, or (iii) require additional
variable coding or data collection that, due to time or money constraints, cannot be
completed for an entire sample of interest.
A second challenge to the analysis of rare-event data arises from the corresponding preponderance of zero (non-event) observations within such data sets. As recent
studies note [126], many zero-observations in terrorism-event data sets correspond
to country or region cases that have no probability of experiencing a terrorist-event
in any period of interest. In these instances, empirical analyses of conflict-event data
risk conflating two distinct types of zero-observations; one for which the probability
of terrorism is non-zero but terrorism-events nevertheless didnt occur, and one
wherein the probability of terrorist-events is consistently zero. Given that the latter
set of zero-cases often arise as the result of covariates that correlate or overlap with
ones primary independent variables of interest, ignoring zero-inflation processes of
these sorts not only leads to an underestimation of terrorist events generally, but can
also bias specific coefficient estimates in indeterminate directions. To correct for
these biases, one must conditionally model both the zero-inflation process and the
main outcome of interest.
Zero-inflated mixture-models specifically address these very problems, most
notably within the contexts of event-count models such as Poisson or negative
binomial estimators. In essence, these models employ a system of two equations
to estimate the combined probability of an observation (i) being inflated and
(ii) experiencing an event-outcome of interestusually by including separate but
overlapping covariates as predictors for each respective equation. For example, one
could expand the logit model presented in Eq. 13 above to the zero-inflated logit
approach by incorporating a second, inflation-stage logit equation as so,
1Yi
1
1
1
f .z; w/ D
1
C 1
1 C e w
1 C e w
1 C e z
Yi
1
1
1
(9)
1 C e w
1 C e z
where the variable w represents the additional set of inflation-stage covariates;
w D 0 C 1 v1 C 2 v2 C 3 v3 C C k vk ;
(10)
which may or may not overlap with z [13]. Zero-inflated models thereby add
an additional layer of nuance to the empirical modeling of conflict-events by
estimating both the propensity of ever experiencing an event of interest and the
likelihood of experiencing an event of interest conditional on being able to do so.
This allows one to use ex-ante observable and theoretically informed covariates
to account for the probability that a given zero observation is inflated, and to
then probabilistically discount these zeroes leverage within ones primary analysis
without dropping these observations entirely. While such zero-inflated modeling
approaches have been most extensively applied to political violence count-data
[14,40,126], zero-inflated models have also recently been developed and applied by
conflict-researchers to a variety of other limited dependent variables [83, 159, 168].
145
4 Algorithmic Approaches
Although more traditional statistical models still dominate quantitative studies
of political conflict, algorithmic approaches have proven effective, thus gaining
momentum not just within political science [21, 139] but also in other disciplines.
For example, computer scientists have developed the CONVEX [157], CAPE [106],
and SOMA [107] tools to forecast terrorist group behavior. While we will follow
common practice in using the statistical vs. algorithmic distinction to differentiate between methodologies, there is overlap between the two definitions. For
example, linear regression is consider a canonical statistical approach, but as we
describe in Sect. 4.1.1, it is also a straightforward example of a supervised linear
algorithm.
In general, by algorithmic approaches, we refer to specific models (such as neural
networks or random forests) or techniques (like bagging and boosting) that attempt
to leverage computational power to specify and train models through iterative
resampling techniques and to build and assess out-of-sample predictions, rather than
obtaining a single estimate of the model coefficients. Algorithmic approaches can
provide a number of benefits over statistical models and are particularly relevant to
forecasting terrorism for at least the following four reasons.
First, machine learning algorithms are often better suited than many traditional
statistical models at handling big data data sets with large numbers of independent variables that potentially exceed the number of observations. Second, these
algorithms are also less dependent on rigid assumptions about the data generating
process and underlying distributions. Third, as opposed to some statistical models,
many machine learning algorithms were specifically designed to generate accurate
predictions, and do this exceedingly well. Finally, a number of the algorithmic
approaches approximate the widely used qualitative method case-based reasoning
[95,108,116] which match patterns of events from past cases to the events observed
in a current situation, and then use the best historical fit to predict the likely outcome
of the current situation; [134, Chap. 6] gives a much more extended discussion
of this approach. This similarity to the methods of human analysts accounted for
these methods originally being labeled artificial intelligence in some of the early
studies.
Indeed, major trends in the empirical study of political violence, such as the
big data revolution and an increasing interest in predictive models, mean that
algorithmic approaches will likely become increasingly popular in the coming
years. In the following sections, we address some of the most relevant machine
learning algorithms for forecasting political violence. Following standard practices,
we divide algorithmic approaches into two general, though not mutually exclusive
categories, supervised and unsupervised algorithms, with an additional discussion
of sequence analysis techniques.
146
147
and process input functions and then transfer information to other nodes, such as
neurons in the case of the human brain.
Although most work with neural networks seems far removed from terrorism,
[12] articulately explain how supervised neural networks are not only an appropriate algorithmic approach to predicting violence but can also be applied as a
straightforward extension to logistic regression. Like a logistic regression, neural
networks can be applied to a traditional TSCS dataset with a binary dependent
variable to generate predicted probabilities that are interoperated identically to the
parameter of logistic regression. However, the primary advantage of a neural
network approach is that they are able to account for the potential of massive
nonlinear interaction effects that may causally link the independent variables to
the outcome of interest without having to directly specify interactive or non-linear
terms to the model as required in a logistic. [12] demonstrate that the neural network
approach consistently outperforms logistic regression in out-of-sample accuracy
measures. Though we are unaware of neural networks being applied in studies of
terrorism, it is likely that doing so could yield similar improvements in predictive
accuracy.
4.1.3 Tree-Based Algorithms
Though yet to gain traction in the political violence literature, tree-based approaches
[80, pp. 305317, 587604] are commonly used to generate accurate predictions
in a host of other disciplines, including finance, medicine, computer science, and
sociology. In brief, tree-based algorithms operate by iteratively partitioning the data
into smaller sub-sets based on a break-point in a particular independent variable.
This process is repeated with the goal of creating bins of observations with similar
Yi values (for continuous data) or class (for categorical data).
We highlight three important factors that contribute to the strong predictive
accuracy of tree-based approaches. First, trees can be used to forecast both continuous (i.e. regression trees) and binomial (i.e. classification trees) dependent
variables. Second, tree-based approaches, such as random forests, are not sensitive
to degrees of freedom and can handle more independent variables than observations. Third, leading tree-based approaches incorporate iterative re-sampling,
weighting, and model averaging strategies like bagging and boosting techniques
[156], which tend to enhance accuracy and stability vis-`a-vis other supervised
learning classification and regression algorithms. In social science applications of
tree-based algorithms, [16] and [17] demonstrates that random forests can help
generate accurate forecasts of violent crime rates in the United States. Despite the
scarcity of tree-based approaches in political science, scholars in other disciplines
ranging from ecology (see [43]) to transportation studies (see [158]) have generated
accurate predictions using this approach. As the quantitative political violence
literature continues to progress, it will behoove scholars to continue experimenting
with supervised forecasting algorithms that have demonstrated their value in other
148
4.2.2 Clustering
Clustering approaches such as k-means and Latent Dirichlet Allocation are similar
to dimension reduction in that they attempt to identify latent classes amongst a
set of observations, but differ in that they identify discrete, rather than continuous
solutions like PCA and FA [69]. Within the machine learning literature, k-means
approaches are among the most commonly used clustering algorithms, though their
application to the study of political violence has been scarce [80, pp. 509520].
149
However, [139] demonstrate that k-means can successfully identify latent clusters
within event data that identify phases of violence in the Middle East.
Latent Dirichlet Allocation [19] is another clustering algorithm that is primarily
applied to raw text. In the typical Latent Dirichlet Allocation application to
document classification, each document is assumed to be a mixture of multiple,
overlapping latent topics, each with a characteristic set of words. Classification is
done by associating words in a document with a pre-defined number of topics most
likely to have generated the observed distribution of words in the documents.
The purpose of LDA is to determine those latent topics from patterns in the data,
which are useful for two purposes. First, to the extent that the words associated with
a topic suggest a plausible category, they are intrinsically interesting in determining
the issues found in the set of documents. Second, the topics can be used with other
classification algorithms such as logistic regression, support vector machines or
discriminant analysis to classify new documents.
Despite the surface differences between the domains, the application of Latent
Dirichlet Analysis to the problem of political forecasting is straightforward: it is
reasonable to assume that the stream of events observed between a set of actors
is a mixture of a variety political strategies and standard operating procedures
(for example escalation of repressive measures against a minority group while
simultaneously making efforts to co-opt the elites of that group). This is essentially
identical to the process by which a collection of words in a document is a composite
of the various themes and topics, the problem Latent Dirichlet Analysis is designed
to solve. As before, the objective of Latent Dirichlet Analysis will be to find those
latent strategies that are mixed to produce the observed event stream. These latent
factors can then be used to convert full event stream to a much simpler set of
measures.
The Latent Dirichlet Analysis approach is similar in many ways to the hidden
Markov approach (Sect. 4.3). In both models, the observed event stream is produced
by a set of events randomly drawn from a mixture of distributions. In an HMM,
however, these distributions are determined by the state of a Markov chain, whose
transition probabilities must be estimated but which consequently also explicitly
provides a formal sequence. An Latent Dirichlet Analysis, in contrast, allows any
combination of mixtures, without explicit sequencing except to the extent that
sequencing information is provided by the events in the model.
150
An HMM is a variation on the well-known Markov chain model, one of the most
widely studied stochastic models of discrete events. Like a conventional Markov
chain, a HMM consists of a set of discrete states and a matrix A D aij of
transition probabilities for going between those states. In addition, however, every
state has a vector of observed symbol probabilities, B D bj .k/ that corresponds to
the probability that the system will produce a symbol of type k when it is in state j.
The states of the HMM cannot be directly observed and can only be inferred from
the observed symbols, hence the adjective hidden.
In empirical applications, the transition matrix and symbol probabilities of an
HMM are estimated using an iterative maximum likelihood technique called the
Baum-Welch algorithm which finds values for the matrices A and B that locally
maximize the probability of observing a set of training sequences. Once a set of
models has been estimated, they can be used to classify an unknown sequence by
computing the probability that each of the models generated the observed sequence.
The model with the highest such probability is chosen as the one which best
represents the sequence.
The application of the HMM to the problem of classifying international event
sequences is straightforward. The symbol set consists of the event codes taken
from an event data set such as IDEA [123] or CAMEO [143]. The states of the
model are unobserved, but have a close theoretical analog in the concept of crisis
phase [101]. Different political phases are distinguished by different distributions
of observed events from the event ontology. Using CAMEO coding as an example, a
stable peace would have a preponderance of cooperative events in the 0110 range
which codes cooperative events; a crisis escalation phase would be characterized by
events in the 1115 range (accusations, protests, denials, and threats), and a phase of
active hostilities would show events in the 1822 range, which codes violent events.
An important advantage of the HMM is that it can be trained by example
rather than by the deductive specification of rule. Furthermore, HMMs require no
temporal aggregation. This is particularly important for early warning problems,
where critical periods in the development of a crisis may occur over a week or
even a day. Finally, indeterminate time means that the HMM is relatively insensitive
to the delineation of the start of a sequence: It is simple to prefix an HMM with
a background state that simply gives the distribution of events generated by a
particular source (e.g. Reuters/IDEA) when no crisis is occurring and simply cycle
in this state until something important happens.
151
Weekt4
1
3
Weekt3
2
3
Weekt3
3
3
Weekt1
4
0
Weekt
N/A
N/A
No
no
No
No
Yes
studies using sequence analysis primarily focus on matching rather than forecasting,
Martinez et al. [106], Silva et al. [157], and Dorazio et al. [50], do utilize sequence
analysis-based algorithms for the explicit goal of prediction.
152
set of training sequence of equal length (i.e. 4). Euclidean distance is one of the
most common and robust measures, though many other distance measures exist.
Figure 1 demonstrates how Euclidean is applied to calculate distanceswhich serve
as the covariates in <Step 3>between training sequences and the archetypical
sequences.
The two distances, Di stanceci vi li ancasualty and Di stanceambushat t empt s are
calculated for every observation (i.e. 4 week sequence) in the training set. To
complete <Step 3>, we choose a logistic regression, which is suitable to a
parsimoniously specified model with a binary dependent variable. To train the
logistic model, estimate the values that maximize the likelihood function below.
L.jy/ D
N
Y
ni
y
i i .1 i /ni yi
y
.n
y
/
i
i
i
i D1
(11)
To complete <Step 4> and build actual forecasts based on the archetypedriven sequence analysis approach, we apply the two estimates that result from
(Eq. 6) to the logistic regression formula in order to calculate f .z/, which reflects
the likelihood that the week following the 4-week period used to generate the
Di stanceci vi li ancasualty and Di stanceambushat t empt s distances from the archetypes
will experience an attack on the base.
z D 0 C 1 Di stanceci vi li ancasualty C 2 Di stanceambushat t empt s
f .z/ D
1
1 C e z
(12)
(13)
To the extent that archetypical sequences that tend to precede an outcome-ofinterest exist, sequence analysis may be an effective tool of prediction that can
provide leverage over other approaches.
153
sequence, the Convex algorithm calculates the various distance measure between
each out-of-sample sequence and all in-sample sequences. In order to form out-ofsample predictions, the Convex algorithm determines the K-number of the most
similar sequences based on various distance measures. If all of the k-nearest
neighbor sequences have the same dependent variable, then the predicted value
for that out-of-sample sequence is assigned as the dependent variable value of
the single nearest neighbor. When this occurs, the ConvexkN N and ConvexMerge
approaches generate identical predictions. However, when the dependent variable
values of the nearest neighbor sequences are not identical, ConvexkN N rounds to the
nearest integer while ConvexMerge assigns more value to more similar sequences.
In addition to Martinez et al., Sliva et al. [157] apply both variants of the Convex
algorithm to terrorist group behavior.
5 Network Models
We label our final category network models: these are approaches that specifically
consider the relationships between entities, either according to their interactions
(SNA models) or location (geospatial models). As with many of the algorithmic
applications, network models have only become feasible in the last decade of
so, as they require very substantial amounts of data and computational capacity.
Consequently the number of applications at present is relatively small, though these
are very active research areas.
154
any other modes of communication open source or otherwise, has greatly increased
the amount of information that can be used to populate a social network. Third, SNA
has been almost simultaneously embraced by the intelligence, policy, and academic
communitiesa rare feat for a new methodology. These factors, as well as its ability
to deliver unique insights, is likely to continue driving its growth among quantitative
studies of terrorism.
6 Conclusion
In 1954, the psychologist Paul Meehl [111] published a path-breaking analysis of
the relative accuracy of human clinical assessments versus simple statistical models
for a variety of prediction problems, such as future school performance and criminal
recidivism. Despite using substantially less information, the statistical models either
outperformed, or performed as well as, the human assessments in most situations.
Meehls work has been replicated and extended to numerous other domains in the
subsequent six decades and the results are always the same: the statistical models
win. Meehl, quoted in Kahneman [94, Chap. 21], reflecting on 30 years of studies,
said, There is no controversy in the social sciences which shows such a large body
of qualitatively diverse studies coming out in the same direction as this one.
This has not, of course, removed the human analysts from tryingand most
certainly, claimingto provide superior performance. These are justified in the face
155
of overwhelming empirical evidence that they are inaccurate using a wide variety of
excusesdare we call them pathologies?that Kahneman [94, Part 3] discusses
in great detail in the section of his book titled, appropriately, Overconfidence.
It is not only that simple statistical models are superior to human punditry, but as
Tetlock [161] established, the most confident and a well-known human analysts
actually tend to make the worst predictions. Furthermore, in most fields, the
past performance of analystsnotoriously, the well-compensated stock-pickers of
managed mutual fundsprovides essentially no guide to their future performance.
At present, there is very little likelihood that human punditry, particularly the
opinionated self-assurance so valued in the popular media, will be completely
replaced by the unblinking assessments of computer programs, whether on 24-hour
news channels or in brainstorming sessions in windowless conference rooms.
Humans are social animals with exquisite skills at persuasion and manipulation;
computer programs simply are far more likely to provide the correct answer in an
inexpensive, consistent and transparent manner.
Yet with the vast increase in the availability of data, computational power, and
the resulting refinement of methodological techniques, there is some change in the
works. While the covers of investment magazines are adorned with the latest wellcoffed guru who by blind luck has managed to have an unusually good year, in
fact algorithmic trading now accounts for all but a small fraction of the activity on
financial exchanges. Weather forecasts are presented on television by jocular and
attractive individuals with the apparent intelligence of tadpoles, but the forecasts
themselves are the result of numerical models processing massive quantities of data.
In the political realm, sophisticated public opinion polling replaced the intuitive
hunches of experts several decades ago, and polls are so rarely incorrect that it is a
major news event when they fail. Even that last bastion of intuitive manliness, the
assessment of athletes, can be trumped by statistical models, as documented in the
surprisingly popular book and movie Moneyball [100].
Similar progress in the adoption of models forecasting political violence, particularly terrorism, is likely to be much slower. As we have stressed repeatedly, one
of the major challenges of this research is that political violence is a rare event,
and acts of terrorism in otherwise peaceful situationsthe bolt out of the blue of
Oklahoma City, 9/11/2001, Madrid 2004 and Utya Island, Norway in 2011are
among the rarest. Consequently even if the statistical models are more accurate
and there is every reason to believe that they will beestablishing this will take
far longer than is required in a field where new predictions can be assessed by the
day, or even by the minute. In addition, a long series of psychological studies have
shown that human risk assessment is particularly inaccurateand hence its validity
overestimatedin low probability, high risk situations, precisely the domain of
counter-terrorism. Getting the technical assessments in the door, to say nothing of
getting them used properly, will not be an easy task, and the initial applications
will almost certainly be in domains where the events of interest are more frequent,
as we are already seeing with the success of the Political Instability Task Force.
But as this chapter has illustrated, the challenges are well understood, a plethora of
156
References
1. Abbott A (1995) Sequence analysis: new methods for old ideas. Ann Rev Sociol 21:93113
2. Abdi H, Williams LJ (2010) Principal components analysis. Wiley Interdiscip Rev 2(4):
433459
3. Achen C (2002) Toward a new political methodology: microfoundations and ART. Ann Rev
Pol Sci 5:423450
4. Andriole SJ, Hopple GW (1984) The rise and fall of events data: from basic research to
applied use in the U.S. Department of Defense. Int Interact 10(3/4):293309
5. Anselin L (1988) Spatial econometrics: methods and models. Kluwer, Dordrecht
6. Asal V, Pate A, Wilkenfeld J (2008) Minorities at risk organizational behavior data and
codebook version 92008. http://www.cidcm.umd.edu/mar/data.asp
7. Azad S, Gupta A (2011) A quantitative assessment on 26/11 mumbai attack using social
network analysis. J Terror Res 2(2):414
8. Azar EE (1980) The conflict and peace data bank (COPDAB) project. J Confl Resolut 24:
143152
9. Azar EE, Sloan T (1975) Dimensions of interaction. University Center for International
Studies, University of Pittsburgh, Pittsburgh
10. Beck N, Katz. JN (1995) What to do (and not to do) with time- series cross-section data. Am
Pol Sci Rev 89:634634
11. Beck N, Katz JN, Tucker R (1998) Taking time seriously: time-series-cross-section analysis
with a binary dependent variable. Am J Pol Sci 42(4):12601288
12. Beck N, King G, Zeng L (2000) Improving quantitative studies of international conflict: a
conjecture. Am Pol Sci Rev 94(1):2135
13. Beger A, DeMeritt JH, Hwang W, Moore WH (2011) The split population logit (spoplogit):
modeling measurement bias in binary data. http://ssrn.com/abstract=1773594. Working paper
14. Benini AA, Moulton LH (2004) Civilian victims in an asymmetrical conflict: operation
enduring freedom, afghanistan. J Peace Res 41(4):403422
15. Bennett DS, Stam AC (2004) The behavioral origins of war. University of Michigan Press,
Ann Arbor
16. Berk R (2009) The role of race in forecasts of violent crime. Race Soc Probl 1(4):231242
17. Berk R, Sherman L, barnes G, Kurtz E, Ahlman L (2007) Forecasting murder within a
population of probationers and parolees: A high stakes application of statistical learning.
J R Stat Soc 172(1):191211
18. Besley T, Persson T (2009) Repression or civil war? Am Econ Rev 99(2):292297
19. Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 2:9931022
20. Bond D, Jenkins JC, Taylor CLT, Schock K (1997) Mapping mass political conflict and civil
society: Issues and prospects for the automated development of event data. J Confl Resolut
41(4):553579
21. Bond J, Petroff V, OBrien S, Bond D (2004) Forecasting turmoil in Indonesia: an application
of hidden Markov models. Presented at the International Studies Association Meetings,
Montreal
157
22. Box GE, Jenkins GM, Reinsel GC (1994) Time series analysis: forecasting and control, 3rd
edn. Prentice Hall, Englewood Cliffs
23. Box-Steffensmeier J, Reiter D, Zorn C (2003) Nonproportional hazards and event history
analysis in international relations. J Confl Resolut 47(1):3353
24. Box-Steffensmeier JM, Jones BS (2004) Event history modeling: a guide for social scientists.
Analytical methods for social research. Cambridge University Press, Cambridge
25. Brandt PT, Colaresi MP, Freeman JR (2008) The dynamics of reciprocity, accountability and
credibility. J Confl Resolut 52(3):343374
26. Brandt PT, Freeman JR (2006) Advances in Bayesian time series modeling and the study of
politics: Theory testing, forecasting, and policy analysis. Pol Anal 14(1):136
27. Brandt PT, Freeman JR, Schrodt PA (2011) Real time, time series forecasting of inter- and
intra-state political conflict. Confl Manage Peace Sci 28(1):4164
28. Brandt PT, Sandler T (2009) Hostage taking: understanding terrorism event dynamics.
J Policy Model 31(5):758778
29. Brandt PT, Sandler T (2010) What do transnational terrorists target? Has it changed? Are we
safer? J Confl Resolut 54(2):214236
30. Brandt PT, Williams JT (2007) Multiple time series models. Sage, Thousand Oaks
31. Brochet J, Marie-Paule-Lefrance, Guidicelli V (2008) Imgt/v-quest: the highly customized
and integrated system for ig and tr standardized v-j and v-d-j sequence analysis. Nucl Acids
Res 36:387395
32. Brown D, Dalton J, Hoyle H (2004) Spatial forecast methods for terrorist events in urban
environments. Intell Secur Inform 3073:426435
33. Buhaug H (2006) Relative capability and rebel objective in civil war. J Peace Res 43(6):
691708
34. Buhaug H, Lujala P (2005) Accounting for scale: measuring geography in quantitative studies
of civil war. Pol Geogr 24(4):399418
35. Buhaug H, Rod JK (2006) Local determinants of African civil wars, 19702001. Pol Geogr
25(6):315335
36. Burgess PM, Lawton RW (1972) Indicators of international behavior: an assessment of events
data research. Sage, Beverly Hills
37. Cederman LE, Gleditsch KS (2009) Introduction to special issue of disaggregating civil
war. J Confl Res 24(4):590617
38. Choi K, Asal V, Wilkenfeld J, Pattipati KR (2012) Forecasting the use of violence by
ethno-political organizations: middle eastern minorities and the choice of violence. In:
Subrahmanian V (ed) Handbook on computational approaches to counterterrorism. Springer,
New York
39. Choucri N, Robinson TW (eds) (1979) Forecasting in international relations: theory, methods,
problems, prospects. Freeman, San Francisco
40. Clark DH, Regan PM (2003) Opportunities to fight: a statistical technique for modeling
unobservable phenomena. J Confl Resolut 47(1):94115
41. Cunningham DE (2011) Barriers to Peace in Civil War. Cambridge University Press,
Cambridge
42. Cunningham DE, Gleditsch KS, Salehyan I (2009) It takes two: a dyadic analysis of civil war
duration and outcome. J Confl Resolut 53(4):570597
43. Cutler DR, Edwards TC Jr, Beard KH, Cutler A, Hess KT, Gibson J, Lawler J (2007) Random
forests for classification in ecology. Ecology 88(11):27832792
44. Daly JA, Andriole SJ (1980) The use of events/interaction research by the intelligence
community. Policy Sci 12:215236
45. Davies JL, Gurr TR (eds) (1998) Preventive measures: building risk assessment and crisis
early warning. Rowman and Littlefield, Lanham
46. Dereux J, Haeberli P, Smithies O (1984) A comprehensive set of sequence analysis programs
for the vax. Nucl Acids Res 12(1):387395
158
47. Desmarais BA, Cranmar SJ (2011) Forecasting the locational dynamics of transnational terrorism: a network analysis approach. In: Paper presented at the 2011 European
Intelligence and Security Conference, Athens. http://ieeexplore.ieee.org/xpls/abs all.jsp?
arnumber=6061174&tag=1
48. Dickerson JP, Simari GI, Subrahmanian VS (2012) Using temporal probabilistic rules to
learn group behavior. In: Subrahmanian V (ed) Handbook on computational approaches to
counterterrorism. Springer, New York
49. DOrazio V, Landis ST, Palmer G, Schrodt PA (2011) Separating the wheat from the chaff:
application of support vector machines to mid4 text classification. Paper presented at the
annual meeting of the midwest political science association, Chicago
50. DOrazio V, Yonamine J, Schrodt PA (2011) Predicting intra-state conflict onset: an events
data approach using euclidean and levenshtein distance measures. Presented at the midwest
political science association, Chicago. Available at http://eventdata.psu.edu
51. Duval RD, Thompson WR (1980) Reconsidering the aggregate relationship between size,
economic development, and some types of foreign policy behavior. Am J Pol Sci 24(3):
511525
52. Enders W, Parise GF, Sandler T (1992) A time-series analyssis of transnational terrorism. Def
Peace Econ 3(4):305320
53. Enders W, Sandler T (1993) The effectiveness of anti-terrorism policies: a vectorautoregression-intervention analysis. Am Pol Sci Rev 87(4):829844
54. Enders W, Sandler T (2000) Is transnational terrorism becoming more threatening? A timeseries investigation. J Confl Resolut 44(3):307332
55. Enders W, Sandler T (2005) After 9/11: is it all different now? J Confl Resolut 49(2):259277
56. Enders W, Sandler T (2006) Distribution of transnational terrorism among countries by
income classes and geography after 9/11. Int Stud Q 50(2):367393
57. Enders W, Sandler T (2006) The political economy of terrorism. Cambridge University Press,
Cambridge
58. Enders W, Sandler T, Cauley J (1990) Assessing the impact of terrorist-thwarting policies: an
intervention time series approach. Def Peace Econ 2(1):118
59. Enders W, Sandler T, Gaibulloev K (2011) Domestic versus transnational terrorism: Data,
decomposition, and dynamics. J Peace Res 48(3):319337
60. Esty DC, Goldstone JA, Gurr TR, Harff B, Levy M, Dabelko GD, Surko P, Unger AN
(1998) State failure task force report: phase II findings. Science Applications International
Corporation, McLean
61. Esty DC, Goldstone JA, Gurr TR, Surko P, Unger AN (1995) State failure task force report.
Science Applications International Corporation, McLean
62. Freedom House (2009) Freedom in the world. http://www.freedomhouse.org/reports
63. Freeman JR (1989) Systematic sampling, temporal aggregation, and the study of political
relationships. Pol Anal 1:6198
64. Gaddis JL (1992) International relations theory and the end of the cold war. Int Secur 17:558
65. Gaddis JL (1992) The United States and the end of the cold war. Oxford University Press,
New York
66. Gassebner M, Luechinger S (2011) Lock, stock, and barrel: a comprehensive assessment of
the determinants of terror. Public Choice 149:235261
67. Gerner DJ, Schrodt PA, Francisco RA, Weddle JL (1994) The machine coding of events from
regional and international sources. Int Stud Q 38:91119
(2009) Conflict and mediation event observations
68. Gerner DJ, Schrodt PA, Yilmaz O
(CAMEO) Codebook. http://eventdata.psu.edu/data.dir/cameo.html
69. Ghahramani Z (2004) Unsupervised learning. In: Advanced lectures on machine learning.
Springer, Berlin/New York, pp 72112
70. Gill J (1999) The insignificance of null hypothesis significance testing. Pol Res Q 52(3):
647674
71. Gill J (2003) Bayesian methods: a social and behavioral sciences approach. Chapman and
Hall, Boca Raton
159
72. Gleditsch NP (2012) Special issue: event data. Int Interact 38(4)
73. Goldstein JS (1992) A conflict-cooperation scale for WEIS events data. J Confl Resolut
36:369385
74. Goldstone JA, Bates R, Epstein DL, Gurr TR, Lustik M, Marshall, MG, Ulfelder J, Woodward
M (2010) A global model for forecasting political instability. Am J Pol Sci 54(1):190208
75. Gurr TR, Harff B (1994) Conceptual, research and policy issues in early warning research: an
overview. J Ethno-Dev 4(1):315
76. Gurr TR, Lichbach MI (1986) Forecasting internal conflict: A competitive evaluation of
empirical theories. Comp Pol Stud 19:338
77. Hamerli A, Gattiker R, Weyermann R (2006) Conflict and cooperation in an actors network
of Chechnya based on event data. J Confl Resolut 50(159):159175
78. Hamilton J (1994) Time series analysis. Princeton University Press, Princeton
79. Harff B, Gurr TR (2001) Systematic early warning of humanitarian emergencies. J Peace Res
35(5):359371
80. Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning. Springer,
New York
81. Hegre H, Ellingson T, Gates S, Gleditsch NP (2001) Toward a democratic civil peace?
democracy, political change, and civil war, 18161992. J Policy Model 95(1):3448
82. Hegre H, Ostby G, Raleigh C (2009) Property and civil war: a disaggregated study of liberia.
J Confl Resolut 53(4):598623
83. Hill DW, Bagozzi BE, Moore WH, Mukherjee B (2011) Strategic incentives and modeling
bias in ordinal data:the zero-inflated ordered probit (ziop) model in political science. In: Paper
presented at the new faces in political methodology meeting, Penn State, 30 April 2011. http://
qssi.psu.edu/files/NF4Hill.pdf
84. Holmes JS, Pineres SAGD, Curtina KM (2007) A subnational study of insurgency: farc
violence in the 1990s. Stud Confl Terror 30(3):249265
85. Hopple GW, Andriole SJ, Freedy A (eds) (1984) National security crisis forecasting and
management. Westview, Boulder
86. Hosmer DW, Lemeshow S, May S (2008) Applied survival analysis: regression modeling of
time to event data. Series in probability and statistics. Wiley, New York
87. Hudson V (ed) (1991) Artificial intelligence and international politics. Westview, Boulder
88. Hudson VM, Schrodt PA, Whitmer RD (2008) Discrete sequence rule models as a social
science methodology: an exploratory analysis of foreign policy rule enactment within
Palestinian-Israeli event data. Foreign Policy Anal 4(2):105126
89. International Monetary Fund. Statistics Dept. (2009) Direction of trade statistics yearbook,
2008. http://www.imf.org/external/pubs/cat/longres.cfm?sk=22053.0
90. Inyaem U, Meesad P, Haruechaiyasak C (2009) Named-entity techniques for terrorism event
extraction and classification. In: Paper presented at the eighth international symposium on
natural language processing. http://ieeexplore.ieee.org/xpls/abs all.jsp?arnumber=5340924&
tag=1
91. Jackman RW, Miller RA (1996) A renaissance of political culture? Am J Pol Sci 40(3):
632659
92. Jackman S (2009) Bayesian analysis for the social sciences. Wiley, Chichester
93. Jenkins CJ, Bond D (2001) Conflict carrying capacity, political crisis, and reconstruction.
J Confl Resolut 45(1):331
94. Kahneman D (2011) Thinking fast and slow. Farrar, Straus and Giroux, New York
95. Khong YF (1992) Analogies at war. Princeton University Press, Princeton
96. King G (1986) How not to lie with statistics: avoiding common mistakes in quantitative
political science. Am J Pol Sci 30(3):666687
97. King G, Zeng L (2001) Logistic regression in rare events data. Pol Anal 9(2):1254
98. Lee K, Booth D, Alam P (2005) A comparison of supervised and unsupervised neural
networks in predicting bankruptcy of korean firms. Expert Syst Appl 29:116
99. Leng RJ (1987) Behavioral correlates of war, 18161975. (ICPSR 8606). Inter-University
Consortium for Political and Social Research, Ann Arbor
160
119. Ocal
N, Yildirim J (2010) Regional effects of terrorism on economic growth in turkey:
A geographically weighted regression. J Peace Res 47(4):477489
120. Oskar Engene J (2007) Five decades of terrorism in europe: the tweed dataset. J Peace Res
44(1):109121
121. Pearl J (2009) Understanding propensity scores. In: Causality: models, reasoning, and
inference, 2nd edn. Cambridge University Press, Cambridge/New York
122. Perliger A, Pedahzur A (2011) Social network analysis in the study of terrorism and political
violence. Pol Sci Pol 44(1):4550
161
123. Petroff V, Bond J, Bond D (2012) Using hidden markov models to predict terror before it hits
(again). In: Subrahmanian V (ed) Handbook on computational approaches to counterterrorism. Springer, New York
124. Pevehouse JC, Brozek J (2010) Time series analysis in political science. In: BoxSteffensmeier J, Brady H, Collier D (eds) Oxford handbook of political Methodology. Oxford
University Press, New York
125. Pevehouse JC, Goldstein JS (1999) Serbian compliance or defiance in Kosovo? Statistical
analysis and real-time predictions. J Confl Resolut 43(4):538546
126. Piazza J (2011) Poverty, minority economic discrimination and domestic terrorism. J Peace
Res 48(3):339353
127. Piazza J, Walsh JI (2009) Transnational terrorism and human rights. Int Stud Q 53:125148
128. Porter MD, Brown DE (2007) Detecting local regions of change in high-dimensional criminal
or terrorist point processes. Comput Stat Data Anal 51(5):27532768
129. Raleigh C, Linke A, Hegre H, Karlsen J (2010) Introducing ACLED: an armed conflict
location and event dataset. J Peace Res 47(5):651660
130. Regan P, Clark D (2010) The institutions and elections project data collection. http://www2.
binghamton.edu/political-science/institutions-and-elections%-project.html
131. Richardson LF (1960) Statistics of deadly quarrels. Quadrangle, Chicago
132. Rupesinghe K, Kuroda M (eds) (1992) Early warning and conflict resolution. St. Martins,
New York
133. Schrodt PA (2000) Pattern recognition of international crises using hidden Markov models.
In: Richards D (ed) Political complexity: nonlinear models of politics. University of Michigan
Press, Ann Arbor, pp 296328
134. Schrodt PA (2004) Detecting united states mediation styles in the middle east, 19791998.
In: Maoz Z, Mintz A, Morgan TC, Palmer G, Stoll RJ (eds) Multiple paths to knowledge in
international relations. Lexington Books, Lexington, pp 99124
135. Schrodt PA (2006) Forecasting conflict in the Balkans using hidden Markov models. In:
Trappl R (ed) Programming for peace: computer-aided methods for international conflict
resolution and prevention. Kluwer, Dordrecht, pp 161184
136. Schrodt PA (2007) Inductive event data scaling using item response theory. Presented at the
summer meeting of the society of political methodology. Available at http://eventdata.psu.edu
137. Schrodt PA (2010) Seven deadly sins of contemporary quantitative analysis. Presented at the
american political science association Meetings, Washington, DC. http://eventdata.psu.edu/
papers.dir/Schrodt.7Sins.APSA10.pdf
138. Schrodt PA, Gerner DJ (1994) Validity assessment of a machine-coded event data set for the
Middle East, 19821992. Am J Pol Sci 38:825854
139. Schrodt PA, Gerner DJ (1997) Empirical indicators of crisis phase in the Middle East,
19791995. J Confl Resolut 25(4):803817
140. Schrodt PA, Gerner DJ (2004) An event data analysis of third-party mediation. J Confl Resolut
48(3):310330
Simpson EM (2001) Analyzing the dynamics
141. Schrodt PA, Gerner DJ, Abu-Jabr R, Yilmaz, O,
of international mediation processes in the Middle East and Balkans. Presented at the
American Political Science Association meetings, San Francisco
142. Schrodt PA, Palmer G, Hatipoglu ME (2008) Automated detection of reports of militarized
interstate disputes using the svm document classification algorithm. Paper presented at
American Political Science Association, San Francisco
143. Schrodt PA, Van Brackle D (2012) Automated coding of political event data. In: Subrahmanian V (ed) Handbook on computational approaches to counterterrorism. Springer, New York
144. Shearer R (2006) Forecasting Israeli-Palestinian conflict with hidden Markov models.
Available at http://eventdata.psu.edu/papers.dir/Shearer.IP.pdf
145. Shellman S (2000) Process matters: Conflict and cooperation in sequential governmentdissident interactions. Journal of Conflict Resolution 15(4), 563599
146. Shellman S (2004) Time series intervals and statistical inference: the effects of temporal
aggregation on event data analysis. Secur Stud 12(1):97104
162
147. Shellman S, Hatfield C, Mills M (2010) Disaggregating actors in intrastate conflict. J Peace
Res 47(1):8390
148. Shellman S, Stewart B (2007) Predicting risk factors associated with forced migration: An
early warning model of Haitian flight. Civil Wars 9(2):174199
149. Silke A (ed) (2004) Research on terrorism: trends, achievements and failures. Frank Cass,
London
150. Silke A (2009) Contemporary terrorism studies: issues in research. In: Jackson R, Smyth MB,
Gunning J (eds) Critical terrorism studies: a new research agenda. Routledge, London
151. Silva A, Simari G, Martinez V, Subrahmanian VS (2012) SOMA: Stochastic opponent
modeling agents for forecasting violent behavior. In: Subrahmanian V (ed) Handbook on
computational approaches to counterterrorism. Springer, New York
152. Simari GI, Earp D, Martinez MV, Silva A, Subrahmanian VS (2012) Forecasting grouplevel actions using similarity measures. In: Subrahmanian V (ed) Handbook on computational
approaches to counterterrorism. Springer, New York
153. Simi P (2010) Operation and structure of right-wing extremist groups in the united states,
19802007. http://www.icpsr.umich.edu/icpsrweb/NACJD/studies/25722/detail
154. Sims CA (1980) Macroeconomics and reality. Econometrica 1, 6344
155. Sims CA, Waggoner DF, Zha TA (2008) Methods for inference in large multiple-equation
Markov-switching models. J Econom 146(2):255274
156. Siroky DS (2009) Navigating random forests and related advances in algorithmic modeling.
Stat Surv 3:147163
157. Sliva A, Subrahmanian V, Martinez V, Simari G (2009) Cape: Automatically predicting
changes in group behavior. In: Memon N, Farley JD, Hicks DL, Rosenorn T (eds) Mathematical methods in counterterrorism. Springer/Wien, Norderstedt, pp 253269
158. Sun S, Zhang C (2007) The selective random subspace predictor for traffic flow forecasting.
IEEE Trans Intell Transp Syst 8(2):367373
159. Svolik MW (2008) Authoritarian reversals and democratic consolidation. Am Pol Sci Rev
102(2):153168
160. Sylvan DA, Chan S (1984) Foreign policy decision making: perception, cognition and
artificial intelligence. Praeger, New York
161. Tetlock PE (2005) Expert political judgment: how good is it? How can we know? Princeton
University Press, Princeton
162. Urdal H (2008) Population, resources and violent conflict: A sub-national study of india
19562002. J Confl Resolut 52(4):590617
163. U.S. National Counterterrorism Center (2009) Worldwide incidents tracking system (wits).
http://wits.nctc.gov/
164. Ward MD, Gleditsch KS (2002) Location, location, location: an mcmc approach to modeling
the spatial context of war and peace. Pol Anal 10(3):244260
165. Ward MD, Greenhill BD, Bakke KM (2010) The perils of policy by p-value: predicting civil
conflicts. J Peace Res 47(5):363375
166. Weidman NB, Ward MD (2010) Predicting conflict in space and time. J Confl Resolut
54(6):883901
167. Weidmann NB, Toft MD (2010) Promises and pitfalls in the spatial prediction of ethnic
violence: a comment. Confl Manage Peace Sci 27(2):159176
168. Xiang J (2010) Relevance as a latent variable in dyadic analysis of conflict. J Pol 72(2):
484498
169. Young JK, Findley MG (2011) Promise and pitfalls of terrorism research. Int Stud Rev
13:411431
170. Yonamine JE (2012) Working with even data: a guide to aggregation. Available at http://jayyonamine.com/wp-content/uploads/2012/06/Working-with-Event-Data-AGuide-to-Aggregation-Choices.pdf