Abstract
Political event data have long been used in the quantitative study of international politics, dating back to the early efforts of Edward Azar’s COPDAB [1] andCharles McClelland’s WEIS [18] as well as a variety of more specialized efforts such as Leng’s BCOW [16]. By the late 1980s, the NSF-funded Data Development in International Relations project [20] had identified event data as the second most common form of data—behind the various Correlates of War data sets— used in quantitative studies. The 1990s saw the development of two practical automated event data coding systems, the NSF-funded KEDS (http://eventdata. psu.edu; [9, 31, 33]) and the proprietary VRA-Reader (http://vranet.com; [15, 27]) and in the 2000s, the development of two new political event coding ontologies— CAMEO [34] and IDEA[4,27]—designed for implementation in automated coding systems. A summary of the current status of political event projects, as well as detailed discussions of some of these, can be found in [10, 32].
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Individual coders, particularly working for short periods of time, can of course reliably code much faster than this. But for the overall labor requirements—that is, the total time invested in the enterprise divided by the resulting useable events—the six events per hour is a pretty good rule of thumb and—like the labor requirements of a string quartet—has changed little over time.
- 2.
The count of “stories” has varied continually as we’ve updated the downloads, modified the filters and so forth, and so an exact count is both unavailable and irrelevant. But starts around around eight to nine-million.
- 3.
We’ve actually identified about 75 distinct sources in the stories, presumably the result of quirks in the LN search engine. However, these additional sources generate only a small number of stories, and by far the bulk of the stories come from the sources we had deliberately identified.
- 4.
This will not, however, catching spelling corrections in the first 48 characters. In the Reuters-based filtering for the KEDS project, we did a count of the frequency of letters in the lead sentence, and identified a duplicate if the absolute distance between that vector for two stories, ∑ | x i − y i | > η, where the threshold η was usually around 10. This catches spelling and date corrections, the most common source of duplicates in Reuters, but failed on AFP, which tends to expand the details in a sentence as more information becomes available.
- 5.
Notably to traders—carbon-based and silicon-based—in the financial sector, which drives much if not most of the international reporting. The likelihood of an event being reported is very much proportional to the possibility that someone can make or lose money on it.
- 6.
The phrase “cue category” refers to the broad two-digit codes, as opposed to the more specific three and four digit subcategories.
- 7.
To date, all of the successful automated event data coding systems are dictionary and rule based, rather than using statistical-methods: see [36]. While statistical methods would certainly be attractive, and seem to work on highly simplified “toy problems” such as those in [6], all of the successfully-deployed systems to date are dictionary-based, and numerous efforts to scale initially-promising statistical methods have failed.
- 8.
Including, at the request of the sponsor, some bugs in TABARI, though after the equivalence of the two systems was demonstrated, these were corrected in both systems.
- 9.
In principle these enhancements could also be applied to Jabari-NLP, though it is running in secure military systems rather than open environments and to date has made less use of cluster processing.
- 10.
Though we’ve not been able to locate this on the web. Itself interesting.
References
Azar EE (1980) The conflict and peace data bank (COPDAB) project. J Confl Resolut 24: 143–152
Azar EE (1982) The codebook of the conflict and peace data bank (COPDAB). Center for International Development, University of Maryland, College Park
Azar EE, Sloan T (1975) Dimensions of interaction. University Center for International Studies, University of Pittsburgh, Pittsburgh
Bond D, Bond J, Oh C, Jenkins JC, Taylor CL (2003) Integrated data for events analysis (IDEA): An event typology for automated events data development. J Peace Res 40(6): 733–745
Bond D, Jenkins JC, Taylor CLT, Schock K (1997) Mapping mass political conflict and civil society: Issues and prospects for the automated development of event data. J Confl Resolut 41(4):553–579
Boschee E, Natarajan P, Weischedel R (2012) Automatic extraction of events from open source text for predictive forecasting. In: Subrahmanian V (ed) Handbook on computational approaches to counterterrorism. Springer, New York
Chenoweth E, Dugan L (2012) Rethinking counterterrorism: evidence from israe. Working Paper, Wesleyan University, Middletown, CT
Dugan L, Chenoweth E (2012) Moving beyond deterrence: the effectiveness of raising the expected utility of abstaining from terrorism in israel. Working Paper, University of Maryland, College Park, MD
Gerner DJ, Schrodt PA, Francisco RA, Weddle JL (1994) The machine coding of events from regional and international sources. Int Stud Q 38:91–119
Gleditsch NP (2012) Special issue: event data in the study of conflict. Int Interact 38(4): 375–569
Goldstein JS (1992) A conflict-cooperation scale for WEIS events data. J Confl Resolut 36:369–385
Howell LD (1983) A comparative study of the WEIS and COPDAB data sets. Int Stud Q 27:149–159
Jenkins CJ, Bond D (2001) Conflict carrying capacity, political crisis, and reconstruction. J Confl Resolut 45(1):3–31
Kahneman D (2011) Thinking fast and slow. Farrar, Straus and Giroux, New York
King G, Lowe W (2004) An automated information extraction tool for international conflict data with performance as good as human coders: A rare events evaluation design. Int Organ 57(3):617–642
Leng RJ (1987) Behavioral correlates of war, 1816–1975. (ICPSR 8606). Inter-University Consortium for Political and Social Research, Ann Arbor
McClelland CA (1967) World-event-interaction-survey: a research project on the theory and measurement of international interaction and transaction. University of Southern California, Los Angeles, CA
McClelland CA (1976) World event/interaction survey codebook (ICPSR 5211). Inter-University Consortium for Political and Social Research, Ann Arbor
McClelland CA (1983) Let the user beware. Int Stud Q 27(2):169–177
Merritt RL, Muncaster RG, Zinnes DA (eds) (1993) International event data developments: DDIR phase II. University of Michigan Press, Ann Arbor
Mikhaylov S, Laver M, Benoit K Coder reliability and misclassification in the human coding of party manifestos. Political Anal 20(1):78–91 (2012)
Mooney B, Simpson B (2003) Breaking News: How the Wheels Came off at Reuters. Capstone, Mankato
Nardulli P (2011) The social, political and economic event database project (SPEED). http://www.clinecenter.illinois.edu/research/speed.html
Nardulli PF, Leetaru KH, Hayes M Event data, civil unrest and the SPEED project (2011). Presented at the International Studies Association Meetings, Montréal
O’Brien S (2012) A multi-method approach for near real time conflict and crisis early warning. In: Subrahmanian V (ed) Handbook on computational approaches to counterterrorism. Springer, New York
O’Brien SP (2010) Crisis early warning and decision support: contemporary approaches and thoughts on future research. Int Stud Rev 12(1):87–104
Petroff V, Bond J, Bond D (2012) Using hidden Markov models to predict terror before it hits (again). In: Subrahmanian V (ed) Handbook on computational approaches to counterterrorism. Springer, New York
Ruggeri A, Gizelis TI, Dorussen H (2011) Events data as bismarck’s sausages? intercoder reliability, coders’ selection, and data quality. Int Interact 37(1):340–361
Russett BM, Singer JD, Small M (1968) National political units in the twentieth century: a standardized list. Am Political Sci Rev 62(3):932–951
Schrodt PA (1994) Statistical characteristics of events data. Int Interact 20(1–2):35–53
Schrodt PA (2006) Twenty years of the Kansas event data system project. Political Methodol 14(1):2–8
Schrodt PA (2012) Precedents, progress and prospects in political event data. Int Interact 38(5):546–569
Schrodt PA, Gerner DJ (1994) Validity assessment of a machine-coded event data set for the Middle East, 1982–1992. Am J Political Sci 38:825–854
Schrodt PA, Gerner DJ, Yilmaz Ö (2009) Conflict and mediation event observations (CAMEO): an event data framework for a post Cold War world. In: Bercovitch J, Gartner S (eds) International conflict mediation: new approaches and findings. Routledge, New York
Schrodt PA, Palmer G, Hatipoglu ME (2008) Automated detection of reports of militarized interstate disputes using the SVM document classification algorithm. Paper presented at American Political Science Association, Chicago, IL
Shilliday A, Lautenschlager J (2012) Data for a global icews and ongoing research. In: 2nd international conference on cross-cultural decision making: focus 2012, San Francisco, CA
Tetlock PE (2005) Expert political judgment: how good is it? how can we know? Princeton University Press, Princeton
Van Brackle D, Wedgwood J (2011) Event coding for hscb modeling: challenges and approaches. In: Human social culture behavior modeling focus 2011, Chantilly, VA
Acknowledgements
This research was supported in part by contracts from the Defense Advanced Research Projects Agency under the Integrated Crisis Early Warning System (ICEWS) program (Prime Contract #FA8650-07-C-7749: Lockheed-Martin Advance Technology Laboratories) as well as grants from the National Science Foundation (SES-0096086, SES-0455158, SES-0527564, SES-1004414) and by a Fulbright-Hays Research Fellowship for work by Schrodt at the Peace Research Institute, Oslo (http://www.prio.no). The results and findings in no way represent the views of Lockheed-Martin, the Department of Defense, DARPA, or NSF. It has benefitted from extended discussions and experimentation within the ICEWS team and the KEDS research group at the University of Kansas; we would note in particular contributions from Steve Shellman, Hans Leonard, Brandon Stewart, Jennifer Lautenschlager, Andrew Shilliday, Will Lowe, Steve Purpura, Vladimir Petroff, Baris Kesgin and Matthias Heilke.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media New York
About this chapter
Cite this chapter
Schrodt, P.A., Van Brackle, D. (2013). Automated Coding of Political Event Data. In: Subrahmanian, V. (eds) Handbook of Computational Approaches to Counterterrorism. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-5311-6_2
Download citation
DOI: https://doi.org/10.1007/978-1-4614-5311-6_2
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-5310-9
Online ISBN: 978-1-4614-5311-6
eBook Packages: Computer ScienceComputer Science (R0)