Abstract
Alerting users about controversial search results can encourage critical literacy, promote healthy civic discourse and counteract the “filter bubble” effect, and therefore would be a useful feature in a search engine or browser extension. In order to implement such a feature, however, the binary classification task of determining which topics or webpages are controversial must be solved. Earlier work described a proof of concept using a supervised nearest neighbor classifier with access to an oracle of manually annotated Wikipedia articles. This paper generalizes and extends that concept by taking the human out of the loop, leveraging the rich metadata available in Wikipedia articles in a weakly-supervised classification approach. The new technique we present allows the nearest neighbor approach to be extended on a much larger scale and to other datasets. The results improve substantially over naive baselines and are nearly identical to the oracle-reliant approach by standard measures of F 1, F 0.5, and accuracy. Finally, we discuss implications of solving this problem as part of a broader subject of interest to the IR community, and suggest several avenues for further exploration in this exciting new space.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Activist Post. 22 Medical Studies That Show Vaccines Can Cause Autism, http://www.activistpost.com/2013/09/22-medical-studies-that-show-vaccines.html (September 24, 2014) (accessed)
Aktolga, E., Allan, J.: Sentiment Diversification With Different Biases. In: Proc. of SIGIR 2013, pp. 593–602 (2013)
Awadallah, R., Ramanath, M., Weikum, G.: Harmony and Dissonance: Organizing the People’s Voices on Political Controversies. In: Proc. of WSDM 2012, pp. 523–532 (February 2012)
Callan, J.P., Croft, W.B., Harding, S.M.: The INQUERY retrieval system. In: Database and Expert Systems Applications, pp. 78–83. Springer, Vienna (1992)
Choi, Y., Jung, Y., Myaeng, S.-H.: Identifying Controversial Issues and Their Sub-topics in News Articles. Intelligence and Security Informatics 6122, 140–153 (2010)
Das, S., Lavoie, A., Magdon-Ismail, M.: Manipulation Among the Arbiters of Collective Intelligence: How Wikipedia Administrators Mold Public Opinion. In: Proc. of CIKM 2013, pp. 1097–1106 (2013)
Dori-Hacohen, S., Allan, J.: Detecting controversy on the web. In: Proc. of CIKM 2013, pp. 1845–1848 (2013)
Ennals, R., Trushkowsky, B., Agosta, J.M.: Highlighting disputed claims on the web. In: Proc. of WWW 2010, p. 341 (2010)
Gyllstrom, K., Moens, M.-F.: Clash of the typings: finding controversies and children’s topics within queries. In: Clough, P., Foley, C., Gurrin, C., Jones, G.J.F., Kraaij, W., Lee, H., Mudoch, V. (eds.) ECIR 2011. LNCS, vol. 6611, pp. 80–91. Springer, Heidelberg (2011)
Heroic Media. Free Abortion Help website (January 2014), http://freeabortionhelp.com/us/ (September 24, 2014) (accessed)
Kacimi, M., Gamper, J.: MOUNA: Mining Opinions to Unveil Neglected Arguments. In: Proc. of CIKM 2012, pp. 2722–2724 (2012)
Kittur, A., Suh, B., Pendleton, B.A., Chi, E.H., Angeles, L., Alto, P.: He Says, She Says: Conflict and Coordination in Wikipedia. In: Proc. of CHI 2007, pp. 453–462 (2007)
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to information retrieval, vol. 1. Cambridge University Press, Cambridge (2008)
Pariser, E.: The Filter Bubble: What the Internet is hiding from you. Penguin Press HC (2011)
Popescu, A.A.-M., Pennacchiotti, M.: Detecting controversial events from twitter. In: Proc. CIKM 2010, pp. 1873–1876 (2010)
Riedel, S., Yao, L., McCallum, A.: Modeling Relations and Their Mentions Without Labeled Text. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds.) ECML PKDD 2010, Part III. LNCS, vol. 6323, pp. 148–163. Springer, Heidelberg (2010)
Sepehri Rad, H., Barbosa, D.: Identifying controversial articles in Wikipedia: A comparative study. In: Proc. WikiSym (2012)
Tsytsarau, M., Palpanas, T., Denecke, K.: Scalable detection of sentiment-based contradictions. In: DiversiWeb 2011 (2011)
Vydiswaran, V.G.V., Zhai, C., Roth, D., Pirolli, P.: BiasTrust: Teaching Biased Users About Controversial Topics. In: Proc. CIKM 2012, pp. 1905–1909 (2012)
Wikipedia. Wikipedia: Neutral Point of View Policy (January 2014)
Yasseri, T., Sumi, R., Rung, A., Kornai, A., Kertész, J.: Dynamics of conflicts in Wikipedia. PloS One 7(6), e38869 (2012)
Yom-Tov, E., Dumais, S.T., Guo, Q.: Promoting civil discourse through search engine diversity. Social Science Computer Review (2013)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Dori-Hacohen, S., Allan, J. (2015). Automated Controversy Detection on the Web. In: Hanbury, A., Kazai, G., Rauber, A., Fuhr, N. (eds) Advances in Information Retrieval. ECIR 2015. Lecture Notes in Computer Science, vol 9022. Springer, Cham. https://doi.org/10.1007/978-3-319-16354-3_46
Download citation
DOI: https://doi.org/10.1007/978-3-319-16354-3_46
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16353-6
Online ISBN: 978-3-319-16354-3
eBook Packages: Computer ScienceComputer Science (R0)