Abstract
We present a unified framework based on supervised sequence labelling methods to identify and extract uncertainty cues, holders, and scopes in one-fell swoop with an application on Arabic tweets. The underlying technology employs Support Vector Machines with a rich set of morphological, syntactic, lexical, semantic, pragmatic, dialectal, and genre-specific features, and yields an average F1 score of 0.759.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Diab, M., Levin, L., Mitamura, T., Rambow, O., Prabhakaran, V., Guo, W.: Committed Belief Annotation and Tagging. In: Proceedings of the 3rd Linguistic Annotation Workshop, Suntec, Singapore, pp. 68–73 (2009)
Palmer, F.R.: Mood and Modality. Cambridge University Press, Cambridge (1986)
Aikhenvald, A.Y.: Evidentiality. Oxford University Press, UK (2004)
SaurÃ, R., Pustejovsky, J.: FactBank: A Corpus Annotated with Event Factuality. Language Resources and Evaluation 43, 227–268 (2009)
DÃaz, N.: Detecting Negated and Uncertain Information in Biomedical and Review Texts. In: Proceedings of the Student Research Workshop Associated with RANLP 2013, Hissar, Bulgaria, pp. 45–50 (2013)
de Marneffe, M., Manning, C., Potts, C.: Did it Happen? The Pragmatic Complexity of Veridicality Assessment. Computational Linguistics 38, 301–333 (2012)
Qazvinian, V., Rosengren, E., Radev, D., Mei, Q.: Rumor has it: Identifying Misinformation in Microblogs. In: Procedings of the 2011 Conference on Empirical Methods in Natural Language Processing, Edinburgh, Scotland, UK, pp. 1589–1599 (2011)
de Marneffe, M., Grimm, S., Potts, C.: Not a Simple Yes or No: Uncertainty in Indirect Answers. In: Proceedings of SIGDIAL 2009: the 10th Annual Meeting of the Special Interest Group in Discourse and Dialogue, pp. 136–143. Queen Mary University of London (2009)
Castillo, C., Mendoza, M., Poblete, B.: Information Credibility on Twitter. In: Proceedings of the 20th International Conference on World Wide Web, Heydrabad, India, pp. 675–684 (2011)
Soni, S., Mitra, T., Gilbert, E., Eisenstein, J.: Modeling Factuality Judgments in Social Media Text. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Short Papers), Baltimore, Maryland, USA, pp. 415–420 (2014)
Wagner, C., Liao, V., Pirolli, P., Nelson, L., Strohmaier, M.: It’s not in their Tweets: Modeling Topical Expertise of Twitter Users. In: Proceedings of the 2012 ASE/IEEE International Conference on Social Computing, SocialCom/PASSAT, Washington DC, USA, pp. 91–100 (2012)
Mowery, D.L., Velupillai, S., Chapman, W.: Medical Diagnosis Lost in Translation: Analysis of Uncertainty and Negation Expressions in English and Swedish Clinical Texts. In: Proceedings of the 2012 Workshop on Biomedical Natural Language Processing (BioNLP 2012), Montreal, Canada, pp. 56–64 (2012)
Baker, K., Bloodgood, M., Dorr, B.J., Callison-Burch, C., Filardo, N.W., Piatko, C., Levin, L., Miller, S.: Modality and Negation in SIMT. Computational Linguistics 38(2), 411–438 (2012)
Wiegand, M., Klakow, D.: Prototypical Opinion Holders: What We can Learn from Experts and Analysts. In: Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2011), Missar, Bulgaria, pp. 282–288 (2011)
Orelid, L., Velldal, E., Oepen, S.: Syntactic Scope Resolution in Uncertainty Analysis. In: Proceedings of the 23rd International Conference on Computational Linguistics (COLING 2010), Beijin, China, pp. 1379–1387 (2010)
Prabhakaran, V.: Uncertainty Learning Using SVMs and CRFs. In: Proceedings of the 14th Conference on Computational Natural Language Learning: Shared Task, Uppsala, Sweden, pp. 132–137 (2010)
Prabhakaran, V., Bloodgood, M., Diab, M., Dorr, B., Levin, L., Piatko, C., Rambow, O., Van Durme, B.: Statistical Modality Tagging from Rule-based Annotations and Crowdsourcing. In: Proceedings of the Workshop on Extra-Propositional Aspects of Meaning in Computational Linguistics, Jeju, Republic of Korea, pp. 57–64 (2012)
Tjong, E., Sang, K.: A Baseline Approach for Detecting Sentences Containing Uncertainty. In: Proceedings of the 14th Conference on Computational Natural Language Learning: Shared Task, Uppsala, Sweden, pp. 148–150 (2010)
Szarvas, G., Vincze, V., Farkas, R., Möra, G., Gurevych, I.: Cross-Genre and Cross-Domain Detection of Semantic Uncertainty. Computational Linguistics 38(2), 335–367 (2012)
Vincze, V.: Uncertainty Detection in Hungarian Texts. In: Proceedings of the 25th International Conference on Computational Linguistics (COLING 2014): Technical Papers, Dublin, Ireland, pp. 1844–1853 (2014)
Kilicoglu, H., Bergler, S.: Recognizing Speculative Language in Biomedical Research Articles: A Linguistically Motivated Perspective. In: Proceedings of BioNLP 2008: Current Trends in Biomedical Natural Language Processing, Ohio, USA, pp. 46–53 (2008)
Zhou, H., Li, X., Huang, D., Li, Z., Yang, Y.: Exploiting Multi-Features to Detect Hedges and Their Scope in Biomedical Texts. In: Proceedings of the 14th Conference on Computational Natural Language Learning: Shared Task, Uppsala, Sweden, pp. 106–113 (2010)
Vincze, V., Szarvas, G., Móra, G., Ohta, T., Farkas, R.: Linguistic Scope-Based and Biological Event-Based Speculation and Negation Annotations in the BioScope and Genia Event Corpora. Journal of Biomedical Semantics 2(5), 1–11 (2011)
Szarvas, G., Gurevych, I.: Uncertainty Detection for Natural Language Watermarking. In: Proceedings of the International Joint Conference on Natural Language Processing (IJCNLP 2013), Nagoya, Japan, pp. 1188–1194 (2013)
Vincze, V.: Weasels, Hedges and Peacocks: Discourse-Level Uncertainty in Wikipedia Articles. In: Proceedings of the International Joint Conference on Natural Language Processing (IJCNLP 2013), Nagoya, Japan, pp. 383–391 (2013)
Wei, Z., Chen, J., Gao, W., Li, B., Zhou, L., He, Y., Wong, W.: An Empirical Study on Uncertainty Identification in Social Media Context. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Sofia, Bulgaria, pp. 58–62 (2013)
Shaalan, K., Abo Bakr, H., Ziedan, I.: A Hybrid Approach for Building Arabic Diacritizer. In: Proceedings of the EACL 2009 Workshop on Computational Approaches to Semitic Languages, Athens, Greece, pp. 27–35 (2009)
Habash, N., Roth, R.: Using Deep Morphology to Improve Automatic Error Detection in Arabic Handwriting Recognition. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, Portland, Oregon, pp. 875–884 (2011)
Alkuhlani, S., Habash, N.: Identifying Broken Plurals, Irregular Gender, and Rationality in Arabic Text. In: Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics, Avignon, France, pp. 675–685 (2011)
Al-Sabbagh, R., Girju, R., Diesner, J.: 3arif: A Corpus of Modern Standard and Egyptian Arabic Tweets Annotated for Epistemic Modality Using Interactive Crowdsourcing. In: Proceedings of the 25th Conference on Computational Linguistics (COLING 2014), Dublin, Ireland, pp. 1521–1532 (2014)
Pasha, A., Al-Badrashiny, M., Diab, M., Elkholy, A., Eskandar, R., Habash, N., Pooleery, M., Rambow, O., Roth, R.: MADAMIRA: a Fast, Comprehensive Tool for Morphological Analysis and Disambiguation of Arabic. In: Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC 2014), Reykjavik, Iceland, pp. 1094–1101 (2014)
Szarvas, G., Vincze, V., Farkas, R., Csirik, J.: The BioScope Corpus: Annotation for Negation, Uncertainty and their Scope in Biomedical Texts. In: Proceedings of BioNLP 2008: Current Trends in Biomedical Natural Language Processing, Columbus, Ohio, pp. 38–45 (2008)
Diab, M.: Second Generation AMIRA Tools for Arabic Processing: Fast and Robust Tokenization, POS tagging, and Base Phrase Chunking. In: Proceedings of the 2nd International Conference on Arabic Language Resources and Tools, Cairo, Egypt, pp. 285–288 (2009)
Marton, Y., Habash, N., Rambow, O.: Dependency Parsing of Modern Standard Arabic with Lexical and Inflectional Features. Computational Linguistics 39(1), 161–194 (2013)
Maamouri, M., Bies, A., Krouna, S., Gaddeche, F., Bouziri, B.: Penn Arabic Treebank Guidelines. In: Linguistic Data Consortium (2009)
Elghamry, K., Al-Sabbagh, R., ElZeiny, N.: Cue-Based Bootstrapping of Arabic Semantic Features. In: Proceedings of the 9th International Conference on Statistical Text Analysis, Lyon, France, pp. 85–95 (2008)
Alkuhlani, S., Habash, N.: A Corpus for Modeling Morpho-Syntactic Agreement in Arabic: Gender, Number and Rationality. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Short Papers, pp. 357–362 (2011)
Elfardy, H., Al-Badrashiny, M., Diab, M.: AIDA: Identifying Code Switching in Informal Arabic Text. In: Proceedings of the 1st Workshop on Computational Approaches to Code Switching, Doha, Qatar, pp. 94–101 (2014)
Al-Sabbagh, R., Girju, R., Diesner, J.: Using the Semantic-Syntactic Interface for Reliable Arabic Modality Annotation. In: Proceedings of the 6th International Joint Conference on Natural Language Processing (IJCNLP 2014), Nagoya, Japan, pp. 410–418 (2013)
Al-Sabbagh, R., Girju, R., Diesner, J.: Unsupervised Construction of a Lexicon and a Repository of Variation Patterns for Arabic Modal Multiword Expressions. In: Proceedings of the 10th Workshop on Multiword Expressions (MWE), Göthenburg, Sweden, pp. 114–123 (2014)
Moncecchi, G., Minel, J., Wonsever, D.: Improving Speculative Language Detection Using Linguistic Knowledge. In: Proceeding of the Workshop on Extra-Propositional Aspects of Meaning in Computational Linguistics, Jeju, Republic of Korea, pp. 37–46. of Korea (2012)
Velupillai, S.: Shades of Certainty: Annotation and Classification of Swedish Medical Records. PhD thesis, Stockholm University (2012)
Verbeke, M., Frasconi, P., Van Asch, V., Morante, R., Daelemans, W., De Raedt, L.: Kernel-Based Logical and Relational Learning with kLog for Hedge Cue Detection. In: Muggleton, S.H., Tamaddoni-Nezhad, A., Lisi, F.A. (eds.) ILP 2011. LNCS, vol. 7207, pp. 347–357. Springer, Heidelberg (2012)
Yang, H., De Roeck, A., Gervasi, V., Willis, A., Nuseibeh, B.: Speculative Requirements: Automatic Detection of Uncertainty in Natural Language Requirements. In: Proceedings of 20th IEEE International Conference on Requirements Engineering, pp. 11–20 (2012)
Wiegand, M., Klakow, D.: The Role of Predicates in Opinion Holder Extraction. In: Proceedings of the Workshop on Information Extraction and Knowledge Acquisition, Hissar, Bulgaria, pp. 13–20 (2011)
Lu, B.: Identifying Opinion Holders and Targets with Dependency Parser in Chinese News Texts. In: Proceedings of the NAACL HLT 2010 Student Research Workshop, Los Angeles, California, pp. 46–51 (2010)
Bethard, S., Yu, H., Thornton, A., Hatzivassiloglou, V., Jurafsky, D.: Extracting Opinion Propositions and Opinion Holders Using Syntactic and Lexical Cues. In: Computing Attitude and Affect in Text: Theory and Applications, pp. 125–141. Springer Netherlands (2006)
Apostolova, E., Tomuro, N., Demner-Fushman, D.: Automatic Extraction of Lexico-Syntactic Patterns for Detection of Negation and Speculation Scopes. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Short Papers, Portland, Oregon, pp. 283–287 (2011)
Velldal, E., Ovrelid, L., Oepen, S.: Resolving Speculation: MaxEnt Cue Classification and Dependency-Based Scope Rules. In: Proceedings of the 14th Conference on Computational Natural Language Learning: Shared Task, Uppsala, Sweden, pp. 48–55 (2010)
Zhao, Q., Sun, C., Liu, B., Cheng, Y.: Learning to Detect Hedges and their Scope Using CRFs. In: Proceedings of the 14th Conference on Computational Natural Language Learning: Shared Task, Uppsala, Sweden, pp. 100–105 (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Al-Sabbagh, R., Girju, R., Diesner, J. (2015). A Unified Framework to Identify and Extract Uncertainty Cues, Holders, and Scopes in One Fell-Swoop. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2015. Lecture Notes in Computer Science(), vol 9041. Springer, Cham. https://doi.org/10.1007/978-3-319-18111-0_24
Download citation
DOI: https://doi.org/10.1007/978-3-319-18111-0_24
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18110-3
Online ISBN: 978-3-319-18111-0
eBook Packages: Computer ScienceComputer Science (R0)