research-article

Uncertainty detection in natural language: a probabilistic model

Authors:

Pierre-Antoine Jean,

Sébastien Harispe,

Patrice Bellot,

Jacky MontmainAuthors Info & Claims

WIMS '16: Proceedings of the 6th International Conference on Web Intelligence, Mining and Semantics

Article No.: 10, Pages 1 - 10

https://doi.org/10.1145/2912845.2912873

Published: 13 June 2016 Publication History

Abstract

Designing approaches able to automatically detect uncertain expressions within natural language is central to design efficient models based on text analysis, in particular in domains such as question-answering, approximate reasoning, knowledge-based population. This article proposes an overview of several contributions and classifications defining the concept of uncertainty expressions in natural language, and the related detection methods that have been proposed so far. A new supervised and generic approach is next introduced for this specific task; it is based on the statistical analysis of multiple lexical and syntactic features used to characterize sentences through vector-based representations that can be analyzed by proven classification methods. The global performance of our approach is demonstrated and discussed with regard to various dimensions of uncertainty and text specificities.

This method is available for download at https://github.com/pajean/uncertaintyDetection.

References

[1]

A. B. Abacha and P. Zweigenbaum. Means: A medical question-answering system combining nlp techniques and semantic web technologies. Information Processing and Management, 5:570--594, 2015.

Digital Library

[2]

L. Chen and B. D. Eugenio. A lucene and maximum-entropy model based hedge detection system. Fourteenth Conference on Computational Natural Language Learning, pages 114--119, 2010.

Digital Library

[3]

Y. W. Chen and C. J. Lin. Combining svms with various feature selection strategies. Feature extraction, pages 315--324, 2006.

[4]

N. Cruz, M. Taboada, and R. Mitkov. A machine learning approach to negation and speculation detection. Association for Information Science and Technology, 2015.

Digital Library

[5]

R. Farkas, V. Vincze, G. Mora, J. Csirik, and G. Szarvas. The conll-2010 shared task: learning to detect hedges and their scope in natural language text. Fourteenth Conference on Computational Natural Language Learning, pages 1--12, 2010.

Digital Library

[6]

S. Ferson, J. O'Rawe, A. Antonenko, J. Siegrist, J. Mickley, C. C. Luhmann, K. Sentz, and A. M. Finkel. Natural language of uncertainty: numeric hedge words. International Journal of Approximate Reasoning, 57:19--39, 2015.

Digital Library

[7]

V. Ganter and M. Strube. Finding hedges by chasing weasels: Hedge detection using wikipedia tags and shallow linguistic features. ACL-IJCNLP, 173--176, 2009.

Digital Library

[8]

P. Gaspar, J. Carbonell, and J. L. Oliveira. On the parameter optimization of support vector machines for binary classification. J Integr Bioinform, 9(3):201, 2012.

[9]

M. Georgescul. A hedgehop over a max-margin framework using hedge cues. Fourteenth Conference on Computational Natural Language Learning, pages 26--31, 2010.

Digital Library

[10]

H. Hamdan. Sentiment Analysis in Social Media. P.h.d thesis, Université d'Aix-Marseille, 2015.

[11]

S. Harispe, S. Ranwez, S. Janaqi, and J. Montmain. Semantic similarity from natural language and ontology analysis. Synthesis Lectures on Human Language Technologies, 8(1):1--254, 2015.

[12]

T. Joachims. Learning to classify text using support vector machines: Methods, theory and algorithms. Kluwer Academic Publishers, page 205, 2002.

Digital Library

[13]

A. L. Jousselme, P. Maupin, and E. Bosse. Uncertainty in a situation analysis perspective. Sixth International Conference of Information Fusion, pages 1207--1214, 2003.

[14]

N. Konstantinova, S. C. de Sousa, N. P. C. Diaz, M. J. M. Lopez, M. Taboada, and R. Mitkov. A review corpus annotated for negation, speculation and their scope. LRE, pages 3190--3195, 2012.

[15]

G. Lakoff. Hedges: A study in meaning criteria and the logic of fuzzy concepts. Journal of philosophical logic, 2(4):458--508, 1973.

[16]

R. Lavalley, C. Clavel, and P. Bellot. Extraction probabiliste de chaînes de mots relatives à une opinion. Traitement Automatique des Langues, 51:101--130, 2010.

[17]

M. Light, X. Y. Qiu, and P. Srinivasan. The language of bioscience: Facts, speculations, and statements in between. BioLink 2004 workshop on linking biological literature, ontologies and databases: tools for users, pages 17--24, 2004.

[18]

L. Ovrelid, E. Velldal, and S. Oepen. Syntactic scope resolution in uncertainty analysis. 23rd International Conference on Computational Linguistics, 10:1379--1387, 2010.

Digital Library

[19]

B. Pang and L. Lee. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. The 42nd annual meeting on Association for Computational Linguistics, Association for Computational Linguistics:271, 2004.

Digital Library

[20]

F. Sebastiani. Machine learning in automated text categorization. ACM computing surveys, 34(1):1--47, 2002.

Digital Library

[21]

P. Smets. Imperfect information: Imprecision and uncertainty. Uncertainty Management in Information Systems, pages 225--254, 1997.

[22]

D. E. Smith. A source book in mathematics. Courier Corporation, 2012.

[23]

G. Szarvas, V. Vincze, R. Farkas, and J. Csirik. The bioscope corpus: annotation for negation, uncertainty and their scope in biomedical texts. Workshop on Current Trends in Biomedical Natural Language Processing, pages 38--45, 2008.

Digital Library

[24]

G. Szarvas, V. Vincze, R. Farkas, G. Móra, and I. Gurevych. Cross-genre and cross-domain detection of semantic uncertainty. Computational Linguistics, 38(2):335--367, 2012.

Digital Library

[25]

K. S. Tai, R. Socher, and C. D. Manning. Improved semantic representations from tree-structured long short-term memory networks. eprint arXiv:1503.00075, 2015.

[26]

B. Tang, X. Wang, X. Wang, B. Yuan, and S. Fan. A cascade method for detecting hedges and their scope in natural language text. Fourteenth Conference on Computational Natural Language Learning, pages 13--17, 2010.

Digital Library

[27]

V. Vincze. Weasels, hedges and peacocks: Discourse-level uncertainty in wikipedia articles. In IJCNLP, pages 383--391, 2013.

[28]

V. Vincze. Uncertainty detection in natural language texts. PhD, University of Szeged, page 141, 2014.

[29]

A. S. Wu, B. H. Do, J. Kim, and D. L. Rubin. Evalution of negation and uncertainty detection and its impact on precision and recall in search. Journal of Digital Imaging, 24(2):234--242, 2011.

[30]

Y. Yang and J. O. Pedersen. A comparative study on feature selection in text categorization. ICML, 97:412--420, 1997.

Digital Library

Cited By

Lu JZhang HXiao YWang Y(2024)An Environmental Uncertainty Perception Framework for Misinformation Detection and Spread Prediction in the COVID-19 Pandemic: Artificial Intelligence ApproachJMIR AI10.2196/472403(e47240)Online publication date: 29-Jan-2024
https://doi.org/10.2196/47240
Pan QYang PZhang J(2024)BayesTSF: Measuring Uncertainty Estimation in Industrial Time Series Forecasting from a Bayesian PerspectiveAdvanced Intelligent Computing Technology and Applications10.1007/978-981-97-5581-3_7(81-93)Online publication date: 1-Aug-2024
https://doi.org/10.1007/978-981-97-5581-3_7
Diaf SSchütze F(2024)Uncovering Uncertainty in Narrative Economics: A Semantic Search ApproachNew Frontiers in Textual Data Analysis10.1007/978-3-031-55917-4_26(323-335)Online publication date: 24-Sep-2024
https://doi.org/10.1007/978-3-031-55917-4_26
Show More Cited By

Uncertainty detection in natural language: a probabilistic model
1. Computing methodologies
  1. Artificial intelligence

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

WIMS '16: Proceedings of the 6th International Conference on Web Intelligence, Mining and Semantics

June 2016

309 pages

ISBN:9781450340564

DOI:10.1145/2912845

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 June 2016

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

WIMS '16

WIMS '16: International Conference on Web Intelligence, Mining and Semantics

June 13 - 15, 2016

Nîmes, France

Acceptance Rates

WIMS '16 Paper Acceptance Rate 36 of 53 submissions, 68%;

Overall Acceptance Rate 140 of 278 submissions, 50%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

11
Total Citations
View Citations
386
Total Downloads

Downloads (Last 12 months)48
Downloads (Last 6 weeks)4

Reflects downloads up to 26 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Lu JZhang HXiao YWang Y(2024)An Environmental Uncertainty Perception Framework for Misinformation Detection and Spread Prediction in the COVID-19 Pandemic: Artificial Intelligence ApproachJMIR AI10.2196/472403(e47240)Online publication date: 29-Jan-2024
https://doi.org/10.2196/47240
Pan QYang PZhang J(2024)BayesTSF: Measuring Uncertainty Estimation in Industrial Time Series Forecasting from a Bayesian PerspectiveAdvanced Intelligent Computing Technology and Applications10.1007/978-981-97-5581-3_7(81-93)Online publication date: 1-Aug-2024
https://doi.org/10.1007/978-981-97-5581-3_7
Diaf SSchütze F(2024)Uncovering Uncertainty in Narrative Economics: A Semantic Search ApproachNew Frontiers in Textual Data Analysis10.1007/978-3-031-55917-4_26(323-335)Online publication date: 24-Sep-2024
https://doi.org/10.1007/978-3-031-55917-4_26
Boguslav MSalem NWhite ESullivan KBada MHernandez TLeach SHunter L(2023)Creating an ignorance-base: Exploring known unknowns in the scientific literatureJournal of Biomedical Informatics10.1016/j.jbi.2023.104405143(104405)Online publication date: Jul-2023
https://doi.org/10.1016/j.jbi.2023.104405
Solarte Pabón OMontenegro OTorrente MRodríguez González AProvencio MMenasalvas E(2022)Negation and uncertainty detection in clinical texts written in Spanish: a deep learning-based approachPeerJ Computer Science10.7717/peerj-cs.9138(e913)Online publication date: 7-Mar-2022
https://doi.org/10.7717/peerj-cs.913
Ebert FCastor FNovielli NSerebrenik A(2021)An exploratory study on confusion in code reviewsEmpirical Software Engineering10.1007/s10664-020-09909-526:1Online publication date: 1-Jan-2021
https://dl.acm.org/doi/10.1007/s10664-020-09909-5
Sinha MAgarwal NDasgupta THuang RWu DMarchionini GHe DCunningham SHansen P(2020)Relation Aware Attention Model for Uncertainty Detection in TextProceedings of the ACM/IEEE Joint Conference on Digital Libraries in 202010.1145/3383583.3398613(437-440)Online publication date: 1-Aug-2020
https://dl.acm.org/doi/10.1145/3383583.3398613
Sinha MDasgupta T(2020)Detecting Uncertainty in Text using Multi-Channel CNN-TreeBiLSTM NetworkCompanion Proceedings of the Web Conference 202010.1145/3366424.3382713(92-93)Online publication date: 20-Apr-2020
https://dl.acm.org/doi/10.1145/3366424.3382713
Omero PValotto MBellana RBongelli RRiccioni IZuczkowski ATasso C(2020)Writer’s uncertainty identification in scientific biomedical articles: a tool for automatic if-clause taggingLanguage Resources and Evaluation10.1007/s10579-020-09491-854:4(1161-1181)Online publication date: 11-Jun-2020
https://doi.org/10.1007/s10579-020-09491-8
Isenegger KDong YShang MFurst JStan-Raicu D(2019)Characterizing and Quantifying Diagnostic (Un)Certainty in Medical Reports through Natural Language Processing2019 International Conference on Computational Science and Computational Intelligence (CSCI)10.1109/CSCI49370.2019.00174(914-919)Online publication date: Dec-2019
https://doi.org/10.1109/CSCI49370.2019.00174
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten