Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

A temporal-probabilistic database model for information extraction

Published: 01 September 2013 Publication History

Abstract

Temporal annotations of facts are a key component both for building a high-accuracy knowledge base and for answering queries over the resulting temporal knowledge base with high precision and recall. In this paper, we present a temporal-probabilistic database model for cleaning uncertain temporal facts obtained from information extraction methods. Specifically, we consider a combination of temporal deduction rules, temporal consistency constraints and probabilistic inference based on the common possible-worlds semantics with data lineage, and we study the theoretical properties of this data model. We further develop a query engine which is capable of scaling to very large temporal knowledge bases, with nearly interactive query response times over millions of uncertain facts and hundreds of thousands of grounded rules. Our experiments over two real-world datasets demonstrate the increased robustness of our approach compared to related techniques based on constraint solving via Integer Linear Programming (ILP) and probabilistic inference via Markov Logic Networks (MLNs). We are also able to show that our runtime performance is more than competitive to current ILP solvers and the fastest available, probabilistic but non-temporal, database engines.

References

[1]
S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. Addison-Wesley, 1995.
[2]
J. Allen. Maintaining knowledge about temporal intervals. Commun. ACM, 26(11), 1983.
[3]
P. Alvaro, W. Marczak, N. Conway, J. M. Hellerstein, D. Maier, and R. C. Sears. Dedalus: Datalog in time and space. Technical Report UCB/EECS-2009-173, University of California, Berkeley, 2009.
[4]
L. Antova, T. Jansen, C. Koch, and D. Olteanu. Fast and simple relational processing of uncertain data. In ICDE, 2008.
[5]
M. Arenas, L. E. Bertossi, and J. Chomicki. Consistent query answers in inconsistent databases. In PODS, 1999.
[6]
O. Benjelloun, A. D. Sarma, A. Y. Halevy, M. Theobald, and J. Widom. Databases with Uncertainty and Lineage. The VLDB Journal, 17(2), 2008.
[7]
J. Boulos, N. N. Dalvi, B. Mandhani, S. Mathur, C. Ré, and D. Suciu. MYSTIQ: a system for finding more answers by using probabilities. In SIGMOD, 2005.
[8]
M. J. Cafarella, A. Halevy, D. Z. Wang, E. Wu, and Y. Zhang. WebTables: exploring the power of tables on the web. PVLDB, 1(1), 2008.
[9]
N. Dalvi and D. Suciu. The dichotomy of conjunctive queries on probabilistic structures. In PODS, 2007.
[10]
N. Dalvi and D. Suciu. Efficient query evaluation on probabilistic databases. The VLDB Journal, 16(4), 2007.
[11]
A. Dekhtyar, R. Ross, and V. S. Subrahmanian. Probabilistic temporal databases, I: Algebra. ACM Trans. Database Syst., 26, 2001.
[12]
A. Dignös, M. H. Böhlen, and J. Gamper. Temporal alignment. In SIGMOD, 2012.
[13]
M. Dylla, I. Miliaraki, and M. Theobald. Top-k Query Processing in Probabilistic Databases with Non-Materialized Views. In ICDE, 2013.
[14]
T. Emrich, H.-P. Kriegel, N. Mamoulis, M. Renz, and A. Züfle. Querying uncertain spatio-temporal data. In ICDE, 2012.
[15]
M. Fisher, D. Gabbay, and L. Vila. Handbook of Temporal Reasoning in Artificial Intelligence. Elsevier, 2005.
[16]
J. R. Frank, M. Kleiman-Weiner, D. A. Roberts, F. Niu, C. Zhang, C. Re, and I. Soboroff. Building an entity-centric stream filtering test collection for TREC 2012. In TREC, 2012.
[17]
J. Hoffart, F. M. Suchanek, K. Berberich, and G. Weikum. YAGO2: A spatially and temporally enhanced knowledge base from Wikipedia. Artificial Intelligence, 194(0), 2013.
[18]
C. S. Jensen. Temporal Database Management. PhD thesis, Aalborg University, 2000.
[19]
A. Jha and D. Suciu. Knowledge compilation meets database theory: compiling queries to decision diagrams. In ICDT, 2011.
[20]
H. Ji and R. Grishman. Knowledge base population: successful approaches and challenges. HLT, 2011.
[21]
P. C. Kanellakis, G. M. Kuper, and P. Z. Revesz. Constraint query languages. In PODS, 1990.
[22]
C. Koch and D. Olteanu. Conditioning probabilistic databases. PVLDB, 1(1), 2008.
[23]
X. Ling and D. S. Weld. Temporal information extraction. In AAAI, 2010.
[24]
L. Liu and M. T. Zsu. Encyclopedia of Database Systems. Springer, 1st edition, 2009.
[25]
N. Nakashole, G. Weikum, and F. M. Suchanek. PATTY: A taxonomy of relational patterns with semantic types. In EMNLP, 2012.
[26]
F. Niu, C. Ré, A. Doan, and J. Shavlik. Tuffy: scaling up statistical inference in Markov Logic Networks using an RDBMS. PVLDB, 4(6), 2011.
[27]
D. Olteanu and J. Huang. Using OBDDs for efficient query evaluation on probabilistic databases. In SUM, 2008.
[28]
D. Olteanu, J. Huang, and C. Koch. SPROUT: Lazy vs. eager query plans for tuple-independent probabilistic databases. In ICDE, 2009.
[29]
D. Olteanu, J. Huang, and C. Koch. Approximate confidence computation in probabilistic databases. In ICDE, 2010.
[30]
J. Pustejovsky, J. M. Castaño, R. Ingria, R. Sauri, R. J. Gaizauskas, A. Setzer, G. Katz, and D. R. Radev. TimeML: Robust specification of event and temporal expressions in text. In New Directions in Question Answering, 2003.
[31]
M. Richardson and P. Domingos. Markov Logic Networks. Mach. Learn., 62(1-2), 2006.
[32]
S. Riedel. Improving the accuracy and efficiency of MAP inference for Markov Logic. In UAI, 2008.
[33]
A. D. Sarma, M. Theobald, and J. Widom. LIVE: a lineage-supported versioned DBMS. In SSDBM, 2010.
[34]
P. Sen, A. Deshpande, and L. Getoor. Read-once functions and query evaluation in probabilistic databases. PVLDB, 3(1), 2010.
[35]
D. Suciu, D. Olteanu, C. Ré, and C. Koch. Probabilistic Databases. Morgan & Claypool Publishers, 2011.
[36]
P. P. Talukdar, D. Wijaya, and T. Mitchell. Coupled temporal scoping of relational facts. In WSDM, 2012.
[37]
M. Verhagen, R. Saurí, T. Caselli, and J. Pustejovsky. SemEval-2010 Task 13: TempEval-2. In SemEval, 2010.
[38]
L. Vila. A survey on temporal reasoning in artificial intelligence. AI Commun., 7(1), 1994.
[39]
Y. Wang, M. Dylla, M. Spaniol, and G. Weikum. Coupling label propagation and constraints for temporal fact extraction. In ACL, 2012.
[40]
Y. Wang, M. Yahya, and M. Theobald. Time-aware Reasoning in Uncertain Knowledge Bases. In MUD, 2010.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 6, Issue 14
September 2013
384 pages

Publisher

VLDB Endowment

Publication History

Published: 01 September 2013
Published in PVLDB Volume 6, Issue 14

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)9
  • Downloads (Last 6 weeks)1
Reflects downloads up to 18 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)A survey for managing temporal data in RDFInformation Systems10.1016/j.is.2024.102368122:COnline publication date: 2-Jul-2024
  • (2023)Probabilistic Reasoning at Scale: Trigger Graphs to the RescueProceedings of the ACM on Management of Data10.1145/35887191:1(1-27)Online publication date: 30-May-2023
  • (2021)RDF for temporal data management – a surveyEarth Science Informatics10.1007/s12145-021-00574-w14:2(563-599)Online publication date: 8-Mar-2021
  • (2020)Utilizing spatio-temporal data in multi-agent simulationProceedings of the Winter Simulation Conference10.5555/3466184.3466210(242-253)Online publication date: 14-Dec-2020
  • (2020)Utilizing Spatio-Temporal Data in Multi-Agent Simulation2020 Winter Simulation Conference (WSC)10.1109/WSC48552.2020.9384124(242-253)Online publication date: 14-Dec-2020
  • (2020)Semantic Technologies for Situation AwarenessKI - Künstliche Intelligenz10.1007/s13218-020-00694-334:4(543-550)Online publication date: 3-Nov-2020
  • (2019)Ontology-based query answering for probabilistic temporal dataProceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v33i01.33012903(2903-2910)Online publication date: 27-Jan-2019
  • (2019)Lifted Temporal Most Probable ExplanationGraph-Based Representation and Reasoning10.1007/978-3-030-23182-8_6(72-85)Online publication date: 19-Jun-2019
  • (2019)From Big Data to Big KnowledgeSOFSEM 2019: Theory and Practice of Computer Science10.1007/978-3-030-10801-4_5(50-53)Online publication date: 27-Jan-2019
  • (2018)Towards Probabilistic Bitemporal Knowledge GraphsCompanion Proceedings of the The Web Conference 201810.1145/3184558.3191637(1757-1762)Online publication date: 23-Apr-2018
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media