research-article

Querying RDF data with text annotated graphs

Authors:

Doreen ChengAuthors Info & Claims

SSDBM '15: Proceedings of the 27th International Conference on Scientific and Statistical Database Management

Article No.: 27, Pages 1 - 12

https://doi.org/10.1145/2791347.2791381

Published: 29 June 2015 Publication History

Abstract

Scientists and casual users need better ways to query RDF databases or Linked Open Data. Using the SPARQL query language requires not only mastering its syntax and semantics but also understanding the RDF data model, the ontology used, and URIs for entities of interest. Natural language query systems are a powerful approach, but current techniques are brittle in addressing the ambiguity and complexity of natural language and require expensive labor to supply the extensive domain knowledge they need. We introduce a compromise in which users give a graphical "skeleton" for a query and annotates it with freely chosen words, phrases and entity names. We describe a framework for interpreting these "schema-agnostic queries" over open domain RDF data that automatically translates them to SPARQL queries. The framework uses semantic textual similarity to find mapping candidates and uses statistical approaches to learn domain knowledge for disambiguation, thus avoiding expensive human efforts required by natural language interface systems. We demonstrate the feasibility of the approach with an implementation that performs well in an evaluation on DBpedia data.

References

[1]

I. Androutsopoulos, G. Ritchie, and P. Thanisch. Natural language interfaces to databases -- an introduction. Natural Language Engineering, 1(01):29--81, 1995.

[2]

S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, R. Cyganiak, and Z. Ives. DBpedia: A Nucleus for a Web of Open Data. In 6th Int. Semantic Web Conf., pages 722--735. Springer, 2007.

Digital Library

[3]

P. Auxerre and R. Inder. Masque modular answering system for queries in english - user's manual. Technical report, Artificial Intelligence Applications Institute, University of Edinburgh, 1986.

[4]

M. Banko and O. Etzioni. The tradeoffs between traditional and open relation extraction. In Proceedings of ACL, 2008.

[5]

C. Bizer, T. Heath, and T. Berners-Lee. Linked data-the story so far. International Journal on Semantic Web and Information Systems (IJSWIS), 5(3):1--22, 2009.

[6]

R. Bunescu and R. Mooney. A shortest path dependency kernel for relation extraction. In Conf. on Human Language Technology and Empirical Methods in Natural Language Processing, pages 724--731, 2005.

Digital Library

[7]

K. Church and P. Hanks. Word association norms, mutual information and lexicography. In Proc. 27th Annual Conf. of the ACL, pages 76--83, 1989.

Digital Library

[8]

P. Cimiano, P. Haase, and J. Heizmann. Porting natural language interfaces between domains: an experimental user study with the ORAKEL system. In Proc. 12th Int. Conf. on Intelligent User Interfaces, pages 180--189. ACM, 2007.

Digital Library

[9]

S. Cohen, J. Mamou, Y. Kanza, and Y. Sagiv. XSearch: A Semantic Search Engine for XML. In VLDB, 2003.

Digital Library

[10]

D. Damljanovic, M. Agatonovic, and H. Cunningham. FREyA: An interactive way of querying Linked Data using natural language. In 1st Workshop on Question Answering over Linked Data, pages 125--138, 2011.

Digital Library

[11]

M.-C. de Marneffe, B. MacCartney, and C. D. Manning. Generating typed dependency parses from phrase structure parses. In 5th Int. Conf. on Language Resources and Evaluation, pages 449--454, 2006.

[12]

O. Erling and I. Mikhailov. RDF support in the virtuoso DBMS. In Networked Knowledge - Networked Media, volume 221, pages 7--24. Springer, 2009.

[13]

T. Finin. Semantic Interpretation of Compound Nominals. PhD thesis, University of Illinois, 1980.

Digital Library

[14]

B. Grosz, D. Appelt, P. Martin, and F. Pereira. Team: an experiment in the design of transportable natural-language interfaces. Artificial Intelligence, 32(2):173--243, 1987.

Digital Library

[15]

F. Haag, S. Lohmann, and T. Ertl. Sparql query composition for everyone. In ESWC Satellite Events, pages 362--367. Springer, 2014.

[16]

L. Han. Schema Free Querying of Semantic Data. PhD thesis, University of Maryland, Baltimore County, August 2014.

[17]

L. Han, T. Finin, and A. Joshi. Schema-free structured querying of DBpedia data. In 21st Conf. on Information and Knowledge Management, pages 2090--2093. ACM, 2012.

Digital Library

[18]

L. Han, T. Finin, P. McNamee, A. Joshi, and Y. Yesha. Improving word similarity by augmenting PMI with estimates of word polysemy. IEEE Trans. on Knowledge and Data Engineering, 2012.

Digital Library

[19]

Z. Harris. Mathematical Structures of Language. Wiley, New York, USA, 1968.

[20]

G. Hendrix, E. Sacerdoti, D. Sagalowicz, and J. Slocum. Developing a natural language interface to complex data. TODS, 3(2):105--147, 1978.

Digital Library

[21]

V. Hristidis and Y. Papakonstantinou. Discover: Keyword search in relational databases. In VLDB, pages 670--681, 2002.

Digital Library

[22]

M. Jarrar and M. D. Dikaiakos. A data mashup language for the data web. In LDOW at WWW, 2009.

[23]

N. Kambhatla. Combining lexical, syntactic and semantic features with maximum entropy models. In Proceedings of ACL, 2004.

Digital Library

[24]

B. Katz and J. Lin. Selectively using relations to improve precision in question answering. In Proc. of the EACL-2003 Workshop on Natural Language Processing for Question Answering, 2003.

[25]

T. Landauer and S. Dumais. A solution to Plato's problem: The latent semantic analysis theory of the acquisition, induction, and representation of knowledge. In Psychological Review, 104, pages 211--240, 1997.

[26]

Y. Lei, V. Uren, and E. Motta. Semsearch: A search engine for the semantic web. In 15th Int. Conf. on Knowledge Engineering and Knowledge Management, pages 238--245. Springer, 2006.

Digital Library

[27]

F. Li, T. Pan, and H. V. Jagadish. Schema-free sql. In SIGMOD, pages 1051--1062, 2014.

Digital Library

[28]

Y. Li, H. Yang, and H. Jagadish. Constructing a generic natural language interface for an xml database. In EDBT, pages 737--754, 2006.

Digital Library

[29]

Y. Li, C. Yu, and H. V. Jagadish. Schema-free XQuery. In VLDB, pages 72--83, 2004.

Digital Library

[30]

D. Lin. Dependency-based evaluation of minipar. In Workshop on the Evaluation of Parsing Systems, 1998.

[31]

D. Lin and P. Pantel. Discovery of inference rules for question answering. Natural Language Engineering, 7(4):343--360, 2001.

Digital Library

[32]

V. Lopez, M. Pasin, and E. Motta. Aqualog: An ontology-portable question answering system for the semantic web. In Proc. European Semantic Web Conf., pages 546--562, 2005.

Digital Library

[33]

V. Lopez, V. Uren, M. Sabou, and E. Motta. Cross Ontology Query Answering on the Semantic Web: An Initial Evaluation. In Proc. 5th Int. Conf. on Knowledge Capture. ACM, 2009.

Digital Library

[34]

R. Mihalcea, C. Corley, and C. Strapparava. Corpus-based and knowledge-based measures of text semantic similarity. In Proc. 21st AAAI, pages 775--780, 2006.

Digital Library

[35]

A.-M. Popescu, O. Etzioni, and H. Kautz. Towards a theory of natural language interfaces to databases. In Proc. 8th Int. Conf. on Intelligent User Interfaces, pages 149--157. ACM, 2003.

Digital Library

[36]

Poweraqua question answering system. http://poweraqua.open.ac.uk:8080/poweraqualinked.

[37]

Qald-1 open challenge test phase: Evaluation results. http://bit.ly/QALD11.

[38]

1st workshop on question answering over linked data. http://www.sc.cit-ec.uni-bielefeld.de/qald-1, 2011.

[39]

R. Rapp. Word sense discovery based on sense descriptor dissimilarity. In Proc. 9th Machine Translation Summit, pages 315--322, 2003.

[40]

A. Schutz and P. Buitelaar. Relext: A tool for relation extraction from text in ontology extension. In Proc. of the 4th ISWC, pages 593--606, 2005.

Digital Library

[41]

D. Schweiger, Z. Trajanoski, and S. Pabinger. Sparqlgraph: a web-based platform for graphically querying biological semantic web databases. BMC Bioinformatics, 15(279), 2014.

[42]

A. Termehchy and M. Winslett. Using structural information in xml keyword search effectively. TODS, 36(01):4:1--4:39, 2011.

Digital Library

[43]

A. Tian, J. F. Sequeda, and D. P. Miranker. Qodi: Query as context in automatic data integration. In ISWC, pages 624--639, 2013.

Digital Library

[44]

K. Toutanova, D. Klein, C. Manning, and Y. Singer. Feature-rich part-of-speech tagging with a cyclic dependency network. In NAACL, pages 173--180, 2003.

Digital Library

[45]

T. Tran, P. Cimiano, S. Rudolph, and R. Studer. Ontology-based Interpretation of Keywords for Semantic Search. In Proc. of the 6th ISWC, pages 523--536. Springer, 2007.

Digital Library

[46]

Trueknowledge (evi) online system. http://trueknowledge.com/.

[47]

W. Tunstall-Pedoe. True knowledge: Open-domain question answering using structured knowledge and inference. AI Magazine, 31(3):80--92, 2010.

[48]

W. Woods, R. Kaplan, and B. Nash-Webber. The lunar sciences natural language information system. Technical Report 2378, BBN, Cambridge MA, 1972.

[49]

Y. Xu and Y. Papakonstantinou. Efficient Keyword Search for Smallest LCAs in XML Databases. In SIGMOD, pages 527--538, 2005.

Digital Library

[50]

D. Yarowsky. Unsupervised word sense disambiguation rivaling supervised methods. In 33rd Annual Meeting of the ACL, pages 189--196, 1995.

Digital Library

[51]

M. M. Zloof. Query by example. In Proceedings of National Computer Conference and Exposition, pages 431--438, 1975.

Digital Library

Cited By

Padia AKalpakis KFerraro FFinin T(2019)Knowledge graph fact prediction via knowledge-enriched tensor factorizationJournal of Web Semantics10.1016/j.websem.2019.01.004Online publication date: Feb-2019
https://doi.org/10.1016/j.websem.2019.01.004
Kashyap AHan LYus RSleeman JSatyapanich TGandhi SFinin T(2016)Robust semantic text similarity using LSA, machine learning, and linguistic resourcesLanguage Resources and Evaluation10.1007/s10579-015-9319-250:1(125-161)Online publication date: 1-Mar-2016
https://dl.acm.org/doi/10.1007/s10579-015-9319-2
Padia AKalpakis KFerraro FFinin T(undefined)Knowledge Graph Fact Prediction Via Knowledge-Enriched Tensor FactorizationSSRN Electronic Journal10.2139/ssrn.3331039
https://doi.org/10.2139/ssrn.3331039

Index Terms

Querying RDF data with text annotated graphs

Recommendations

Annotated RDF

Real-world use of RDF requires the ability to transparently represent and explain metadata associated with RDF triples. For example, when RDF triples are extracted automatically by information extraction programs, there is a need to represent where the ...
Morphologically Annotated Amharic Text Corpora
SIGIR '21: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval

In information retrieval (IR), documents that match the query are retrieved. Search engines usually conflate word variants into a common stem when indexing documents because queries and documents do not need to use exactly the same word variant for the ...
Semantic querying over knowledge in biomedical text corpora annotated with multiple ontologies
BCB '12: Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine

Existing ontology-based knowledge representations systems have achieved considerable success in semantic querying on large biomedical text corpora over keyword-based systems. However, their query expressivity is limited due to the lack of cross-ontology ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

SSDBM '15: Proceedings of the 27th International Conference on Scientific and Statistical Database Management

June 2015

390 pages

ISBN:9781450337090

DOI:10.1145/2791347

Editors:
Amarnath Gupta
University of California San Diego
,
Susan Rathbun
University of California San Diego

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 June 2015

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article

Funding Sources

National Science Foundation

Conference

SSDBM 2015

SSDBM 2015: International Conference on Scientific and Statistical Database Management

June 29 - July 1, 2015

California, La Jolla

Acceptance Rates

Overall Acceptance Rate 56 of 146 submissions, 38%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
119
Total Downloads

Downloads (Last 12 months)1
Downloads (Last 6 weeks)0

Reflects downloads up to 30 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Padia AKalpakis KFerraro FFinin T(2019)Knowledge graph fact prediction via knowledge-enriched tensor factorizationJournal of Web Semantics10.1016/j.websem.2019.01.004Online publication date: Feb-2019
https://doi.org/10.1016/j.websem.2019.01.004
Kashyap AHan LYus RSleeman JSatyapanich TGandhi SFinin T(2016)Robust semantic text similarity using LSA, machine learning, and linguistic resourcesLanguage Resources and Evaluation10.1007/s10579-015-9319-250:1(125-161)Online publication date: 1-Mar-2016
https://dl.acm.org/doi/10.1007/s10579-015-9319-2
Padia AKalpakis KFerraro FFinin T(undefined)Knowledge Graph Fact Prediction Via Knowledge-Enriched Tensor FactorizationSSRN Electronic Journal10.2139/ssrn.3331039
https://doi.org/10.2139/ssrn.3331039

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents