Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2791347.2791381acmotherconferencesArticle/Chapter ViewAbstractPublication PagesssdbmConference Proceedingsconference-collections
research-article

Querying RDF data with text annotated graphs

Published: 29 June 2015 Publication History

Abstract

Scientists and casual users need better ways to query RDF databases or Linked Open Data. Using the SPARQL query language requires not only mastering its syntax and semantics but also understanding the RDF data model, the ontology used, and URIs for entities of interest. Natural language query systems are a powerful approach, but current techniques are brittle in addressing the ambiguity and complexity of natural language and require expensive labor to supply the extensive domain knowledge they need. We introduce a compromise in which users give a graphical "skeleton" for a query and annotates it with freely chosen words, phrases and entity names. We describe a framework for interpreting these "schema-agnostic queries" over open domain RDF data that automatically translates them to SPARQL queries. The framework uses semantic textual similarity to find mapping candidates and uses statistical approaches to learn domain knowledge for disambiguation, thus avoiding expensive human efforts required by natural language interface systems. We demonstrate the feasibility of the approach with an implementation that performs well in an evaluation on DBpedia data.

References

[1]
I. Androutsopoulos, G. Ritchie, and P. Thanisch. Natural language interfaces to databases -- an introduction. Natural Language Engineering, 1(01):29--81, 1995.
[2]
S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, R. Cyganiak, and Z. Ives. DBpedia: A Nucleus for a Web of Open Data. In 6th Int. Semantic Web Conf., pages 722--735. Springer, 2007.
[3]
P. Auxerre and R. Inder. Masque modular answering system for queries in english - user's manual. Technical report, Artificial Intelligence Applications Institute, University of Edinburgh, 1986.
[4]
M. Banko and O. Etzioni. The tradeoffs between traditional and open relation extraction. In Proceedings of ACL, 2008.
[5]
C. Bizer, T. Heath, and T. Berners-Lee. Linked data-the story so far. International Journal on Semantic Web and Information Systems (IJSWIS), 5(3):1--22, 2009.
[6]
R. Bunescu and R. Mooney. A shortest path dependency kernel for relation extraction. In Conf. on Human Language Technology and Empirical Methods in Natural Language Processing, pages 724--731, 2005.
[7]
K. Church and P. Hanks. Word association norms, mutual information and lexicography. In Proc. 27th Annual Conf. of the ACL, pages 76--83, 1989.
[8]
P. Cimiano, P. Haase, and J. Heizmann. Porting natural language interfaces between domains: an experimental user study with the ORAKEL system. In Proc. 12th Int. Conf. on Intelligent User Interfaces, pages 180--189. ACM, 2007.
[9]
S. Cohen, J. Mamou, Y. Kanza, and Y. Sagiv. XSearch: A Semantic Search Engine for XML. In VLDB, 2003.
[10]
D. Damljanovic, M. Agatonovic, and H. Cunningham. FREyA: An interactive way of querying Linked Data using natural language. In 1st Workshop on Question Answering over Linked Data, pages 125--138, 2011.
[11]
M.-C. de Marneffe, B. MacCartney, and C. D. Manning. Generating typed dependency parses from phrase structure parses. In 5th Int. Conf. on Language Resources and Evaluation, pages 449--454, 2006.
[12]
O. Erling and I. Mikhailov. RDF support in the virtuoso DBMS. In Networked Knowledge - Networked Media, volume 221, pages 7--24. Springer, 2009.
[13]
T. Finin. Semantic Interpretation of Compound Nominals. PhD thesis, University of Illinois, 1980.
[14]
B. Grosz, D. Appelt, P. Martin, and F. Pereira. Team: an experiment in the design of transportable natural-language interfaces. Artificial Intelligence, 32(2):173--243, 1987.
[15]
F. Haag, S. Lohmann, and T. Ertl. Sparql query composition for everyone. In ESWC Satellite Events, pages 362--367. Springer, 2014.
[16]
L. Han. Schema Free Querying of Semantic Data. PhD thesis, University of Maryland, Baltimore County, August 2014.
[17]
L. Han, T. Finin, and A. Joshi. Schema-free structured querying of DBpedia data. In 21st Conf. on Information and Knowledge Management, pages 2090--2093. ACM, 2012.
[18]
L. Han, T. Finin, P. McNamee, A. Joshi, and Y. Yesha. Improving word similarity by augmenting PMI with estimates of word polysemy. IEEE Trans. on Knowledge and Data Engineering, 2012.
[19]
Z. Harris. Mathematical Structures of Language. Wiley, New York, USA, 1968.
[20]
G. Hendrix, E. Sacerdoti, D. Sagalowicz, and J. Slocum. Developing a natural language interface to complex data. TODS, 3(2):105--147, 1978.
[21]
V. Hristidis and Y. Papakonstantinou. Discover: Keyword search in relational databases. In VLDB, pages 670--681, 2002.
[22]
M. Jarrar and M. D. Dikaiakos. A data mashup language for the data web. In LDOW at WWW, 2009.
[23]
N. Kambhatla. Combining lexical, syntactic and semantic features with maximum entropy models. In Proceedings of ACL, 2004.
[24]
B. Katz and J. Lin. Selectively using relations to improve precision in question answering. In Proc. of the EACL-2003 Workshop on Natural Language Processing for Question Answering, 2003.
[25]
T. Landauer and S. Dumais. A solution to Plato's problem: The latent semantic analysis theory of the acquisition, induction, and representation of knowledge. In Psychological Review, 104, pages 211--240, 1997.
[26]
Y. Lei, V. Uren, and E. Motta. Semsearch: A search engine for the semantic web. In 15th Int. Conf. on Knowledge Engineering and Knowledge Management, pages 238--245. Springer, 2006.
[27]
F. Li, T. Pan, and H. V. Jagadish. Schema-free sql. In SIGMOD, pages 1051--1062, 2014.
[28]
Y. Li, H. Yang, and H. Jagadish. Constructing a generic natural language interface for an xml database. In EDBT, pages 737--754, 2006.
[29]
Y. Li, C. Yu, and H. V. Jagadish. Schema-free XQuery. In VLDB, pages 72--83, 2004.
[30]
D. Lin. Dependency-based evaluation of minipar. In Workshop on the Evaluation of Parsing Systems, 1998.
[31]
D. Lin and P. Pantel. Discovery of inference rules for question answering. Natural Language Engineering, 7(4):343--360, 2001.
[32]
V. Lopez, M. Pasin, and E. Motta. Aqualog: An ontology-portable question answering system for the semantic web. In Proc. European Semantic Web Conf., pages 546--562, 2005.
[33]
V. Lopez, V. Uren, M. Sabou, and E. Motta. Cross Ontology Query Answering on the Semantic Web: An Initial Evaluation. In Proc. 5th Int. Conf. on Knowledge Capture. ACM, 2009.
[34]
R. Mihalcea, C. Corley, and C. Strapparava. Corpus-based and knowledge-based measures of text semantic similarity. In Proc. 21st AAAI, pages 775--780, 2006.
[35]
A.-M. Popescu, O. Etzioni, and H. Kautz. Towards a theory of natural language interfaces to databases. In Proc. 8th Int. Conf. on Intelligent User Interfaces, pages 149--157. ACM, 2003.
[36]
Poweraqua question answering system. http://poweraqua.open.ac.uk:8080/poweraqualinked.
[37]
Qald-1 open challenge test phase: Evaluation results. http://bit.ly/QALD11.
[38]
1st workshop on question answering over linked data. http://www.sc.cit-ec.uni-bielefeld.de/qald-1, 2011.
[39]
R. Rapp. Word sense discovery based on sense descriptor dissimilarity. In Proc. 9th Machine Translation Summit, pages 315--322, 2003.
[40]
A. Schutz and P. Buitelaar. Relext: A tool for relation extraction from text in ontology extension. In Proc. of the 4th ISWC, pages 593--606, 2005.
[41]
D. Schweiger, Z. Trajanoski, and S. Pabinger. Sparqlgraph: a web-based platform for graphically querying biological semantic web databases. BMC Bioinformatics, 15(279), 2014.
[42]
A. Termehchy and M. Winslett. Using structural information in xml keyword search effectively. TODS, 36(01):4:1--4:39, 2011.
[43]
A. Tian, J. F. Sequeda, and D. P. Miranker. Qodi: Query as context in automatic data integration. In ISWC, pages 624--639, 2013.
[44]
K. Toutanova, D. Klein, C. Manning, and Y. Singer. Feature-rich part-of-speech tagging with a cyclic dependency network. In NAACL, pages 173--180, 2003.
[45]
T. Tran, P. Cimiano, S. Rudolph, and R. Studer. Ontology-based Interpretation of Keywords for Semantic Search. In Proc. of the 6th ISWC, pages 523--536. Springer, 2007.
[46]
Trueknowledge (evi) online system. http://trueknowledge.com/.
[47]
W. Tunstall-Pedoe. True knowledge: Open-domain question answering using structured knowledge and inference. AI Magazine, 31(3):80--92, 2010.
[48]
W. Woods, R. Kaplan, and B. Nash-Webber. The lunar sciences natural language information system. Technical Report 2378, BBN, Cambridge MA, 1972.
[49]
Y. Xu and Y. Papakonstantinou. Efficient Keyword Search for Smallest LCAs in XML Databases. In SIGMOD, pages 527--538, 2005.
[50]
D. Yarowsky. Unsupervised word sense disambiguation rivaling supervised methods. In 33rd Annual Meeting of the ACL, pages 189--196, 1995.
[51]
M. M. Zloof. Query by example. In Proceedings of National Computer Conference and Exposition, pages 431--438, 1975.

Cited By

View all
  • (2019)Knowledge graph fact prediction via knowledge-enriched tensor factorizationJournal of Web Semantics10.1016/j.websem.2019.01.004Online publication date: Feb-2019
  • (2016)Robust semantic text similarity using LSA, machine learning, and linguistic resourcesLanguage Resources and Evaluation10.1007/s10579-015-9319-250:1(125-161)Online publication date: 1-Mar-2016
  • (undefined)Knowledge Graph Fact Prediction Via Knowledge-Enriched Tensor FactorizationSSRN Electronic Journal10.2139/ssrn.3331039

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
SSDBM '15: Proceedings of the 27th International Conference on Scientific and Statistical Database Management
June 2015
390 pages
ISBN:9781450337090
DOI:10.1145/2791347
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 June 2015

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Funding Sources

Conference

SSDBM 2015

Acceptance Rates

Overall Acceptance Rate 56 of 146 submissions, 38%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 30 Aug 2024

Other Metrics

Citations

Cited By

View all
  • (2019)Knowledge graph fact prediction via knowledge-enriched tensor factorizationJournal of Web Semantics10.1016/j.websem.2019.01.004Online publication date: Feb-2019
  • (2016)Robust semantic text similarity using LSA, machine learning, and linguistic resourcesLanguage Resources and Evaluation10.1007/s10579-015-9319-250:1(125-161)Online publication date: 1-Mar-2016
  • (undefined)Knowledge Graph Fact Prediction Via Knowledge-Enriched Tensor FactorizationSSRN Electronic Journal10.2139/ssrn.3331039

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media