Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3041021.3054135acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Multimodal Question Answering over Structured Data with Ambiguous Entities

Published: 03 April 2017 Publication History

Abstract

In recent years, we have witnessed profound changes in the way people satisfy their information needs. For instance, with the ubiquitous 24/7 availability of mobile devices, the number of search engine queries on mobile devices has reportedly overtaken that of queries on regular personal computers. In this paper, we consider the task of multimodal question answering over structured data, in which a user supplies not just a natural language query but also an image. Our system addresses this by optimizing a non-convex objective function capturing multimodal constraints. Our experiments show that this enables it to answer even very challenging ambiguous entity queries with high accuracy.

References

[1]
J. Berant, A. Chou, R. Frostig, and P. Liang. Semantic parsing on Freebase from question-answer pairs. In phThe Conference on Empirical Methods on Natural Language Processing, pages 1533--1544, 2013.
[2]
Q. Cai and A. Yates. Large-scale semantic parsing via schema matching and lexicon extension. In phThe annual meeting of the Association for Computational Linguistics (1), pages 423--433. Citeseer, 2013.
[3]
Y. Cao, H. Wang, C. Wang, Z. Li, L. Zhang, and L. Zhang. MindFinder: interactive sketch-based image search on millions of images. In phACM Multimedia Conference, pages 1605--1608. ACM, 2010.
[4]
T. Chen, M.-M. Cheng, P. Tan, A. Shamir, and S.-M. Hu. Sketch2Photo: Internet image montage. phACM Transactions on Graphics (TOG), 28 (5): 124, 2009.
[5]
P. Cimiano, V. Lopez, C. Unger, E. Cabrio, A.-C. Ngonga Ngomo, and S. Walter. Multilingual question answering over linked data (qald-3): Lab overview. In phProceedings of CLEF 2013, pages 321--332. Springer Berlin Heidelberg, 2013.
[6]
H. T. Dang, D. Kelly, and J. J. Lin. Overview of the TREC 2007 question answering track. In phTREC, volume 7, page 63. Citeseer, 2007.
[7]
G. de Melo and K. Hose. Searching the web of data. In phProc. ECIR 2013, LNCS. Springer, 2013.
[8]
G. de Melo and N. Tandon. Seeing is believing: The quest for multimodal knowledge. phACM SIGWEB Newsletter, (Spring 2016), 2016. ISSN 1931--1745. URL http://dl.acm.org/citation.cfm?id=2903517.
[9]
G. de Melo and G. Weikum. MENTA: Inducing multilingual taxonomies from Wikipedia. In J. Huang, N. Koudas, G. Jones, X. Wu, K. Collins-Thompson, and A. An, editors, phProceedings of the 19th ACM Conference on Information and Knowledge Management (CIKM 2010), pages 1099--1108, New York, NY, USA, October 2010. ACM. ISBN 978--1--4503-0099--5.
[10]
A. Fader, L. Zettlemoyer, and O. Etzioni. Open question answering over curated and extracted knowledge bases. In phACM SIGKDD Conferences on Knowledge Discovery and Data Mining, pages 1156--1165. ACM, 2014.
[11]
013)}ferre2013squall2sparqlS. Ferré. squall2sparql: a translator from Controlled English to full SPARQL 1.1. In phQALD-3t, 2013.
[12]
D. Ferrucci, E. Brown, J. Chu-Carroll, J. Fan, D. Gondek, A. A. Kalyanpur, A. Lally, J. W. Murdock, E. Nyberg, J. Prager, et al. Building Watson: An overview of the DeepQA project. phAI magazine, 31 (3): 59--79, 2010.
[13]
, and Sch\"afer}frank2007questionA. Frank, H.-U. Krieger, F. Xu, H. Uszkoreit, B. Crysmann, B. Jörg, and U. Sch\"afer. Question answering from structured knowledge sources. phJournal of Applied Logic, 5 (1): 20--48, 2007.
[14]
C. Gan, M. Lin, Y. Yang, G. de Melo, and A. G. Hauptmann. Concepts not alone: Exploring pairwise relationships for zero-shot video activity recognition. In phProceedings of the 30th AAAI Conference on Artificial Intelligence (AAAI 2016). AAAI Press, 2016.
[15]
T. Ge, Y. Wang, G. de Melo, and H. Li. Visualizing and curating knowledge graphs over time and space. In phProceedings of ACL 2016. ACL, 2016.
[16]
B. F. Green, Jr., A. K. Wolf, C. Chomsky, and K. Laughery. Baseball: An automatic question-answerer. In phWestern Joint IRE-AIEE-ACM, IRE-AIEE-ACM '61 (Western), pages 219--224, 1961.
[17]
S. He, K. Liu, Y. Zhang, L. Xu, and J. Zhao. Question answering over linked data using first-order logic. In phThe 2017 Conference on Empirical Methods on Natural Language Processing, 2014.
[18]
O. Herzog, J. H. Siekmann, and C. Rollinger. phText Understanding in LILOG: Integrating Computational Linguistics and Artificial Intelligence. Springer-Verlag, 1991.
[19]
Hoffart, Suchanek, Berberich, Lewis-Kelham, De Melo, and Weikum}hoffart2011yago2J. Hoffart, F. M. Suchanek, K. Berberich, E. Lewis-Kelham, G. De Melo, and G. Weikum. YAGO2: exploring and querying world knowledge in time, space, context, and many languages. In phThe World Wide Web conference. ACM, 2011\natexlaba.
[20]
Hoffart, Yosef, Bordino, Fürstenau, Pinkal, Spaniol, Taneva, Thater, and Weikum}Hoffart:2011J. Hoffart, M. A. Yosef, I. Bordino, H. Fürstenau, M. Pinkal, M. Spaniol, B. Taneva, S. Thater, and G. Weikum. Robust disambiguation of named entities in text. In phThe 2017 Conference on Empirical Methods on Natural Language Processing, 2011\natexlabb.
[21]
C. Kwok, O. Etzioni, and D. S. Weld. Scaling question answering to the web. phACM Transactions on Information Systems, 19 (3): 242--262, 2001.
[22]
Y. Li, H. Yang, and H. Jagadish. NaLIX: A generic natural language search environment for XML data. phACM Transactions on Database Systems, 32 (4): 30, 2007.
[23]
C. D. Manning, M. Surdeanu, J. Bauer, J. Finkel, S. J. Bethard, and D. McClosky. The Stanford CoreNLP natural language processing toolkit. In phThe annual meeting of the Association for Computational Linguistics, 2014.
[24]
N. Nakashole, G. Weikum, and F. Suchanek. PATTY: a taxonomy of relational patterns with semantic types. In phENMLP and CoNLL, pages 1135--1145. ACL, 2012.
[25]
and Hose}FrameBaseSchemaJ. Rouces, G. de Melo, and K. Hose. FrameBase: Representing n-ary relations using semantic frames. In phProceedings of ESWC 2015, pages 505--521, 2015.
[26]
ShutovaEtAl2015PerceptSPsE. Shutova, N. Tandon, and G. de Melo. Perceptually grounded selectional preferences. In phProceedings of ACL 2015, pages 950--960, 2015.
[27]
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions. In phConference on Computer Vision and Pattern Recognition, pages 1--9, 2015.
[28]
De, and Weikum}TandonEtAl2015KnowlywoodN. Tandon, G. de Melo, A. De, and G. Weikum. Knowlywood: Mining activity knowledge from Hollywood narratives. In phProceedings of CIKM 2015, 2015.
[29]
C. Unger and P. Cimiano. Pythia: Compositional meaning construction for ontology-based question answering on the semantic web. In phInternational Conference on Natural Language and Information Systems, pages 153--160. Springer, 2011.
[30]
T. Winograd. Understanding natural language. phCognitive psychology, 3 (1): 1--191, 1972.
[31]
W. A. Woods and R. Kaplan. Lunar rocks in natural English: Explorations in natural language question answering. phLinguistic structures processing, 5: 521--569, 1977.
[32]
H. Xu, Y. Wang, K. Feng, G. de Melo, W. Wu, A. Sharf, and B. Chen. Shapelearner: Towards shape-based visual knowledge harvesting. In G. A. Kaminka, M. Fox, P. Bouquet, E. Hüllermeier, V. Dignum, F. Dignum, and F. van Harmelen, editors, phProceedings of ECAI 2016, volume 285 of phFrontiers in Artificial Intelligence and Applications, pages 435--443. IOS Press, 2016. ISBN 978--1--61499--671--2. 10.3233/978--1--61499--672--9--435.
[33]
M. Yahya, K. Berberich, S. Elbassuoni, M. Ramanath, V. Tresp, and G. Weikum. Natural language questions for the web of data. In phThe Conference on Empirical Methods on Natural Language Processing and Computational Natural Language Learning, pages 379--390, 2012.
[34]
M. A. Yosef, J. Hoffart, I. Bordino, M. Spaniol, and G. Weikum. AIDA: An online tool for accurate disambiguation of named entities in text and tables. phInternational Conference on Very Large Data Bases, 4 (12): 1450--1453, 2011.
[35]
Z. Zheng. AnswerBus question answering system. In phHTL, pages 399--404, 2002.
[36]
and Fei-Fei}zhu2015buildingY. Zhu, C. Zhang, C. Ré, and L. Fei-Fei. Building a large-scale multimodal knowledge base for visual question answering. pharXiv 1507.05670, 2015.
[37]
L. Zou, R. Huang, H. Wang, J. X. Yu, W. He, and D. Zhao. Natural language question answering over RDF: a graph data driven approach. In phProc. ACM-SIGMOD Int. Conf. on Management of Data, pages 313--324. ACM, 2014.

Cited By

View all
  • (2024)Are my answers medically accurate? Exploiting medical knowledge graphs for medical question answeringApplied Intelligence10.1007/s10489-024-05282-854:2(2172-2187)Online publication date: 31-Jan-2024
  • (2021)A knowledge graph based question answering method for medical domainPeerJ Computer Science10.7717/peerj-cs.6677(e667)Online publication date: 1-Sep-2021
  • (2021)Intelligent SPARQL Query Generation for Natural Language Processing SystemsIEEE Access10.1109/ACCESS.2021.31306679(158638-158650)Online publication date: 2021
  • Show More Cited By

Index Terms

  1. Multimodal Question Answering over Structured Data with Ambiguous Entities

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      WWW '17 Companion: Proceedings of the 26th International Conference on World Wide Web Companion
      April 2017
      1738 pages
      ISBN:9781450349147

      Sponsors

      • IW3C2: International World Wide Web Conference Committee

      In-Cooperation

      Publisher

      International World Wide Web Conferences Steering Committee

      Republic and Canton of Geneva, Switzerland

      Publication History

      Published: 03 April 2017

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. multimodal
      2. multimodal knowledge bases
      3. question answering

      Qualifiers

      • Research-article

      Funding Sources

      • Joint NSFC-ISF Research Program 61561146397
      • National Natural Science Foundation of China
      • National 973 Program
      • Shandong Provincial Science and Technology Development Program

      Conference

      WWW '17
      Sponsor:
      • IW3C2

      Acceptance Rates

      WWW '17 Companion Paper Acceptance Rate 164 of 966 submissions, 17%;
      Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)14
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 24 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Are my answers medically accurate? Exploiting medical knowledge graphs for medical question answeringApplied Intelligence10.1007/s10489-024-05282-854:2(2172-2187)Online publication date: 31-Jan-2024
      • (2021)A knowledge graph based question answering method for medical domainPeerJ Computer Science10.7717/peerj-cs.6677(e667)Online publication date: 1-Sep-2021
      • (2021)Intelligent SPARQL Query Generation for Natural Language Processing SystemsIEEE Access10.1109/ACCESS.2021.31306679(158638-158650)Online publication date: 2021
      • (2020)Incorporating pragmatic reasoning communication into emergent languageProceedings of the 34th International Conference on Neural Information Processing Systems10.5555/3495724.3496592(10348-10359)Online publication date: 6-Dec-2020
      • (2019)Can Image Captioning Help Passage Retrieval in Multimodal Question Answering?Advances in Information Retrieval10.1007/978-3-030-15719-7_12(94-101)Online publication date: 7-Apr-2019
      • (2018)Natural Language Data Management and InterfacesSynthesis Lectures on Data Management10.2200/S00866ED1V01Y201807DTM04910:2(1-156)Online publication date: 13-Aug-2018
      • (2017)FrameBaseSemantic Web10.3233/SW-1702798:6(817-850)Online publication date: 1-Jan-2017
      • (2017)Knowledge Graphs: Venturing Out into the WildKnowledge Graphs and Language Technology10.1007/978-3-319-68723-0_1(1-9)Online publication date: 29-Oct-2017

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media