research-article

Multimodal Question Answering over Structured Data with Ambiguous Entities

Authors:

Gerard de Melo,

Baoquan ChenAuthors Info & Claims

WWW '17 Companion: Proceedings of the 26th International Conference on World Wide Web Companion

Pages 79 - 88

https://doi.org/10.1145/3041021.3054135

Published: 03 April 2017 Publication History

Abstract

In recent years, we have witnessed profound changes in the way people satisfy their information needs. For instance, with the ubiquitous 24/7 availability of mobile devices, the number of search engine queries on mobile devices has reportedly overtaken that of queries on regular personal computers. In this paper, we consider the task of multimodal question answering over structured data, in which a user supplies not just a natural language query but also an image. Our system addresses this by optimizing a non-convex objective function capturing multimodal constraints. Our experiments show that this enables it to answer even very challenging ambiguous entity queries with high accuracy.

References

[1]

J. Berant, A. Chou, R. Frostig, and P. Liang. Semantic parsing on Freebase from question-answer pairs. In phThe Conference on Empirical Methods on Natural Language Processing, pages 1533--1544, 2013.

[2]

Q. Cai and A. Yates. Large-scale semantic parsing via schema matching and lexicon extension. In phThe annual meeting of the Association for Computational Linguistics (1), pages 423--433. Citeseer, 2013.

[3]

Y. Cao, H. Wang, C. Wang, Z. Li, L. Zhang, and L. Zhang. MindFinder: interactive sketch-based image search on millions of images. In phACM Multimedia Conference, pages 1605--1608. ACM, 2010.

Digital Library

[4]

T. Chen, M.-M. Cheng, P. Tan, A. Shamir, and S.-M. Hu. Sketch2Photo: Internet image montage. phACM Transactions on Graphics (TOG), 28 (5): 124, 2009.

Digital Library

[5]

P. Cimiano, V. Lopez, C. Unger, E. Cabrio, A.-C. Ngonga Ngomo, and S. Walter. Multilingual question answering over linked data (qald-3): Lab overview. In phProceedings of CLEF 2013, pages 321--332. Springer Berlin Heidelberg, 2013.

Digital Library

[6]

H. T. Dang, D. Kelly, and J. J. Lin. Overview of the TREC 2007 question answering track. In phTREC, volume 7, page 63. Citeseer, 2007.

[7]

G. de Melo and K. Hose. Searching the web of data. In phProc. ECIR 2013, LNCS. Springer, 2013.

Digital Library

[8]

G. de Melo and N. Tandon. Seeing is believing: The quest for multimodal knowledge. phACM SIGWEB Newsletter, (Spring 2016), 2016. ISSN 1931--1745. URL http://dl.acm.org/citation.cfm?id=2903517.

Digital Library

[9]

G. de Melo and G. Weikum. MENTA: Inducing multilingual taxonomies from Wikipedia. In J. Huang, N. Koudas, G. Jones, X. Wu, K. Collins-Thompson, and A. An, editors, phProceedings of the 19th ACM Conference on Information and Knowledge Management (CIKM 2010), pages 1099--1108, New York, NY, USA, October 2010. ACM. ISBN 978--1--4503-0099--5.

Digital Library

[10]

A. Fader, L. Zettlemoyer, and O. Etzioni. Open question answering over curated and extracted knowledge bases. In phACM SIGKDD Conferences on Knowledge Discovery and Data Mining, pages 1156--1165. ACM, 2014.

Digital Library

[11]

013)}ferre2013squall2sparqlS. Ferré. squall2sparql: a translator from Controlled English to full SPARQL 1.1. In phQALD-3t, 2013.

[12]

D. Ferrucci, E. Brown, J. Chu-Carroll, J. Fan, D. Gondek, A. A. Kalyanpur, A. Lally, J. W. Murdock, E. Nyberg, J. Prager, et al. Building Watson: An overview of the DeepQA project. phAI magazine, 31 (3): 59--79, 2010.

[13]

, and Sch\"afer}frank2007questionA. Frank, H.-U. Krieger, F. Xu, H. Uszkoreit, B. Crysmann, B. Jörg, and U. Sch\"afer. Question answering from structured knowledge sources. phJournal of Applied Logic, 5 (1): 20--48, 2007.

[14]

C. Gan, M. Lin, Y. Yang, G. de Melo, and A. G. Hauptmann. Concepts not alone: Exploring pairwise relationships for zero-shot video activity recognition. In phProceedings of the 30th AAAI Conference on Artificial Intelligence (AAAI 2016). AAAI Press, 2016.

Digital Library

[15]

T. Ge, Y. Wang, G. de Melo, and H. Li. Visualizing and curating knowledge graphs over time and space. In phProceedings of ACL 2016. ACL, 2016.

[16]

B. F. Green, Jr., A. K. Wolf, C. Chomsky, and K. Laughery. Baseball: An automatic question-answerer. In phWestern Joint IRE-AIEE-ACM, IRE-AIEE-ACM '61 (Western), pages 219--224, 1961.

Digital Library

[17]

S. He, K. Liu, Y. Zhang, L. Xu, and J. Zhao. Question answering over linked data using first-order logic. In phThe 2017 Conference on Empirical Methods on Natural Language Processing, 2014.

[18]

O. Herzog, J. H. Siekmann, and C. Rollinger. phText Understanding in LILOG: Integrating Computational Linguistics and Artificial Intelligence. Springer-Verlag, 1991.

Digital Library

[19]

Hoffart, Suchanek, Berberich, Lewis-Kelham, De Melo, and Weikum}hoffart2011yago2J. Hoffart, F. M. Suchanek, K. Berberich, E. Lewis-Kelham, G. De Melo, and G. Weikum. YAGO2: exploring and querying world knowledge in time, space, context, and many languages. In phThe World Wide Web conference. ACM, 2011\natexlaba.

Digital Library

[20]

Hoffart, Yosef, Bordino, Fürstenau, Pinkal, Spaniol, Taneva, Thater, and Weikum}Hoffart:2011J. Hoffart, M. A. Yosef, I. Bordino, H. Fürstenau, M. Pinkal, M. Spaniol, B. Taneva, S. Thater, and G. Weikum. Robust disambiguation of named entities in text. In phThe 2017 Conference on Empirical Methods on Natural Language Processing, 2011\natexlabb.

Digital Library

[21]

C. Kwok, O. Etzioni, and D. S. Weld. Scaling question answering to the web. phACM Transactions on Information Systems, 19 (3): 242--262, 2001.

Digital Library

[22]

Y. Li, H. Yang, and H. Jagadish. NaLIX: A generic natural language search environment for XML data. phACM Transactions on Database Systems, 32 (4): 30, 2007.

Digital Library

[23]

C. D. Manning, M. Surdeanu, J. Bauer, J. Finkel, S. J. Bethard, and D. McClosky. The Stanford CoreNLP natural language processing toolkit. In phThe annual meeting of the Association for Computational Linguistics, 2014.

[24]

N. Nakashole, G. Weikum, and F. Suchanek. PATTY: a taxonomy of relational patterns with semantic types. In phENMLP and CoNLL, pages 1135--1145. ACL, 2012.

Digital Library

[25]

and Hose}FrameBaseSchemaJ. Rouces, G. de Melo, and K. Hose. FrameBase: Representing n-ary relations using semantic frames. In phProceedings of ESWC 2015, pages 505--521, 2015.

Digital Library

[26]

ShutovaEtAl2015PerceptSPsE. Shutova, N. Tandon, and G. de Melo. Perceptually grounded selectional preferences. In phProceedings of ACL 2015, pages 950--960, 2015.

[27]

C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions. In phConference on Computer Vision and Pattern Recognition, pages 1--9, 2015.

[28]

De, and Weikum}TandonEtAl2015KnowlywoodN. Tandon, G. de Melo, A. De, and G. Weikum. Knowlywood: Mining activity knowledge from Hollywood narratives. In phProceedings of CIKM 2015, 2015.

Digital Library

[29]

C. Unger and P. Cimiano. Pythia: Compositional meaning construction for ontology-based question answering on the semantic web. In phInternational Conference on Natural Language and Information Systems, pages 153--160. Springer, 2011.

Digital Library

[30]

T. Winograd. Understanding natural language. phCognitive psychology, 3 (1): 1--191, 1972.

[31]

W. A. Woods and R. Kaplan. Lunar rocks in natural English: Explorations in natural language question answering. phLinguistic structures processing, 5: 521--569, 1977.

[32]

H. Xu, Y. Wang, K. Feng, G. de Melo, W. Wu, A. Sharf, and B. Chen. Shapelearner: Towards shape-based visual knowledge harvesting. In G. A. Kaminka, M. Fox, P. Bouquet, E. Hüllermeier, V. Dignum, F. Dignum, and F. van Harmelen, editors, phProceedings of ECAI 2016, volume 285 of phFrontiers in Artificial Intelligence and Applications, pages 435--443. IOS Press, 2016. ISBN 978--1--61499--671--2. 10.3233/978--1--61499--672--9--435.

[33]

M. Yahya, K. Berberich, S. Elbassuoni, M. Ramanath, V. Tresp, and G. Weikum. Natural language questions for the web of data. In phThe Conference on Empirical Methods on Natural Language Processing and Computational Natural Language Learning, pages 379--390, 2012.

Digital Library

[34]

M. A. Yosef, J. Hoffart, I. Bordino, M. Spaniol, and G. Weikum. AIDA: An online tool for accurate disambiguation of named entities in text and tables. phInternational Conference on Very Large Data Bases, 4 (12): 1450--1453, 2011.

[35]

Z. Zheng. AnswerBus question answering system. In phHTL, pages 399--404, 2002.

Digital Library

[36]

and Fei-Fei}zhu2015buildingY. Zhu, C. Zhang, C. Ré, and L. Fei-Fei. Building a large-scale multimodal knowledge base for visual question answering. pharXiv 1507.05670, 2015.

[37]

L. Zou, R. Huang, H. Wang, J. X. Yu, W. He, and D. Zhao. Natural language question answering over RDF: a graph data driven approach. In phProc. ACM-SIGMOD Int. Conf. on Management of Data, pages 313--324. ACM, 2014.

Digital Library

Cited By

Zafar AVarshney DKumar Sahoo SDas AEkbal A(2024)Are my answers medically accurate? Exploiting medical knowledge graphs for medical question answeringApplied Intelligence10.1007/s10489-024-05282-854:2(2172-2187)Online publication date: 31-Jan-2024
https://doi.org/10.1007/s10489-024-05282-8
Huang XZhang JXu ZOu LTong J(2021)A knowledge graph based question answering method for medical domainPeerJ Computer Science10.7717/peerj-cs.6677(e667)Online publication date: 1-Sep-2021
https://doi.org/10.7717/peerj-cs.667
Chen YLu EOu T(2021)Intelligent SPARQL Query Generation for Natural Language Processing SystemsIEEE Access10.1109/ACCESS.2021.31306679(158638-158650)Online publication date: 2021
https://doi.org/10.1109/ACCESS.2021.3130667
Show More Cited By

Index Terms

Multimodal Question Answering over Structured Data with Ambiguous Entities
1. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Question answering
  2. Information systems applications

Recommendations

Multimodal question answering for mobile devices
IUI '08: Proceedings of the 13th international conference on Intelligent user interfaces

This paper introduces multimodal question answering, a new interface for community-based question answering services. By offering users an extra modality---photos---in addition to the text modality to formulate queries, multimodal question answering ...
Quality-aware collaborative question answering: methods and evaluation
WSDM '09: Proceedings of the Second ACM International Conference on Web Search and Data Mining

Community Question Answering (QA) portals contain questions and answers contributed by hundreds of millions of users. These databases of questions and answers are of great value if they can be used directly to answer questions from any user. In this ...
Answer ranking based on named entity types for question answering
IMCOM '17: Proceedings of the 11th International Conference on Ubiquitous Information Management and Communication

Question answering (QA) using triples has been widely studied. One important aspect is answer ranking, that is, which answer candidates should be used to find correct answers. We are proposing a new method using type-matching information for ranking QA ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

WWW '17 Companion: Proceedings of the 26th International Conference on World Wide Web Companion

April 2017

1738 pages

ISBN:9781450349147

General Chairs:
Rick Barrett
W3Events
,
Rick Cummings
Murdoch University
,
Program Chairs:
Eugene Agichtein
Emory University
,
Evgeniy Gabrilovich
Google Research

Sponsors

IW3C2: International World Wide Web Conference Committee

In-Cooperation

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

International World Wide Web Conferences Steering Committee

Republic and Canton of Geneva, Switzerland

Publication History

Published: 03 April 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Joint NSFC-ISF Research Program 61561146397
National Natural Science Foundation of China
National 973 Program
Shandong Provincial Science and Technology Development Program

Conference

WWW '17

Sponsor:

IW3C2

WWW '17: 26th International World Wide Web Conference

April 3 - 7, 2017

Perth, Australia

Acceptance Rates

WWW '17 Companion Paper Acceptance Rate 164 of 966 submissions, 17%;

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

8
Total Citations
View Citations
292
Total Downloads

Downloads (Last 12 months)14
Downloads (Last 6 weeks)1

Reflects downloads up to 24 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zafar AVarshney DKumar Sahoo SDas AEkbal A(2024)Are my answers medically accurate? Exploiting medical knowledge graphs for medical question answeringApplied Intelligence10.1007/s10489-024-05282-854:2(2172-2187)Online publication date: 31-Jan-2024
https://doi.org/10.1007/s10489-024-05282-8
Huang XZhang JXu ZOu LTong J(2021)A knowledge graph based question answering method for medical domainPeerJ Computer Science10.7717/peerj-cs.6677(e667)Online publication date: 1-Sep-2021
https://doi.org/10.7717/peerj-cs.667
Chen YLu EOu T(2021)Intelligent SPARQL Query Generation for Natural Language Processing SystemsIEEE Access10.1109/ACCESS.2021.31306679(158638-158650)Online publication date: 2021
https://doi.org/10.1109/ACCESS.2021.3130667
Kang YWang Tde Melo GLarochelle HRanzato MHadsell RBalcan MLin H(2020)Incorporating pragmatic reasoning communication into emergent languageProceedings of the 34th International Conference on Neural Information Processing Systems10.5555/3495724.3496592(10348-10359)Online publication date: 6-Dec-2020
https://dl.acm.org/doi/10.5555/3495724.3496592
Sheng SLaenen KMoens M(2019)Can Image Captioning Help Passage Retrieval in Multimodal Question Answering?Advances in Information Retrieval10.1007/978-3-030-15719-7_12(94-101)Online publication date: 7-Apr-2019
https://doi.org/10.1007/978-3-030-15719-7_12
Li YRafiei D(2018)Natural Language Data Management and InterfacesSynthesis Lectures on Data Management10.2200/S00866ED1V01Y201807DTM04910:2(1-156)Online publication date: 13-Aug-2018
https://doi.org/10.2200/S00866ED1V01Y201807DTM049
Rouces Jde Melo GHose KGandon FSabou MSack H(2017)FrameBaseSemantic Web10.3233/SW-1702798:6(817-850)Online publication date: 1-Jan-2017
https://dl.acm.org/doi/10.3233/SW-170279
de Melo G(2017)Knowledge Graphs: Venturing Out into the WildKnowledge Graphs and Language Technology10.1007/978-3-319-68723-0_1(1-9)Online publication date: 29-Oct-2017
https://doi.org/10.1007/978-3-319-68723-0_1

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten