research-article

Fast Semantic Matching via Flexible Contextualized Interaction

Authors:

Shuaiqiang Wang,

Dawei YinAuthors Info & Claims

WSDM '22: Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining

Pages 1275 - 1283

https://doi.org/10.1145/3488560.3498442

Published: 15 February 2022 Publication History

Abstract

Deep pre-trained language models (e.g., BERT) lead to remarkable headway in many Natural Language Processing tasks. Their superior capacity in perceiving textual data is also witnessed in semantic matching tasks (e.g., question answering, web search). Particularly for matching a pair of query and text candidate, the current state-of-the-arts usually rely on the semantic representations produced by BERT, and compute relevance scores with various interaction (i.e., matching) methods. However, they may 1) miss fine-grained phrase-level interaction between the input query and candidate context or 2) lack a thoughtful consideration of both effectiveness and efficiency. Motivated by this, we propose \hyttInteractor, a BERT-based semantic matching model with a flexible contextualized interaction paradigm. It is capable of capturing fine-grained phrase-level information in the interaction, and thus is more effective to be applied for semantic matching tasks. Moreover, we further facilitate \hyttInteractor with a novel partial attention scheme, which significantly reduces the computational cost while maintaining the high effectiveness. We conduct comprehensive experimental evaluations on three datasets. The results show that \hyttInteractor achieves superior effectiveness and efficiency for semantic matching.

Supplementary Material

MP4 File (WSDM22-fp358.mp4)

Presentation video for paper "Fast Semantic Matching via Flexible Contextualized Interaction"

Download
91.98 MB

References

[1]

Iz Beltagy, Matthew E Peters, and Arman Cohan. 2020. Longformer: The longdocument transformer. arXiv preprint arXiv:2004.05150 (2020).

[2]

Danqi Chen, Adam Fisch, Jason Weston, and Antoine Bordes. 2017. Reading wikipedia to answer open-domain questions. arXiv preprint arXiv:1704.00051 (2017).

[3]

Rewon Child, Scott Gray, Alec Radford, and Ilya Sutskever. 2019. Generating long sequences with sparse transformers. arXiv preprint arXiv:1904.10509 (2019).

[4]

Misha Denil, Alban Demiraj, Nal Kalchbrenner, Phil Blunsom, and Nando de Freitas. 2014. Modelling, visualising and summarising documents with a single convolutional neural network. arXiv preprint arXiv:1406.3830 (2014).

[5]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).

[6]

Matthew Dunn, Levent Sagun, Mike Higgins, V Ugur Guney, Volkan Cirik, and Kyunghyun Cho. 2017. Searchqa: A new q&a dataset augmented with context from a search engine. arXiv preprint arXiv:1704.05179 (2017).

[7]

Yixing Fan, Xiaohui Xie, Yinqiong Cai, Jia Chen, Xinyu Ma, Xiangsheng Li, Ruqing Zhang, Jiafeng Guo, and Yiqun Liu. 2021. Pre-training Methods in Information Retrieval. arXiv preprint arXiv:2111.13853 (2021).

[8]

Jonathan Ho, Nal Kalchbrenner, DirkWeissenborn, and Tim Salimans. 2019. Axial Attention in Multidimensional Transformers. arXiv preprint arXiv:1912.12180 (2019).

[9]

Baotian Hu, Zhengdong Lu, Hang Li, and Qingcai Chen. 2014. Convolutional neural network architectures for matching natural language sentences. In NeurIPS. 2042--2050.

[10]

Binxuan Huang and KathleenMCarley. 2019. Parameterized convolutional neural networks for aspect level sentiment classification. arXiv preprint arXiv:1909.06276 (2019).

[11]

Jui-Ting Huang, Ashish Sharma, Shuying Sun, Li Xia, David Zhang, Philip Pronin, Janani Padmanabhan, Giuseppe Ottaviano, and Linjun Yang. 2020. Embeddingbased Retrieval in Facebook Search. In SIGKDD. 2553--2561.

[12]

Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alex Acero, and Larry Heck. 2013. Learning deep structured semantic models for web search using clickthrough data. In CIKM. 2333--2338.

[13]

Kai Hui, Andrew Yates, Klaus Berberich, and Gerard de Melo. 2017. Pacrr: A position-aware neural ir model for relevance matching. arXiv preprint arXiv:1704.03940 (2017).

[14]

Samuel Humeau, Kurt Shuster, Marie-Anne Lachaux, and Jason Weston. 2019. Poly-encoders: Transformer architectures and pre-training strategies for fast and accurate multi-sentence scoring. arXiv preprint arXiv:1905.01969 (2019).

[15]

Angelos Katharopoulos, Apoorv Vyas, Nikolaos Pappas, and François Fleuret. 2020. Transformers are rnns: Fast autoregressive transformers with linear attention. arXiv preprint arXiv:2006.16236 (2020).

[16]

Omar Khattab and Matei Zaharia. 2020. ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT. arXiv preprint arXiv:2004.12832 (2020).

[17]

Nikita Kitaev, Anselm Levskaya. 2020. Reformer: The efficient transformer. arXiv preprint arXiv:2001.04451 (2020).

[18]

Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, and Radu Soricut. 2019. Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019).

[19]

Juho Lee, Yoonho Lee, Jungtaek Kim, Adam Kosiorek, Seungjin Choi, and Yee Whye Teh. 2019. Set transformer: A framework for attention-based permutation-invariant neural networks. In ICML. PMLR, 3744--3753.

[20]

Kenton Lee, Ming-Wei Chang, and Kristina Toutanova. 2019. Latent retrieval for weakly supervised open domain question answering. arXiv preprint arXiv:1906.00300 (2019).

[21]

Peter J Liu, Mohammad Saleh, Etienne Pot, Ben Goodrich, Ryan Sepassi, Lukasz Kaiser, and Noam Shazeer. 2018. Generating wikipedia by summarizing long sequences. arXiv preprint arXiv:1801.10198 (2018).

[22]

Xiaodong Liu, Pengcheng He, Weizhu Chen, and Jianfeng Gao. 2019. Multitask deep neural networks for natural language understanding. arXiv preprint arXiv:1901.11504 (2019).

[23]

Yiding Liu, Weixue Lu, Suqi Cheng, Daiting Shi, Shuaiqiang Wang, Zhicong Cheng, and Dawei Yin. 2021. Pre-Trained Language Model forWeb-Scale Retrieval in Baidu Search. In SIGKDD. 3365--3375.

[24]

Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019).

[25]

Ryan Lowe, Nissan Pow, Iulian Serban, and Joelle Pineau. 2015. The ubuntu dialogue corpus: A large dataset for research in unstructured multi-turn dialogue systems. arXiv preprint arXiv:1506.08909 (2015).

[26]

Jonas Mueller and Aditya Thyagarajan. 2016. Siamese recurrent architectures for learning sentence similarity. In AAAI.

[27]

Tri Nguyen, Mir Rosenberg, Xia Song, Jianfeng Gao, Saurabh Tiwary, Rangan Majumder, and Li Deng. 2016. MS MARCO: A human generated machine reading comprehension dataset. In CoCo@NIPS.

[28]

Ping Nie, Yuyu Zhang, Xiubo Geng, Arun Ramamurthy, Le Song, and Daxin Jiang. 2020. DC-BERT: Decoupling Question and Document for Efficient Contextual Encoding. In SIGIR. 1829--1832.

[29]

Niki Parmar, Ashish Vaswani, Jakob Uszkoreit, Lukasz Kaiser, Noam Shazeer, Alexander Ku, and Dustin Tran. 2018. Image transformer. arXiv preprint arXiv:1802.05751 (2018).

[30]

Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. 2018. Improving language understanding with unsupervised learning. Technical report, OpenAI (2018).

[31]

Corby Rosset, Damien Jose, Gargi Ghosh, Bhaskar Mitra, and Saurabh Tiwary. 2018. Optimizing query evaluations using reinforcement learning for web search. In SIGIR. 1193--1196.

[32]

Aurko Roy, Mohammad Saffar, Ashish Vaswani, and David Grangier. 2020. Efficient content-based sparse attention with routing transformers. arXiv preprint arXiv:2003.05997 (2020).

[33]

Kihyuk Sohn. 2016. Improved deep metric learning with multi-class n-pair loss objective. In NeurIPS. 1857--1865.

[34]

Yi Tay, Dara Bahri, Liu Yang, Donald Metzler, and Da-Cheng Juan. 2020. Sparse Sinkhorn Attention. arXiv preprint arXiv:2002.11296 (2020).

[35]

Lingyun Xiang, Guoqing Guo, Jingming Yu, Victor S Sheng, and Peng Yang. 2020. A convolutional neural network-based linguistic steganalysis for synonym substitution steganography. Mathematical Biosciences and Engineering 17, 2 (2020), 1041--1058.

[36]

Chenyan Xiong, Zhuyun Dai, Jamie Callan, Zhiyuan Liu, and Russell Power. 2017. End-to-end neural ad-hoc ranking with kernel pooling. In SIGIR. 55--64.

[37]

Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Russ R Salakhutdinov, and Quoc V Le. 2019. Xlnet: Generalized autoregressive pretraining for language understanding. In NeurIPS. 5753--5763.

[38]

Zhilin Yang, Peng Qi, Saizheng Zhang, Yoshua Bengio, WilliamWCohen, Ruslan Salakhutdinov, and Christopher D Manning. 2018. Hotpotqa: A dataset for diverse, explainable multi-hop question answering. arXiv preprint arXiv:1809.09600 (2018).

[39]

Koichiro Yoshino, Chiori Hori, Julien Perez, Luis Fernando D'Haro, Lazaros Polymenakos, Chulaka Gunasekara, Walter S Lasecki, Jonathan K Kummerfeld, Michel Galley, Chris Brockett, et al. 2019. Dialog system technology challenge 7. arXiv preprint arXiv:1901.03461 (2019).

[40]

Manzil Zaheer, Guru Guruganesh, Avinava Dubey, Joshua Ainslie, Chris Alberti, Santiago Ontanon, Philip Pham, Anirudh Ravula, QifanWang, Li Yang, et al. 2020. Big bird: Transformers for longer sequences. arXiv preprint arXiv:2007.14062 (2020).

[41]

Yuyu Zhang, Ping Nie, Xiubo Geng, Arun Ramamurthy, Le Song, and Daxin Jiang. 2020. DC-BERT: Decoupling Question and Document for Efficient Contextual Encoding. arXiv preprint arXiv:2002.12591 (2020).

[42]

Yuebing Zhang, Zhifei Zhang, Duoqian Miao, and Jiaqi Wang. 2019. Three-way enhanced convolutional neural networks for sentence-level sentiment classification. Information Sciences 477 (2019), 55--64.

[43]

Zhengyan Zhang, Xu Han, Zhiyuan Liu, Xin Jiang, Maosong Sun, and Qun Liu. 2019. ERNIE: Enhanced language representation with informative entities. arXiv preprint arXiv:1905.07129 (2019).

[44]

Li Zhou, Jianfeng Gao, Di Li, and Heung-Yeung Shum. 2020. The design and implementation of xiaoice, an empathetic social chatbot. Computational Linguistics 46, 1 (2020), 53--93.

Digital Library

[45]

Lixin Zou, Shengqiang Zhang, Hengyi Cai, Dehong Ma, Suqi Cheng, Shuaiqiang Wang, Daiting Shi, Zhicong Cheng, and Dawei Yin. 2021. Pre-trained language model based ranking in Baidu search. In SIGKDD. 4014--4022.

Cited By

Chen BDai HMa XJiang WNing W(2024)Robust Interaction-Based Relevance Modeling for Online e-Commerce SearchMachine Learning and Knowledge Discovery in Databases. Applied Data Science Track10.1007/978-3-031-70378-2_4(55-71)Online publication date: 22-Aug-2024
https://doi.org/10.1007/978-3-031-70378-2_4
Dong QLiu YAi QLi HWang SLiu YYin DMa SFrommholz IHopfgartner FLee MOakes MLalmas MZhang MSantos R(2023)I3 Retriever: Incorporating Implicit Interaction in Pre-trained Language Models for Passage RetrievalProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3614923(441-451)Online publication date: 21-Oct-2023
https://dl.acm.org/doi/10.1145/3583780.3614923
Luo DZou LAi QChen ZYin DDavison BChua TLauw HSi LTerzi ETsaparas P(2023)Model-based Unbiased Learning to RankProceedings of the Sixteenth ACM International Conference on Web Search and Data Mining10.1145/3539597.3570395(895-903)Online publication date: 27-Feb-2023
https://dl.acm.org/doi/10.1145/3539597.3570395

Index Terms

Fast Semantic Matching via Flexible Contextualized Interaction
1. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals

Recommendations

Contextual semantic embeddings for ontology subsumption prediction
Abstract
Automating ontology construction and curation is an important but challenging task in knowledge engineering and artificial intelligence. Prediction by machine learning techniques such as contextual semantic embedding is a promising direction, but ...
COMET: A Contextualized Molecule-Based Matching Technique
Database and Expert Systems Applications
Abstract
Context-specific description of entities –expressed in RDF– poses challenges during data-driven tasks, e.g., data integration, and context-aware entity matching represents a building-block for these tasks. However, existing approaches only ...
Semantic enrichment in ontologies for matching
AOW '06: Proceedings of the second Australasian workshop on Advances in ontologies - Volume 72

Matching (or mapping) between heterogeneous ontologies becomes crucial for interoperability in distributed and intelligent environments. Although many efforts in ontology mapping have already been conducted, most of them rely heavily on the meaning of ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

WSDM '22: Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining

February 2022

1690 pages

ISBN:9781450391320

DOI:10.1145/3488560

General Chairs:
K. Selcuk Candan
Arizona State University, USA
,
Huan Liu
Arizona State University, USA
,
Program Chairs:
Leman Akoglu
Carnegie Mellon University, USA
,
Xin Luna Dong
Meta Platforms, Inc. (former Facebook), USA
,
Jiliang Tang
Michigan State University, USA

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 February 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

WSDM '22

Sponsor:

WSDM '22: The Fifteenth ACM International Conference on Web Search and Data Mining

February 21 - 25, 2022

AZ, Virtual Event, USA

Acceptance Rates

Overall Acceptance Rate 498 of 2,863 submissions, 17%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
474
Total Downloads

Downloads (Last 12 months)63
Downloads (Last 6 weeks)9

Reflects downloads up to 17 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Chen BDai HMa XJiang WNing W(2024)Robust Interaction-Based Relevance Modeling for Online e-Commerce SearchMachine Learning and Knowledge Discovery in Databases. Applied Data Science Track10.1007/978-3-031-70378-2_4(55-71)Online publication date: 22-Aug-2024
https://doi.org/10.1007/978-3-031-70378-2_4
Dong QLiu YAi QLi HWang SLiu YYin DMa SFrommholz IHopfgartner FLee MOakes MLalmas MZhang MSantos R(2023)I3 Retriever: Incorporating Implicit Interaction in Pre-trained Language Models for Passage RetrievalProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3614923(441-451)Online publication date: 21-Oct-2023
https://dl.acm.org/doi/10.1145/3583780.3614923
Luo DZou LAi QChen ZYin DDavison BChua TLauw HSi LTerzi ETsaparas P(2023)Model-based Unbiased Learning to RankProceedings of the Sixteenth ACM International Conference on Web Search and Data Mining10.1145/3539597.3570395(895-903)Online publication date: 27-Feb-2023
https://dl.acm.org/doi/10.1145/3539597.3570395
Lu WZhao PLi YWang SHuang HShi SWu H(2022)Chinese sentence semantic matching based on multi-level relevance extraction and aggregation for intelligent human–robot interactionApplied Soft Computing10.1016/j.asoc.2022.109795131:COnline publication date: 1-Dec-2022
https://dl.acm.org/doi/10.1016/j.asoc.2022.109795

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents