Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3488560.3498442acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article

Fast Semantic Matching via Flexible Contextualized Interaction

Published: 15 February 2022 Publication History

Abstract

Deep pre-trained language models (e.g., BERT) lead to remarkable headway in many Natural Language Processing tasks. Their superior capacity in perceiving textual data is also witnessed in semantic matching tasks (e.g., question answering, web search). Particularly for matching a pair of query and text candidate, the current state-of-the-arts usually rely on the semantic representations produced by BERT, and compute relevance scores with various interaction (i.e., matching) methods. However, they may 1) miss fine-grained phrase-level interaction between the input query and candidate context or 2) lack a thoughtful consideration of both effectiveness and efficiency. Motivated by this, we propose \hyttInteractor, a BERT-based semantic matching model with a flexible contextualized interaction paradigm. It is capable of capturing fine-grained phrase-level information in the interaction, and thus is more effective to be applied for semantic matching tasks. Moreover, we further facilitate \hyttInteractor with a novel partial attention scheme, which significantly reduces the computational cost while maintaining the high effectiveness. We conduct comprehensive experimental evaluations on three datasets. The results show that \hyttInteractor achieves superior effectiveness and efficiency for semantic matching.

Supplementary Material

MP4 File (WSDM22-fp358.mp4)
Presentation video for paper "Fast Semantic Matching via Flexible Contextualized Interaction"

References

[1]
Iz Beltagy, Matthew E Peters, and Arman Cohan. 2020. Longformer: The longdocument transformer. arXiv preprint arXiv:2004.05150 (2020).
[2]
Danqi Chen, Adam Fisch, Jason Weston, and Antoine Bordes. 2017. Reading wikipedia to answer open-domain questions. arXiv preprint arXiv:1704.00051 (2017).
[3]
Rewon Child, Scott Gray, Alec Radford, and Ilya Sutskever. 2019. Generating long sequences with sparse transformers. arXiv preprint arXiv:1904.10509 (2019).
[4]
Misha Denil, Alban Demiraj, Nal Kalchbrenner, Phil Blunsom, and Nando de Freitas. 2014. Modelling, visualising and summarising documents with a single convolutional neural network. arXiv preprint arXiv:1406.3830 (2014).
[5]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
[6]
Matthew Dunn, Levent Sagun, Mike Higgins, V Ugur Guney, Volkan Cirik, and Kyunghyun Cho. 2017. Searchqa: A new q&a dataset augmented with context from a search engine. arXiv preprint arXiv:1704.05179 (2017).
[7]
Yixing Fan, Xiaohui Xie, Yinqiong Cai, Jia Chen, Xinyu Ma, Xiangsheng Li, Ruqing Zhang, Jiafeng Guo, and Yiqun Liu. 2021. Pre-training Methods in Information Retrieval. arXiv preprint arXiv:2111.13853 (2021).
[8]
Jonathan Ho, Nal Kalchbrenner, DirkWeissenborn, and Tim Salimans. 2019. Axial Attention in Multidimensional Transformers. arXiv preprint arXiv:1912.12180 (2019).
[9]
Baotian Hu, Zhengdong Lu, Hang Li, and Qingcai Chen. 2014. Convolutional neural network architectures for matching natural language sentences. In NeurIPS. 2042--2050.
[10]
Binxuan Huang and KathleenMCarley. 2019. Parameterized convolutional neural networks for aspect level sentiment classification. arXiv preprint arXiv:1909.06276 (2019).
[11]
Jui-Ting Huang, Ashish Sharma, Shuying Sun, Li Xia, David Zhang, Philip Pronin, Janani Padmanabhan, Giuseppe Ottaviano, and Linjun Yang. 2020. Embeddingbased Retrieval in Facebook Search. In SIGKDD. 2553--2561.
[12]
Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alex Acero, and Larry Heck. 2013. Learning deep structured semantic models for web search using clickthrough data. In CIKM. 2333--2338.
[13]
Kai Hui, Andrew Yates, Klaus Berberich, and Gerard de Melo. 2017. Pacrr: A position-aware neural ir model for relevance matching. arXiv preprint arXiv:1704.03940 (2017).
[14]
Samuel Humeau, Kurt Shuster, Marie-Anne Lachaux, and Jason Weston. 2019. Poly-encoders: Transformer architectures and pre-training strategies for fast and accurate multi-sentence scoring. arXiv preprint arXiv:1905.01969 (2019).
[15]
Angelos Katharopoulos, Apoorv Vyas, Nikolaos Pappas, and François Fleuret. 2020. Transformers are rnns: Fast autoregressive transformers with linear attention. arXiv preprint arXiv:2006.16236 (2020).
[16]
Omar Khattab and Matei Zaharia. 2020. ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT. arXiv preprint arXiv:2004.12832 (2020).
[17]
Nikita Kitaev, Anselm Levskaya. 2020. Reformer: The efficient transformer. arXiv preprint arXiv:2001.04451 (2020).
[18]
Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, and Radu Soricut. 2019. Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019).
[19]
Juho Lee, Yoonho Lee, Jungtaek Kim, Adam Kosiorek, Seungjin Choi, and Yee Whye Teh. 2019. Set transformer: A framework for attention-based permutation-invariant neural networks. In ICML. PMLR, 3744--3753.
[20]
Kenton Lee, Ming-Wei Chang, and Kristina Toutanova. 2019. Latent retrieval for weakly supervised open domain question answering. arXiv preprint arXiv:1906.00300 (2019).
[21]
Peter J Liu, Mohammad Saleh, Etienne Pot, Ben Goodrich, Ryan Sepassi, Lukasz Kaiser, and Noam Shazeer. 2018. Generating wikipedia by summarizing long sequences. arXiv preprint arXiv:1801.10198 (2018).
[22]
Xiaodong Liu, Pengcheng He, Weizhu Chen, and Jianfeng Gao. 2019. Multitask deep neural networks for natural language understanding. arXiv preprint arXiv:1901.11504 (2019).
[23]
Yiding Liu, Weixue Lu, Suqi Cheng, Daiting Shi, Shuaiqiang Wang, Zhicong Cheng, and Dawei Yin. 2021. Pre-Trained Language Model forWeb-Scale Retrieval in Baidu Search. In SIGKDD. 3365--3375.
[24]
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019).
[25]
Ryan Lowe, Nissan Pow, Iulian Serban, and Joelle Pineau. 2015. The ubuntu dialogue corpus: A large dataset for research in unstructured multi-turn dialogue systems. arXiv preprint arXiv:1506.08909 (2015).
[26]
Jonas Mueller and Aditya Thyagarajan. 2016. Siamese recurrent architectures for learning sentence similarity. In AAAI.
[27]
Tri Nguyen, Mir Rosenberg, Xia Song, Jianfeng Gao, Saurabh Tiwary, Rangan Majumder, and Li Deng. 2016. MS MARCO: A human generated machine reading comprehension dataset. In CoCo@NIPS.
[28]
Ping Nie, Yuyu Zhang, Xiubo Geng, Arun Ramamurthy, Le Song, and Daxin Jiang. 2020. DC-BERT: Decoupling Question and Document for Efficient Contextual Encoding. In SIGIR. 1829--1832.
[29]
Niki Parmar, Ashish Vaswani, Jakob Uszkoreit, Lukasz Kaiser, Noam Shazeer, Alexander Ku, and Dustin Tran. 2018. Image transformer. arXiv preprint arXiv:1802.05751 (2018).
[30]
Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. 2018. Improving language understanding with unsupervised learning. Technical report, OpenAI (2018).
[31]
Corby Rosset, Damien Jose, Gargi Ghosh, Bhaskar Mitra, and Saurabh Tiwary. 2018. Optimizing query evaluations using reinforcement learning for web search. In SIGIR. 1193--1196.
[32]
Aurko Roy, Mohammad Saffar, Ashish Vaswani, and David Grangier. 2020. Efficient content-based sparse attention with routing transformers. arXiv preprint arXiv:2003.05997 (2020).
[33]
Kihyuk Sohn. 2016. Improved deep metric learning with multi-class n-pair loss objective. In NeurIPS. 1857--1865.
[34]
Yi Tay, Dara Bahri, Liu Yang, Donald Metzler, and Da-Cheng Juan. 2020. Sparse Sinkhorn Attention. arXiv preprint arXiv:2002.11296 (2020).
[35]
Lingyun Xiang, Guoqing Guo, Jingming Yu, Victor S Sheng, and Peng Yang. 2020. A convolutional neural network-based linguistic steganalysis for synonym substitution steganography. Mathematical Biosciences and Engineering 17, 2 (2020), 1041--1058.
[36]
Chenyan Xiong, Zhuyun Dai, Jamie Callan, Zhiyuan Liu, and Russell Power. 2017. End-to-end neural ad-hoc ranking with kernel pooling. In SIGIR. 55--64.
[37]
Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Russ R Salakhutdinov, and Quoc V Le. 2019. Xlnet: Generalized autoregressive pretraining for language understanding. In NeurIPS. 5753--5763.
[38]
Zhilin Yang, Peng Qi, Saizheng Zhang, Yoshua Bengio, WilliamWCohen, Ruslan Salakhutdinov, and Christopher D Manning. 2018. Hotpotqa: A dataset for diverse, explainable multi-hop question answering. arXiv preprint arXiv:1809.09600 (2018).
[39]
Koichiro Yoshino, Chiori Hori, Julien Perez, Luis Fernando D'Haro, Lazaros Polymenakos, Chulaka Gunasekara, Walter S Lasecki, Jonathan K Kummerfeld, Michel Galley, Chris Brockett, et al. 2019. Dialog system technology challenge 7. arXiv preprint arXiv:1901.03461 (2019).
[40]
Manzil Zaheer, Guru Guruganesh, Avinava Dubey, Joshua Ainslie, Chris Alberti, Santiago Ontanon, Philip Pham, Anirudh Ravula, QifanWang, Li Yang, et al. 2020. Big bird: Transformers for longer sequences. arXiv preprint arXiv:2007.14062 (2020).
[41]
Yuyu Zhang, Ping Nie, Xiubo Geng, Arun Ramamurthy, Le Song, and Daxin Jiang. 2020. DC-BERT: Decoupling Question and Document for Efficient Contextual Encoding. arXiv preprint arXiv:2002.12591 (2020).
[42]
Yuebing Zhang, Zhifei Zhang, Duoqian Miao, and Jiaqi Wang. 2019. Three-way enhanced convolutional neural networks for sentence-level sentiment classification. Information Sciences 477 (2019), 55--64.
[43]
Zhengyan Zhang, Xu Han, Zhiyuan Liu, Xin Jiang, Maosong Sun, and Qun Liu. 2019. ERNIE: Enhanced language representation with informative entities. arXiv preprint arXiv:1905.07129 (2019).
[44]
Li Zhou, Jianfeng Gao, Di Li, and Heung-Yeung Shum. 2020. The design and implementation of xiaoice, an empathetic social chatbot. Computational Linguistics 46, 1 (2020), 53--93.
[45]
Lixin Zou, Shengqiang Zhang, Hengyi Cai, Dehong Ma, Suqi Cheng, Shuaiqiang Wang, Daiting Shi, Zhicong Cheng, and Dawei Yin. 2021. Pre-trained language model based ranking in Baidu search. In SIGKDD. 4014--4022.

Cited By

View all
  • (2024)Robust Interaction-Based Relevance Modeling for Online e-Commerce SearchMachine Learning and Knowledge Discovery in Databases. Applied Data Science Track10.1007/978-3-031-70378-2_4(55-71)Online publication date: 22-Aug-2024
  • (2023)I3 Retriever: Incorporating Implicit Interaction in Pre-trained Language Models for Passage RetrievalProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3614923(441-451)Online publication date: 21-Oct-2023
  • (2023)Model-based Unbiased Learning to RankProceedings of the Sixteenth ACM International Conference on Web Search and Data Mining10.1145/3539597.3570395(895-903)Online publication date: 27-Feb-2023

Index Terms

  1. Fast Semantic Matching via Flexible Contextualized Interaction

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    WSDM '22: Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining
    February 2022
    1690 pages
    ISBN:9781450391320
    DOI:10.1145/3488560
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 15 February 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. bert
    2. efficient retrieval
    3. neural network
    4. text retrieval

    Qualifiers

    • Research-article

    Conference

    WSDM '22

    Acceptance Rates

    Overall Acceptance Rate 498 of 2,863 submissions, 17%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)63
    • Downloads (Last 6 weeks)9
    Reflects downloads up to 17 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Robust Interaction-Based Relevance Modeling for Online e-Commerce SearchMachine Learning and Knowledge Discovery in Databases. Applied Data Science Track10.1007/978-3-031-70378-2_4(55-71)Online publication date: 22-Aug-2024
    • (2023)I3 Retriever: Incorporating Implicit Interaction in Pre-trained Language Models for Passage RetrievalProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3614923(441-451)Online publication date: 21-Oct-2023
    • (2023)Model-based Unbiased Learning to RankProceedings of the Sixteenth ACM International Conference on Web Search and Data Mining10.1145/3539597.3570395(895-903)Online publication date: 27-Feb-2023
    • (2022)Chinese sentence semantic matching based on multi-level relevance extraction and aggregation for intelligent human–robot interactionApplied Soft Computing10.1016/j.asoc.2022.109795131:COnline publication date: 1-Dec-2022

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media