Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/3016100.3016292guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Text matching as image recognition

Published: 12 February 2016 Publication History

Abstract

Matching two texts is a fundamental problem in many natural language processing tasks. An effective way is to extract meaningful matching patterns from words, phrases, and sentences to produce the matching score. Inspired by the success of convolutional neural network in image recognition, where neurons can capture many complicated patterns based on the extracted elementary visual patterns such as oriented edges and corners, we propose to model text matching as the problem of image recognition. Firstly, a matching matrix whose entries represent the similarities between words is constructed and viewed as an image. Then a convolutional neural network is utilized to capture rich matching patterns in a layer-by-layer way. We show that by resembling the compositional hierarchies of patterns in image recognition, our model can successfully identify salient signals such as n-gram and n-term matchings. Experimental results demonstrate its superiority against the baselines.

References

[1]
Brown, P. F.; Pietra, V. J. D.; Pietra, S. A. D.; and Mercer, R. L. 1993. The mathematics of statistical machine translation: Parameter estimation. Computational linguistics 19(2):263-311.
[2]
Dahl, G. E.; Sainath, T. N.; and Hinton, G. E. 2013. Improving deep neural networks for lvcsr using rectified linear units and dropout. In Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on, 8609-8613. IEEE.
[3]
Dolan, W. B., and Brockett, C. 2005. Automatically constructing a corpus of sentential paraphrases. In Proc. of IWP.
[4]
Duchi, J.; Hazan, E.; and Singer, Y. 2011. Adaptive subgradient methods for online learning and stochastic optimization. The Journal of Machine Learning Research 12:2121-2159.
[5]
Gao, J.; Pantel, P.; Gamon, M.; He, X.; Deng, L.; and Shen, Y. 2014. Modeling interestingness with deep neural networks. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing.
[6]
Giles, R. C. S. L. L. 2001. Overfitting in neural nets: Back-propagation, conjugate gradient, and early stopping. In Advances in Neural Information Processing Systems 13: Proceedings of the 2000 Conference, volume 13, 402. MIT Press.
[7]
Girshick, R.; Donahue, J.; Darrell, T.; and Malik, J. 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. In Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on, 580-587. IEEE.
[8]
Hardoon, D. R., and Shawe-Taylor, J. 2003. Kcca for different level precision in content-based image retrieval. In Proceedings of Third International Workshop on Content-Based Multimedia Indexing, IRISA, Rennes, France.
[9]
Hinton, G. E.; Srivastava, N.; Krizhevsky, A.; Sutskever, I.; and Salakhutdinov, R. 2012. Improving neural networks by preventing co-adaptation of feature detectors. CoRR abs/1207.0580.
[10]
Hu, B.; Lu, Z.; Li, H.; and Chen, Q. 2014. Convolutional neural network architectures for matching natural language sentences. In Advances in Neural Information Processing Systems, 2042-2050.
[11]
Huang, P.-S.; He, X.; Gao, J.; Deng, L.; Acero, A.; and Heck, L. 2013. Learning deep structured semantic models for web search using clickthrough data. In Proceedings of the 22nd ACM international conference on Conference on Information and Knowledge Management, 2333-2338. ACM.
[12]
Jia, Y.; Shelhamer, E.; Donahue, J.; Karayev, S.; Long, J.; Girshick, R.; Guadarrama, S.; and Darrell, T. 2014. Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093.
[13]
Kalchbrenner, N.; Grefenstette, E.; and Blunsom, P. 2014. A convolutional neural network for modelling sentences. CoRR abs/1404.2188.
[14]
LeCun, Y.; Bottou, L.; Bengio, Y.; and Haffner, P. 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11):2278-2324.
[15]
Li, H., and Xu, J. 2014. Semantic matching in search. Foundations and Trends in Information Retrieval 7(5):343-469.
[16]
Lu, Z., and Li, H. 2013. A deep architecture for matching short texts. In Advances in Neural Information Processing Systems, 1367-1375.
[17]
Mikolov, T.; Chen, K.; Corrado, G.; and Dean, J. 2013. Efficient estimation of word representations in vector space. CoRR abs/1301.3781.
[18]
Salton, G.; Fox, E. A.; and Wu, H. 1983. Extended boolean information retrieval. Communications of the ACM 26(11):1022-1036.
[19]
Shen, Y.; He, X.; Gao, J.; Deng, L.; and Mesnil, G. 2014. A latent semantic model with convolutional-pooling structure for information retrieval. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, 101-110. ACM.
[20]
Simard, P. Y.; Steinkraus, D.; and Platt, J. C. 2003. Best practices for convolutional neural networks applied to visual document analysis. In 2013 12th International Conference on Document Analysis and Recognition, volume 2, 958-958. IEEE Computer Society.
[21]
Socher, R.; Huang, E. H.; Pennin, J.; Manning, C. D.; and Ng, A. Y. 2011. Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In Advances in Neural Information Processing Systems, 801-809.
[22]
Williams, D. R. G. H. R., and Hinton, G. 1986. Learning representations by back-propagating errors. Nature 323-533.
[23]
Wu, W.; Li, H.; and Xu, J. 2013. Learning query and document similarities from click-through bipartite graph with metadata. In Proceedings of the sixth ACM international conference on WSDM, 687-696. ACM.
[24]
Xue, X.; Jeon, J.; and Croft, W. B. 2008. Retrieval models for question and answer archives. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, 475-482. ACM.
[25]
Zeiler, M. D., and Fergus, R. 2014. Visualizing and understanding convolutional networks. In Computer Vision–ECCV 2014. Springer. 818-833.

Cited By

View all
  • (2024)Deep Bag-of-Words Model: An Efficient and Interpretable Relevance Architecture for Chinese E-CommerceProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671559(5398-5408)Online publication date: 25-Aug-2024
  • (2023)LADER: Log-Augmented DEnse Retrieval for Biomedical Literature SearchProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3592005(2092-2097)Online publication date: 19-Jul-2023
  • (2022)GazBy: Gaze-Based BERT Model to Incorporate Human Attention in Neural Information RetrievalProceedings of the 2022 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3539813.3545129(182-192)Online publication date: 23-Aug-2022
  • Show More Cited By
  1. Text matching as image recognition

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    AAAI'16: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence
    February 2016
    4406 pages

    Sponsors

    • Association for the Advancement of Artificial Intelligence

    Publisher

    AAAI Press

    Publication History

    Published: 12 February 2016

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 12 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Deep Bag-of-Words Model: An Efficient and Interpretable Relevance Architecture for Chinese E-CommerceProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671559(5398-5408)Online publication date: 25-Aug-2024
    • (2023)LADER: Log-Augmented DEnse Retrieval for Biomedical Literature SearchProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3592005(2092-2097)Online publication date: 19-Jul-2023
    • (2022)GazBy: Gaze-Based BERT Model to Incorporate Human Attention in Neural Information RetrievalProceedings of the 2022 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3539813.3545129(182-192)Online publication date: 23-Aug-2022
    • (2022)ReprBERT: Distilling BERT to an Efficient Representation-Based Relevance Model for E-CommerceProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3534678.3539090(4363-4371)Online publication date: 14-Aug-2022
    • (2022)ANTHEMProceedings of the Fifteenth ACM International Conference on Web Search and Data Mining10.1145/3488560.3498456(161-171)Online publication date: 11-Feb-2022
    • (2021)Personalized, Sequential, Attentive, Metric-Aware Product SearchACM Transactions on Information Systems10.1145/347333740:2(1-29)Online publication date: 24-Nov-2021
    • (2021)Match-IgnitionProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482450(1396-1405)Online publication date: 26-Oct-2021
    • (2021)Locate Who You AreProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482134(3413-3417)Online publication date: 26-Oct-2021
    • (2021)Adversarial Domain Adaptation for Cross-lingual Information Retrieval with Multilingual BERTProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482050(3498-3502)Online publication date: 26-Oct-2021
    • (2021)Single-Pass On-Line Event Detection in Twitter StreamsProceedings of the 2021 13th International Conference on Machine Learning and Computing10.1145/3457682.3457762(522-529)Online publication date: 26-Feb-2021
    • Show More Cited By

    View Options

    View options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media