Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3580305.3599263acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Open access

BatchSampler: Sampling Mini-Batches for Contrastive Learning in Vision, Language, and Graphs

Published: 04 August 2023 Publication History

Abstract

In-Batch contrastive learning is a state-of-the-art self-supervised method that brings semantically-similar instances close while pushing dissimilar instances apart within a mini-batch. Its key to success is the negative sharing strategy, in which every instance serves as a negative for the others within the mini-batch. Recent studies aim to improve performance by sampling hard negatives within the current mini-batch, whose quality is bounded by the mini-batch itself. In this work, we propose to improve contrastive learning by sampling mini-batches from the input data. We present BatchSampler\footnoteThe code is available at BatchSampler to sample mini-batches of hard-to-distinguish (i.e., hard and true negatives to each other) instances. To make each mini-batch have fewer false negatives, we design the proximity graph of randomly-selected instances. To form the mini-batch, we leverage random walk with restart on the proximity graph to help sample hard-to-distinguish instances. BatchSampler is a simple and general technique that can be directly plugged into existing contrastive learning models in vision, language, and graphs. Extensive experiments on datasets of three modalities show that BatchSampler can consistently improve the performance of powerful contrastive models, as shown by significant improvements of SimCLR on ImageNet-100, SimCSE on STS (language), and GraphCL and MVGRL on graph datasets.

Supplementary Material

MP4 File (BatchSampler-video.mp4)
Pre-recorded Presentation Video for KDD 2023

References

[1]
Eneko Agirre, Carmen Banea, Claire Cardie, Daniel Cer, Mona Diab, Aitor Gonzalez-Agirre, Weiwei Guo, Inigo Lopez-Gazpio, Montse Maritxalar, Rada Mihalcea, et al. 2015. Semeval-2015 task 2: Semantic textual similarity, english, spanish and pilot on interpretability. In SemEval'15. 252--263.
[2]
Eneko Agirre, Carmen Banea, Claire Cardie, Daniel M Cer, Mona T Diab, Aitor Gonzalez-Agirre, Weiwei Guo, Rada Mihalcea, German Rigau, and Janyce Wiebe. 2014. SemEval-2014 Task 10: Multilingual Semantic Textual Similarity. In SemEval'14. 81--91.
[3]
Eneko Agirre, Carmen Banea, Daniel Cer, Mona Diab, Aitor Gonzalez Agirre, Rada Mihalcea, German Rigau Claramunt, and Janyce Wiebe. 2016. Semeval-2016 task 1: Semantic textual similarity, monolingual and cross-lingual evaluation. In SemEval'16.
[4]
Eneko Agirre, Daniel Cer, Mona Diab, and Aitor Gonzalez-Agirre. 2012. Semeval-2012 task 6: A pilot on semantic textual similarity. In SemEval'12. 385--393.
[5]
Eneko Agirre, Daniel Cer, Mona Diab, Aitor Gonzalez-Agirre, and Weiwei Guo. 2013. * SEM 2013 shared task: Semantic textual similarity. In SemEval'13. 32--43.
[6]
Reid Andersen, Fan Chung, and Kevin Lang. 2006. Local graph partitioning using pagerank vectors. FOCS'06.
[7]
Konstantin Avrachenkov, Remco van der Hofstad, and Marina Sokol. 2014. Personalized pagerank with node-dependent restart. In WAW'14. 23--33.
[8]
Mathilde Caron, Ishan Misra, Julien Mairal, Priya Goyal, Piotr Bojanowski, and Armand Joulin. 2020. Unsupervised learning of visual features by contrasting cluster assignments. In NIPS'20. 9912--9924.
[9]
Chih-Chung Chang and Chih-Jen Lin. 2011. LIBSVM: a library for support vector. TIST'11 (2011), 1--27.
[10]
Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. 2020. A simple framework for contrastive learning of visual representations. In ICML'20. PMLR, 1597--1607.
[11]
Xinlei Chen, Saining Xie, and Kaiming He. 2021. An empirical study of training self-supervised vision transformers. In ICCV'21. 9640--9649.
[12]
Ching-Yao Chuang, Joshua Robinson, Lin Yen-Chen Antonio Torralba, and Stefanie Jegelka. 2020. Debiased Contrastive Learning. In NIPS'20.
[13]
Fan Chung and Alexander Tsiatas. 2010. Finding and visualizing graph clusters using pagerank optimization. In WAW'10. 86--97.
[14]
Fan Chung and Wenbo Zhao. 2010. PageRank and random walks on graphs. In Fete of combinatorics and computer science. 43--62.
[15]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL'19. 4171--4186.
[16]
Tianyu Gao, Xingcheng Yao, and Danqi Chen. 2021. Simcse: Simple contrastive learning of sentence embeddings. EMNLP'21 (2021).
[17]
Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. In KDD'16. 855--864.
[18]
Michael Gutmann and Aapo Hyvärinen. 2010. Noise-Contrastive Estimation: A new estimation principle for unnormalized statistical models. In AISTATS'10. 297--304.
[19]
Kaveh Hassani and Amir Hosein Khasahmadi. 2020. Contrastive multi-view representation learning on graphs. In International conference on machine learning. PMLR, 4116--4126.
[20]
Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. 2020. Momentum contrast for unsupervised visual representation learning. In CVPR'20.
[21]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In CVPR'16. 770--778.
[22]
Jui-Ting Huang, Ashish Sharma, Shuying Sun, Li Xia, David Zhang, Philip Pronin, Janani Padmanabhan, Giuseppe Ottaviano, and Linjun Yang. 2020. Embedding-based retrieval in facebook search. In KDD'20. 2553--2561.
[23]
Tinglin Huang, Yuxiao Dong, Ming Ding, Zhen Yang, Wenzheng Feng, Xinyu Wang, and Jie Tang. 2021. textMixGCF: An Improved Training Method for Graph Neural Network-based Recommender Systems. In KDD'21. 665--674.
[24]
Tri Huynh, Simon Kornblith, Matthew R Walter, Michael Maire, and Maryam Khademi. 2022. Boosting contrastive self-supervised learning with false negative cancellation. In WACV'22. 2785--2795.
[25]
Ashish Jaiswal, Ashwin Ramesh Babu, Mohammad Zaki Zadeh, Debapriya Banerjee, and Fillia Makedon. 2020. A survey on contrastive self-supervised learning. Technologies, Vol. 9, 1 (2020), 2.
[26]
Jeff Johnson, Matthijs Douze, and Hervé Jégou. 2019. Billion-scale similarity search with GPUs. IEEE Transactions on Big Data, Vol. 7, 3 (2019), 535--547.
[27]
Yannis Kalantidis, Mert Bulent Sariyildiz, Noe Pion, Philippe Weinzaepfel, and Diane Larlus. 2020. Hard Negative Mixing for Contrastive Learning. In NIPS'20.
[28]
Vladimir Karpukhin, Barlas Oug uz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih. 2020. Dense passage retrieval for open-domain question answering. In EMNLP'20.
[29]
Donald Ervin Knuth. 1997. The art of computer programming: Fundamental Algorithms. Vol. 1. Pearson Education.
[30]
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019).
[31]
Marco Marelli, Stefano Menini, Marco Baroni, Luisa Bentivogli, Raffaella Bernardi, and Roberto Zamparelli. 2014. A SICK cure for the evaluation of compositional distributional semantic models. In LREC'14. 216--223.
[32]
Yu Meng, Chenyan Xiong, Payal Bajaj, Paul Bennett, Jiawei Han, and Xia Song. 2021. Coco-lm: Correcting and contrasting text sequences for language model pretraining. In NIPS'21.
[33]
Annamalai Narayanan, Mahinthan Chandramohan, Rajasekar Venkatesan, Lihui Chen, Yang Liu, and Shantanu Jaiswal. 2017. graph2vec: Learning distributed representations of graphs. In MLG'17.
[34]
Aaron van den Oord, Yazhe Li, and Oriol Vinyals. 2018. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748 (2018).
[35]
Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. 1999. The PageRank citation ranking: Bringing order to the web. Stanford University Technical Report, (1999).
[36]
Veli?kovi? Petar, William Fedus, William L. Hamilton, Pietro Liò, Yoshua Bengio, and R. Devon Hjelm. 2018. Deep graph infomax. In ICLR'19.
[37]
Jiezhong Qiu, Qibin Chen, Yuxiao Dong, Jing Zhang, Hongxia Yang, Ming Ding, Kuansan Wang, and Jie Tang. 2020. Gcc: Graph contrastive coding for graph neural network pre-training. In KDD'20. 1150--1160.
[38]
Joshua Robinson, Ching-Yao Chuang, Suvrit Sra, and Stefanie Jegelka. 2021. Contrastive Learning with Hard Negative Samples. In ICLR'21.
[39]
Nino Shervashidze, Pascal Schweitzer, Erik Jan Van Leeuwen, Kurt Mehlhorn, and Karsten M Borgwardt. 2011. Weisfeiler-lehman graph kernels. Journal of Machine Learning Research, Vol. 12, 9 (2011).
[40]
Daniel A Spielman and Shang-Hua Teng. 2013. A local clustering algorithm for massive graphs and its application to nearly linear time graph partitioning. SIAM Journal on computing (2013).
[41]
Fan-Yun Sun, Jordan Hoffmann, Vikas Verma, and Jian Tang. 2019. Infograph: Unsupervised and semi-supervised graph-level representation learning via mutual information maximization. In ICLR'20.
[42]
Afrina Tabassum, Muntasir Wahed, Hoda Eldardiry, and Ismini Lourentzou. 2022. Hard negative sampling strategies for contrastive representation learning. arXiv preprint arXiv:2206.01197 (2022).
[43]
Junfeng Tian, Zhiheng Zhou, Man Lan, and Yuanbin Wu. 2017. Ecnu at semeval-2017 task 1: Leverage kernel-based traditional nlp features and neural networks to build a universal model for multilingual and cross-lingual semantic textual similarity. In SemEval'17. 191--197.
[44]
Hanghang Tong, Christos Faloutsos, and Jia-Yu Pan. 2006. Fast random walk with restart and its applications. In ICDM'06.
[45]
Guangrun Wang, Keze Wang, Guangcong Wang, Philip HS Torr, and Liang Lin. 2021. Solving inefficiency of self-supervised representation learning. In ICCV'21. 9505--9515.
[46]
Mike Wu, Milan Mosse, Chengxu Zhuang, Daniel Yamins, and Noah Goodman. 2020a. Conditional negative sampling for contrastive learning of visual representations. In ICLR'21.
[47]
Yawen Wu, Zhepeng Wang, Dewen Zeng, Yiyu Shi, and Jingtong Hu. 2021. Enabling on-device self-supervised contrastive learning with selective data contrast. In 2021 58th ACM/IEEE Design Automation Conference (DAC). IEEE, 655--660.
[48]
Zhuofeng Wu, Sinong Wang, Jiatao Gu, Madian Khabsa, Fei Sun, and Hao Ma. 2020b. Clear: Contrastive learning for sentence representation. arXiv preprint arXiv:2012.15466 (2020).
[49]
Zhirong Wu, Yuanjun Xiong, Stella X. Yu, and Dahua Lin. 2018. Unsupervised feature learning via non-parametric instance discrimination. In CVPR'19. 3733--3742.
[50]
Lee Xiong, Chenyan Xiong, Ye Li, Kwok-Fung Tang, Jialin Liu, Paul Bennett, Junaid Ahmed, and Arnold Overwijk. 2020. Approximate nearest neighbor negative contrastive learning for dense text retrieval. In ICLR'21.
[51]
Dongkuan Xu, Wei Cheng, Dongsheng Luo, Haifeng Chen, and Xiang Zhang. 2021. Infogcl: Information-aware graph contrastive learning. In NIPS'21, Vol. 34. 30414--30425.
[52]
Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2018. How powerful are graph neural networks?. In ICLR'19.
[53]
Pinar Yanardag and SVN Vishwanathan. 2015. Deep graph kernels. In KDD'15. 1365--1374.
[54]
Zhen Yang, Ming Ding, Chang Zhou, Hongxia Yang, Jingren Zhou, and Jie Tang. 2020. Understanding negative sampling in graph representation learning. In KDD'20. 1666--1676.
[55]
Zhitao Ying, Jiaxuan You, Christopher Morris, Xiang Ren, Will Hamilton, and Jure Leskovec. 2018. Hierarchical graph representation learning with differentiable pooling. In NIPS'19, Vol. 31.
[56]
Yuning You, Tianlong Chen, Yang Shen, and Zhangyang Wang. 2021. Graph contrastive learning automated. In International Conference on Machine Learning. PMLR, 12121--12132.
[57]
Yuning You, Tianlong Chen, Yongduo Sui, Ting Chen, Zhangyang Wang, and Yang Shen. 2020. Graph Contrastive Learning with Augmentations. In NIPS'20. 5812--5823.
[58]
Junchi Yu, Tingyang Xu, Yu Rong, Yatao Bian, Junzhou Huang, and Ran He. 2020. Graph information bottleneck for subgraph recognition. In ICLR'21.
[59]
Junchi Yu, Tingyang Xu, Yu Rong, Yatao Bian, Junzhou Huang, and Ran He. 2021. Recognizing predictive substructures with subgraph information bottleneck. IEEE Transactions on Pattern Analysis and Machine Intelligence (2021).
[60]
Shaofeng Zhang, Meng Liu, Junchi Yan, Hengrui Zhang, Lingxiao Huang, Xiaokang Yang, and Pinyan Lu. 2022. M-Mix: Generating Hard Negatives via Multi-sample Mixing for Contrastive Learning. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2461--2470.
[61]
Weinan Zhang, Tianqi Chen, Jun Wang, and Yong Yu. 2013. Optimizing top-n collaborative filtering via dynamic negative item sampling. In SIGIR'13. 785--788.
[62]
Mingkai Zheng, Fei Wang, Shan You, Chen Qian, Changshui Zhang, Xiaogang Wang, and Chang Xu. 2021. Weakly supervised contrastive learning. In ICCV'21. 10042--10051.
[63]
Kun Zhou, Beichen Zhang, Wayne Xin Zhao, and Ji-Rong Wen. 2022. Debiased Contrastive Learning of Unsupervised Sentence Representations. In ACL'22.
[64]
Yanqiao Zhu, Yichen Xu, Qiang Liu, and Shu Wu. 2021a. An empirical study of graph contrastive learning. arXiv preprint arXiv:2109.01116 (2021).
[65]
Yanqiao Zhu, Yichen Xu, Feng Yu, Qiang Liu, Shu Wu, and Liang Wang. 2021b. Graph contrastive learning with adaptive augmentation. In Proceedings of the Web Conference 2021. 2069--2080.
[66]
Jirí ?íma and Satu Elisa Schaeffer. 2006. On the np-completeness of some graph cluster measures. In International Conference on Current Trends in Theory and Practice of Computer Science. io

Cited By

View all
  • (2024)Does Negative Sampling Matter? a Review With Insights Into its Theory and ApplicationsIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.337147346:8(5692-5711)Online publication date: Aug-2024
  • (2024)StockCL: Selective Contrastive Learning for Stock Trend Forecasting via Learnable ConceptsDatabase Systems for Advanced Applications10.1007/978-981-97-5575-2_20(275-284)Online publication date: 2-Sep-2024

Index Terms

  1. BatchSampler: Sampling Mini-Batches for Contrastive Learning in Vision, Language, and Graphs

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
    August 2023
    5996 pages
    ISBN:9798400701030
    DOI:10.1145/3580305
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 04 August 2023

    Check for updates

    Author Tags

    1. contrastive learning
    2. global hard negatives
    3. mini-batch sampling

    Qualifiers

    • Research-article

    Funding Sources

    • Technology and Innovation Major Project of the Ministry of Science
    • NSF of China for Distinguished Young Scholars
    • Technology of China under Grant

    Conference

    KDD '23
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)398
    • Downloads (Last 6 weeks)57
    Reflects downloads up to 03 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Does Negative Sampling Matter? a Review With Insights Into its Theory and ApplicationsIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.337147346:8(5692-5711)Online publication date: Aug-2024
    • (2024)StockCL: Selective Contrastive Learning for Stock Trend Forecasting via Learnable ConceptsDatabase Systems for Advanced Applications10.1007/978-981-97-5575-2_20(275-284)Online publication date: 2-Sep-2024

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media

    Access Granted

    The conference sponsors are committed to making content openly accessible in a timely manner.
    This article is provided by ACM and the conference, through the ACM OpenTOC service.