Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3485447.3512093acmconferencesArticle/Chapter ViewAbstractPublication PageswebconfConference Proceedingsconference-collections
research-article
Open access

MetaBalance: Improving Multi-Task Recommendations via Adapting Gradient Magnitudes of Auxiliary Tasks

Published: 25 April 2022 Publication History
  • Get Citation Alerts
  • Abstract

    In many personalized recommendation scenarios, the generalization ability of a target task can be improved via learning with additional auxiliary tasks alongside this target task on a multi-task network. However, this method often suffers from a serious optimization imbalance problem. On the one hand, one or more auxiliary tasks might have a larger influence than the target task and even dominate the network weights, resulting in worse recommendation accuracy for the target task. On the other hand, the influence of one or more auxiliary tasks might be too weak to assist the target task. More challenging is that this imbalance dynamically changes throughout the training process and varies across the parts of the same network. We propose a new method: MetaBalance to balance auxiliary losses via directly manipulating their gradients w.r.t the shared parameters in the multi-task network. Specifically, in each training iteration and adaptively for each part of the network, the gradient of an auxiliary loss is carefully reduced or enlarged to have a closer magnitude to the gradient of the target loss, preventing auxiliary tasks from being so strong that dominate the target task or too weak to help the target task. Moreover, the proximity between the gradient magnitudes can be flexibly adjusted to adapt MetaBalance to different scenarios. The experiments show that our proposed method achieves a significant improvement of 8.34% in terms of NDCG@10 upon the strongest baseline on two real-world datasets. The code of our approach can be found at here.1

    References

    [1]
    Trapit Bansal, David Belanger, and Andrew McCallum. 2016. Ask the gru: Multi-task learning for deep text recommendations. In proceedings of the 10th ACM Conference on Recommender Systems. 107–114.
    [2]
    Walter Baur and Volker Strassen. 1983. The complexity of partial derivatives. Theoretical computer science 22, 3 (1983), 317–330.
    [3]
    Da Cao, Liqiang Nie, Xiangnan He, Xiaochi Wei, Shunzhi Zhu, and Tat-Seng Chua. 2017. Embedding factorization models for jointly recommending items and user generated lists. In SIGIR.
    [4]
    Zhao Chen, Vijay Badrinarayanan, Chen-Yu Lee, and Andrew Rabinovich. 2018. Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In International Conference on Machine Learning. PMLR, 794–803.
    [5]
    Zhao Chen, Jiquan Ngiam, Yanping Huang, Thang Luong, Henrik Kretzschmar, Yuning Chai, and Dragomir Anguelov. 2020. Just pick a sign: Optimizing deep multitask models with gradient sign dropout. arXiv preprint arXiv:2010.06808(2020).
    [6]
    Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, 2016. Wide & deep learning for recommender systems. In Proceedings of the 1st workshop on deep learning for recommender systems. 7–10.
    [7]
    Yunshu Du, Wojciech M Czarnecki, Siddhant M Jayakumar, Mehrdad Farajtabar, Razvan Pascanu, and Balaji Lakshminarayanan. 2018. Adapting auxiliary losses using gradient similarity. arXiv preprint arXiv:1812.02224(2018).
    [8]
    John Duchi, Elad Hazan, and Yoram Singer. 2011. Adaptive subgradient methods for online learning and stochastic optimization.Journal of machine learning research 12, 7 (2011).
    [9]
    Guibing Guo, Jie Zhang, and Neil Yorke-Smith. 2015. Trustsvd: Collaborative filtering with both the explicit and implicit influence of user trust and of item ratings. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 29.
    [10]
    Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural collaborative filtering. In WWW.
    [11]
    Yun He, Jianling Wang, Wei Niu, and James Caverlee. 2019. A Hierarchical Self-Attentive Model for Recommending User-Generated Item Lists. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management. ACM, 1481–1490.
    [12]
    Max Jaderberg, Volodymyr Mnih, Wojciech Marian Czarnecki, Tom Schaul, Joel Z Leibo, David Silver, and Koray Kavukcuoglu. 2016. Reinforcement learning with unsupervised auxiliary tasks. arXiv preprint arXiv:1611.05397(2016).
    [13]
    Kalervo Järvelin and Jaana Kekäläinen. 2002. Cumulated gain-based evaluation of IR techniques. TOIS (2002).
    [14]
    Alex Kendall, Yarin Gal, and Roberto Cipolla. 2018. Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In Proceedings of the IEEE conference on computer vision and pattern recognition. 7482–7491.
    [15]
    Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980(2014).
    [16]
    Lukas Liebel and Marco Körner. 2018. Auxiliary tasks in multi-task learning. arXiv preprint arXiv:1805.06334(2018).
    [17]
    Xingyu Lin, Harjatin Baweja, George Kantor, and David Held. 2019. Adaptive Auxiliary Task Weighting for Reinforcement Learning. In Advances in Neural Information Processing Systems. 4772–4783.
    [18]
    Shikun Liu, Andrew Davison, and Edward Johns. 2019. Self-supervised generalisation with meta auxiliary learning. In Advances in Neural Information Processing Systems. 1679–1689.
    [19]
    Shikun Liu, Edward Johns, and Andrew J Davison. 2019. End-to-end multi-task learning with attention. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1871–1880.
    [20]
    Yidan Liu, Min Xie, and Laks VS Lakshmanan. 2014. Recommending user generated item lists. In Recsys.
    [21]
    Hao Ma, Haixuan Yang, Michael R Lyu, and Irwin King. 2008. Sorec: social recommendation using probabilistic matrix factorization. In Proceedings of the 17th ACM conference on Information and knowledge management. 931–940.
    [22]
    Xiao Ma, Liqin Zhao, Guan Huang, Zhi Wang, Zelin Hu, Xiaoqiang Zhu, and Kun Gai. 2018. Entire space multi-task model: An effective approach for estimating post-click conversion rate. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. 1137–1140.
    [23]
    Itzik Malkiel and Lior Wolf. 2020. MTAdam: Automatic Balancing of Multiple Training Loss Terms. arXiv preprint arXiv:2006.14683(2020).
    [24]
    Taylor Mordan, Nicolas Thome, Gilles Henaff, and Matthieu Cord. 2018. Revisiting multi-task learning with rock: a deep residual auxiliary block for visual detection. Advances in neural information processing systems 31 (2018), 1310–1322.
    [25]
    Maxim Naumov, Dheevatsa Mudigere, Hao-Jun Michael Shi, Jianyu Huang, Narayanan Sundaraman, Jongsoo Park, Xiaodong Wang, Udit Gupta, Carole-Jean Wu, Alisson G Azzolini, 2019. Deep learning recommendation model for personalization and recommendation systems. arXiv preprint arXiv:1906.00091(2019).
    [26]
    Sebastian Ruder. 2017. An Overview of Multi-Task Learning in Deep Neural Networks. arXiv (2017), arXiv–1706.
    [27]
    Ozan Sener and Vladlen Koltun. 2018. Multi-task learning as multi-objective optimization. arXiv preprint arXiv:1810.04650(2018).
    [28]
    T. Tieleman and G. Hinton. 2012. Lecture 6.5—RmsProp: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural Networks for Machine Learning.
    [29]
    Shubham Toshniwal, Hao Tang, Liang Lu, and Karen Livescu. 2017. Multitask Learning with Low-Level Auxiliary Tasks for Encoder-Decoder Based Speech Recognition. Proc. Interspeech 2017(2017), 3532–3536.
    [30]
    Trieu Trinh, Andrew Dai, Thang Luong, and Quoc Le. 2018. Learning Longer-term Dependencies in RNNs with Auxiliary Losses. In International Conference on Machine Learning. 4965–4974.
    [31]
    Abhinav Valada, Noha Radwan, and Wolfram Burgard. 2018. Deep auxiliary learning for visual localization and odometry. In 2018 IEEE international conference on robotics and automation (ICRA). IEEE, 6939–6946.
    [32]
    Simon Vandenhende, Stamatios Georgoulis, Wouter Van Gansbeke, Marc Proesmans, Dengxin Dai, and Luc Van Gool. 2020. Multi-Task Learning for Dense Prediction Tasks: A Survey. arXiv preprint arXiv:2004.13379(2020).
    [33]
    Xin Wang, Wenwu Zhu, and Chenghao Liu. 2019. Social recommendation with optimal limited attention. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1518–1527.
    [34]
    Tianhe Yu, Saurabh Kumar, Abhishek Gupta, Sergey Levine, Karol Hausman, and Chelsea Finn. 2020. Gradient surgery for multi-task learning. arXiv preprint arXiv:2001.06782(2020).
    [35]
    Wei Zhang, Quan Yuan, Jiawei Han, and Jianyong Wang. 2016. Collaborative multi-Level embedding learning from reviews for rating prediction. In IJCAI, Vol. 16. 2986–2992.

    Cited By

    View all
    • (2024) M 3 oE: Multi-Domain Multi-Task Mixture-of Experts Recommendation Framework Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657686(893-902)Online publication date: 11-Jul-2024
    • (2024)Sample-level weighting for multi-task learning with auxiliary tasksApplied Intelligence10.1007/s10489-024-05300-954:4(3482-3501)Online publication date: 5-Mar-2024
    • (2023)STAN: Stage-Adaptive Network for Multi-Task Recommendation by Learning User Lifecycle-Based RepresentationProceedings of the 17th ACM Conference on Recommender Systems10.1145/3604915.3608796(602-612)Online publication date: 14-Sep-2023
    • Show More Cited By

    Index Terms

    1. MetaBalance: Improving Multi-Task Recommendations via Adapting Gradient Magnitudes of Auxiliary Tasks
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Information & Contributors

          Information

          Published In

          cover image ACM Conferences
          WWW '22: Proceedings of the ACM Web Conference 2022
          April 2022
          3764 pages
          ISBN:9781450390965
          DOI:10.1145/3485447
          This work is licensed under a Creative Commons Attribution International 4.0 License.

          Sponsors

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          Published: 25 April 2022

          Check for updates

          Author Tags

          1. Auxiliary Learning
          2. Gradient-based Optimization
          3. Multi-Task Learning
          4. Personalized Recommendation

          Qualifiers

          • Research-article
          • Research
          • Refereed limited

          Conference

          WWW '22
          Sponsor:
          WWW '22: The ACM Web Conference 2022
          April 25 - 29, 2022
          Virtual Event, Lyon, France

          Acceptance Rates

          Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)1,007
          • Downloads (Last 6 weeks)94
          Reflects downloads up to 27 Jul 2024

          Other Metrics

          Citations

          Cited By

          View all
          • (2024) M 3 oE: Multi-Domain Multi-Task Mixture-of Experts Recommendation Framework Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657686(893-902)Online publication date: 11-Jul-2024
          • (2024)Sample-level weighting for multi-task learning with auxiliary tasksApplied Intelligence10.1007/s10489-024-05300-954:4(3482-3501)Online publication date: 5-Mar-2024
          • (2023)STAN: Stage-Adaptive Network for Multi-Task Recommendation by Learning User Lifecycle-Based RepresentationProceedings of the 17th ACM Conference on Recommender Systems10.1145/3604915.3608796(602-612)Online publication date: 14-Sep-2023
          • (2023)Gradient Coordination for Quantifying and Maximizing Knowledge Transference in Multi-Task LearningProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591993(2032-2036)Online publication date: 19-Jul-2023
          • (2023)Multi-behavior Self-supervised Learning for RecommendationProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591734(496-505)Online publication date: 19-Jul-2023
          • (2023)Meta Auxiliary Learning for Top-K RecommendationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.322315535:10(10857-10870)Online publication date: 1-Oct-2023
          • (2023)Cross-Domain Disentangled Learning for E-Commerce Live Streaming Recommendation2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00226(2955-2968)Online publication date: Apr-2023
          • (2023)Multi-Objective Surrogate Modeling Through Transfer Learning for Telescopic Boom ForkliftIEEE Access10.1109/ACCESS.2023.324060211(11629-11641)Online publication date: 2023
          • (2023)Object localization and edge refinement network for salient object detectionExpert Systems with Applications: An International Journal10.1016/j.eswa.2022.118973213:PBOnline publication date: 1-Mar-2023
          • (2023)MT-BICN: Multi-task Balanced Information Cascade Network for RecommendationKnowledge Science, Engineering and Management10.1007/978-3-031-40289-0_34(423-435)Online publication date: 16-Aug-2023
          • Show More Cited By

          View Options

          View options

          PDF

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format.

          HTML Format

          Get Access

          Login options

          Media

          Figures

          Other

          Tables

          Share

          Share

          Share this Publication link

          Share on social media