research-article

Open access

MetaBalance: Improving Multi-Task Recommendations via Adapting Gradient Magnitudes of Auxiliary Tasks

Authors:

James CaverleeAuthors Info & Claims

WWW '22: Proceedings of the ACM Web Conference 2022

Pages 2205 - 2215

https://doi.org/10.1145/3485447.3512093

Published: 25 April 2022 Publication History

All formats PDF

Abstract

In many personalized recommendation scenarios, the generalization ability of a target task can be improved via learning with additional auxiliary tasks alongside this target task on a multi-task network. However, this method often suffers from a serious optimization imbalance problem. On the one hand, one or more auxiliary tasks might have a larger influence than the target task and even dominate the network weights, resulting in worse recommendation accuracy for the target task. On the other hand, the influence of one or more auxiliary tasks might be too weak to assist the target task. More challenging is that this imbalance dynamically changes throughout the training process and varies across the parts of the same network. We propose a new method: MetaBalance to balance auxiliary losses via directly manipulating their gradients w.r.t the shared parameters in the multi-task network. Specifically, in each training iteration and adaptively for each part of the network, the gradient of an auxiliary loss is carefully reduced or enlarged to have a closer magnitude to the gradient of the target loss, preventing auxiliary tasks from being so strong that dominate the target task or too weak to help the target task. Moreover, the proximity between the gradient magnitudes can be flexibly adjusted to adapt MetaBalance to different scenarios. The experiments show that our proposed method achieves a significant improvement of 8.34% in terms of NDCG@10 upon the strongest baseline on two real-world datasets. The code of our approach can be found at here.1

References

[1]

Trapit Bansal, David Belanger, and Andrew McCallum. 2016. Ask the gru: Multi-task learning for deep text recommendations. In proceedings of the 10th ACM Conference on Recommender Systems. 107–114.

Digital Library

[2]

Walter Baur and Volker Strassen. 1983. The complexity of partial derivatives. Theoretical computer science 22, 3 (1983), 317–330.

[3]

Da Cao, Liqiang Nie, Xiangnan He, Xiaochi Wei, Shunzhi Zhu, and Tat-Seng Chua. 2017. Embedding factorization models for jointly recommending items and user generated lists. In SIGIR.

[4]

Zhao Chen, Vijay Badrinarayanan, Chen-Yu Lee, and Andrew Rabinovich. 2018. Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In International Conference on Machine Learning. PMLR, 794–803.

[5]

Zhao Chen, Jiquan Ngiam, Yanping Huang, Thang Luong, Henrik Kretzschmar, Yuning Chai, and Dragomir Anguelov. 2020. Just pick a sign: Optimizing deep multitask models with gradient sign dropout. arXiv preprint arXiv:2010.06808(2020).

[6]

Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, 2016. Wide & deep learning for recommender systems. In Proceedings of the 1st workshop on deep learning for recommender systems. 7–10.

Digital Library

[7]

Yunshu Du, Wojciech M Czarnecki, Siddhant M Jayakumar, Mehrdad Farajtabar, Razvan Pascanu, and Balaji Lakshminarayanan. 2018. Adapting auxiliary losses using gradient similarity. arXiv preprint arXiv:1812.02224(2018).

[8]

John Duchi, Elad Hazan, and Yoram Singer. 2011. Adaptive subgradient methods for online learning and stochastic optimization.Journal of machine learning research 12, 7 (2011).

[9]

Guibing Guo, Jie Zhang, and Neil Yorke-Smith. 2015. Trustsvd: Collaborative filtering with both the explicit and implicit influence of user trust and of item ratings. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 29.

[10]

Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural collaborative filtering. In WWW.

[11]

Yun He, Jianling Wang, Wei Niu, and James Caverlee. 2019. A Hierarchical Self-Attentive Model for Recommending User-Generated Item Lists. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management. ACM, 1481–1490.

Digital Library

[12]

Max Jaderberg, Volodymyr Mnih, Wojciech Marian Czarnecki, Tom Schaul, Joel Z Leibo, David Silver, and Koray Kavukcuoglu. 2016. Reinforcement learning with unsupervised auxiliary tasks. arXiv preprint arXiv:1611.05397(2016).

[13]

Kalervo Järvelin and Jaana Kekäläinen. 2002. Cumulated gain-based evaluation of IR techniques. TOIS (2002).

[14]

Alex Kendall, Yarin Gal, and Roberto Cipolla. 2018. Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In Proceedings of the IEEE conference on computer vision and pattern recognition. 7482–7491.

[15]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980(2014).

[16]

Lukas Liebel and Marco Körner. 2018. Auxiliary tasks in multi-task learning. arXiv preprint arXiv:1805.06334(2018).

[17]

Xingyu Lin, Harjatin Baweja, George Kantor, and David Held. 2019. Adaptive Auxiliary Task Weighting for Reinforcement Learning. In Advances in Neural Information Processing Systems. 4772–4783.

[18]

Shikun Liu, Andrew Davison, and Edward Johns. 2019. Self-supervised generalisation with meta auxiliary learning. In Advances in Neural Information Processing Systems. 1679–1689.

[19]

Shikun Liu, Edward Johns, and Andrew J Davison. 2019. End-to-end multi-task learning with attention. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1871–1880.

[20]

Yidan Liu, Min Xie, and Laks VS Lakshmanan. 2014. Recommending user generated item lists. In Recsys.

[21]

Hao Ma, Haixuan Yang, Michael R Lyu, and Irwin King. 2008. Sorec: social recommendation using probabilistic matrix factorization. In Proceedings of the 17th ACM conference on Information and knowledge management. 931–940.

Digital Library

[22]

Xiao Ma, Liqin Zhao, Guan Huang, Zhi Wang, Zelin Hu, Xiaoqiang Zhu, and Kun Gai. 2018. Entire space multi-task model: An effective approach for estimating post-click conversion rate. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. 1137–1140.

Digital Library

[23]

Itzik Malkiel and Lior Wolf. 2020. MTAdam: Automatic Balancing of Multiple Training Loss Terms. arXiv preprint arXiv:2006.14683(2020).

[24]

Taylor Mordan, Nicolas Thome, Gilles Henaff, and Matthieu Cord. 2018. Revisiting multi-task learning with rock: a deep residual auxiliary block for visual detection. Advances in neural information processing systems 31 (2018), 1310–1322.

[25]

Maxim Naumov, Dheevatsa Mudigere, Hao-Jun Michael Shi, Jianyu Huang, Narayanan Sundaraman, Jongsoo Park, Xiaodong Wang, Udit Gupta, Carole-Jean Wu, Alisson G Azzolini, 2019. Deep learning recommendation model for personalization and recommendation systems. arXiv preprint arXiv:1906.00091(2019).

[26]

Sebastian Ruder. 2017. An Overview of Multi-Task Learning in Deep Neural Networks. arXiv (2017), arXiv–1706.

[27]

Ozan Sener and Vladlen Koltun. 2018. Multi-task learning as multi-objective optimization. arXiv preprint arXiv:1810.04650(2018).

[28]

T. Tieleman and G. Hinton. 2012. Lecture 6.5—RmsProp: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural Networks for Machine Learning.

[29]

Shubham Toshniwal, Hao Tang, Liang Lu, and Karen Livescu. 2017. Multitask Learning with Low-Level Auxiliary Tasks for Encoder-Decoder Based Speech Recognition. Proc. Interspeech 2017(2017), 3532–3536.

[30]

Trieu Trinh, Andrew Dai, Thang Luong, and Quoc Le. 2018. Learning Longer-term Dependencies in RNNs with Auxiliary Losses. In International Conference on Machine Learning. 4965–4974.

[31]

Abhinav Valada, Noha Radwan, and Wolfram Burgard. 2018. Deep auxiliary learning for visual localization and odometry. In 2018 IEEE international conference on robotics and automation (ICRA). IEEE, 6939–6946.

Digital Library

[32]

Simon Vandenhende, Stamatios Georgoulis, Wouter Van Gansbeke, Marc Proesmans, Dengxin Dai, and Luc Van Gool. 2020. Multi-Task Learning for Dense Prediction Tasks: A Survey. arXiv preprint arXiv:2004.13379(2020).

[33]

Xin Wang, Wenwu Zhu, and Chenghao Liu. 2019. Social recommendation with optimal limited attention. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1518–1527.

Digital Library

[34]

Tianhe Yu, Saurabh Kumar, Abhishek Gupta, Sergey Levine, Karol Hausman, and Chelsea Finn. 2020. Gradient surgery for multi-task learning. arXiv preprint arXiv:2001.06782(2020).

[35]

Wei Zhang, Quan Yuan, Jiawei Han, and Jianyong Wang. 2016. Collaborative multi-Level embedding learning from reviews for rating prediction. In IJCAI, Vol. 16. 2986–2992.

Cited By

Zhang ZLiu SYu JCai QZhao XZhang CLiu ZLiu QZhao HHu LJiang PGai KHui Yang GWang HHan SHauff CZuccon GZhang Y(2024) M 3 oE: Multi-Domain Multi-Task Mixture-of Experts Recommendation Framework Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657686(893-902)Online publication date: 11-Jul-2024
https://doi.org/10.1145/3626772.3657686
Grégoire EChaudhary MVerboven S(2024)Sample-level weighting for multi-task learning with auxiliary tasksApplied Intelligence10.1007/s10489-024-05300-954:4(3482-3501)Online publication date: 5-Mar-2024
https://dl.acm.org/doi/10.1007/s10489-024-05300-9
Li WZheng WXiao XWang S(2023)STAN: Stage-Adaptive Network for Multi-Task Recommendation by Learning User Lifecycle-Based RepresentationProceedings of the 17th ACM Conference on Recommender Systems10.1145/3604915.3608796(602-612)Online publication date: 14-Sep-2023
https://dl.acm.org/doi/10.1145/3604915.3608796
Show More Cited By

Index Terms

MetaBalance: Improving Multi-Task Recommendations via Adapting Gradient Magnitudes of Auxiliary Tasks
1. Computing methodologies
  1. Machine learning
2. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Recommender systems
  2. Information systems applications
    1. Data mining

Index terms have been assigned to the content through auto-classification.

Recommendations

Learning multi-tasks with inconsistent labels by using auxiliary big task
Abstract
Multi-task learning is to improve the performance of the model by transferring and exploiting common knowledge among tasks. Existing MTL works mainly focus on the scenario where label sets among multiple tasks (MTs) are usually the same, thus they ...
Sample-level weighting for multi-task learning with auxiliary tasks
Abstract
Multi-task learning (MTL) can improve the generalization performance of neural networks by sharing representations with related tasks. Nonetheless, MTL is challenging in practice because it can also degrade performance through harmful interference ...
Metric-Guided Multi-task Learning
Foundations of Intelligent Systems
Abstract
Multi-task learning (MTL) aims to solve multiple related learning tasks simultaneously so that the useful information in one specific task can be utilized by other tasks in order to improve the learning performance of all tasks. Many ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

WWW '22: Proceedings of the ACM Web Conference 2022

April 2022

3764 pages

ISBN:9781450390965

DOI:10.1145/3485447

Editors:
Frédérique Laforest
INSA Lyon, France
,
Raphaël Troncy
EURECOM, France
,
Elena Simperl
King’s College London, UK
,
Deepak Agarwal
Pinterest, USA
,
Aristides Gionis
KTH Royal Institute of Technology, Sweden
,
Ivan Herman
W3C / retired
,
Lionel Médini
Université Lyon 1, France

Copyright © 2022 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 April 2022

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

WWW '22

Sponsor:

SIGWEB

WWW '22: The ACM Web Conference 2022

April 25 - 29, 2022

Virtual Event, Lyon, France

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

11
Total Citations
View Citations
2,190
Total Downloads

Downloads (Last 12 months)1,007
Downloads (Last 6 weeks)94

Reflects downloads up to 27 Jul 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zhang ZLiu SYu JCai QZhao XZhang CLiu ZLiu QZhao HHu LJiang PGai KHui Yang GWang HHan SHauff CZuccon GZhang Y(2024) M 3 oE: Multi-Domain Multi-Task Mixture-of Experts Recommendation Framework Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657686(893-902)Online publication date: 11-Jul-2024
https://doi.org/10.1145/3626772.3657686
Grégoire EChaudhary MVerboven S(2024)Sample-level weighting for multi-task learning with auxiliary tasksApplied Intelligence10.1007/s10489-024-05300-954:4(3482-3501)Online publication date: 5-Mar-2024
https://dl.acm.org/doi/10.1007/s10489-024-05300-9
Li WZheng WXiao XWang S(2023)STAN: Stage-Adaptive Network for Multi-Task Recommendation by Learning User Lifecycle-Based RepresentationProceedings of the 17th ACM Conference on Recommender Systems10.1145/3604915.3608796(602-612)Online publication date: 14-Sep-2023
https://dl.acm.org/doi/10.1145/3604915.3608796
Yang XZhao JLiu SWang LZheng BChen HDuh WHuang HKato MMothe JPoblete B(2023)Gradient Coordination for Quantifying and Maximizing Knowledge Transference in Multi-Task LearningProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591993(2032-2036)Online publication date: 19-Jul-2023
https://dl.acm.org/doi/10.1145/3539618.3591993
Xu JWang CWu CSong YZheng KWang XWang CZhou GGai KChen HDuh WHuang HKato MMothe JPoblete B(2023)Multi-behavior Self-supervised Learning for RecommendationProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591734(496-505)Online publication date: 19-Jul-2023
https://dl.acm.org/doi/10.1145/3539618.3591734
Li XMa CLi GXu PLiu CYuan YWang G(2023)Meta Auxiliary Learning for Top-K RecommendationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.322315535:10(10857-10870)Online publication date: 1-Oct-2023
https://dl.acm.org/doi/10.1109/TKDE.2022.3223155
Zhang YLiu YXiong HLiu YYu FHe WXu YCui LMiao C(2023)Cross-Domain Disentangled Learning for E-Commerce Live Streaming Recommendation2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00226(2955-2968)Online publication date: Apr-2023
https://doi.org/10.1109/ICDE55515.2023.00226
Lin JLi HHuang YLiang JZhou SHuang ZLiang G(2023)Multi-Objective Surrogate Modeling Through Transfer Learning for Telescopic Boom ForkliftIEEE Access10.1109/ACCESS.2023.324060211(11629-11641)Online publication date: 2023
https://doi.org/10.1109/ACCESS.2023.3240602
Yao ZWang L(2023)Object localization and edge refinement network for salient object detectionExpert Systems with Applications: An International Journal10.1016/j.eswa.2022.118973213:PBOnline publication date: 1-Mar-2023
https://dl.acm.org/doi/10.1016/j.eswa.2022.118973
Wu HGao Y(2023)MT-BICN: Multi-task Balanced Information Cascade Network for RecommendationKnowledge Science, Engineering and Management10.1007/978-3-031-40289-0_34(423-435)Online publication date: 16-Aug-2023
https://dl.acm.org/doi/10.1007/978-3-031-40289-0_34
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents