Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Towards Domain-Aware Stable Meta Learning for Out-of-Distribution Generalization

Published: 16 August 2024 Publication History

Abstract

Deep learning models are often trained on datasets that are limited in size and distribution, which may not fully represent the entire range of data encountered in practice. Thus, making deep learning models generalize to out-of-distribution data has received a significant amount of attention in recent studies due to the critical importance of this ability in real-world applications. Meta learning as an effective knowledge transfer paradigm, which learns a base model with high generalization ability to adapt to new data distributions by minimizing domain shifts across tasks during meta-training. However, most existing meta learning methods assume that the base model can access the labels of different domains, and this assumption is demanding in many real application scenarios. In addition, these methods focus on narrowing data-level domain shifts, while ignoring task-level domain shifts, which may lead to inadequate or even negative transfer. Inspired by human learners who use induction to learn and master new tasks, we propose a novel domain-aware meta learning framework for out-of-distribution generalization, termed SMLG. This framework enables the base model to generalize effectively to unseen domains without relying on domain-specific labels. Specifically, we develop a domain-aware transformation module to obtain meta representation and pseudo domain labels. As a result, the base model can be trained robustly without the need for direct domain label input. Furthermore, to investigate the impact of domain shifts at different levels, we introduce a joint loss function that combines cross-entropy with a domain alignment constraint. Extensive experiments on benchmark datasets demonstrate the efficacy of our framework.

References

[1]
Sercan Ömer Arik, Jitong Chen, Kainan Peng, Wei Ping, and Yanqi Zhou. 2018. Neural Voice Cloning with a Few Samples. In Proceedings of Advances in Neural Information Processing Systems (NeurIPS). 10040–10050.
[2]
Martin Arjovsky, Léon Bottou, Ishaan Gulrajani, and David Lopez-Paz. 2019. Invariant Risk Minimization. arXiv:1907.02893. Retrieved from http://arxiv.org/abs/1907.02893
[3]
Yogesh Balaji, Swami Sankaranarayanan, and Rama Chellappa. 2018. Metareg: Towards Domain Generalization Using Meta-Regularization. In Proceedings of Advances in Neural Information Processing Systems (NeurIPS), Vol. 31. 998–1008.
[4]
Shai Ben-David, John Blitzer, Koby Crammer, and Fernando Pereira. 2006. Analysis of Representations for Domain Adaptation. In Proceedings of Advances in Neural Information Processing Systems (NeurIPS). 137–144.
[5]
Hakan Bilen and Andrea Vedaldi. 2017. Universal Representations: The Missing Link Between Faces, Text, Planktons, and Cat Breeds. arXiv:1701.07275. Retrieved from http://arxiv.org/abs/1701.07275
[6]
Danushka Bollegala and James O’Neill. 2022. A Survey on Word Meta-Embedding Learning. In Proceedings of the 31st International Joint Conference on Artificial Intelligence (IJCAI). 5402–5409. Retrieved from ijcai.org
[7]
Fabio M. Carlucci, Antonio D’Innocente, Silvia Bucci, Barbara Caputo, and Tatiana Tommasi. 2019. Domain Generalization by Solving Jigsaw Puzzles. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2229–2238.
[8]
Binghui Chen, Zhaoyi Yan, Ke Li, Pengyu Li, Biao Wang, Wangmeng Zuo, and Lei Zhang. 2021. Variational Attention: Propagating Domain-Specific Knowledge for Multi-Domain Learning in Crowd Counting. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV ’21), IEEE, 16045–16055.
[9]
Chaoqi Chen, Jiongcheng Li, Xiaoguang Han, Xiaoqing Liu, and Yizhou Yu. 2022a. Compound Domain Generalization via Meta-Knowledge Encoding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 7109–7119.
[10]
Jiaxin Chen, Xiao-Ming Wu, Yanke Li, Qimai Li, Li-Ming Zhan, and Fu-Lai Chung. 2020. A Closer Look at the Training Strategy for Modern Meta-Learning. In Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems (NeurIPS). 396–406.
[11]
Keyu Chen, Di Zhuang, and J. Morris Chang. 2022b. Discriminative Adversarial Domain Generalization with Meta-Learning Based Cross-Domain Validation. Neurocomputing 467 (2022), 418–426.
[12]
Jonghyun Choi, Jayant Krishnamurthy, Aniruddha Kembhavi, and Ali Farhadi. 2018. Structured Set Matching Networks for One-Shot Part Labeling. In Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR). 3627–3636.
[13]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Fei-Fei Li. 2009. ImageNet: A Large-Scale Hierarchical Image Database. In Proceedings of Computer Society Conference on Computer Vision and Pattern Recognition (CVPR). 248–255.
[14]
Li Deng. 2012. The MNIST Database of Handwritten Digit Images for Machine Learning Research [Best of the Web]. Proceedings of the IEEE Signal Processing Magazine 29, 6 (2012), 141–142.
[15]
Antonio D’Innocente and Barbara Caputo. 2018. Domain Generalization with Domain-Specific Aggregation Modules. In Proceedings of Pattern Recognition - 40th German Conference (GCPR), Vol. 11269. 187–198.
[16]
Qi Dou, Daniel Coelho de Castro, Konstantinos Kamnitsas, and Ben Glocker. 2019. Domain Generalization via Model-Agnostic Learning of Semantic Features. In Proceedings of Advances in Neural Information Processing Systems (NeurIPS). 6447–6458.
[17]
Ying-Jun Du, Jun Xu, Huan Xiong, Qiang Qiu, Xiantong Zhen, Cees G. M. Snoek, and Ling Shao. 2020. Learning to Learn with Variational Information Bottleneck for Domain Generalization. In Proceedings of Computer Vision - ECCV 16th European Conference, Vol. 12355. 200–216.
[18]
Ying-Jun Du, Xiantong Zhen, Ling Shao, and Cees G. M. Snoek. 2021. MetaNorm: Learning to Normalize Few-Shot Batches Across Domains. In Proceedings of 9th International Conference on Learning Representations (ICLR). 1–13.
[19]
Chen Fang, Ye Xu, and Daniel N. Rockmore. 2013. Unbiased Metric Learning: On the Utilization of Multiple Datasets and Web Images for Softening Bias. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). 1657–1664.
[20]
Chelsea Finn, Pieter Abbeel, and Sergey Levine. 2017. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. In Proceedings of the 34th International Conference on Machine Learning (ICML), Vol. 70. 1126–1135.
[21]
Muhammad Ghifary, W. Bastiaan Kleijn, Mengjie Zhang, and David Balduzzi. 2015. Domain Generalization for Object Recognition with Multi-task Autoencoders. In Proceedings of International Conference on Computer Vision (ICCV). 2551–2559.
[22]
Tom Heskes. 2000. Empirical Bayes for Learning to Learn. In Proceedings of the 17th International Conference on Machine Learning (ICML). 367–374.
[23]
Timothy M. Hospedales, Antreas Antoniou, Paul Micaelli, and Amos J. Storkey. 2022. Meta-Learning in Neural Networks: A Survey. IEEE Transactions on Pattern Analysis and Machine Intelligence 44, 9 (2022), 5149–5169.
[24]
Mike Huisman, Jan N van Rijn, and Aske Plaat. 2020. A Survey of Deep Meta-Learning. arXiv:2010.03522. Retrieved from https://arxiv.org/abs/2010.03522
[25]
Hal Daumé III. 2007. Frustratingly Easy Domain Adaptation. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (ACL). 256–263.
[26]
Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In Proceedings of the 3rd International Conference on Learning Representations (ICLR). 1–11.
[27]
Jedrzej Kozerawski and Matthew A. Turk. 2018. CLEAR: Cumulative LEARning for One-Shot One-Class Image Recognition. In Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR). 3446–3455.
[28]
A. Krizhevsky and G. Hinton. 2009. Learning Multiple Layers of Features from Tiny Images. Proceedings of Handbook of Systemic Autoimmune Diseases 1, 4 (2009). 1–47.
[29]
Da Li, Yongxin Yang, Yi-Zhe Song, and Timothy M. Hospedales. 2020. Sequential Learning for Domain Generalization. In Proceedings of Computer Vision - ECCV Workshops, Vol. 12535. 603–619.
[30]
Da Li, Yongxin Yang, Yi-Zhe Song, and Timothy M. Hospedales. 2017a. Deeper, Broader and Artier Domain Generalization. In Proceedings of the IEEE International Conference on Computer Vision. 5542–5550.
[31]
Da Li, Yongxin Yang, Yi-Zhe Song, and Timothy M. Hospedales. 2017b. Learning to Generalize: Meta-Learning for Domain Generalization. arXiv:1710.03463. Retrieved from http://arxiv.org/abs/1710.03463
[32]
Da Li, Jianshu Zhang, Yongxin Yang, Cong Liu, Yi-Zhe Song, and Timothy M. Hospedales. 2019b. Episodic Training for Domain Generalization. In Proceedings of International Conference on Computer Vision (ICCV). 1446–1455.
[33]
Haoyang Li, Xin Wang, Ziwei Zhang, and Wenwu Zhu. 2022. Out-of-Distribution Generalization on Graphs: A Survey. arXiv:2202.07987. Retrieved from https://arxiv.org/abs/2202.07987
[34]
Yiying Li, Yongxin Yang, Wei Zhou, and Timothy M Hospedales. 2019a. Feature-Critic Networks for Heterogeneous Domain Generalization. arXiv:1901.11448. Retrieved from http://arxiv.org/abs/1901.11448
[35]
Evan Zheran Liu, Behzad Haghgoo, Annie S. Chen, Aditi Raghunathan, Pang Wei Koh, Shiori Sagawa, Percy Liang, and Chelsea Finn. 2021. Just Train Twice: Improving Group Robustness without Training Group Information. In Proceedings of the 38th International Conference on Machine Learning (ICML), Vol. 139. PMLR, 6781–6792.
[36]
Yanbin Liu, Juho Lee, Minseop Park, Saehoon Kim, Eunho Yang, Sung Ju Hwang, and Yi Yang. 2019. Learning to Propagate Labels: Transductive Propagation Network for Few-Shot Learning. In Proceedings of 7th International Conference on Learning Representations (ICLR). 1–11.
[37]
Nikhil Mishra, Mostafa Rohaninejad, Xi Chen, and Pieter Abbeel. 2018. A Simple Neural Attentive Meta-Learner. In Proceedings of 6th International Conference on Learning Representations (ICLR). 1–12.
[38]
Saeid Motiian, Marco Piccirilli, Donald A. Adjeroh, and Gianfranco Doretto. 2017. Unified Deep Supervised Domain Adaptation and Generalization. In Proceedings of IEEE International Conference on Computer Vision (ICCV). 5716–5726.
[39]
Jun Hyun Nam, Hyuntak Cha, Sungsoo Ahn, Jaeho Lee, and Jinwoo Shin. 2020. Learning from Failure: De-Biasing Classifier from Biased Classifier. In Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems (NeurIPS). 2067–20684.
[40]
Eli Schwartz, Leonid Karlinsky, Joseph Shtok, Sivan Harary, Mattias Marder, Abhishek Kumar, Rogério Schmidt Feris, Raja Giryes, and Alexander M. Bronstein. 2018. Delta-Encoder: An Effective Sample Synthesis Method for Few-Shot Object Recognition. In Proceedings of Advances in Neural Information Processing Systems (NeurIPS). 2850–2860.
[41]
Jian Shen, Yanru Qu, Weinan Zhang, and Yong Yu. 2017. Wasserstein Distance Guided Representation Learning for Domain Adaptation. arXiv:1707.01217. Retrieved from https://arxiv.org/abs/1707.01217
[42]
Zheyan Shen, Jiashuo Liu, Yue He, Xingxuan Zhang, Renzhe Xu, Han Yu, and Peng Cui. 2021. Towards Out-of-Distribution Generalization: A Survey. arXiv:2108.13624. Retrieved from https://arxiv.org/abs/2108.13624
[43]
Mingchen Sun, Kaixiong Zhou, Xin He, Ying Wang, and Xin Wang. 2022. GPPT: Graph Pre-Training and Prompt Tuning to Generalize Graph Neural Networks. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD ’22). ACM, 1717–1727.
[44]
Yao-Hung Hubert Tsai, Liang-Kang Huang, and Ruslan Salakhutdinov. 2017. Learning Robust Visual-Semantic Embeddings. In Proceedings of International Conference on Computer Vision (ICCV). 3591–3600.
[45]
Aäron van den Oord, Nal Kalchbrenner, Lasse Espeholt, Koray Kavukcuoglu, Oriol Vinyals, and Alex Graves. 2016. Conditional Image Generation with PixelCNN Decoders. In Proceedings of Advances in Neural Information Processing Systems (NeurIPS). 4790–4798.
[46]
Manasi Vartak, Arvind Thiagarajan, Conrado Miranda, Jeshua Bratman, and Hugo Larochelle. 2017. A Meta-Learning Perspective on Cold-Start Recommendations for Items. In Proceedings of Advances in Neural Information Processing Systems (NeurIPS). 6904–6914.
[47]
Jindong Wang, Cuiling Lan, Chang Liu, Yidong Ouyang, and Tao Qin. 2021. Generalizing to Unseen Domains: A Survey on Domain Generalization. In Proceedings of the 30th International Joint Conference on Artificial Intelligence (IJCAI). 4627–4635. Retrieved from ijcai.org
[48]
Zhengyu Yang, Kan Ren, Xufang Luo, Minghuan Liu, Weiqing Liu, Jiang Bian, Weinan Zhang, and Dongsheng Li. 2022. Towards Applicable Reinforcement Learning: Improving the Generalization and Sample Efficiency with Policy Ensemble. In Proceedings of the 31st International Joint Conference on Artificial Intelligence (IJCAI). 3659–3665. Retrieved from ijcai.org
[49]
Michael Zhang, Nimit Sharad Sohoni, Hongyang R. Zhang, Chelsea Finn, and Christopher Ré. 2022. Correct-N-Contrast: A Contrastive Approach for Improving Robustness to Spurious Correlations. In Proceedings of the International Conference on Machine Learning (ICML), Vol. 162. PMLR, 26484–26516.
[50]
Xingxuan Zhang, Peng Cui, Renzhe Xu, Linjun Zhou, Yue He, and Zheyan Shen. 2021. Deep Stable Learning for Out-of-Distribution Generalization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Computer Vision Foundation/IEEE, 5372–5382.
[51]
Kaiyang Zhou, Ziwei Liu, Yu Qiao, Tao Xiang, and Chen Change Loy. 2021. Domain Generalization: A Survey. arXiv:2103.02503. Retrieved from https://arxiv.org/abs/2103.02503
[52]
Xiao Zhou, Yong Lin, Weizhong Zhang, and Tong Zhang. 2022. Sparse Invariant Risk Minimization. In Proceedings of the International Conference on Machine Learning (ICML), Vol. 162. 27222–27244.

Cited By

View all
  • (2024)An Aggregation Procedure Enhanced Mechanism for GCN-Based Knowledge Graph Completion Model by Leveraging Condensed Sampling and Attention OptimizationWeb and Big Data10.1007/978-981-97-7235-3_23(341-356)Online publication date: 31-Aug-2024
  • (2024)Generating Graph-Based Rules for Enhancing Logical ReasoningAdvanced Intelligent Computing Technology and Applications10.1007/978-981-97-5615-5_12(143-156)Online publication date: 5-Aug-2024

Index Terms

  1. Towards Domain-Aware Stable Meta Learning for Out-of-Distribution Generalization

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Knowledge Discovery from Data
      ACM Transactions on Knowledge Discovery from Data  Volume 18, Issue 8
      September 2024
      700 pages
      EISSN:1556-472X
      DOI:10.1145/3613713
      Issue’s Table of Contents

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 16 August 2024
      Online AM: 03 July 2024
      Accepted: 27 June 2024
      Revised: 29 February 2024
      Received: 25 November 2022
      Published in TKDD Volume 18, Issue 8

      Check for updates

      Author Tags

      1. Cross domain
      2. domain generalization
      3. task-level domain shift
      4. meta learning

      Qualifiers

      • Research-article

      Funding Sources

      • National Natural Science Foundation of China
      • International Science and Technology Cooperation Program of Jilin Province
      • Science and Technology Development Program of Jilin Province
      • Fifth Electronics Research Institute of the Ministry of Industry and Information Technology

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)131
      • Downloads (Last 6 weeks)45
      Reflects downloads up to 21 Sep 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)An Aggregation Procedure Enhanced Mechanism for GCN-Based Knowledge Graph Completion Model by Leveraging Condensed Sampling and Attention OptimizationWeb and Big Data10.1007/978-981-97-7235-3_23(341-356)Online publication date: 31-Aug-2024
      • (2024)Generating Graph-Based Rules for Enhancing Logical ReasoningAdvanced Intelligent Computing Technology and Applications10.1007/978-981-97-5615-5_12(143-156)Online publication date: 5-Aug-2024

      View Options

      Get Access

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      Full Text

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media