research-article

Towards Domain-Aware Stable Meta Learning for Out-of-Distribution Generalization

Authors:

Xin WangAuthors Info & Claims

ACM Transactions on Knowledge Discovery from Data, Volume 18, Issue 8

Article No.: 203, Pages 1 - 24

https://doi.org/10.1145/3676558

Published: 16 August 2024 Publication History

Abstract

Deep learning models are often trained on datasets that are limited in size and distribution, which may not fully represent the entire range of data encountered in practice. Thus, making deep learning models generalize to out-of-distribution data has received a significant amount of attention in recent studies due to the critical importance of this ability in real-world applications. Meta learning as an effective knowledge transfer paradigm, which learns a base model with high generalization ability to adapt to new data distributions by minimizing domain shifts across tasks during meta-training. However, most existing meta learning methods assume that the base model can access the labels of different domains, and this assumption is demanding in many real application scenarios. In addition, these methods focus on narrowing data-level domain shifts, while ignoring task-level domain shifts, which may lead to inadequate or even negative transfer. Inspired by human learners who use induction to learn and master new tasks, we propose a novel domain-aware meta learning framework for out-of-distribution generalization, termed SMLG. This framework enables the base model to generalize effectively to unseen domains without relying on domain-specific labels. Specifically, we develop a domain-aware transformation module to obtain meta representation and pseudo domain labels. As a result, the base model can be trained robustly without the need for direct domain label input. Furthermore, to investigate the impact of domain shifts at different levels, we introduce a joint loss function that combines cross-entropy with a domain alignment constraint. Extensive experiments on benchmark datasets demonstrate the efficacy of our framework.

References

[1]

Sercan Ömer Arik, Jitong Chen, Kainan Peng, Wei Ping, and Yanqi Zhou. 2018. Neural Voice Cloning with a Few Samples. In Proceedings of Advances in Neural Information Processing Systems (NeurIPS). 10040–10050.

[2]

Martin Arjovsky, Léon Bottou, Ishaan Gulrajani, and David Lopez-Paz. 2019. Invariant Risk Minimization. arXiv:1907.02893. Retrieved from http://arxiv.org/abs/1907.02893

[3]

Yogesh Balaji, Swami Sankaranarayanan, and Rama Chellappa. 2018. Metareg: Towards Domain Generalization Using Meta-Regularization. In Proceedings of Advances in Neural Information Processing Systems (NeurIPS), Vol. 31. 998–1008.

[4]

Shai Ben-David, John Blitzer, Koby Crammer, and Fernando Pereira. 2006. Analysis of Representations for Domain Adaptation. In Proceedings of Advances in Neural Information Processing Systems (NeurIPS). 137–144.

[5]

Hakan Bilen and Andrea Vedaldi. 2017. Universal Representations: The Missing Link Between Faces, Text, Planktons, and Cat Breeds. arXiv:1701.07275. Retrieved from http://arxiv.org/abs/1701.07275

[6]

Danushka Bollegala and James O’Neill. 2022. A Survey on Word Meta-Embedding Learning. In Proceedings of the 31st International Joint Conference on Artificial Intelligence (IJCAI). 5402–5409. Retrieved from ijcai.org

[7]

Fabio M. Carlucci, Antonio D’Innocente, Silvia Bucci, Barbara Caputo, and Tatiana Tommasi. 2019. Domain Generalization by Solving Jigsaw Puzzles. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2229–2238.

[8]

Binghui Chen, Zhaoyi Yan, Ke Li, Pengyu Li, Biao Wang, Wangmeng Zuo, and Lei Zhang. 2021. Variational Attention: Propagating Domain-Specific Knowledge for Multi-Domain Learning in Crowd Counting. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV ’21), IEEE, 16045–16055.

[9]

Chaoqi Chen, Jiongcheng Li, Xiaoguang Han, Xiaoqing Liu, and Yizhou Yu. 2022a. Compound Domain Generalization via Meta-Knowledge Encoding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 7109–7119.

[10]

Jiaxin Chen, Xiao-Ming Wu, Yanke Li, Qimai Li, Li-Ming Zhan, and Fu-Lai Chung. 2020. A Closer Look at the Training Strategy for Modern Meta-Learning. In Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems (NeurIPS). 396–406.

[11]

Keyu Chen, Di Zhuang, and J. Morris Chang. 2022b. Discriminative Adversarial Domain Generalization with Meta-Learning Based Cross-Domain Validation. Neurocomputing 467 (2022), 418–426.

[12]

Jonghyun Choi, Jayant Krishnamurthy, Aniruddha Kembhavi, and Ali Farhadi. 2018. Structured Set Matching Networks for One-Shot Part Labeling. In Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR). 3627–3636.

[13]

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Fei-Fei Li. 2009. ImageNet: A Large-Scale Hierarchical Image Database. In Proceedings of Computer Society Conference on Computer Vision and Pattern Recognition (CVPR). 248–255.

[14]

Li Deng. 2012. The MNIST Database of Handwritten Digit Images for Machine Learning Research [Best of the Web]. Proceedings of the IEEE Signal Processing Magazine 29, 6 (2012), 141–142.

[15]

Antonio D’Innocente and Barbara Caputo. 2018. Domain Generalization with Domain-Specific Aggregation Modules. In Proceedings of Pattern Recognition - 40th German Conference (GCPR), Vol. 11269. 187–198.

[16]

Qi Dou, Daniel Coelho de Castro, Konstantinos Kamnitsas, and Ben Glocker. 2019. Domain Generalization via Model-Agnostic Learning of Semantic Features. In Proceedings of Advances in Neural Information Processing Systems (NeurIPS). 6447–6458.

[17]

Ying-Jun Du, Jun Xu, Huan Xiong, Qiang Qiu, Xiantong Zhen, Cees G. M. Snoek, and Ling Shao. 2020. Learning to Learn with Variational Information Bottleneck for Domain Generalization. In Proceedings of Computer Vision - ECCV 16th European Conference, Vol. 12355. 200–216.

Digital Library

[18]

Ying-Jun Du, Xiantong Zhen, Ling Shao, and Cees G. M. Snoek. 2021. MetaNorm: Learning to Normalize Few-Shot Batches Across Domains. In Proceedings of 9th International Conference on Learning Representations (ICLR). 1–13.

[19]

Chen Fang, Ye Xu, and Daniel N. Rockmore. 2013. Unbiased Metric Learning: On the Utilization of Multiple Datasets and Web Images for Softening Bias. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). 1657–1664.

Digital Library

[20]

Chelsea Finn, Pieter Abbeel, and Sergey Levine. 2017. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. In Proceedings of the 34th International Conference on Machine Learning (ICML), Vol. 70. 1126–1135.

[21]

Muhammad Ghifary, W. Bastiaan Kleijn, Mengjie Zhang, and David Balduzzi. 2015. Domain Generalization for Object Recognition with Multi-task Autoencoders. In Proceedings of International Conference on Computer Vision (ICCV). 2551–2559.

Digital Library

[22]

Tom Heskes. 2000. Empirical Bayes for Learning to Learn. In Proceedings of the 17th International Conference on Machine Learning (ICML). 367–374.

[23]

Timothy M. Hospedales, Antreas Antoniou, Paul Micaelli, and Amos J. Storkey. 2022. Meta-Learning in Neural Networks: A Survey. IEEE Transactions on Pattern Analysis and Machine Intelligence 44, 9 (2022), 5149–5169.

[24]

Mike Huisman, Jan N van Rijn, and Aske Plaat. 2020. A Survey of Deep Meta-Learning. arXiv:2010.03522. Retrieved from https://arxiv.org/abs/2010.03522

[25]

Hal Daumé III. 2007. Frustratingly Easy Domain Adaptation. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (ACL). 256–263.

[26]

Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In Proceedings of the 3rd International Conference on Learning Representations (ICLR). 1–11.

[27]

Jedrzej Kozerawski and Matthew A. Turk. 2018. CLEAR: Cumulative LEARning for One-Shot One-Class Image Recognition. In Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR). 3446–3455.

[28]

A. Krizhevsky and G. Hinton. 2009. Learning Multiple Layers of Features from Tiny Images. Proceedings of Handbook of Systemic Autoimmune Diseases 1, 4 (2009). 1–47.

[29]

Da Li, Yongxin Yang, Yi-Zhe Song, and Timothy M. Hospedales. 2020. Sequential Learning for Domain Generalization. In Proceedings of Computer Vision - ECCV Workshops, Vol. 12535. 603–619.

Digital Library

[30]

Da Li, Yongxin Yang, Yi-Zhe Song, and Timothy M. Hospedales. 2017a. Deeper, Broader and Artier Domain Generalization. In Proceedings of the IEEE International Conference on Computer Vision. 5542–5550.

[31]

Da Li, Yongxin Yang, Yi-Zhe Song, and Timothy M. Hospedales. 2017b. Learning to Generalize: Meta-Learning for Domain Generalization. arXiv:1710.03463. Retrieved from http://arxiv.org/abs/1710.03463

[32]

Da Li, Jianshu Zhang, Yongxin Yang, Cong Liu, Yi-Zhe Song, and Timothy M. Hospedales. 2019b. Episodic Training for Domain Generalization. In Proceedings of International Conference on Computer Vision (ICCV). 1446–1455.

[33]

Haoyang Li, Xin Wang, Ziwei Zhang, and Wenwu Zhu. 2022. Out-of-Distribution Generalization on Graphs: A Survey. arXiv:2202.07987. Retrieved from https://arxiv.org/abs/2202.07987

[34]

Yiying Li, Yongxin Yang, Wei Zhou, and Timothy M Hospedales. 2019a. Feature-Critic Networks for Heterogeneous Domain Generalization. arXiv:1901.11448. Retrieved from http://arxiv.org/abs/1901.11448

[35]

Evan Zheran Liu, Behzad Haghgoo, Annie S. Chen, Aditi Raghunathan, Pang Wei Koh, Shiori Sagawa, Percy Liang, and Chelsea Finn. 2021. Just Train Twice: Improving Group Robustness without Training Group Information. In Proceedings of the 38th International Conference on Machine Learning (ICML), Vol. 139. PMLR, 6781–6792.

[36]

Yanbin Liu, Juho Lee, Minseop Park, Saehoon Kim, Eunho Yang, Sung Ju Hwang, and Yi Yang. 2019. Learning to Propagate Labels: Transductive Propagation Network for Few-Shot Learning. In Proceedings of 7th International Conference on Learning Representations (ICLR). 1–11.

[37]

Nikhil Mishra, Mostafa Rohaninejad, Xi Chen, and Pieter Abbeel. 2018. A Simple Neural Attentive Meta-Learner. In Proceedings of 6th International Conference on Learning Representations (ICLR). 1–12.

[38]

Saeid Motiian, Marco Piccirilli, Donald A. Adjeroh, and Gianfranco Doretto. 2017. Unified Deep Supervised Domain Adaptation and Generalization. In Proceedings of IEEE International Conference on Computer Vision (ICCV). 5716–5726.

[39]

Jun Hyun Nam, Hyuntak Cha, Sungsoo Ahn, Jaeho Lee, and Jinwoo Shin. 2020. Learning from Failure: De-Biasing Classifier from Biased Classifier. In Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems (NeurIPS). 2067–20684.

[40]

Eli Schwartz, Leonid Karlinsky, Joseph Shtok, Sivan Harary, Mattias Marder, Abhishek Kumar, Rogério Schmidt Feris, Raja Giryes, and Alexander M. Bronstein. 2018. Delta-Encoder: An Effective Sample Synthesis Method for Few-Shot Object Recognition. In Proceedings of Advances in Neural Information Processing Systems (NeurIPS). 2850–2860.

[41]

Jian Shen, Yanru Qu, Weinan Zhang, and Yong Yu. 2017. Wasserstein Distance Guided Representation Learning for Domain Adaptation. arXiv:1707.01217. Retrieved from https://arxiv.org/abs/1707.01217

[42]

Zheyan Shen, Jiashuo Liu, Yue He, Xingxuan Zhang, Renzhe Xu, Han Yu, and Peng Cui. 2021. Towards Out-of-Distribution Generalization: A Survey. arXiv:2108.13624. Retrieved from https://arxiv.org/abs/2108.13624

[43]

Mingchen Sun, Kaixiong Zhou, Xin He, Ying Wang, and Xin Wang. 2022. GPPT: Graph Pre-Training and Prompt Tuning to Generalize Graph Neural Networks. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD ’22). ACM, 1717–1727.

Digital Library

[44]

Yao-Hung Hubert Tsai, Liang-Kang Huang, and Ruslan Salakhutdinov. 2017. Learning Robust Visual-Semantic Embeddings. In Proceedings of International Conference on Computer Vision (ICCV). 3591–3600.

[45]

Aäron van den Oord, Nal Kalchbrenner, Lasse Espeholt, Koray Kavukcuoglu, Oriol Vinyals, and Alex Graves. 2016. Conditional Image Generation with PixelCNN Decoders. In Proceedings of Advances in Neural Information Processing Systems (NeurIPS). 4790–4798.

[46]

Manasi Vartak, Arvind Thiagarajan, Conrado Miranda, Jeshua Bratman, and Hugo Larochelle. 2017. A Meta-Learning Perspective on Cold-Start Recommendations for Items. In Proceedings of Advances in Neural Information Processing Systems (NeurIPS). 6904–6914.

[47]

Jindong Wang, Cuiling Lan, Chang Liu, Yidong Ouyang, and Tao Qin. 2021. Generalizing to Unseen Domains: A Survey on Domain Generalization. In Proceedings of the 30th International Joint Conference on Artificial Intelligence (IJCAI). 4627–4635. Retrieved from ijcai.org

[48]

Zhengyu Yang, Kan Ren, Xufang Luo, Minghuan Liu, Weiqing Liu, Jiang Bian, Weinan Zhang, and Dongsheng Li. 2022. Towards Applicable Reinforcement Learning: Improving the Generalization and Sample Efficiency with Policy Ensemble. In Proceedings of the 31st International Joint Conference on Artificial Intelligence (IJCAI). 3659–3665. Retrieved from ijcai.org

[49]

Michael Zhang, Nimit Sharad Sohoni, Hongyang R. Zhang, Chelsea Finn, and Christopher Ré. 2022. Correct-N-Contrast: A Contrastive Approach for Improving Robustness to Spurious Correlations. In Proceedings of the International Conference on Machine Learning (ICML), Vol. 162. PMLR, 26484–26516.

[50]

Xingxuan Zhang, Peng Cui, Renzhe Xu, Linjun Zhou, Yue He, and Zheyan Shen. 2021. Deep Stable Learning for Out-of-Distribution Generalization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Computer Vision Foundation/IEEE, 5372–5382.

[51]

Kaiyang Zhou, Ziwei Liu, Yu Qiao, Tao Xiang, and Chen Change Loy. 2021. Domain Generalization: A Survey. arXiv:2103.02503. Retrieved from https://arxiv.org/abs/2103.02503

[52]

Xiao Zhou, Yong Lin, Weizhong Zhang, and Tong Zhang. 2022. Sparse Invariant Risk Minimization. In Proceedings of the International Conference on Machine Learning (ICML), Vol. 162. 27222–27244.

Cited By

Wang YOuyang XZhu XGuo DZhang Y(2024)An Aggregation Procedure Enhanced Mechanism for GCN-Based Knowledge Graph Completion Model by Leveraging Condensed Sampling and Attention OptimizationWeb and Big Data10.1007/978-981-97-7235-3_23(341-356)Online publication date: 31-Aug-2024
https://dl.acm.org/doi/10.1007/978-981-97-7235-3_23
Sun KJiang HHu YYin B(2024)Generating Graph-Based Rules for Enhancing Logical ReasoningAdvanced Intelligent Computing Technology and Applications10.1007/978-981-97-5615-5_12(143-156)Online publication date: 5-Aug-2024
https://dl.acm.org/doi/10.1007/978-981-97-5615-5_12

Index Terms

Towards Domain-Aware Stable Meta Learning for Out-of-Distribution Generalization
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Multi-task learning
        Transfer learning
      2. Supervised learning

Recommendations

Meta conditional variational auto-encoder for domain generalization
Abstract
Domain generalization has recently generated increasing attention in machine learning in that it tackles the challenging out-of-distribution problem. The huge domain shift from source domains to target domains induces great uncertainty ...
Graphical abstract

Display Omitted
Highlights
- We propose a new meta conditional variational auto-encoder model for domain generalization.
Learning to Learn with Variational Information Bottleneck for Domain Generalization
Computer Vision – ECCV 2020
Abstract
Domain generalization models learn to generalize to previously unseen domains, but suffer from prediction uncertainty and domain shift. In this paper, we address both problems. We introduce a probabilistic meta-learning model for domain ...
Domain-augmented meta ensemble learning for mechanical fault diagnosis from heterogeneous source domains to unseen target domains
Abstract
Existing domain generalization (DG) fault diagnosis methods primarily use adversarial training to reduce shifts between source domains and learn domain-invariant features. However, such features are difficult to learn when dealing with ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Knowledge Discovery from Data

ACM Transactions on Knowledge Discovery from Data Volume 18, Issue 8

September 2024

700 pages

EISSN:1556-472X

DOI:10.1145/3613713

Editor:
Jian Pei
Duke University, USA

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 August 2024

Online AM: 03 July 2024

Accepted: 27 June 2024

Revised: 29 February 2024

Received: 25 November 2022

Published in TKDD Volume 18, Issue 8

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Natural Science Foundation of China
International Science and Technology Cooperation Program of Jilin Province
Science and Technology Development Program of Jilin Province
Fifth Electronics Research Institute of the Ministry of Industry and Information Technology

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
248
Total Downloads

Downloads (Last 12 months)248
Downloads (Last 6 weeks)13

Reflects downloads up to 23 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wang YOuyang XZhu XGuo DZhang Y(2024)An Aggregation Procedure Enhanced Mechanism for GCN-Based Knowledge Graph Completion Model by Leveraging Condensed Sampling and Attention OptimizationWeb and Big Data10.1007/978-981-97-7235-3_23(341-356)Online publication date: 31-Aug-2024
https://dl.acm.org/doi/10.1007/978-981-97-7235-3_23
Sun KJiang HHu YYin B(2024)Generating Graph-Based Rules for Enhancing Logical ReasoningAdvanced Intelligent Computing Technology and Applications10.1007/978-981-97-5615-5_12(143-156)Online publication date: 5-Aug-2024
https://dl.acm.org/doi/10.1007/978-981-97-5615-5_12

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

Figures

Tables

Media

View full text|Download PDF

View Issue’s Table of Contents