Not All Tasks Are Equal: A Parameter-Efficient Task Reweighting Method for Few-Shot Learning

Liu, Xin; Lyu, Yilin; Jing, Liping; Zeng, Tieyong; Yu, Jian

doi:10.1007/978-3-031-43415-0_25

Xin Liu¹²,
Yilin Lyu¹²,
Liping Jing¹²,
Tieyong Zeng¹³ &
…
Jian Yu¹²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14170))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

1079 Accesses

Abstract

Meta-learning has emerged as an effective and popular approach for few-shot learning (FSL) due to its fast adaptation to novel tasks. However, this kind of method assumes that the meta-training and testing tasks come from the same task distribution and assigns equal weights to all tasks during meta-training. This assumption limits their ability to perform well in real-world scenarios where some meta-training tasks contribute more to the testing tasks than others. To address this issue, we propose a parameter-efficient task reweighting (PETR) method, which assigns proper weights to meta-training tasks according to their contribution to the testing tasks while using few parameters. Specifically, we formulate a bi-level optimization problem to jointly learn the few-shot learning model and the task weights. In the inner loop, the meta-parameters of the few-shot learning model are updated based on a weighted training loss. In the outer loop, the task weight parameters are updated with the implicit gradient. Additionally, to address the challenge of a large number of task weight parameters, we introduce a hypothesis that significantly reduces the required parameters by considering the factors that influence the importance of each meta-training task. Empirical evaluation results on both traditional FSL and FSL with out-of-distribution (OOD) tasks show that our PETR method outperforms state-of-the-art meta-learning-based FSL methods by assigning proper weights to different meta-training tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Few-shot learning with adaptively initialized task optimizer: a practical meta-learning approach

Article 10 October 2019

Improving Few-Shot Learning Through Multi-task Representation Learning Theory

Imbalanced Few-Shot Learning Based on Meta-transfer Learning

References

Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Communications ACM 60(6), 84–90 (2017)
Article Google Scholar
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. In: Proceedings of the IEEE International Conference On Computer Vision, pp. 2961–2969 (2017)
Google Scholar
Schmidhuber, J.: Evolutionary principles in self-referential learning, or on learning how to learn: the meta-meta-... hook. PhD thesis, Technische Universität München (1987)
Google Scholar
Naik, D.K., Mammone, R.J.: Meta-neural networks that learn by learning. In: [Proceedings 1992] IJCNN International Joint Conference on Neural Networks, vol. 1, pp. 437–442. IEEE (1992)
Google Scholar
Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. In International Conference on Machine Learning, pp. 1126–1135. PMLR (2017)
Google Scholar
Li, Z., Zhou, F., Chen, F., Li, H.: Meta-sgd: learning to learn quickly for few-shot learning. arXiv preprint arXiv:1707.09835 (2017)
Nichol, A., Schulman, J.: Reptile: a scalable metalearning algorithm, vol. 2(3), p. 4. arXiv preprint arXiv:1803.02999 (2018)
Rajeswaran, A., Finn, C., Kakade, S.M., Levine, S.: Meta-learning with implicit gradients. In: Advances in Neural Information Processing Systems 32 (2019)
Google Scholar
Flennerhag, S., Rusu, A.A., Pascanu, R., Visin, F., Yin, H., Hadsell, R.: Meta-learning with warped gradient descent. arXiv preprint arXiv:1909.00025 (2019)
Cai, D., Sheth, R., Mackey, L., Fusi, N.: Weighted meta-learning. arXiv preprint arXiv:2003.09465 (2020)
Killamsetty, K., Li, C., Zhao, C., Chen, F., Iyer, R.: A nested bi-level optimization framework for robust few shot learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, 7176–7184 (2022)
Google Scholar
Shimodaira, H.: Improving predictive inference under covariate shift by weighting the log-likelihood function. J. Stat. Planning Inference 90(2), 227–244 (2000)
Article MathSciNet MATH Google Scholar
Sugiyama, M., Suzuki, T., Nakajima, S., Kashima, H., Nau, P.B., Kawanabe, M.: Direct importance estimation for covariate shift adaptation. Annals Inst. Stat. Mathem. 60(4), 699–746 (2008)
Google Scholar
Fang, T., Nan, L., Niu, G., Sugiyama, M.: Rethinking importance weighting for deep learning under distribution shift. Adv. Neural. Inf. Process. Syst. 33, 11996–12007 (2020)
Google Scholar
Kuang, K., Xiong, R., Cui, P., Athey, S., Li, B.: Stable prediction with model misspecification and agnostic distribution shift. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 4485–4492 (2020)
Google Scholar
Zhang, X., Cui, P., Xu, R., Zhou, L., He, Y., Shen, Z.: Deep stable learning for out-of-distribution generalization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5372–5382 (2021)
Google Scholar
Ren, M., Zeng, W., Yang, B., Urtasun, R.: Learning to reweight examples for robust deep learning. In International Conference on Machine Learning, pp. 4334–4343. PMLR (2018)
Google Scholar
Zhou, X., et al.: Model agnostic sample reweighting for out-of-distribution learning. In: International Conference on Machine Learning, pp. 27203–27221. PMLR (2022)
Google Scholar
Finn, C., Xu, K., Levine, S.: Probabilistic model-agnostic meta-learning. In: Advances in Neural Information Processing Systems 31 (2018)
Google Scholar
Grant, E., Finn, C., Levine, S., Darrell, T., Griffiths, T.: Recasting gradient-based meta-learning as hierarchical bayes. arXiv preprint arXiv:1801.08930 (2018)
Ravi, S., Beatson, A.: Amortized bayesian meta-learning. In: International Conference on Learning Representations (2019)
Google Scholar
Lee, H.B., Nam, T., Yang, E., Hwang, S.J.: Learning to perturb latent features for generalization, Meta dropout (2020)
Google Scholar
Ni, R., Goldblum, M., Sharaf, A., Kong, K., Goldstein, T.: Data augmentation for meta-learning. In International Conference on Machine Learning, pp. 8152–8161. PMLR (2021)
Google Scholar
Yao, H., Zhang, L., Finn, C.: Meta-learning with fewer tasks through task interpolation. arXiv preprint arXiv:2106.02695 (2021)
Vuorio, R., Sun, S.-H., Hu, H., Lim, J.J.: Multimodal model-agnostic meta-learning via task-aware modulation. In: Advances in Neural Information Processing Systems 32 (2019)
Google Scholar
Lee, H.B., et al.: Learning to balance: Bayesian meta-learning for imbalanced and out-of-distribution tasks. arXiv preprint arXiv:1905.12917 (2019)
Baik, S., Choi, J., Kim, H., Cho, D., Min, J., Lee, K.M.: Meta-learning with task-adaptive loss function for few-shot learning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9465–9474 (2021)
Google Scholar
Liu, C., Wang, Z., Sahoo, D., Fang, Y., Zhang, K., Hoi, S.C.H.: Adaptive task sampling for meta-learning. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12363, pp. 752–769. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58523-5_44
Chapter Google Scholar
Zhang, J., Song, J., Yao, Y., Gao, L.: Curriculum-based meta-learning. In Proceedings of the 29th ACM International Conference on Multimedia, pp. 1838–1846 (2021)
Google Scholar
Zhou, Y., Wang, Y., Cai, J., Zhou, Y., Hu, Q., Wang, W.: Expert training: Task hardness aware meta-learning for few-shot classification. arXiv preprint arXiv:2007.06240 (2020)
Bennequin, E., Bouvier, V., Tami, M., Toubhans, A., Hudelot, C.: Bridging few-shot learning and adaptation: new challenges of support-query shift. In: Oliver, N., Pérez-Cruz, F., Kramer, S., Read, J., Lozano, J.A. (eds.) ECML PKDD 2021. LNCS (LNAI), vol. 12975, pp. 554–569. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86486-6_34
Chapter Google Scholar
Aimen, A., Ladrecha, B., Krishnan, N.C.: Adversarial projections to tackle support-query shifts in few-shot meta-learning. In: Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2022, Grenoble, France, 19–23 September 2022, Proceedings, Part III, pp. 615–630. Springer (2023). https://doi.org/10.1007/978-3-031-26409-2_37
Foo, C.-S., Ng, A., et al.: Efficient multiple hyperparameter learning for log-linear models. In: Advances in Neural Information Processing Systems 20 (2007)
Google Scholar
Okuno, T., Takeda, A., Kawana, A., Watanabe, M.: On \(l_p-\)hyperparameter learning via bilevel nonsmooth optimization. arXiv preprint arXiv:1806.01520 (2018)
Lorraine, J., Vicol, P., Duvenaud, D.: Optimizing millions of hyperparameters by implicit differentiation. In: International Conference on Artificial Intelligence and Statistics, pp. 1540–1552. PMLR (2020)
Google Scholar
Mao, Y., Wang, Z., Liu, W., Lin, X., Xie, P.: Metaweighting: learning to weight tasks in multi-task learning. In: Findings of the Association for Computational Linguistics: ACL 2022, pp. 3436–3448 (2022)
Google Scholar
Chen, H., Wang, X., Guan, C., Liu, Y., Zhu, W.: Auxiliary learning with joint task and data scheduling. In: International Conference on Machine Learning, pp. 3634–3647. PMLR (2022)
Google Scholar
Franceschi, L., Frasconi, P., Salzo, S., Grazzi, R., Pontil, M.: Bilevel programming for hyperparameter optimization and meta-learning. In: International Conference on Machine Learning, pp. 1568–1577. PMLR (2018)
Google Scholar
Lian, D., et al.: Towards fast adaptation of neural architectures with meta learning. In: International Conference on Learning Representations (2020)
Google Scholar
Hu, Y., Wu, X., He, R.: TF-NAS: rethinking three search freedoms of latency-constrained differentiable neural architecture search. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12360, pp. 123–139. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58555-6_8
Chapter Google Scholar
Chen, Z., Jiang, H., Shi, Y., Dai, B., Zhao, T.: Learning to defense by learning to attack (2019)
Google Scholar
Tian, Y., Shen, L., Guinan, S., Li, Z., Liu, W.: Alphagan: fully differentiable architecture search for generative adversarial networks. IEEE Trans. Pattern Anal. Mach. Intell. 44(10), 6752–6766 (2021)
Article Google Scholar
Yang, Z., Chen, Y., Hong, M., Wang, Z.: Provably global convergence of actor-critic: A case for linear quadratic regulator with ergodic cost. In: Advances in Neural Information Processing Systems 32 (2019)
Google Scholar
Zhang, H., Chen, W., Huang, Z., Li, M., Yang, Y., Zhang, W., Wang, J.: Bi-level actor-critic for multi-agent coordination. In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 7325–7332 (2020)
Google Scholar
Rüschendorf, L.: The wasserstein distance and approximation theorems. Probab. Theory Relat. Fields 70(1), 117–129 (1985)
Article MathSciNet MATH Google Scholar
Zhao, S., Sinha, A., He, Y., Perreault, A., Song, J., Ermon, S.: H-divergence: A decision-theoretic probability discrepancy measure
Google Scholar
Arazo, E., Ortego, D., Albert, P., O’Connor, N., McGuinness, K.: Unsupervised label noise modeling and loss correction. In: International conference on machine learning, pp. 312–321. PMLR (2019)
Google Scholar
Bottou, L., Bousquet, O.: The tradeoffs of large scale learning. In: Advances in Neural Information Processing Systems 20 (2008)
Google Scholar
Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vision 115, 211–252 (2015)
Article MathSciNet Google Scholar
Ravi, S., Larochelle, H.: Optimization as a model for few-shot learning. In: International Conference on Learning Representations (2017)
Google Scholar
Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning (2011)
Google Scholar
Xiao, H., Rasul, K., Vollgraf, R.: Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747 (2017)

Download references

Acknowledgments

This work was partly supported by the Fundamental Research Funds for the Central Universities (2019JBZ110); the Beijing Natural Science Foundation under Grant L211016; the National Natural Science Foundation of China under Grant 62176020; the National Key Research and Development Program (2020AAA0106800); and Chinese Academy of Sciences (OEIP-O-202004).

Author information

Authors and Affiliations

School of Computer and Information Technology, Beijing Jiaotong University, Beijing, China
Xin Liu, Yilin Lyu, Liping Jing & Jian Yu
Department of Mathematics, The Chinese University of Hong Kong, Shatin, NT, Hong Kong
Tieyong Zeng

Authors

Xin Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yilin Lyu
View author publications
You can also search for this author in PubMed Google Scholar
Liping Jing
View author publications
You can also search for this author in PubMed Google Scholar
Tieyong Zeng
View author publications
You can also search for this author in PubMed Google Scholar
Jian Yu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Liping Jing .

Editor information

Editors and Affiliations

University of Michigan, Ann Arbor, MI, USA
Danai Koutra
University of Vienna, Vienna, Austria
Claudia Plant
Max Planck Institute for Software Systems, Kaiserslautern, Germany
Manuel Gomez Rodriguez
Politecnico di Torino, Turin, Italy
Elena Baralis
CENTAI, Turin, Italy
Francesco Bonchi

Ethics declarations

Ethical Statement

The authors declare that they have no conflict of interest. All procedures performed in studies involving human participants were by the ethical standards of the institutional and national research committees. This article does not contain any studies with animals performed by any of the authors. Informed consent was obtained from all individual participants included in the study.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, X., Lyu, Y., Jing, L., Zeng, T., Yu, J. (2023). Not All Tasks Are Equal: A Parameter-Efficient Task Reweighting Method for Few-Shot Learning. In: Koutra, D., Plant, C., Gomez Rodriguez, M., Baralis, E., Bonchi, F. (eds) Machine Learning and Knowledge Discovery in Databases: Research Track. ECML PKDD 2023. Lecture Notes in Computer Science(), vol 14170. Springer, Cham. https://doi.org/10.1007/978-3-031-43415-0_25

Download citation

DOI: https://doi.org/10.1007/978-3-031-43415-0_25
Published: 17 September 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43414-3
Online ISBN: 978-3-031-43415-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the ECML PKDD community (opens in a new tab)

Not All Tasks Are Equal: A Parameter-Efficient Task Reweighting Method for Few-Shot Learning

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Few-shot learning with adaptively initialized task optimizer: a practical meta-learning approach

Improving Few-Shot Learning Through Multi-task Representation Learning Theory

Imbalanced Few-Shot Learning Based on Meta-transfer Learning

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Ethics declarations

Ethical Statement

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Subscribe and save

Buy Now

Navigation

Not All Tasks Are Equal: A Parameter-Efficient Task Reweighting Method for Few-Shot Learning

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Few-shot learning with adaptively initialized task optimizer: a practical meta-learning approach

Improving Few-Shot Learning Through Multi-task Representation Learning Theory

Imbalanced Few-Shot Learning Based on Meta-transfer Learning

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Ethics declarations

Ethical Statement

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation