Article

Canonical Shape Projection Is All You Need for 3D Few-Shot Class Incremental Learning

Authors:

Ali Cheraghian,

Zeeshan Hayder,

Sameera Ramasinghe,

Javad Jafaryahya,

Lars Petersson,

Mehrtash HarandiAuthors Info & Claims

Computer Vision – ECCV 2024: 18th European Conference, Milan, Italy, September 29–October 4, 2024, Proceedings, Part XLI

Pages 36 - 53

https://doi.org/10.1007/978-3-031-72940-9_3

Published: 17 November 2024 Publication History

Abstract

In recent years, robust pre-trained foundation models have been successfully used in many downstream tasks. Here, we would like to use such powerful models to address the problem of few-shot class incremental learning (FSCIL) tasks on 3D point cloud objects. Our approach is to reprogram the well-known CLIP-based foundation model (trained on 2D images and text pairs) for this purpose. The CLIP model works by ingesting 2D images, so to leverage it in our context, we project the 3D object point cloud onto 2D image space to create proper depth maps. For this, prior works consider a fixed and non-trainable set of camera poses. In contrast, we propose to train the network to find a projection that best describes the object and is appropriate for extracting 2D image features from the CLIP vision encoder. Directly using the generated depth map is not suitable for the CLIP model, so we apply the model reprogramming paradigm to the depth map to augment the foreground and background to adapt it. This removes the need for modification or fine-tuning of the foundation model. In the setting we have investigated, we have limited access to data from novel classes, resulting in a problem with overfitting. Here, we address this problem via the use of a prompt engineering approach using multiple GPT-generated text descriptions. Our method, C3PR, successfully outperforms existing FSCIL methods on ModelNet, ShapeNet, ScanObjectNN, and CO3D datasets. The code is available at https://github.com/alichr/C3PR.

References

[1]

Bansal, N., Chen, X., Wang, Z.: Can we gain more from orthogonality regularizations in training deep networks? In: Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 31. Curran Associates, Inc. (2018). https://proceedings.neurips.cc/paper_files/paper/2018/file/bf424cb7b0dea050a42b9739eb261a3a-Paper.pdf

[2]

Belouadah, E., Popescu, A.: IL2M: class incremental learning with dual memory. In: CVPR (2019)

[3]

Belouadah, E., Popescu, A.: ScaIL: classifier weights scaling for class incremental learning. In: WACV (2020)

[4]

Brown, T., et al.: Language models are few-shot learners. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., Lin, H. (eds.) Advances in Neural Information Processing Systems, vol. 33, pp. 1877–1901. Curran Associates, Inc. (2020). https://proceedings.neurips.cc/paper_files/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf

[5]

Castro FM, Marín-Jiménez MJ, Guil N, Schmid C, and Alahari K Ferrari V, Hebert M, Sminchisescu C, and Weiss Y End-to-end incremental learning Computer Vision – ECCV 2018 2018 Cham Springer 241-257

Digital Library

[6]

Chang, A.X., et al.: ShapeNet: an information-rich 3D model repository. arXiv preprint arXiv:1512.03012 (2015)

[7]

Chen, K., Lee, C.G.: Incremental few-shot learning via vector quantization in deep embedded space. In: ICLR (2021)

[8]

Chen, P.Y.: Model reprogramming: resource-efficient cross-domain machine learning (2023)

[9]

Cheraghian, A., Rahman, S., Fang, P., Roy, S.K., Petersson, L., Harandi, M.: Semantic-aware knowledge distillation for few-shot class-incremental learning. In: CVPR (2021)

[10]

Cheraghian, A., et al.: Synthesized feature based few-shot class-incremental learning on a mixture of subspaces. In: ICCV (2021)

[11]

Chowdhury T, Cheraghian A, Ramasinghe S, Ahmadi S, Saberi M, and Rahman S Avidan S, Brostow G, Cissé M, Farinella GM, and Hassner T Few-shot class-incremental learning for 3D point cloud objects Computer Vision - ECCV 2022 2022 Cham Springer 204-220

Digital Library

[12]

Dinh, T., Seo, D., Du, Z., Shang, L., Lee, K.: Improved input reprogramming for GAN conditioning (2022)

[13]

Dosovitskiy, A., et al.: An image is worth 16

\times

16 words: transformers for image recognition at scale. In: ICLR (2021)

[14]

Elsayed, G.F., Goodfellow, I., Sohl-Dickstein, J.: Adversarial reprogramming of neural networks. In: International Conference on Learning Representations (2019). https://openreview.net/forum?id=Syx_Ss05tm

[15]

Jaderberg, M., Simonyan, K., Zisserman, A., Kavukcuoglu, K.: Spatial transformer networks. In: Cortes, C., Lawrence, N., Lee, D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems. vol. 28. Curran Associates, Inc. (2015). https://proceedings.neurips.cc/paper_files/paper/2015/file/33ceb07bf4eeb3da587e268d663aba1a-Paper.pdf

[16]

Jia, C., et al.: Scaling up visual and vision-language representation learning with noisy text supervision. In: Meila, M., Zhang, T. (eds.) Proceedings of the 38th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 139, pp. 4904–4916. PMLR, 18–24 July 2021. https://proceedings.mlr.press/v139/jia21b.html

[17]

Jia, M., et al.: Visual prompt tuning. In: Avidan, S., Brostow, G., Cisse, M., Farinella, G.M., Hassner, T. (eds.) Computer Vision – ECCV 2022. ECCV 2022. LNCS, vol. 13693, pp. 709–727. Springer, Cham (2022).

Digital Library

[18]

Lee, D.H., Pujara, J., Sewak, M., White, R.W., Jauhar, S.K.: Making large language models better data creators. In: The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP) (2023). https://openreview.net/forum?id=2Rdfdri2oT

[19]

Lee, K.Y., Zhong, Y., Wang, Y.X.: Do pre-trained models benefit equally in continual learning? In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 6485–6493, January 2023

[20]

Li, Y., Bu, R., Sun, M., Wu, W., Di, X., Chen, B.: PointCNN: convolution on X-transformed points. In: NeurIPS (2018)

[21]

Li, Z., Hoiem, D.: Learning without forgetting. IEEE Trans. Pattern Anal. Mach. Intell. (2018)

[22]

Liu, Y., Fan, B., Xiang, S., Pan, C.: Relation-shape convolutional neural network for point cloud analysis. In: CVPR (2019)

[23]

Mazumder, P., Singh, P., Rai, P.: Few-shot lifelong learning. In: AAAI (2021)

[24]

Pei, Y., et al.: Learning a condensed frame for memory-efficient video class-incremental learning. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=lCGYC7pXWNQ

[25]

Poulenard, A., Rakotosaona, M.J., Ponty, Y., Ovsjanikov, M.: Effective rotation-invariant point CNN with spherical harmonics kernels. In: 3DV (2019)

[26]

Qi, C.R., Su, H., Mo, K., Guibas, L.J.: PointNet: deep learning on point sets for 3D classification and segmentation. In: CVPR (2017)

[27]

Qi, C.R., Yi, L., Su, H., Guibas, L.J.: PointNet++: deep hierarchical feature learning on point sets in a metric space. In: NeurIPS (2017)

[28]

Radford, A., et al.: Learning transferable visual models from natural language supervision. In: Meila, M., Zhang, T. (eds.) Proceedings of the 38th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 139, pp. 8748–8763. PMLR, 18–24 July 2021. https://proceedings.mlr.press/v139/radford21a.html

[29]

Rao, Y., Lu, J., Zhou, J.: Spherical fractal convolutional neural networks for point cloud recognition. In: CVPR (2019)

[30]

Reizenstein, J., Shapovalov, R., Henzler, P., Sbordone, L., Labatut, P., Novotny, D.: Common objects in 3D: large-scale learning and evaluation of real-life 3D category reconstruction. In: ICCV (2021)

[31]

Ronneberger O, Fischer P, and Brox T Navab N, Hornegger J, Wells WM, and Frangi AF U-Net: convolutional networks for biomedical image segmentation Medical Image Computing and Computer-Assisted Intervention - MICCAI 2015 2015 Cham Springer 234-241

[32]

Singh, A., et al.: FLAVA: a foundational language and vision alignment model. In: CVPR (2022)

[33]

Sung, F., Yang, Y., Zhang, L., Xiang, T., Torr, P.H., Hospedales, T.M.: Learning to compare: relation network for few-shot learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)

[34]

Tan, Y., Xiang, X.: Cross-domain few-shot incremental learning for point-cloud recognition. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 2307–2316, January 2024

[35]

Tan, Z., Ding, K., Guo, R., Liu, H.: Graph few-shot class-incremental learning. In: Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining (2022)

[36]

Tao, X., Hong, X., Chang, X., Dong, S., Wei, X., Gong, Y.: Few-shot class-incremental learning. In: CVPR (2020)

[37]

Tsai, Y.Y., Chen, P.Y., Ho, T.Y.: Transfer learning without knowing: reprogramming black-box machine learning models with scarce data and limited resources. In: III, H.D., Singh, A. (eds.) Proceedings of the 37th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 119, pp. 9614–9624. PMLR, 13–18 July 2020. https://proceedings.mlr.press/v119/tsai20a.html

[38]

Tsai, Y.Y., Chen, P.Y., Ho, T.Y.: Transfer learning without knowing: reprogramming black-box machine learning models with scarce data and limited resources. In: Proceedings of the 37th International Conference on Machine Learning, ICML 2020, JMLR.org (2020)

[39]

Uy, M.A., Pham, Q.H., Hua, B.S., Nguyen, D.T., Yeung, S.K.: Revisiting point cloud classification: a new benchmark dataset and classification model on real-world data. In: ICCV (2019)

[40]

Wang C, Samari B, and Siddiqi K Ferrari V, Hebert M, Sminchisescu C, and Weiss Y Local spectral graph convolution for point set feature learning Computer Vision – ECCV 2018 2018 Cham Springer 56-71

Digital Library

[41]

Wang, R., et al.: AttriClip: a non-incremental learner for incremental knowledge learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3654–3663, June 2023

[42]

Wang, Y., Sun, Y., Liu, Z., Sarma, S.E., Bronstein, M.M., Solomon, J.M.: Dynamic graph CNN for learning on point clouds. ACM Trans. Graph. (TOG) (2019)

[43]

Wu, W., Qi, Z., Fuxin, L.: PointCONV: deep convolutional networks on 3D point clouds. In: CVPR (2019)

[44]

Wu, Z., et al.: 3D ShapeNets: a deep representation for volumetric shapes. In: CVPR (2015)

[45]

Xiang, T., Zhang, C., Song, Y., Yu, J., Cai, W.: Walk in the cloud: learning curves for point clouds shape analysis. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 915–924, October 2021

[46]

Xu Y, Fan T, Xu M, Zeng L, and Qiao Yu Ferrari V, Hebert M, Sminchisescu C, and Weiss Y SpiderCNN: deep learning on point sets with parameterized convolutional filters Computer Vision – ECCV 2018 2018 Cham Springer 90-105

Digital Library

[47]

Zhang, R., et al.: PointClip: point cloud understanding by clip. arXiv preprint arXiv:2112.02413 (2021)

[48]

Zhang, Y., Rabbat, M.: A graph-CNN for 3D point cloud classification. In: ICASSP (2018)

[49]

Zhou, D.W., Wang, F.Y., Ye, H.J., Ma, L., Pu, S., Zhan, D.C.: Forward compatible few-shot class-incremental learning. In: CVPR (2022)

[50]

Zhu, X., et al.: PointCLIP V2: prompting clip and GPT for powerful 3D open-world learning. arXiv preprint arXiv:2211.11682 (2022)

Index Terms

Canonical Shape Projection Is All You Need for 3D Few-Shot Class Incremental Learning
1. Computing methodologies

Index terms have been assigned to the content through auto-classification.

Recommendations

Grassmann Graph Embedding for Few-Shot Class Incremental Learning
Pattern Recognition and Computer Vision
Abstract
Few-shot class incremental learning (FSCIL) poses a significant challenge in machine learning as it involves acquiring new knosstabwledge from limited samples while retaining previous knowledge. However, the scarcity of data for new classes not ...
Pseudo initialization based Few-Shot Class Incremental Learning
Abstract
Few-Shot Class Incremental Learning (FSCIL) aims to recognize sequentially arriving new classes without catastrophic forgetting old classes. The incremental new classes only contain very few labeled examples for updating the model, which causes ...
Highlights
- We propose a novel preserving feature space and pseudo initialization based FSCIL method.
- We adopt the cosine classifier to avoid catastrophic forgetting and overfitting of the classifier.
- Regularization are utilized to limit the ...
Rethinking few-shot class-incremental learning: A lazy learning baseline
Abstract
Few-shot class-incremental learning is a step forward in the realm of incremental learning, catering to a more realistic context. In typical incremental learning scenarios, the initial session possesses ample data for effective training. However, ...
Highlights
- Introducing Lazy Learning Baseline for Few-shot Class-incremental Learning (FSCIL).
- Efficiently mitigating catastrophic forgetting and overfitting in incremental learning.
- Explaining the problem of biased indicators in the existing ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

Computer Vision – ECCV 2024: 18th European Conference, Milan, Italy, September 29–October 4, 2024, Proceedings, Part XLI

Sep 2024

585 pages

ISBN:978-3-031-72939-3

DOI:10.1007/978-3-031-72940-9

Editors:
Aleš Leonardis
University of Birmingham, Birmingham, UK
,
Elisa Ricci
https://ror.org/05trd4x28University of Trento, Trento, Italy
,
Stefan Roth
Technical University of Darmstadt, Darmstadt, Germany
,
Olga Russakovsky
Princeton University, Princeton, NJ, USA
,
Torsten Sattler
Czech Technical University in Prague, Prague, Czech Republic
,
Gül Varol
École des Ponts ParisTech, Marne-la-Vallée, France

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 17 November 2024

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 25 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

Figures

Tables

Media

View Table of Conten