Pathology-Knowledge Enhanced Multi-instance Prompt Learning for Few-Shot Whole Slide Image Classification

Qu, Linhao; Yang, Dingkang; Huang, Dan; Guo, Qinhao; Luo, Rongkui; Zhang, Shaoting; Wang, Xiaosong

doi:10.1007/978-3-031-73247-8_12

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15069))

Included in the following conference series:

European Conference on Computer Vision

143 Accesses

Abstract

Current multi-instance learning algorithms for pathology image analysis often require a substantial number of Whole Slide Images for effective training but exhibit suboptimal performance in scenarios with limited learning data. In clinical settings, restricted access to pathology slides is inevitable due to patient privacy concerns and the prevalence of rare or emerging diseases. The emergence of the Few-shot Weakly Supervised WSI Classification accommodates the significant challenge of the limited slide data and sparse slide-level labels for diagnosis. Prompt learning based on the pre-trained models (e.g., CLIP) appears to be a promising scheme for this setting; however, current research in this area is limited, and existing algorithms often focus solely on patch-level prompts or confine themselves to language prompts. This paper proposes a multi-instance prompt learning framework enhanced with pathology knowledge, i.e., integrating visual and textual prior knowledge into prompts at both patch and slide levels. The training process employs a combination of static and learnable prompts, effectively guiding the activation of pre-trained models and further facilitating the diagnosis of key pathology patterns. Lightweight Messenger (self-attention) and Summary (attention-pooling) layers are introduced to model relationships between patches and slides within the same patient data. Additionally, alignment-wise contrastive losses ensure the feature-level alignment between visual and textual learnable prompts for both patches and slides. Our method demonstrates superior performance in three challenging clinical tasks, significantly outperforming comparative few-shot methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Low-Shot Prompt Tuning for Multiple Instance Learning Based Histology Classification

Patients and Slides are Equal: A Multi-level Multi-instance Learning Framework for Pathological Image Analysis

Data-efficient and weakly supervised computational pathology on whole-slide images

Article 01 March 2021

References

Alayrac, J.B., et al.: Flamingo: a visual language model for few-shot learning. In: Advances in Neural Information Processing Systems (NeurIPS), vol. 35, pp. 23716–23736 (2022)
Google Scholar
Campanella, G., et al.: Clinical-grade computational pathology using weakly supervised deep learning on whole slide images. Nat. Med. 25(8), 1301–1309 (2019)
Article Google Scholar
Chan, T.H., Cendra, F.J., Ma, L., Yin, G., Yu, L.: Histopathology whole slide image analysis with heterogeneous graph representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 15661–15670 (2023)
Google Scholar
Chen, R.J., et al.: Scaling vision transformers to gigapixel images via hierarchical self-supervised learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 16144–16155 (2022)
Google Scholar
Chen, R.J., et al.: Pathomic fusion: an integrated framework for fusing histopathology and genomic features for cancer diagnosis and prognosis. IEEE Trans. Med. Imaging 41(4), 757–770 (2020)
Article Google Scholar
Chen, R.J., et al.: Multimodal co-attention transformer for survival prediction in gigapixel whole slide images. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 4015–4025 (2021)
Google Scholar
Chen, W., Si, C., Zhang, Z., Wang, L., Wang, Z., Tan, T.: Semantic prompt for few-shot image recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 23581–23591 (2023)
Google Scholar
Chen, Y.C., Lu, C.S.: Rankmix: data augmentation for weakly supervised learning of classifying whole slide images with diverse sizes and imbalanced categories. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 23936–23945 (2023)
Google Scholar
Cheplygina, V., de Bruijne, M., Pluim, J.P.: Not-so-supervised: a survey of semi-supervised, multi-instance, and transfer learning in medical image analysis. Med. Image Anal. 54, 280–296 (2019)
Article Google Scholar
Chikontwe, P., Kim, M., Nam, S.J., Go, H., Park, S.H.: Multiple instance learning with center embeddings for histopathology classification. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12265, pp. 519–528. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59722-1_50
Chapter Google Scholar
Gu, J., et al.: A systematic survey of prompt engineering on vision-language foundation models. arXiv preprint arXiv:2307.12980 (2023)
Hashimoto, N., et al.: Multi-scale domain-adversarial multiple-instance CNN for cancer subtype classification with unannotated histopathological images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3852–3861 (2020)
Google Scholar
Huang, Y., Zhao, W., Wang, S., Fu, Y., Jiang, Y., Yu, L.: Conslide: asynchronous hierarchical interaction transformer with breakup-reorganize rehearsal for continual whole slide image analysis. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 21349–21360 (2023)
Google Scholar
Huang, Z., Bianchi, F., Yuksekgonul, M., Montine, T.J., Zou, J.: A visual–language foundation model for pathology image analysis using medical twitter. Nat. Med. 1–10 (2023)
Google Scholar
Ikezogwo, W., et al.: Quilt-1m: one million image-text pairs for histopathology. In: Advances in Neural Information Processing Systems (NeurIPS), vol. 36 (2024)
Google Scholar
Ilse, M., Tomczak, J., Welling, M.: Attention-based deep multiple instance learning. In: International Conference on Machine Learning (ICML), pp. 2127–2136. PMLR (2018)
Google Scholar
Jia, M., et al.: Visual prompt tuning. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13693, pp. 709–727. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19827-4_41
Chapter Google Scholar
Li, B., Li, Y., Eliceiri, K.W.: Dual-stream multiple instance learning network for whole slide image classification with self-supervised contrastive learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 14318–14328 (2021)
Google Scholar
Li, H., et al.: DT-MIL: deformable transformer for multi-instance learning on histopathological image. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12908, pp. 206–216. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87237-3_20
Chapter Google Scholar
Li, H., et al.: Task-specific fine-tuning via variational information bottleneck for weakly-supervised pathology whole slide image classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7454–7463 (2023)
Google Scholar
Li, J., Li, D., Xiong, C., Hoi, S.: Blip: bootstrapping language-image pre-training for unified vision-language understanding and generation. In: International Conference on Machine Learning (ICML), pp. 12888–12900. PMLR (2022)
Google Scholar
Lin, T., Xu, H., Yang, C., Xu, Y.: Interventional multi-instance learning with deconfounded instance-level prediction. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), vol. 36, pp. 1601–1609 (2022)
Google Scholar
Lin, T., Yu, Z., Hu, H., Xu, Y., Chen, C.W.: Interventional bag multi-instance learning on whole-slide pathological images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 19830–19839 (2023)
Google Scholar
Lu, M.Y., et al.: Visual language pretrained multiple instance zero-shot transfer for histopathology images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 19764–19775 (2023)
Google Scholar
Lu, M.Y., Williamson, D.F., Chen, T.Y., Chen, R.J., Barbieri, M., Mahmood, F.: Data-efficient and weakly supervised computational pathology on whole-slide images. Nat. Biomed. Eng. 5(6), 555–570 (2021)
Article Google Scholar
Qu, L., Liu, S., Liu, X., Wang, M., Song, Z.: Towards label-efficient automatic diagnosis and analysis: a comprehensive survey of advanced deep learning-based weakly-supervised, semi-supervised and self-supervised techniques in histopathological image analysis. Phys. Med. Biol. (2022)
Google Scholar
Qu, L., Luo, X., Fu, K., Wang, M., Song, Z.: The rise of AI language pathologists: exploring two-level prompt learning for few-shot weakly-supervised whole slide image classification. arXiv preprint arXiv:2305.17891 (2023)
Qu, L., Luo, X., Liu, S., Wang, M., Song, Z.: DGMIL: distribution guided multiple instance learning for whole slide image classification. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds.) MICCAI 2022. LNCS, vol. 13432, pp. 24–34. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16434-7_3
Chapter Google Scholar
Radford, A., et al.: Learning transferable visual models from natural language supervision. In: International Conference on Machine Learning (ICML), pp. 8748–8763. PMLR (2021)
Google Scholar
Rony, J., Belharbi, S., Dolz, J., Ayed, I.B., McCaffrey, L., Granger, E.: Deep weakly-supervised learning methods for classification and localization in histology images: a survey. arXiv preprint arXiv:1909.03354 (2019)
Shao, Z., et al.: Transmil: transformer based correlated multiple instance learning for whole slide image classification. In: Advances in Neural Information Processing Systems (NeurIPS), vol. 34, pp. 2136–2147 (2021)
Google Scholar
Shi, X., Xing, F., Xie, Y., Zhang, Z., Cui, L., Yang, L.: Loss-based attention for deep multiple instance learning. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), vol. 34, pp. 5742–5749 (2020)
Google Scholar
Song, A.H., et al.: Artificial intelligence for digital and computational pathology. Nat. Rev. Bioeng. 1–20 (2023)
Google Scholar
Tu, C., Zhang, Y., Ning, Z.: Dual-curriculum contrastive multi-instance learning for cancer prognosis analysis with whole slide images. In: Advances in Neural Information Processing Systems (NeurIPS), vol. 35, pp. 29484–29497 (2022)
Google Scholar
Wang, X., et al.: SCL-WC: cross-slide contrastive learning for weakly-supervised whole-slide image classification. In: Advances in Neural Information Processing Systems (NeurIPS), vol. 35, pp. 18009–18021 (2022)
Google Scholar
Wasim, S.T., Naseer, M., Khan, S., Khan, F.S., Shah, M.: Vita-clip: video and text adaptive clip via multimodal prompting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 23034–23044 (2023)
Google Scholar
Xu, G., et al.: Camel: a weakly supervised learning framework for histopathology image segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 10682–10691 (2019)
Google Scholar
Yao, H., Zhang, R., Xu, C.: Visual-language prompt tuning with knowledge-guided context optimization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6757–6767 (2023)
Google Scholar
Yao, J., Zhu, X., Jonnagaddala, J., Hawkins, N., Huang, J.: Whole slide images based cancer survival prediction using attention guided deep multiple instance learning networks. Med. Image Anal. 65, 101789 (2020)
Article Google Scholar
Zhang, H., et al.: DTFD-MIL: double-tier feature distillation multiple instance learning for histopathology whole slide image classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 18802–18812 (2022)
Google Scholar
Zhang, Y., et al.: Text-guided foundation model adaptation for pathological image classification. In: Greenspan, H., et al. (eds.) MICCAI 2023. LNCS, vol. 14224, pp. 272–282. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-43904-9_27
Chapter Google Scholar
Zhou, K., Yang, J., Loy, C.C., Liu, Z.: Learning to prompt for vision-language models. Int. J. Comput. Vision 130(9), 2337–2348 (2022)
Article Google Scholar
Zhu, X., Yao, J., Zhu, F., Huang, J.: WSISA: making survival prediction from whole slide histopathological images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7234–7242 (2017)
Google Scholar

Download references

Acknowledgements

This work is funded by the National Key R&D Program of China (2022ZD0160700) and Shanghai AI Laboratory.

Author information

Authors and Affiliations

Shanghai Artificial Intelligence Laboratory, Shanghai, China
Linhao Qu, Shaoting Zhang & Xiaosong Wang
Academy for Engineering and Technology, Fudan University, Shanghai, China
Dingkang Yang
Department of Pathology, Fudan University Shanghai Cancer Center, Shanghai, China
Dan Huang
Department of Gynecologic Oncology, Fudan University Shanghai Cancer Center, Shanghai, China
Qinhao Guo
Department of Pathology, Zhongshan Hospital, Fudan University, Shanghai, China
Rongkui Luo

Authors

Linhao Qu
View author publications
You can also search for this author in PubMed Google Scholar
Dingkang Yang
View author publications
You can also search for this author in PubMed Google Scholar
Dan Huang
View author publications
You can also search for this author in PubMed Google Scholar
Qinhao Guo
View author publications
You can also search for this author in PubMed Google Scholar
Rongkui Luo
View author publications
You can also search for this author in PubMed Google Scholar
Shaoting Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaosong Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaosong Wang .

Editor information

Editors and Affiliations

University of Birmingham, Birmingham, UK
Aleš Leonardis
University of Trento, Trento, Italy
Elisa Ricci
Technical University of Darmstadt, Darmstadt, Germany
Stefan Roth
Princeton University, Princeton, NJ, USA
Olga Russakovsky
Czech Technical University in Prague, Prague, Czech Republic
Torsten Sattler
École des Ponts ParisTech, Marne-la-Vallée, France
Gül Varol

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 328 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Qu, L. et al. (2025). Pathology-Knowledge Enhanced Multi-instance Prompt Learning for Few-Shot Whole Slide Image Classification. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15069. Springer, Cham. https://doi.org/10.1007/978-3-031-73247-8_12

Download citation

DOI: https://doi.org/10.1007/978-3-031-73247-8_12
Published: 01 November 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-73246-1
Online ISBN: 978-3-031-73247-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Pathology-Knowledge Enhanced Multi-instance Prompt Learning for Few-Shot Whole Slide Image Classification

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Low-Shot Prompt Tuning for Multiple Instance Learning Based Histology Classification

Patients and Slides are Equal: A Multi-level Multi-instance Learning Framework for Pathological Image Analysis

Data-efficient and weakly supervised computational pathology on whole-slide images

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 328 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Pathology-Knowledge Enhanced Multi-instance Prompt Learning for Few-Shot Whole Slide Image Classification

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Low-Shot Prompt Tuning for Multiple Instance Learning Based Histology Classification

Patients and Slides are Equal: A Multi-level Multi-instance Learning Framework for Pathological Image Analysis

Data-efficient and weakly supervised computational pathology on whole-slide images

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 328 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation