Snuffy: Efficient Whole Slide Image Classifier

Jafarinia, Hossein; Alipanah, Alireza; Razavi, Saeed; Mirzaie, Nahal; Rohban, Mohammad Hossein

doi:10.1007/978-3-031-73024-5_15

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15147))

Included in the following conference series:

European Conference on Computer Vision

181 Accesses
1 Citations

Abstract

Whole Slide Image (WSI) classification with multiple instance learning (MIL) in digital pathology faces significant computational challenges. Current methods mostly rely on extensive self-supervised learning (SSL) for satisfactory performance, requiring long training periods and considerable computational resources. At the same time, no pre-training affects performance due to domain shifts from natural images to WSIs. We introduce Snuffy architecture, a novel MIL-pooling method based on sparse transformers that mitigates performance loss with limited pre-training and enables continual few-shot pre-training as a competitive option. Our sparsity pattern is tailored for pathology and is theoretically proven to be a universal approximator with the tightest probabilistic sharp bound on the number of layers for sparse transformers, to date. We demonstrate Snuffy’s effectiveness on CAMELYON16 and TCGA Lung cancer datasets, achieving superior WSI and patch-level accuracies. The code is available on https://github.com/jafarinia/snuffy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Gigapixel Whole-Slide Images Classification Using Locally Supervised Learning

Buffer-MIL: Robust Multi-instance Learning with a Buffer-Based Approach

Multistain Pretraining for Slide Representation Learning in Pathology

Notes

1.
https://github.com/mahmoodlab/HIPT/issues/41.

References

Andrews, S., Tsochantaridis, I., Hofmann, T.: Support vector machines for multiple-instance learning. Adv. Neural Inf. Process. Syst. 15 (2002)
Google Scholar
Bejnordi, B.E., et al.: Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. JAMA 318, 2199–2210 (2017). https://api.semanticscholar.org/CorpusID:205086555
Beltagy, I., Peters, M.E., Cohan, A.: Longformer: the long-document transformer. arXiv preprint arXiv:2004.05150 (2020)
Campanella, G., et al.: Clinical-grade computational pathology using weakly supervised deep learning on whole slide images. Nat. Med. 25, 1301–1309 (2019). https://api.semanticscholar.org/CorpusID:196814162
Caron, M., et al.: Emerging properties in self-supervised vision transformers. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9650–9660 (2021)
Google Scholar
Chen, R.J., et al.: Scaling vision transformers to gigapixel images via hierarchical self-supervised learning. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, 18–24 June 2022, pp. 16123–16134. IEEE (2022). https://doi.org/10.1109/CVPR52688.2022.01567
Chen, S., et al.: Adaptformer: adapting vision transformers for scalable visual recognition. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems, vol. 35, pp. 16664–16678. Curran Associates, Inc. (2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/69e2f49ab0837b71b0e0cb7c555990f8-Paper-Conference.pdf
Cheplygina, V., de Bruijne, M., Pluim, J.P.W.: Not-so-supervised: a survey of semi-supervised, multi-instance, and transfer learning in medical image analysis. Medical Image Anal. 54, 280–296 (2019). https://doi.org/10.1016/J.MEDIA.2019.03.009
Article Google Scholar
Child, R., Gray, S., Radford, A., Sutskever, I.: Generating long sequences with sparse transformers (2019). https://openai.com/blog/sparse-transformers
Cooper, L.A., Demicco, E.G., Saltz, J.H., Powell, R.T., Rao, A., Lazar, A.J.: Pancancer insights from the cancer genome atlas: the pathologist’s perspective. J. Pathol. 244(5), 512–524 (2018)
Article Google Scholar
Dadashzadeh, A., Duan, S., Whone, A., Mirmehdi, M.: Pecop: parameter efficient continual pretraining for action quality assessment. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 42–52 (2024)
Google Scholar
Dietterich, T.G., Lathrop, R.H., Lozano-Pérez, T.: Solving the multiple instance problem with axis-parallel rectangles. Artif. Intell. 89(1–2), 31–71 (1997)
Article Google Scholar
Dosovitskiy, A., et al.: An image is worth 16$\times $16 words: transformers for image recognition at scale. In: 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, 3–7 May 2021. OpenReview.net (2021). https://openreview.net/forum?id=YicbFdNTTy
Gao, P., et al.: Clip-adapter: better vision-language models with feature adapters. Int. J. Comput. Vis. 132(2), 581–595 (2024). https://doi.org/10.1007/S11263-023-01891-X
Article Google Scholar
Guo, Q., Qiu, X., Liu, P., Shao, Y., Xue, X., Zhang, Z.: Star-transformer. CoRR arxiv:1902.09113 (2019)
He, J., Zhou, C., Ma, X., Berg-Kirkpatrick, T., Neubig, G.: Towards a unified view of parameter-efficient transfer learning. In: The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, 25–29 April 2022. OpenReview.net (2022). https://openreview.net/forum?id=0RDcd5Axok
He, K., Chen, X., Xie, S., Li, Y., Dollár, P., Girshick, R.: Masked autoencoders are scalable vision learners. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16000–16009 (2022)
Google Scholar
Houlsby, N., et al.: Parameter-efficient transfer learning for NLP. In: Chaudhuri, K., Salakhutdinov, R. (eds.) Proceedings of the 36th International Conference on Machine Learning, ICML 2019, Long Beach, California, USA, 9–15 June 2019. Proceedings of Machine Learning Research, vol. 97, pp. 2790–2799. PMLR (2019). http://proceedings.mlr.press/v97/houlsby19a.html
Hu, E.J., et al.: Lora: low-rank adaptation of large language models. In: The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, 25–29 April 2022. OpenReview.net (2022). https://openreview.net/forum?id=nZeVKeeFYf9
Ilse, M., Tomczak, J.M., Welling, M.: Attention-based deep multiple instance learning. In: Dy, J.G., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, 10–15 July 2018. Proceedings of Machine Learning Research, vol. 80, pp. 2132–2141. PMLR (2018). http://proceedings.mlr.press/v80/ilse18a.html
Javed, S., et al.: Cellular community detection for tissue phenotyping in colorectal cancer histology images. Med. Image Anal. 63, 101696 (2020). https://doi.org/10.1016/J.MEDIA.2020.101696
Article Google Scholar
Kang, M., Song, H., Park, S., Yoo, D., Pereira, S.: Benchmarking self-supervised learning on diverse pathology datasets. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3344–3354 (2023)
Google Scholar
Kather, J.N., et al.: Multi-class texture analysis in colorectal cancer histology. Sci. Rep. 6 (2016). https://api.semanticscholar.org/CorpusID:4769235
van der Laak, J.A., Litjens, G.J.S., Ciompi, F.: Deep learning in histopathology: the path to the clinic. Nat. Med. 27, 775 – 784 (2021). https://api.semanticscholar.org/CorpusID:234597294
Li, B., Li, Y., Eliceiri, K.W.: Dual-stream multiple instance learning network for whole slide image classification with self-supervised contrastive learning. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, 19–25 June 2021, pp. 14318–14328. Computer Vision Foundation/IEEE (2021). https://doi.org/10.1109/CVPR46437.2021.01409. https://openaccess.thecvf.com/content/CVPR2021/html/Li_Dual-Stream_Multiple_Instance_Learning_Network_for_Whole_Slide_Image_Classification_CVPR_2021_paper.html
Lu, M.Y., Williamson, D.F.K., Chen, T.Y., Chen, R.J., Barbieri, M., Mahmood, F.: Data efficient and weakly supervised computational pathology on whole slide images. CoRR arxiv:2004.09666 (2020)
Ludwig, J.A., Weinstein, J.N.: Biomarkers in cancer staging, prognosis and treatment selection. Nat. Rev. Cancer 5, 845–856 (2005). https://api.semanticscholar.org/CorpusID:25540232
Myronenko, A., Xu, Z., Yang, D., Roth, H.R., Xu, D.: Accounting for dependencies in deep learning based multiple instance learning for whole slide imaging. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12908, pp. 329–338. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87237-3_32
Chapter Google Scholar
Ng, T.G., Damiris, K., Trivedi, U., George, J.C.: Obstructive jaundice, a rare presentation of lung cancer: a case report. Respir. Med. Case. Rep. 33, 101425 (2021)
Google Scholar
Pajaziti, L., Hapçiu, S.R., Dobruna, S., Hoxha, N., Kurshumliu, F., Pajaziti, A.: Skin metastases from lung cancer: a case report. BMC. Res. Notes 8, 1–6 (2015)
Article Google Scholar
Patel, A.M., Vila, D.G.D., Peters, S.G.: Paraneoplastic syndromes associated with lung cancer. Mayo Clin. Proc. 68(3), 278–287 (1993). https://doi.org/10.1016/S0025-6196(12)60050-0. https://www.sciencedirect.com/science/article/pii/S0025619612600500
Pfeiffer, J., Vulic, I., Gurevych, I., Ruder, S.: MAD-X: an adapter-based framework for multi-task cross-lingual transfer. In: Webber, B., Cohn, T., He, Y., Liu, Y. (eds.) Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, 16–20 November 2020, pp. 7654–7673. Association for Computational Linguistics (2020).https://doi.org/10.18653/V1/2020.EMNLP-MAIN.617
Qi, C.R., Su, H., Mo, K., Guibas, L.J.: Pointnet: deep learning on point sets for 3d classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 652–660 (2017)
Google Scholar
Qu, L., Luo, X., Liu, S., Wang, M., Song, Z.: DGMIL: distribution guided multiple instance learning for whole slide image classification. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds.) Medical Image Computing and Computer Assisted Intervention - MICCAI 2022 - 25th International Conference, Singapore, September 18-22, 2022, Proceedings, Part II. Lecture Notes in Computer Science, vol. 13432, pp. 24–34. Springer, Heidelberg (2022). https://doi.org/10.1007/978-3-031-16434-7_3
Rony, J., Belharbi, S., Dolz, J., Ayed, I.B., McCaffrey, L., Granger, E.: Deep weakly-supervised learning methods for classification and localization in histology images: a survey. CoRR arxiv:1909.03354 (2019)
Shalata, W., et al.: Dermatomyositis associated with lung cancer: a brief review of the current literature and retrospective single institution experience. Life 13, 40 (2022). https://doi.org/10.3390/life13010040
Shao, Z., et al.: Transmil: transformer based correlated multiple instance learning for whole slide image classification. In: Ranzato, M., Beygelzimer, A., Dauphin, Y.N., Liang, P., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, 6–14 December 2021, Virtual, pp. 2136–2147 (2021). https://proceedings.neurips.cc/paper/2021/hash/10c272d06794d3e5785d5e7c5356e9ff-Abstract.html
Srinidhi, C.L., Ciga, O., Martel, A.L.: Deep neural network models for computational histopathology: a survey. Med. Image Anal. 67, 101813 (2021). https://doi.org/10.1016/J.MEDIA.2020.101813
Article Google Scholar
Wu, J., et al.: Medical sam adapter: adapting segment anything model for medical image segmentation. arXiv preprint arXiv:2304.12620 (2023)
Xiong, Y., et al.: Nyströmformer: a nyström-based algorithm for approximating self-attention. In: Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2021, Thirty-Third Conference on Innovative Applications of Artificial Intelligence, IAAI 2021, The Eleventh Symposium on Educational Advances in Artificial Intelligence, EAAI 2021, Virtual Event, 2–9 February 2021, pp. 14138–14148. AAAI Press (2021). https://doi.org/10.1609/AAAI.V35I16.17664
Xu, Y., Zhu, J., Chang, E.I., Tu, Z.: Multiple clustered instance learning for histopathology cancer image classification, segmentation and clustering. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012, pp. 964–971. IEEE Computer Society (2012). https://doi.org/10.1109/CVPR.2012.6247772
Yun, C., Chang, Y.W., Bhojanapalli, S., Rawat, A.S., Reddi, S.J., Kumar, S.: O(n) connections are expressive enough: universal approximability of sparse transformers. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. NIPS 2020. Curran Associates Inc., Red Hook (2020)
Google Scholar
Yun, C., Chang, Y.W., Bhojanapalli, S., Rawat, A.S., Reddi, S.J., Kumar, S.: \$o(n)\$ connections are expressive enough: universal approximability of sparse transformers. ArXiv arxiv:2006.04862 (2020). https://api.semanticscholar.org/CorpusID:219558319
Zaheer, M., et al.: Big bird: transformers for longer sequences. Adv. Neural Inf. Process. Syst. 33 (2020)
Google Scholar
Zaheer, M., Kottur, S., Ravanbakhsh, S., Poczos, B., Salakhutdinov, R.R., Smola, A.J.: Deep sets. Adv. Neural Inf. Process. Syst. 30 (2017)
Google Scholar
Zhang, H., et al.: DTFD-MIL: double-tier feature distillation multiple instance learning for histopathology whole slide image classification. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, 18–24 June 2022, pp. 18780–18790. IEEE (2022). https://doi.org/10.1109/CVPR52688.2022.01824
Zhang, T., et al.: Pad: self-supervised pre-training with patchwise-scale adapter for infrared images. arXiv preprint arXiv:2312.08192 (2023)
Zheng, Y., et al.: Kernel attention transformer for histopathology whole slide image analysis and assistant cancer diagnosis. IEEE Trans. Med. Imaging 42(9), 2726–2739 (2023). https://doi.org/10.1109/TMI.2023.3264781
Article Google Scholar

Download references

Acknowledgements

We extend our deepest and most special thanks to Danial Hamdi for their efforts. We also thank Mohammad Mosayyebi, Mehrab Moradzadeh, Mohammad Hosein Movasaghinia, Mohammad Azizmalayeri, Hossein Mirzaei, Mohammad Mozafari, Soroush Vafaei Tabar, Mohammad Hassan Alikhani, and Hosein Hasani.

Author information

Authors and Affiliations

Sharif University of Technology, Tehran, Iran
Hossein Jafarinia, Alireza Alipanah, Saeed Razavi, Nahal Mirzaie & Mohammad Hossein Rohban

Authors

Hossein Jafarinia
View author publications
You can also search for this author in PubMed Google Scholar
Alireza Alipanah
View author publications
You can also search for this author in PubMed Google Scholar
Saeed Razavi
View author publications
You can also search for this author in PubMed Google Scholar
Nahal Mirzaie
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Hossein Rohban
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohammad Hossein Rohban .

Editor information

Editors and Affiliations

University of Birmingham, Birmingham, UK
Aleš Leonardis
University of Trento, Trento, Italy
Elisa Ricci
Technical University of Darmstadt, Darmstadt, Germany
Stefan Roth
Princeton University, Princeton, NJ, USA
Olga Russakovsky
Czech Technical University in Prague, Prague, Czech Republic
Torsten Sattler
École des Ponts ParisTech, Marne-la-Vallée, France
Gül Varol

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 8840 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jafarinia, H., Alipanah, A., Razavi, S., Mirzaie, N., Rohban, M.H. (2025). Snuffy: Efficient Whole Slide Image Classifier. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15147. Springer, Cham. https://doi.org/10.1007/978-3-031-73024-5_15

Download citation

DOI: https://doi.org/10.1007/978-3-031-73024-5_15
Published: 24 November 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-73023-8
Online ISBN: 978-3-031-73024-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics