Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Snuffy: Efficient Whole Slide Image Classifier

  • Conference paper
  • First Online:
Computer Vision – ECCV 2024 (ECCV 2024)

Abstract

Whole Slide Image (WSI) classification with multiple instance learning (MIL) in digital pathology faces significant computational challenges. Current methods mostly rely on extensive self-supervised learning (SSL) for satisfactory performance, requiring long training periods and considerable computational resources. At the same time, no pre-training affects performance due to domain shifts from natural images to WSIs. We introduce Snuffy architecture, a novel MIL-pooling method based on sparse transformers that mitigates performance loss with limited pre-training and enables continual few-shot pre-training as a competitive option. Our sparsity pattern is tailored for pathology and is theoretically proven to be a universal approximator with the tightest probabilistic sharp bound on the number of layers for sparse transformers, to date. We demonstrate Snuffy’s effectiveness on CAMELYON16 and TCGA Lung cancer datasets, achieving superior WSI and patch-level accuracies. The code is available on https://github.com/jafarinia/snuffy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 74.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://github.com/mahmoodlab/HIPT/issues/41.

References

  1. Andrews, S., Tsochantaridis, I., Hofmann, T.: Support vector machines for multiple-instance learning. Adv. Neural Inf. Process. Syst. 15 (2002)

    Google Scholar 

  2. Bejnordi, B.E., et al.: Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. JAMA 318, 2199–2210 (2017). https://api.semanticscholar.org/CorpusID:205086555

  3. Beltagy, I., Peters, M.E., Cohan, A.: Longformer: the long-document transformer. arXiv preprint arXiv:2004.05150 (2020)

  4. Campanella, G., et al.: Clinical-grade computational pathology using weakly supervised deep learning on whole slide images. Nat. Med. 25, 1301–1309 (2019). https://api.semanticscholar.org/CorpusID:196814162

  5. Caron, M., et al.: Emerging properties in self-supervised vision transformers. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9650–9660 (2021)

    Google Scholar 

  6. Chen, R.J., et al.: Scaling vision transformers to gigapixel images via hierarchical self-supervised learning. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, 18–24 June 2022, pp. 16123–16134. IEEE (2022). https://doi.org/10.1109/CVPR52688.2022.01567

  7. Chen, S., et al.: Adaptformer: adapting vision transformers for scalable visual recognition. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems, vol. 35, pp. 16664–16678. Curran Associates, Inc. (2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/69e2f49ab0837b71b0e0cb7c555990f8-Paper-Conference.pdf

  8. Cheplygina, V., de Bruijne, M., Pluim, J.P.W.: Not-so-supervised: a survey of semi-supervised, multi-instance, and transfer learning in medical image analysis. Medical Image Anal. 54, 280–296 (2019). https://doi.org/10.1016/J.MEDIA.2019.03.009

    Article  Google Scholar 

  9. Child, R., Gray, S., Radford, A., Sutskever, I.: Generating long sequences with sparse transformers (2019). https://openai.com/blog/sparse-transformers

  10. Cooper, L.A., Demicco, E.G., Saltz, J.H., Powell, R.T., Rao, A., Lazar, A.J.: Pancancer insights from the cancer genome atlas: the pathologist’s perspective. J. Pathol. 244(5), 512–524 (2018)

    Article  Google Scholar 

  11. Dadashzadeh, A., Duan, S., Whone, A., Mirmehdi, M.: Pecop: parameter efficient continual pretraining for action quality assessment. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 42–52 (2024)

    Google Scholar 

  12. Dietterich, T.G., Lathrop, R.H., Lozano-Pérez, T.: Solving the multiple instance problem with axis-parallel rectangles. Artif. Intell. 89(1–2), 31–71 (1997)

    Article  Google Scholar 

  13. Dosovitskiy, A., et al.: An image is worth 16\(\times \)16 words: transformers for image recognition at scale. In: 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, 3–7 May 2021. OpenReview.net (2021). https://openreview.net/forum?id=YicbFdNTTy

  14. Gao, P., et al.: Clip-adapter: better vision-language models with feature adapters. Int. J. Comput. Vis. 132(2), 581–595 (2024). https://doi.org/10.1007/S11263-023-01891-X

    Article  Google Scholar 

  15. Guo, Q., Qiu, X., Liu, P., Shao, Y., Xue, X., Zhang, Z.: Star-transformer. CoRR arxiv:1902.09113 (2019)

  16. He, J., Zhou, C., Ma, X., Berg-Kirkpatrick, T., Neubig, G.: Towards a unified view of parameter-efficient transfer learning. In: The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, 25–29 April 2022. OpenReview.net (2022). https://openreview.net/forum?id=0RDcd5Axok

  17. He, K., Chen, X., Xie, S., Li, Y., Dollár, P., Girshick, R.: Masked autoencoders are scalable vision learners. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16000–16009 (2022)

    Google Scholar 

  18. Houlsby, N., et al.: Parameter-efficient transfer learning for NLP. In: Chaudhuri, K., Salakhutdinov, R. (eds.) Proceedings of the 36th International Conference on Machine Learning, ICML 2019, Long Beach, California, USA, 9–15 June 2019. Proceedings of Machine Learning Research, vol. 97, pp. 2790–2799. PMLR (2019). http://proceedings.mlr.press/v97/houlsby19a.html

  19. Hu, E.J., et al.: Lora: low-rank adaptation of large language models. In: The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, 25–29 April 2022. OpenReview.net (2022). https://openreview.net/forum?id=nZeVKeeFYf9

  20. Ilse, M., Tomczak, J.M., Welling, M.: Attention-based deep multiple instance learning. In: Dy, J.G., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, 10–15 July 2018. Proceedings of Machine Learning Research, vol. 80, pp. 2132–2141. PMLR (2018). http://proceedings.mlr.press/v80/ilse18a.html

  21. Javed, S., et al.: Cellular community detection for tissue phenotyping in colorectal cancer histology images. Med. Image Anal. 63, 101696 (2020). https://doi.org/10.1016/J.MEDIA.2020.101696

    Article  Google Scholar 

  22. Kang, M., Song, H., Park, S., Yoo, D., Pereira, S.: Benchmarking self-supervised learning on diverse pathology datasets. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3344–3354 (2023)

    Google Scholar 

  23. Kather, J.N., et al.: Multi-class texture analysis in colorectal cancer histology. Sci. Rep. 6 (2016). https://api.semanticscholar.org/CorpusID:4769235

  24. van der Laak, J.A., Litjens, G.J.S., Ciompi, F.: Deep learning in histopathology: the path to the clinic. Nat. Med. 27, 775 – 784 (2021). https://api.semanticscholar.org/CorpusID:234597294

  25. Li, B., Li, Y., Eliceiri, K.W.: Dual-stream multiple instance learning network for whole slide image classification with self-supervised contrastive learning. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, 19–25 June 2021, pp. 14318–14328. Computer Vision Foundation/IEEE (2021). https://doi.org/10.1109/CVPR46437.2021.01409. https://openaccess.thecvf.com/content/CVPR2021/html/Li_Dual-Stream_Multiple_Instance_Learning_Network_for_Whole_Slide_Image_Classification_CVPR_2021_paper.html

  26. Lu, M.Y., Williamson, D.F.K., Chen, T.Y., Chen, R.J., Barbieri, M., Mahmood, F.: Data efficient and weakly supervised computational pathology on whole slide images. CoRR arxiv:2004.09666 (2020)

  27. Ludwig, J.A., Weinstein, J.N.: Biomarkers in cancer staging, prognosis and treatment selection. Nat. Rev. Cancer 5, 845–856 (2005). https://api.semanticscholar.org/CorpusID:25540232

  28. Myronenko, A., Xu, Z., Yang, D., Roth, H.R., Xu, D.: Accounting for dependencies in deep learning based multiple instance learning for whole slide imaging. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12908, pp. 329–338. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87237-3_32

    Chapter  Google Scholar 

  29. Ng, T.G., Damiris, K., Trivedi, U., George, J.C.: Obstructive jaundice, a rare presentation of lung cancer: a case report. Respir. Med. Case. Rep. 33, 101425 (2021)

    Google Scholar 

  30. Pajaziti, L., Hapçiu, S.R., Dobruna, S., Hoxha, N., Kurshumliu, F., Pajaziti, A.: Skin metastases from lung cancer: a case report. BMC. Res. Notes 8, 1–6 (2015)

    Article  Google Scholar 

  31. Patel, A.M., Vila, D.G.D., Peters, S.G.: Paraneoplastic syndromes associated with lung cancer. Mayo Clin. Proc. 68(3), 278–287 (1993). https://doi.org/10.1016/S0025-6196(12)60050-0. https://www.sciencedirect.com/science/article/pii/S0025619612600500

  32. Pfeiffer, J., Vulic, I., Gurevych, I., Ruder, S.: MAD-X: an adapter-based framework for multi-task cross-lingual transfer. In: Webber, B., Cohn, T., He, Y., Liu, Y. (eds.) Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, 16–20 November 2020, pp. 7654–7673. Association for Computational Linguistics (2020).https://doi.org/10.18653/V1/2020.EMNLP-MAIN.617

  33. Qi, C.R., Su, H., Mo, K., Guibas, L.J.: Pointnet: deep learning on point sets for 3d classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 652–660 (2017)

    Google Scholar 

  34. Qu, L., Luo, X., Liu, S., Wang, M., Song, Z.: DGMIL: distribution guided multiple instance learning for whole slide image classification. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds.) Medical Image Computing and Computer Assisted Intervention - MICCAI 2022 - 25th International Conference, Singapore, September 18-22, 2022, Proceedings, Part II. Lecture Notes in Computer Science, vol. 13432, pp. 24–34. Springer, Heidelberg (2022). https://doi.org/10.1007/978-3-031-16434-7_3

  35. Rony, J., Belharbi, S., Dolz, J., Ayed, I.B., McCaffrey, L., Granger, E.: Deep weakly-supervised learning methods for classification and localization in histology images: a survey. CoRR arxiv:1909.03354 (2019)

  36. Shalata, W., et al.: Dermatomyositis associated with lung cancer: a brief review of the current literature and retrospective single institution experience. Life 13, 40 (2022). https://doi.org/10.3390/life13010040

  37. Shao, Z., et al.: Transmil: transformer based correlated multiple instance learning for whole slide image classification. In: Ranzato, M., Beygelzimer, A., Dauphin, Y.N., Liang, P., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, 6–14 December 2021, Virtual, pp. 2136–2147 (2021). https://proceedings.neurips.cc/paper/2021/hash/10c272d06794d3e5785d5e7c5356e9ff-Abstract.html

  38. Srinidhi, C.L., Ciga, O., Martel, A.L.: Deep neural network models for computational histopathology: a survey. Med. Image Anal. 67, 101813 (2021). https://doi.org/10.1016/J.MEDIA.2020.101813

    Article  Google Scholar 

  39. Wu, J., et al.: Medical sam adapter: adapting segment anything model for medical image segmentation. arXiv preprint arXiv:2304.12620 (2023)

  40. Xiong, Y., et al.: Nyströmformer: a nyström-based algorithm for approximating self-attention. In: Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2021, Thirty-Third Conference on Innovative Applications of Artificial Intelligence, IAAI 2021, The Eleventh Symposium on Educational Advances in Artificial Intelligence, EAAI 2021, Virtual Event, 2–9 February 2021, pp. 14138–14148. AAAI Press (2021). https://doi.org/10.1609/AAAI.V35I16.17664

  41. Xu, Y., Zhu, J., Chang, E.I., Tu, Z.: Multiple clustered instance learning for histopathology cancer image classification, segmentation and clustering. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012, pp. 964–971. IEEE Computer Society (2012). https://doi.org/10.1109/CVPR.2012.6247772

  42. Yun, C., Chang, Y.W., Bhojanapalli, S., Rawat, A.S., Reddi, S.J., Kumar, S.: O(n) connections are expressive enough: universal approximability of sparse transformers. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. NIPS 2020. Curran Associates Inc., Red Hook (2020)

    Google Scholar 

  43. Yun, C., Chang, Y.W., Bhojanapalli, S., Rawat, A.S., Reddi, S.J., Kumar, S.: \$o(n)\$ connections are expressive enough: universal approximability of sparse transformers. ArXiv arxiv:2006.04862 (2020). https://api.semanticscholar.org/CorpusID:219558319

  44. Zaheer, M., et al.: Big bird: transformers for longer sequences. Adv. Neural Inf. Process. Syst. 33 (2020)

    Google Scholar 

  45. Zaheer, M., Kottur, S., Ravanbakhsh, S., Poczos, B., Salakhutdinov, R.R., Smola, A.J.: Deep sets. Adv. Neural Inf. Process. Syst. 30 (2017)

    Google Scholar 

  46. Zhang, H., et al.: DTFD-MIL: double-tier feature distillation multiple instance learning for histopathology whole slide image classification. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, 18–24 June 2022, pp. 18780–18790. IEEE (2022). https://doi.org/10.1109/CVPR52688.2022.01824

  47. Zhang, T., et al.: Pad: self-supervised pre-training with patchwise-scale adapter for infrared images. arXiv preprint arXiv:2312.08192 (2023)

  48. Zheng, Y., et al.: Kernel attention transformer for histopathology whole slide image analysis and assistant cancer diagnosis. IEEE Trans. Med. Imaging 42(9), 2726–2739 (2023). https://doi.org/10.1109/TMI.2023.3264781

    Article  Google Scholar 

Download references

Acknowledgements

We extend our deepest and most special thanks to Danial Hamdi for their efforts. We also thank Mohammad Mosayyebi, Mehrab Moradzadeh, Mohammad Hosein Movasaghinia, Mohammad Azizmalayeri, Hossein Mirzaei, Mohammad Mozafari, Soroush Vafaei Tabar, Mohammad Hassan Alikhani, and Hosein Hasani.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammad Hossein Rohban .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 8840 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Jafarinia, H., Alipanah, A., Razavi, S., Mirzaie, N., Rohban, M.H. (2025). Snuffy: Efficient Whole Slide Image Classifier. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15147. Springer, Cham. https://doi.org/10.1007/978-3-031-73024-5_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-73024-5_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-73023-8

  • Online ISBN: 978-3-031-73024-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics