Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Finding Significantly Enriched Cells in Single-Cell RNA Sequencing by Single-Sample Approaches

  • Conference paper
  • First Online:
Bioinformatics and Biomedical Engineering (IWBBIO 2022)

Abstract

Gene set analysis is a leading bioinformatical technique allowing comparison of phenotypes on gene set level, which is applied to different transcriptome-wide gene expression platforms and omics levels. The aim of this study was to measure the performance of three single-sample gene set enrichment algorithms, based on their ability to obtain the statistical significance of enrichment in each cell separately using scRNA-Seq data. The peripheral blood mononuclear cell dataset was used in the evaluation process and individual enrichment within the B cell subtype was investigated based on reference gene set collection. Sensitivity, specificity, prioritization, and balanced accuracy were used as evaluation metrics, accompanied by correlation analysis between gene sets. AUCell, originally designed for scRNA-Seq, showed the best sensitivity and balanced accuracy, good prioritization and acceptable specificity. However, large correlation between gene set size and specificity was observed, so we recommend its usage on large gene sets (>80). Moreover, the computational time is much longer compared to other tested methods. Among other algorithms, CERNO gave very high specificity and prioritization, but the sensitivity needs to be enhanced by algorithm improvement. Finally, the problem of the “gold standard” dataset and gene set collection that could be used for gene set analysis algorithms performance evaluation in scRNA-Seq, was stated and the initial solution was presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Kanehisa, M., Furumichi, M., Tanabe, M., Sato, Y., Morishima, K.: KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res 45, D353–D361 (2017)

    Article  CAS  PubMed  Google Scholar 

  2. Consortium, G.O.: The Gene Ontology (GO) database and informatics resource. Nucleic Acids Res. 32, D258–D261 (2004)

    Article  Google Scholar 

  3. Subramanian, A., et al.: Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A 102, 15545–15550 (2005)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Dong, X., Hao, Y., Wang, X., Tian, W.: LEGO: a novel method for gene set over-representation analysis by incorporating network-based gene weights. Sci. Rep. 6, 18871 (2016)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Zyla, J., Marczyk, M., Domaszewska, T., Kaufmann, S.H.E., Polanska, J., Weiner, J.: Gene set enrichment for reproducible science: comparison of CERNO and eight other algorithms. Bioinformatics 35, 5146–5154 (2019)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Tomfohr, J., Lu, J., Kepler, T.B.: Pathway level analysis of gene expression using singular value decomposition. BMC Bioinformatics 6, 225 (2005)

    Article  PubMed  PubMed Central  Google Scholar 

  7. Hänzelmann, S., Castelo, R., Guinney, J.: GSVA: gene set variation analysis for microarray and RNA-Seq data. BMC Bioinformatics 14, 7 (2013)

    Article  PubMed  PubMed Central  Google Scholar 

  8. Jaakkola, M.K., McGlinchey, A.J., Klen, R., Elo, L.L.: PASI: a novel pathway method to identify delicate group effects. PLoS ONE 13, e0199991 (2018)

    Article  PubMed  PubMed Central  Google Scholar 

  9. Foroutan, M., Bhuva, D.D., Lyu, R., Horan, K., Cursons, J., Davis, M.J.: Single sample scoring of molecular phenotypes. BMC Bioinformatics 19, 1–10 (2018)

    Article  Google Scholar 

  10. Lee, E., Chuang, H.-Y., Kim, J.-W., Ideker, T., Lee, D.: Inferring pathway activity toward precise disease classification. PLoS Comput. Biol. 4, e1000217 (2008)

    Article  PubMed  PubMed Central  Google Scholar 

  11. Zyla, J., Leszczorz, K., Polanska, J.: Robustness of pathway enrichment analysis to transcriptome-wide gene expression platform. In: International Conference on Practical Applications of Computational Biology & Bioinformatics, pp. 176–185. Springer (Year)

    Google Scholar 

  12. Geistlinger, L., et al.: Toward a gold standard for benchmarking gene set enrichment analysis. Brief Bioinform 22, 545–556 (2021)

    Article  CAS  PubMed  Google Scholar 

  13. Stuart, T., et al.: Comprehensive integration of single-cell data. Cell 177, 1888–1902 (2019). e1821

    Google Scholar 

  14. Andreatta, M., Carmona, S.J.: UCell: robust and scalable single-cell gene signature scoring. bioRxiv (2021)

    Google Scholar 

  15. Aibar, S., et al.: SCENIC: single-cell regulatory network inference and clustering. Nat. Methods 14, 1083–1086 (2017)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Zhang, Y., et al.: Benchmarking algorithms for pathway activity transformation of single-cell RNA-seq data. Comput. Struct. Biotechnol. J. 18, 2953–2961 (2020)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Ding, J., et al.: Systematic comparison of single-cell and single-nucleus RNA-sequencing methods. Nat. Biotechnol. 38, 737–746 (2020)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Marczyk, M., Jaksik, R., Polanski, A., Polanska, J.: GaMRed—adaptive filtering of high-throughput biological data. IEEE/ACM Trans. Comput. Biol. Bioinf. 17, 149–157 (2020)

    CAS  Google Scholar 

  19. Widlak, P., et al.: Detection of molecular signatures of oral squamous cell carcinoma and normal epithelium–application of a novel methodology for unsupervised segmentation of imaging mass spectrometry data. Proteomics 16, 1613–1621 (2016)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Chaussabel, D., et al.: A modular analysis framework for blood genomics studies: application to systemic lupus erythematosus. Immunity 29, 150–164 (2008)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Li, S., et al.: Molecular signatures of antibody responses derived from a systems biology study of five human vaccines. Nat. Immunol. 15, 195–204 (2014)

    Article  PubMed  Google Scholar 

  22. Chen, B., Khodadoust, M.S., Liu, C.L., Newman, A.M., Alizadeh, A.A.: Profiling tumor infiltrating immune cells with CIBERSORT. Methods in molecular biology (Clifton, NJ) 1711, 243 (2018)

    Google Scholar 

  23. Demerath, N.J.: The American Soldier: Volume I, Adjustment During Army Life. By S. A. Stouffer, E. A. Suchman, L. C. DeVinney, S. A. Star, R. M. Williams, Jr. Volume II, Combat and Its Aftermath. By S. A. Stouffer, A. A. Lumsdaine, M. H. Lumsdaine, R. M. Williams, Jr., M. B. Smith, I. L. Janis, S. A. Star, L. S. Cottrell, Jr. Princeton, New Jersey: Princeton University Press, 1949. Vol. I, 599 pp., Vol. II, 675 pp. $7.50 each; $13.50 together. Social Forces 28, 87–90 (1949)

    Google Scholar 

  24. Benjamini, Y., Hochberg, Y.: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. Roy. Stat. Soc.: Ser. B (Methodol.) 57, 289–300 (1995)

    Google Scholar 

  25. Maaten, L.V.D., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)

    Google Scholar 

  26. Tarca, A.L., Bhatti, G., Romero, R.: A comparison of gene set analysis methods in terms of sensitivity, prioritization and specificity. PLoS ONE 8, e79217 (2013)

    Article  PubMed  PubMed Central  Google Scholar 

  27. Xie, C., Jauhari, S., Mora, A.: Popularity and performance of bioinformatics software: the case of gene set analysis. BMC Bioinformatics 22, 191 (2021)

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

This work was financed by Silesian University of Technology grant for maintaining and developing research potential (MM, JZ). Anna Mrukwa takes part in mentor program “Spread your wings” at Silesian University of Technology and was financed by reserve of the Vice-Rector for Student Affairs and Education (60/001).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Joanna Zyla .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mrukwa, A., Marczyk, M., Zyla, J. (2022). Finding Significantly Enriched Cells in Single-Cell RNA Sequencing by Single-Sample Approaches. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2022. Lecture Notes in Computer Science(), vol 13347. Springer, Cham. https://doi.org/10.1007/978-3-031-07802-6_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-07802-6_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-07801-9

  • Online ISBN: 978-3-031-07802-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics