Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3388440.3412409acmconferencesArticle/Chapter ViewAbstractPublication PagesbcbConference Proceedingsconference-collections
research-article
Open access

Submodular sketches of single-cell RNA-seq measurements

Published: 10 November 2020 Publication History

Abstract

Single-cell RNA-seq (scRNA-seq) datasets now routinely profile tens of thousands to millions of cells. These data are invaluable for finding important subpopulations of cells and for closely studying the mechanics of gene expression. However, as these datasets become larger, they become more difficult to analyze. Analyzing and sharing massive single-cell RNA-seq datasets can be facilitated by creating a "sketch" of the data---a selected subset of cells that accurately represent the full dataset. In this work, we use an existing benchmark to demonstrate the utility of submodular optimization in efficiently creating high quality sketches of scRNA-seq data.

References

[1]
J. A. Bilmes and W. Bai. 2017. Deep Submodular Functions. Arxiv abs/1701.08939 (Jan 2017).
[2]
J. Cao, M. Spielmann, X. Qiu, X. Huang, D. M. Ibrahim, A. J. Hill, F. Zhang, S. Mundlos, L. Christiansen, F. J. Steemers, C. Trapnell, and J. Shendure. 2019. The single-cell transcriptional landscape of mammalian organogenesis. Nature 566 (2019), 496--502.
[3]
M. Carter. 2001. Foundations of Mathematical Economics. The MIT Press.
[4]
G. Cormode. 2017. Data sketching. Commun. ACM 60, 9 (2017), 48.
[5]
G. Cornunéjols, G. L. Nemhauser, and L. A. Wolsey. 1990. The uncapacitated facility location problem. In Discrete Location Theory, P.B. Mirchandani and R.L. Franci (Eds.). Wiley/Interscience, New York, Chapter 3, 119--171.
[6]
Andrew Cotter, Mahdi Milani Fard, Seungil You, Maya Gupta, and Jeff Bilmes. 2018. Constrained Interacting Submodular Groupings. In International Conference on Machine Learning (ICML). Stockholm, Sweden.
[7]
J. Edmonds. 1970. Matroids, Submodular Functions, and Certain Polyhedra. Combinatorial Structures and Their Applications (1970), 69--87.
[8]
M.L. Fisher, G.L. Nemhauser, and L.A. Wolsey. 1978. An analysis of approximations for maximizing submodular set functions---II. Polyhedral combinatorics (1978), 73--87.
[9]
M. Gasperini, A. J. Hill, J. L. McFaline-Figueroa, B. Martin, S. Kim, D. Jackson, A. Leith, J. Schreiber, W. S. Noble, C. Trapnell, N. Ahituv, and J. Shendure. 2019. A genome-wide framework for mapping gene regulation via cellular genetic screens. Cell 176 (2019), 377--390. Issue 1.
[10]
B. Hie, H. Cho, B. DeMeo, B. Bryson, and B. Berger. 2019. Geometric sketching compactly summarizes the single-cell transcriptomic landscape. Cell Systems 8 (2019), 483--493.
[11]
K. Kirchhoff and J. Bilmes. 2014. Submodularity for data selection in machine translation. In Empirical Methods in Natural Language Processing (EMNLP).
[12]
S. K. Lam, A. Pitrou, and S. Seibert. 2015. Numba: A LLVM-based Python JIT Compiler. In Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC (LLVM '15). ACM, New York, NY, USA, 7:1--7:6.
[13]
M. W. Libbrecht, J. A. Bilmes, and W. S. Noble. 2018. Choosing non-redundant representative subsets of protein sequence data sets using submodular optimization. Proteins 86, 4 (2018), 454--466.
[14]
G. Lin, M.K. Chawla, K. Olson, C. A. Barnes, J. F. Guzowski, C. Bjornsson, W. Shain, and B. Roysam. 2007. A multi-model approach to simultaneous segmentation and classification of heteregenous populations of cell nuclei in 3D confocal microscope images. Cytometry A. 71, 9 (2007), 724--736.
[15]
H. Lin and J. Bilmes. 2011. A class of submodular functions for document summarization. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1. Association for Computational Linguistics, 510--520.
[16]
H. Lin and J. Bilmes. 2012. Learning Mixtures of Submodular Shells with Application to Document Summarization. In Uncertainty in Artificial Intelligence (UAI). AUAI, Catalina Island, USA, 479--490.
[17]
Y. Liu, K. Wei, K. Kirchhoff, Y. Song, and J. Bilmes. 2013. Submodular feature selection for high-dimensional acoustic score spaces. In Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on. IEEE, 7184--7188.
[18]
L. Lovász. 1983. Submodular functions and convexity. In Mathematical Programming - The State of the Art, M. Grotchel A. Bachem and B. Korte (Eds.). Springer-Verlag, Bonn, 235--257.
[19]
M. Minoux. 1978. Accelerated greedy algorithms for maximizing submodular set functions. Optimization Techniques (1978), 234--243.
[20]
H. Narayanan. 1997. Submodular functions and electrical networks. Annals of Discrete Mathematics 54 (1997).
[21]
G. L. Nemhauser, L. A. Wolsey, and M. L. Fisher. 1978. An analysis of approximations for maximizing submodular set functions. Mathematical Programming 14, 1 (1978), 265--294.
[22]
T. E. Oliphant. 2006. Guide to NumPy. CreateSpace Independent Publishing Platform, North Charleston, SC.
[23]
T. Powers, J. Bilmes, D. W. Krout, and L. Atlas. 2016. Constrained Robust Sub-modular Sensor Selection with Applications to Multistatic Sonar Arrays. In 19th International Conference on Information Fusion. IEEE, Heidelberg, Germany.
[24]
A. Saunders, E.Z. Macosko, A. Wysoker, M. Goldman, F.M. Krienen, H. de Rivera, E. Bien, M. Baum, L. Bortolin, S. Wang, A. Goeva, J. Nemesh, N. Kamitaki, S. Brumbaugh, D. Kulp, and S. A. McCarroll. 2018. Molecular diversity and specializations among the cells of the adult mouse brain. Cell 174 (2018), 999--1014.
[25]
J. M. Schreiber, J. Bilmes, and W. S. Noble. 2019. apricot: Submodular selection for data summarization in Python. arXiv (2019). https://arxiv.org/abs/1906.03543.
[26]
A. Schrijver. 2004. Combinatorial Optimization. Springer.
[27]
L. S. Shapley. 1971. Cores of convex games. International Journal of Game Theory 1, 1 (1971), 11--26.
[28]
D. M. Topkis. 1998. Supermodularity and complementarity. Princeton University Press.
[29]
S. Tschiatschek, R. K. Iyer, H. Wei, and J. A. Bilmes. 2014. Learning mixtures of submodular functions for image collection summarization. In Advances in Neural Information Processing Systems. 1413--1421.
[30]
X. Vives. 2001. Oligopoly pricing: Old ideas and new tools. The MIT Press.
[31]
K. Wei, M. W. Libbrecht, J. A. Bilmes, and W. S. Noble. 2016. Choosing panels of genomics assays using submodular optimization. Genome Biology 17, 1 (2016), 229.
[32]
K. Wei, Y. Liu, K. Kirchhoff, C. Bartels, and J. Bilmes. 2014. Submodular subset selection for large-scale speech training data. In Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on. IEEE, 3311--3315.
[33]
K. Wei, Y. Liu, K. Kirchhoff, and J. Bilmes. 2013. Using Document Summarization Techniques for Speech Data Subset Selection. In HLT-NAACL. 721--726.
[34]
A. Zeisel, H. Hochgerner, P. Lönnerberg, A. Johnsson, F. Memic, J. van der Zwan, M. Häring, E. Braun, L.E. Borm, G. La Manno, S. Codeluppi, A. Furlan, K. Lee, N. Skene, K. D. Harris, J. Hjerling-Leffler, E. Arenas, P. Ernfors, U. Marklund, and S. Linnarsson. 2018. Molecular Architecture of the Mouse Nervous System. Cell 174 (2018), 999--1014.
[35]
G. X. Y. Zheng, J. M. Terry, P. Belgrader, P. Ryvkin, Z. W. Bent, R. Wilson, S. B. Ziraldo, T. D. Wheeler, G. P. McDermott, J. Zhu, M. T. Gregory, J. Shuga, L. Montesclaros, J. G. Underwood, D. A. Masquelier, S. Y. Nishimura, M. Schnall-Levin, P. W. Wyatt, C. M. Hindson, R. Bharadwaj, A. Wong, K. D. Ness, L. W. Beppu, H. J. Deeg, C. McFarland, K. R. Loeb, W. J. Valente, N. G. Ericson, E. A. Stevens, J. P. Radich, T. S. Mikkelsen, B. J. Hindson, and J. H. Biela. 2017. Massively parallel digital transcriptional profiling of single cells. Nature Communications 8 (2017), 14049.

Cited By

View all
  • (2023)KMD clustering: robust general-purpose clustering of biological dataCommunications Biology10.1038/s42003-023-05480-z6:1Online publication date: 2-Nov-2023
  • (2021)Fast and memory-efficient scRNA-seq k-means clustering with various distancesProceedings of the 12th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics10.1145/3459930.3469523(1-8)Online publication date: 1-Aug-2021

Index Terms

  1. Submodular sketches of single-cell RNA-seq measurements

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    BCB '20: Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics
    September 2020
    193 pages
    ISBN:9781450379649
    DOI:10.1145/3388440
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 10 November 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. single-cell expression
    2. sketching
    3. submodular maximization

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Funding Sources

    Conference

    BCB '20
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 254 of 885 submissions, 29%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)75
    • Downloads (Last 6 weeks)11
    Reflects downloads up to 11 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)KMD clustering: robust general-purpose clustering of biological dataCommunications Biology10.1038/s42003-023-05480-z6:1Online publication date: 2-Nov-2023
    • (2021)Fast and memory-efficient scRNA-seq k-means clustering with various distancesProceedings of the 12th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics10.1145/3459930.3469523(1-8)Online publication date: 1-Aug-2021

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media