Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3583780.3615061acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article
Public Access

Single-Cell Multimodal Prediction via Transformers

Published: 21 October 2023 Publication History

Abstract

The recent development of multimodal single-cell technology has made the possibility of acquiring multiple omics data from individual cells, thereby enabling a deeper understanding of cellular states and dynamics. Nevertheless, the proliferation of multimodal single-cell data also introduces tremendous challenges in modeling the complex interactions among different modalities. The recently advanced methods focus on constructing static interaction graphs and applying graph neural networks (GNNs) to learn from multimodal data. However, such static graphs can be suboptimal as they do not take advantage of the downstream task information; meanwhile GNNs also have some inherent limitations when deeply stacking GNN layers. To tackle these issues, in this work, we investigate how to leverage transformers for multimodal single-cell data in an end-to-end manner while exploiting downstream task information. In particular, we propose a scMoFormer framework which can readily incorporate external domain knowledge and model the interactions within each modality and cross modalities. Extensive experiments demonstrate that scMoFormer achieves superior performance on various benchmark datasets. Remarkably, scMoFormer won a Kaggle silver medal with the rank of 24/1221 (Top 2%) without ensemble in a NeurIPS 2022 competition1. Our implementation is publicly available at Github2.

References

[1]
Uri Alon and Eran Yahav. 2021. On the Bottleneck of Graph Neural Networks and its Practical Implications. In International Conference on Learning Representations.
[2]
Jason D Buenrostro, Paul G Giresi, Lisa C Zaba, Howard Y Chang, and William J Greenleaf. 2013. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nature methods, Vol. 10, 12 (2013), 1213--1218.
[3]
Junyue Cao, Darren A Cusanovich, Vijay Ramani, Delasa Aghamirzaie, Hannah A Pliner, Andrew J Hill, Riza M Daza, Jose L McFaline-Figueroa, Jonathan S Packer, Lena Christiansen, et al. 2018. Joint profiling of chromatin accessibility and gene expression in thousands of single cells. Science, Vol. 361, 6409 (2018), 1380--1385.
[4]
Zhi-Jie Cao and Ge Gao. 2022. Multi-omics single-cell data integration and regulatory inference with graph-linked embedding. Nature Biotechnology (2022), 1--9.
[5]
Dexiong Chen, Leslie O'Bray, and Karsten Borgwardt. 2022. Structure-aware transformer for graph representation learning. In International Conference on Machine Learning. PMLR, 3469--3489.
[6]
Song Chen, Blue B Lake, and Kun Zhang. 2019. High-throughput sequencing of the transcriptome and chromatin accessibility in the same cell. Nature biotechnology, Vol. 37, 12 (2019), 1452--1457.
[7]
Krzysztof Marcin Choromanski, Valerii Likhosherstov, David Dohan, Xingyou Song, Andreea Gane, Tamas Sarlos, Peter Hawkins, Jared Quincy Davis, Afroz Mohiuddin, Lukasz Kaiser, David Benjamin Belanger, Lucy J Colwell, and Adrian Weller. 2021. Rethinking Attention with Performers. In International Conference on Learning Representations.
[8]
Fiona Cunningham, James E Allen, Jamie Allen, Jorge Alvarez-Jarreta, M Ridwan Amode, Irina M Armean, Olanrewaju Austine-Orimoloye, Andrey G Azov, If Barnes, Ruth Bennett, et al. 2022. Ensembl 2022. Nucleic acids research, Vol. 50, D1 (2022), D988--D995.
[9]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
[10]
Vijay Prakash Dwivedi and Xavier Bresson. 2020. A generalization of transformer networks to graphs. arXiv preprint arXiv:2012.09699 (2020).
[11]
Vijay Prakash Dwivedi, Anh Tuan Luu, Thomas Laurent, Yoshua Bengio, and Xavier Bresson. 2021. Graph neural networks with learnable structural and positional representations. arXiv preprint arXiv:2110.07875 (2021).
[12]
Vijay Prakash Dwivedi, Anh Tuan Luu, Thomas Laurent, Yoshua Bengio, and Xavier Bresson. 2022. Graph Neural Networks with Learnable Structural and Positional Representations. In International Conference on Learning Representations.
[13]
Gökcen Eraslan, Lukas M Simon, Maria Mircea, Nikola S Mueller, and Fabian J Theis. 2019. Single-cell RNA-seq denoising using a deep count autoencoder. Nature communications, Vol. 10, 1 (2019), 390.
[14]
Federico Gaiti, Ronan Chaligne, Hongcang Gu, Ryan M Brand, Steven Kothen-Hill, Rafael C Schulman, Kirill Grigorev, Davide Risso, Kyu-Tae Kim, Alessandro Pastore, et al. 2019. Epigenetic evolution and lineage histories of chronic lymphocytic leukaemia. Nature, Vol. 569, 7757 (2019), 576--580.
[15]
Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. Advances in neural information processing systems, Vol. 30 (2017).
[16]
Yuhan Hao, Stephanie Hao, Erica Andersen-Nissen, William M Mauck, Shiwei Zheng, Andrew Butler, Maddie J Lee, Aaron J Wilk, Charlotte Darby, Michael Zager, et al. 2021. Integrated analysis of multimodal single-cell data. Cell, Vol. 184, 13 (2021), 3573--3587.
[17]
Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016).
[18]
Nikita Kitaev, Łukasz Kaiser, and Anselm Levskaya. 2020. Reformer: The efficient transformer. arXiv preprint arXiv:2001.04451 (2020).
[19]
Yuri Kotliarov, Rachel Sparks, Andrew J Martins, Matthew P Mulè, Yong Lu, Meghali Goswami, Lela Kardava, Romain Banchereau, Virginia Pascual, Angélique Biancotto, et al. 2020. Broad immune activation underlies shared set point signatures for vaccine responsiveness in healthy individuals and disease activity in patients with lupus. Nature Medicine, Vol. 26, 4 (2020), 618--629.
[20]
Devin Kreuzer, Dominique Beaini, William L. Hamilton, Vincent Létourneau, and Prudencio Tossou. 2021. Rethinking Graph Transformers with Spectral Attention. In Advances in Neural Information Processing Systems.
[21]
Sebastien Lelong, Xinghua Zhou, Cyrus Afrasiabi, Zhongchao Qian, Marco Alvarado Cano, Ginger Tsueng, Jiwen Xin, Julia Mullen, Yao Yao, Ricardo Avila, et al. 2022. BioThings SDK: a toolkit for building high-performance data APIs in biomedical research. Bioinformatics, Vol. 38, 7 (2022), 2077--2079.
[22]
Xiang Lin, Tian Tian, Zhi Wei, and Hakon Hakonarson. 2022. Clustering of single-cell multi-omics data with a multimodal deep learning method. Nature Communications, Vol. 13, 1 (2022), 7705.
[23]
Ze Liu, Han Hu, Yutong Lin, Zhuliang Yao, Zhenda Xie, Yixuan Wei, Jia Ning, Yue Cao, Zheng Zhang, Li Dong, et al. 2022. Swin transformer v2: Scaling up capacity and resolution. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 12009--12019.
[24]
Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. 2021. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision. 10012--10022.
[25]
Malte D Luecken, Daniel Bernard Burkhardt, Robrecht Cannoodt, Christopher Lance, Aditi Agrawal, Hananeh Aliee, Ann T Chen, Louise Deconinck, Angela M Detweiler, Alejandro A Granados, et al. 2021. A sandbox for prediction and integration of dna, rna, and proteins in single cells. In Thirty-fifth conference on neural information processing systems datasets and benchmarks track (Round 2).
[26]
Yao Ma and Jiliang Tang. 2021. Deep learning on graphs. Cambridge University Press.
[27]
Kodai Minoura, Ko Abe, Hyunha Nam, Hiroyoshi Nishikawa, and Teppei Shimamura. 2021. A mixture-of-experts deep generative model for integrated analysis of single-cell multiomics data. Cell reports methods, Vol. 1, 5 (2021), 100071.
[28]
Sebastian Pott. 2017. Simultaneous measurement of chromatin accessibility, DNA methylation, and nucleosome phasing in single cells. elife, Vol. 6 (2017), e23203.
[29]
Ladislav Rampasek, Mikhail Galkin, Vijay Prakash Dwivedi, Anh Tuan Luu, Guy Wolf, and Dominique Beaini. 2022. Recipe for a General, Powerful, Scalable Graph Transformer. In Advances in Neural Information Processing Systems.
[30]
Ladislav Rampásek, Mikhail Galkin, Vijay Prakash Dwivedi, Anh Tuan Luu, Guy Wolf, and Dominique Beaini. 2022. Recipe for a general, powerful, scalable graph transformer. arXiv preprint arXiv:2205.12454 (2022).
[31]
Gil Stelzer, Naomi Rosen, Inbar Plaschkes, Shahar Zimmerman, Michal Twik, Simon Fishilevich, Tsippi Iny Stein, Ron Nudel, Iris Lieder, Yaron Mazor, et al. 2016. The GeneCards suite: from gene data mining to disease genome sequence analyses. Current protocols in bioinformatics, Vol. 54, 1 (2016), 1--30.
[32]
Marlon Stoeckius, Christoph Hafemeister, William Stephenson, Brian Houck-Loomis, Pratip K Chattopadhyay, Harold Swerdlow, Rahul Satija, and Peter Smibert. 2017. Simultaneous epitope and transcriptome measurement in single cells. Nature methods, Vol. 14, 9 (2017), 865--868.
[33]
Damian Szklarczyk, Rebecca Kirsch, Mikaela Koutrouli, Katerina Nastou, Farrokh Mehryary, Radja Hachilif, Annika L Gable, Tao Fang, Nadezhda T Doncheva, Sampo Pyysalo, et al. 2023. The STRING database in 2023: protein--protein association networks and functional enrichment analyses for any sequenced genome of interest. Nucleic Acids Research, Vol. 51, D1 (2023), D638--D646.
[34]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems, Vol. 30 (2017).
[35]
Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, Yoshua Bengio, et al. 2017. Graph attention networks. stat, Vol. 1050, 20 (2017), 10--48550.
[36]
Juexin Wang, Anjun Ma, Yuzhou Chang, Jianting Gong, Yuexu Jiang, Ren Qi, Cankun Wang, Hongjun Fu, Qin Ma, and Dong Xu. 2021. scGNN is a novel graph neural network framework for single-cell RNA-Seq analyses. Nature communications, Vol. 12, 1 (2021), 1--11.
[37]
Hongzhi Wen, Jiayuan Ding, Wei Jin, Yiqi Wang, Yuying Xie, and Jiliang Tang. 2022. Graph neural networks for multimodal single-cell data integration. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 4153--4163.
[38]
Chunlei Wu, Adam Mark, and Andrew I Su. 2014. MyGene. info: gene annotation query as a service. bioRxiv (2014), 009332.
[39]
Kevin E Wu, Kathryn E Yost, Howard Y Chang, and James Zou. 2021. BABEL enables cross-modality translation between multiomic profiles at single-cell resolution. Proceedings of the National Academy of Sciences, Vol. 118, 15 (2021), e2023070118.
[40]
Fan Yang, Wenchuan Wang, Fang Wang, Yuan Fang, Duyu Tang, Junzhou Huang, Hui Lu, and Jianhua Yao. 2022. scBERT as a large-scale pretrained deep language model for cell type annotation of single-cell RNA-seq data. Nature Machine Intelligence, Vol. 4, 10 (2022), 852--866.
[41]
Karren Dai Yang, Anastasiya Belyaeva, Saradha Venkatachalapathy, Karthik Damodaran, Abigail Katcoff, Adityanarayanan Radhakrishnan, GV Shivashankar, and Caroline Uhler. 2021. Multi-domain translation between single-cell imaging and sequencing data using autoencoders. Nature communications, Vol. 12, 1 (2021), 1--10.
[42]
Chengxuan Ying, Tianle Cai, Shengjie Luo, Shuxin Zheng, Guolin Ke, Di He, Yanming Shen, and Tie-Yan Liu. 2021. Do transformers really perform badly for graph representation? Advances in Neural Information Processing Systems, Vol. 34 (2021), 28877--28888.
[43]
Chenxu Zhu, Sebastian Preissl, and Bing Ren. 2020. Single-cell multimodal omics: the power of many. Nature methods, Vol. 17, 1 (2020), 11--14.
[44]
Chunman Zuo, Hao Dai, and Luonan Chen. 2021. Deep cross-omics cycle attention model for joint analysis of single-cell multi-omics data. Bioinformatics, Vol. 37, 22 (2021), 4091--4099.

Cited By

View all
  • (2024)Transformer-Based Single-Cell Language Model: A SurveyBig Data Mining and Analytics10.26599/BDMA.2024.90200347:4(1169-1186)Online publication date: Dec-2024
  • (2024)Clustering and visualization of single-cell RNA-seq data using path metricsPLOS Computational Biology10.1371/journal.pcbi.101201420:5(e1012014)Online publication date: 29-May-2024
  • (2024)Strategic Multi-Omics Data Integration via Multi-Level Feature Contrasting and MatchingIEEE Transactions on NanoBioscience10.1109/TNB.2024.345679723:4(579-590)Online publication date: Oct-2024
  • Show More Cited By

Index Terms

  1. Single-Cell Multimodal Prediction via Transformers

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management
    October 2023
    5508 pages
    ISBN:9798400701245
    DOI:10.1145/3583780
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 21 October 2023

    Check for updates

    Author Tags

    1. graph neural networks
    2. single-cell analysis
    3. transformer

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    CIKM '23
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

    Upcoming Conference

    CIKM '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)420
    • Downloads (Last 6 weeks)100
    Reflects downloads up to 11 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Transformer-Based Single-Cell Language Model: A SurveyBig Data Mining and Analytics10.26599/BDMA.2024.90200347:4(1169-1186)Online publication date: Dec-2024
    • (2024)Clustering and visualization of single-cell RNA-seq data using path metricsPLOS Computational Biology10.1371/journal.pcbi.101201420:5(e1012014)Online publication date: 29-May-2024
    • (2024)Strategic Multi-Omics Data Integration via Multi-Level Feature Contrasting and MatchingIEEE Transactions on NanoBioscience10.1109/TNB.2024.345679723:4(579-590)Online publication date: Oct-2024
    • (2024)Delineating the effective use of self-supervised learning in single-cell genomicsNature Machine Intelligence10.1038/s42256-024-00934-3Online publication date: 27-Dec-2024
    • (2024)Transformers in single-cell omics: a review and new perspectivesNature Methods10.1038/s41592-024-02353-z21:8(1430-1443)Online publication date: 9-Aug-2024
    • (2024)A review of transformers in drug discovery and beyondJournal of Pharmaceutical Analysis10.1016/j.jpha.2024.101081(101081)Online publication date: Aug-2024
    • (2024)Advances and applications in single-cell and spatial genomicsScience China Life Sciences10.1007/s11427-024-2770-xOnline publication date: 20-Dec-2024

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media