Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3592686.3592748acmotherconferencesArticle/Chapter ViewAbstractPublication PagesbicConference Proceedingsconference-collections
research-article

scAEQN: A Batch Correction Joint Dimension Reduction Method on scRNA-seq Data

Published: 31 May 2023 Publication History

Abstract

As single cell sequencing technique continues to advance, the size of scRNA-seq dataset has been enlarging, generating batch effects that affect downstream analysis, such as clustering analysis and differential expression gene (DEG) analysis. In this context, we present a novel batch integration joint dimensionality reduction method titled scAEQN. It adopts QuantNorm to calculate coefficient matrix and constructs an autoencoder to estimate the matrix of coefficient, ultimately obtaining a low-dimensional representation and reconstruction of the data. scAEQN is compared to different batch correction methods on a simulated dataset and six real single-cell RNA datasets. The results suggest that scAEQN is superior to batch correction methods under comparison in downstream analysis. scAEQN effectively eliminates batches and strongly reserves clustering pattern of cells, providing solid back-up for downstream analyses. scAEQN enhances the capability of clustering and selects more representative and stable DEGs in differential expression gene analysis. The source code and supplementary information of scAEQN are provided on website https://github.com/SiningSong/scAEQN.

References

[1]
Kim B, Lee E, Kim JK. Analysis of Technical and Biological Variability in Single-Cell RNA Sequencing. Methods Mol Biol. 2019, 1935:25-43.
[2]
Johnson W E, Li C, Rabinovic A. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics, 2007,8(1): 118–127.
[3]
Ritchie ME, Phipson B, Wu D, limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res., 2015, 43(7): e47.
[4]
Chen W, Zhang S, Williams J, Jet al. A comparison of methods accounting for batch effects in differential expression analysis of UMI count based single cell RNA sequencing. Comput Struct Biotechnol J. 2020, 18:861-873.
[5]
Salit M. Standards in gene expression microarray experiments. Methods Enzymol. 2006, 411:63-78.
[6]
Pine PS, Munro SA, Parsons JR, Evaluation of the External RNA Controls Consortium (ERCC) reference material using a modified Latin square design. BMC Biotechnol. 2016, 16(1):54.
[7]
Song F, Chan GMA, Wei Y. Flexible experimental designs for valid single-cell RNA-sequencing experiments allowing batch effects correction. Nat Commun. 2020, 11(1):3274.
[8]
Haghverdi L, Lun ATL, Morgan MD, Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors. Nat Biotechnol., 2018, 36(5): 421-427.
[9]
Polański K, Young MD, Miao Z, BBKNN: fast batch alignment of single cell transcriptomes. Bioinformatics, 2020, 36(3): 964-965.
[10]
Hie B, Bryson B, Berger B. Efficient integration of heterogeneous single-cell transcriptomes using Scanorama. Nat Biotechnol, 2019, 37(6): 685–691.
[11]
Korsunsky I, Millard N, Fan J, Fast, sensitive and accurate integration of single-cell data with Harmony. Nature Methods, 2019, 16(12): 1289-1296.
[12]
Fei T, Zhang T, Shi W, Mitigating the adverse impact of batch effects in sample pattern detection. Bioinformatics, 2018, 34(15): 2634–2641.
[13]
Shaham U, Stanton KP, Zhao J, Removal of batch effects using distribution-matching residual networks. Bioinformatics, 2017, 33(16): 2539–2546.
[14]
Lopez R, Regier J, Cole M B, Deep generative modeling for single-cell transcriptomics. Nature methods, 2018, 15(12): 1053–1058.
[15]
Fei T, Yu T. scBatch: batch-effect correction of RNA-seq data through sample distance matrix adjustment. Bioinformatics, 2020, 36(10): 3115-3123.
[16]
Zhao Y, Wong L, Goh WWB. How to do quantile normalization correctly for gene expression data analyses. Sci. Rep., 2020, 10(1): 15534.
[17]
Wang D, Cheng L, Wang M, Extensive increase of microarray signals in cancers calls for novel normalization assumptions. Comput. Biol. Chem., 2011, 35:126–130.
[18]
Hicks SC, Townes FW, Teng M, Missing data and technical variability in single-cell RNA-sequencing experiments. Biostatistics, 2018,19: 562–578.
[19]
Zhang Y, Sloan SA, Clarke LE, Purification and characterization of progenitor and mature human astrocytes reveals transcriptional and functional differences with mouse. Neuron, 2016, 89: 37–53.
[20]
Kim JK, Kolodziejczyk AA, Ilicic T, (2015) Characterizing noise structure in single-cell RNA-seq distinguishes genuine from technical stochastic allelic expression. Nat.Commun., 2015, 6: 8687.
[21]
Usoskin D, Furlan A, Islam S, Unbiased classification of sensory neuron types by large-scale single-cell RNA sequencing. Nat. Neurosci., 2015, 18: 145–153.
[22]
Xin Y, Kim J, Okamoto H, RNA sequencing of single human islet cells reveals type 2 diabetes genes. Cell Metab., 2016,24: 608–615.
[23]
Grün D, Muraro MJ, Boisset JC, De Novo Prediction of Stem Cell Identity using Single-Cell Transcriptome Data. Cell Stem Cell, 2016, 19(2):266-277.
[24]
Muraro MJ, Dharmadhikari G, Grün D, A Single-Cell Transcriptome Atlas of the Human Pancreasreas. Cell Syst, 2016, 3(4):385-394.
[25]
Baron M, Veres A, Wolock S L, A single-cell transcriptomic map of the human and Dataset5reas reveals inter-and intra-cell population structure. Cellsystems, 2016,3(4): 346–360.
[26]
Wolf F A, Angerer P, Theis FJ. SCANPY: large-scale single-cell gene expression data analysis. Genome Biol. 2018, 19(1):15.
[27]
Maaten LVD, Hinton G. Visualizing data using t-SNE. Mach. Learn. Res., 2008, 9: 2579–2605.
[28]
Hubert L and Arabie P. Comparing partitions. Classification, 1985, 2:193– 218.
[29]
Rousseeuw PJ. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math. 1987; 20:53–65
[30]
Buttner M, Miao Z, Wolf FA, A test metric for assessing single-cell RNA-seq batch correction. Nat Methods. 2019; 16(1):43–49.
[31]
Stuart T, Butler A, Hoffman P, Comprehensive Integration of Single-Cell Data. Cell. 2019, 177(7):1888-1902, e21.
[32]
Gu Z, Eils R, and Schlesner M, Complex heatmaps reveal patterns and correlations in multidimensional genomic data. Bioinformatics, 2016, 32(18): 2847-9.
[33]
Tran, H.T.N., Ang, K.S., Chevrier, M. A benchmark of batch-effect correction methods for single-cell RNA sequencing data. Genome Biol., 2020, 21(1): 12.
[34]
Yang Q, Li B, Tang J Consistent gene signature of schizophrenia identified by a novel feature selection strategy from comprehensive sets of transcriptomic data. Brief Bioinform., 2020, 21(3):1058-1068.
[35]
Zhao Y., Cai H., Zhang Z. Learning interpretable cellular and gene signature embeddings from single-cell transcriptomic data. Nat Commun., 2021, 12(1): 5261.
[36]
Zappia L., Phipson B., Oshlack A. Splatter: simulation of single-cell RNA sequencing data. Genome Biol., 2017, 18(1): 174.

Cited By

View all
  • (2024)Batch effects correction in scRNA-seq based on biological-noise decoupling autoencoder and central-cross lossComputational Biology and Chemistry10.1016/j.compbiolchem.2024.108261113(108261)Online publication date: Dec-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
BIC '23: Proceedings of the 2023 3rd International Conference on Bioinformatics and Intelligent Computing
February 2023
398 pages
ISBN:9798400700200
DOI:10.1145/3592686
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 May 2023

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

BIC 2023

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)36
  • Downloads (Last 6 weeks)6
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Batch effects correction in scRNA-seq based on biological-noise decoupling autoencoder and central-cross lossComputational Biology and Chemistry10.1016/j.compbiolchem.2024.108261113(108261)Online publication date: Dec-2024

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media