Abstract
As a powerful tool for genome-wide gene expression analysis, DNA microarray technology is widely used in biomedical research. One important application of microarrays is to identify differentially expressed genes (DEGs) between two distinct biological conditions, e.g. disease versus normal or treatment versus control, so that the underlying molecular mechanism differentiating the two conditions maybe revealed. Mechanistic interpretation of microarray results requires the identification of reproducible and reliable lists of DEGs, because irreproducible lists of DEGs may lead to different biological conclusions. Many vendors are providing microarray platforms of different characteristics for gene expression analysis, and the widely publicized apparent lack of intra- and cross-platform concordance in DEGs from microarray analysis of the same sets of study samples has been of great concerns to the scientific community and regulatory agencies like the US Food and Drug Administration (FDA). In this chapter, we describe the study design of and the main findings from the FDA-led MicroArray Quality Control (MAQC) project that aims to objectively assess the performance of different microarray platforms and the advantages and limitations of various competing statistical methods in identifying DEGs from microarray data. Using large data sets generated on two human reference RNA samples established by the MAQC project, we show that the levels of concordance in inter-laboratory and cross-platform comparisons are generally high. Furthermore, the levels of concordance largely depend on the statistical criteria used for ranking and selecting DEGs, irrespective of the chosen platforms or test sites. Importantly, a straightforward method combining fold-change ranking with a non-stringent P-value cutoff produces more reproducible lists of DEGs than those by t-test P-value ranking. Similar conclusions are reached when microarray data sets from a rat toxicogenomics study are analyzed. The availability of the MAQC reference RNA samples and the large reference data sets provides a unique resource for the gene expression community to reach consensus on the “best practices” for the generation, analysis, and applications of microarray data in drug development and personalized medicine.
The views presented in this article do not necessarily reflect those of the US Food and Drug Administration.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Allison, D. B., et al. (2006). Microarray data analysis: From disarray to consolidation and consensus. Native Reviews. Genetics, 7, 55–65.
Canales, R. D., et al. (2006). Evaluation of dna microarray results with quantitative gene expression platforms. Nature Biotechnology, 24, 1115–1122.
Chen, J. J., et al. (2007). Reproducibility of microarray data: A further analysis of microarray quality control (MAQC) data. BMC Bioinformatics, 8, 412.
Chen, L., et al. (2006). Mutations induced by carcinogenic doses of aristolochic acid in kidney of Big Blue transgenic rats. Toxicology Letters, 165, 250–256.
Ein-Dor, L., et al. (2006). Thousands of samples are needed to generate a robust gene list for predicting outcome in cancer. Proceedings of the National Academy of Sciences of the United States of America, 103, 5923–5928.
Fodor, S. P., et al. (1991). Light-directed, spatially addressable parallel chemical synthesis. Science, 251, 767–773.
Frantz, S. (2005). An array of problems. Nature Reviews. Drug Discovery, 4, 362–363.
Frueh, F. W. (2006). Impact of microarray data quality on genomic data submissions to the fda. Nature Biotechnology, 24, 1105–1107.
Geiss, G. K., et al. (2008). Direct multiplexed measurement of gene expression with color-coded probe pairs. Nature Biotechnology, 26, 317–325.
Gunderson, K. L., et al. (2004). Decoding randomly ordered dna arrays. Genome Research, 14, 870–877.
Guo, L., et al. (2006). Rat toxicogenomic study reveals analytical consistency across microarray platforms. Nature Biotechnology, 24, 1162–1169.
Hoffman, E. (2004). Expression profiling–best practices for data generation and interpretation in clinical trials. Native Reviews. Genetics, 5, 229–237.
Hughes, T. R., et al. (2001). Expression profiling using microarrays fabricated by an ink-jet oligonucleotide synthesizer. Nature Biotechnology, 19, 342–347.
Ioannidis, J. P. (2005). Microarrays and molecular research: Noise discovery? The Lancet, 365, 454–455.
Irizarry, R. A., et al. (2005). Multiple-laboratory comparison of microarray platforms. Nature Methods, 3, 345–350.
Irizarry, R. A., et al. (2006). Comparison of Affymetrix GeneChip expression measures. Bioinformatics, 22, 789–794.
Ivanova, N. B., et al. (2002). A stem cell molecular signature. Science, 298, 601–604.
Kadota, K., et al. (2009). Ranking differentially expressed genes from affymetrix gene expression data: Methods with reproducibility, sensitivity, and specificity. Algorithms for Molecular Biology, 4, 7.
Klebanov, L., et al. (2007). Statistical methods and microarray data. Nature Biotechnology, 25, 25–26. Author reply 26–27.
Lockhart, D. J., et al. (1996). Expression monitoring by hybridization to high-density oligonucleotide arrays. Nature Biotechnology, 14, 1675–1680.
Marshall, E. (2004). Getting the noise out of gene arrays. Science, 306, 630–631.
Mecham, B. H., et al. (2004). Sequence-matched probes produce increased cross-platform consistency and more reproducible biological results in microarray-based gene expression measurements. Nucleic Acids Research, 32, e74.
Mei, N., et al. (2004). Differential mutagenicity of riddelliine in liver endothelial and parenchymal cells of transgenic big blue rats. Cancer Letters, 215, 151–158.
Mei, N., et al. (2004). Mutations induced by the carcinogenic pyrrolizidine alkaloid riddelliine in the liver cII gene of transgenic big blue rats. Chemical Research in Toxicology, 17, 814–818.
Mei, N., et al. (2005). Mutagenicity of comfrey (Symphytum Officinale) in rat liver. British Journal of Cancer, 92, 873–875.
Michiels, S., et al. (2005). Prediction of cancer outcome with microarrays: A multiple random validation strategy. The Lancet, 365, 488–492.
Miklos, G. L., & Maleszka, R. (2004). Microarray reality checks in the context of a complex disease. Nature Biotechnology, 22, 615–621.
Miller, R. M., et al. (2004). Dysregulation of gene expression in the 1-methyl-4-phenyl-1,2,3,6-tetrahydropyridine-lesioned mouse substantia nigra. Journal of Neuroscience, 24, 7445–7454.
Ramakrishnan, R., et al. (2002). An assessment of Motorola CodeLink microarray performance for gene expression profiling applications. Nucleic Acids Research, 30, e30.
Ramalho-Santos, M., et al. (2002). ‘stemness’: Transcriptional profiling of embryonic and adult stem cells. Science, 298, 597–600.
Sage, L. (2006). Do microarrays measure up? Analytical Chemistry, 78, 7358–7360.
Schena, M., et al. (1995). Quantitative monitoring of gene expression patterns with a complementary dna microarray. Science, 270, 467–470.
Shi, L., et al. (2005). Cross-platform comparability of microarray technology: Intra-platform consistency and appropriate data analysis procedures are essential. BMC Bioinformatics, 6(Suppl. 2), S12.
Shi, L., et al. (2006). The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. Nature Biotechnology, 24, 1151–1161.
Shi, L., et al. (2007). Reply to Statistical methods and microarray data. Nature Biotechnology, 25, 26–27.
Shi, L., et al. (2008). The current status of DNA microarrays. In Dill K., Liu R., & Grodzinski P. (Eds.), Microarrays: Preparation, microfluidics, detection methods, and biological applications (pp. 3–24). New York: Springer.
Shi, L., et al. (2008). The balance of reproducibility, sensitivity, and specificity of lists of differentially expressed genes in microarray studies. BMC Bioinformatics, 9(Suppl. 9), S10.
Strauss, E. (2006). Arrays of hope. Cell, 127, 657–659.
Su, Z., et al. (2009). Approaches and practical considerations for the analysis of toxicogenomics data. In Boverhof D.R., & Gollapudi B.B. (Eds.), Application of toxicogenomics in safety evaluation and risk assessment. Wiley, Chichester, West Sussex, UK.
Tan, P. K., et al. (2003). Evaluation of gene expression measurements from commercial microarray platforms. Nucleic Acids Research, 31, 247–276.
Tusher, V. G., et al. (2001). Significance analysis of microarrays applied to the ionizing radiation response. Proceedings of the National Academy of Sciences of the United States of America, 98, 5116–5121.
Wang, E. T., et al. (2008). Alternative isoform regulation in human tissue transcriptomes. Nature, 456, 470–476.
Wang, Y., et al. (2006). Large scale real-time PCR validation on gene expression measurements from two commercial long-oligonucleotide microarrays. BMC Genomics, 7, 59.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Wen, Z. et al. (2011). The MicroArray Quality Control (MAQC) Project and Cross-Platform Analysis of Microarray Data. In: Lu, HS., Schölkopf, B., Zhao, H. (eds) Handbook of Statistical Bioinformatics. Springer Handbooks of Computational Statistics. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16345-6_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-16345-6_9
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16344-9
Online ISBN: 978-3-642-16345-6
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)