Abstract
The success of mass spectrometry-based proteomics in emerging applications such as biomarker discovery and clinical diagnostics, is predicated substantially on its ability to achieve growing demands for throughput. Support for high throughput implies sophisticated tracking of experiments and the experimental steps, larger amounts of data to be organized and summarized, more complex algorithms for inferring and tracking protein expression across multiple experiments, statistical methods to access data quality, and a streamlined proteomics-centric bioinformatics environment to establish the biological context and relevance of the experimental measurements. This paper presents a bioinformatics platform that was built for an industrial mass spectrometry-based proteomics laboratory focusing on biomarker discovery. The basis of the platform is a robust and scalable information management environment supported by database and workflow management technology that is employed for the integration of heterogeneous data, applications and processes across the entire laboratory workflow. This paper focuses on selected features of the platform which include: (a) a method for improving the accuracy of protein assignment, (b) novel software tools for protein expression analysis that combine differential MS quantitation with tandem MS for peptide identification, and (c) integration of methods to aid the biological relevance and statistical significance of differentially expressed proteins.
The work reported in this paper was carried out by the authors at MDS Proteomics / Protana.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Aebersold, R., Mann, M.: Mass spectrometry-based proteomics. Nature 422(6928), 198–207 (2003)
Baldwin, M.A.: Protein identification by mass spectrometry: issues to be considered. Mol. Cell Proteomics 3(1), 1–9 (2004)
Boyle, E.I., Weng, S., et al.: GO::TermFinder–open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes. Bioinformatics 20(18), 3710–3715 (2004)
Cargile, B.J., Bundy, J.L., et al.: Potential for false positive identifications from large databases through tandem mass spectrometry. J Proteome Res. 3(5), 1082–1085 (2004)
Chernushevich, I., Loboda, A., et al.: An introduction to quadrupole-time-of-flight mass spectrometry. Journal of Mass Spectrometry 26, 859–865 (2001)
Craig, R., Beavis, R.C.: TANDEM: matching proteins with tandem mass spectra. Bioinformatics 20(9), 1466–1467 (2004)
Fenyo, D., Beavis, R.C.: A method for assessing the statistical significance of mass spectrometry-based protein identifications using general scoring schemes. Anal. Chem. 75(4), 768–774 (2003)
Horn, D.M., Zubarev, R.A., et al.: Automated Reduction and Interpretation of High Resolution Electrospray Mass Spectra of Large Molecules. Journal of American Society for Mass Spectrometry 11, 320–322 (2000)
Hosack, D.A., Dennis Jr., G., et al.: Identifying biological themes within lists of genes with EASE. Genome Biol. 4(10) (2003)
Johnson, K.L., Mason, C.J., et al.: Analysis of the Low Molecular Weight Fraction of Serum by LC-Dual ESI-FT-ICR Mass Spectrometry: Precision of Retention Time, Mass, and Ion Abundance. Analytical Chemistry 76, 5097–5103 (2004)
Keller, A., Eng, J., et al.: A uniform proteomics MS/MS analysis platform utilizing open XML file formats. Molecular Systems Biology (2005)
Kersey, P.J., Duarte, J., et al.: The International Protein Index: an integrated database for proteomics experiments. Proteomics 4(7), 1985–1988 (2004)
Kiebel, G.R., Anderson, G.A., et al.: Proteomics Research Information Storage and Management (PRISM) System, Pacific Northwest National Laboratory (2004)
Kristensen, D.B., Brond, J.C., et al.: Experimental Peptide Identification Repository (EPIR): an integrated peptide-centric platform for validation and mining of tandem mass spectrometry data. Mol. Cell Proteomics 3(10), 1023–1038 (2004)
Li, X.-J., Zhang, H., et al.: Automated Statistical Analysis of Protein Abundance Ratios from Data Generated by Stable-Isotope Dilution and Tandem Mass Spectrometry. Analytical Chemistry 75(23), 6648–6657 (2003)
Lilien, R., Farid, H., et al.: Probabilistic Disease Classification of Expression-Dependent Proteomic Data from Mass Spectrometry of Human Serum. Journal of Computational Biology 10(6), 925–946 (2003)
Listgarten, J., Emili, A.: Statistical and computational methods for comparative proteomic profiling using liquid chromatography-tandem mass spectrometry. Mol. Cell Proteomics 4(4), 419–434 (2005)
Ludascher, B., Goble, C.: Guest Editors’ Introduction to the Special Section on Scientific Workflows. SIGMOD Rec. 34(3), 4–5 (2005)
MacCoss, M.J., Wu, C.C., et al.: A Correlation Algorithm for the Automated Quantitative Analysis of Shothun Proteomics. Analytical Chemistry 75(24), 6912–6921 (2003)
Nesvizhskii, A.I., Keller, A., et al.: A statistical model for identifying proteins by tandem mass spectrometry. Anal. Chem. 75(17), 4646–4658 (2003)
Pedrioli, P.G., Eng, J.K., et al.: A common open representation of mass spectrometry data and its application to proteomics research. Nat. Biotechnol 22(11), 1459–1466 (2004)
Perkins, D.N., Pappin, D.J., et al.: Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 20(18), 3551–3567 (1999)
Petricoin, E., Ardekani, A., et al.: Use of proteomic patterns in serum to identify ovarian cancer. Lancet 7(9306), 572–577 (2002)
Senko, M., Beu, S., et al.: Automated Assignment of Charge States from Resolved Isotopic Peaks for Multiply Charged Ions. Journal of American Society for Mass Spectrometry 6, 52–56 (1995)
Simmhan, Y., Plale, B., et al.: A Survey of Data Provenance in e-Science. SIGMOD Rec. 34(3), 31–36 (2005)
Simon, R.M., Korn, E.L., et al.: Design and Analysis of DNA Microarray Investigations. Springer, Heidelberg (2003)
Smith, R., Loo, J., et al.: New Developments in Biochemical Mass Spectrometry: Electrospray Ionization. Analytical Chemistry 62, 882–899 (1990)
Syka, J., Marto, J., et al.: Novel Linear Quadrupole Ion Trap/FT Mass Spectrometer: Performance Characterization and Use in the Comparative Analysis of Histone H3 Post-translational Modifications. Journal of Proteomics Research 3, 621–626 (2004)
Tabb, D.L., McDonald, W.H., et al.: DTASelect and Contrast: tools for assembling and comparing protein identifications from shotgun proteomics. J Proteome Res. 1(1), 21–26 (2002)
Taylor, C.F., Paton, N.W., et al.: A systematic approach to modeling, capturing, and disseminating proteomics experimental data. Nat. Biotech 21(3), 247–254 (2003)
Yang, X., Dondeti, V., et al.: DBParser: web-based software for shotgun proteomic data analyses. J Proteome Res. 3(5), 1002–1008 (2004)
Zeeberg, B.R., Feng, W., et al.: GoMiner: a resource for biological interpretation of genomic and proteomic data. Genome Biol. 4(4) (2003)
Zhang, Z., Marshall, A.: A Universal Algorithm for Fast and Automated Charge State Deconvolution of Electrospray Mass-to-Charge Ratio Spectra. Journal of American Society for Mass Spectrometry 9, 320–332 (1998)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Topaloglou, T., Dharsee, M., Ewing, R.M., Bukhman, Y. (2007). A High-Throughput Bioinformatics Platform for Mass Spectrometry-Based Proteomics. In: Cohen-Boulakia, S., Tannen, V. (eds) Data Integration in the Life Sciences. DILS 2007. Lecture Notes in Computer Science(), vol 4544. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73255-6_9
Download citation
DOI: https://doi.org/10.1007/978-3-540-73255-6_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73254-9
Online ISBN: 978-3-540-73255-6
eBook Packages: Computer ScienceComputer Science (R0)