Covering: 2008 to 2014 Mass spectrometry is a key technology for the identification and structura... more Covering: 2008 to 2014 Mass spectrometry is a key technology for the identification and structural elucidation of natural products. Manual interpretation of the resulting data is tedious and time-consuming, so methods for automated analysis are highly sought after. In this review, we focus on four recently developed methods for the detection and investigation of small molecules, namely MetFrag/MetFusion, ISIS, FingerID, and FT-BLAST. These methods have the potential to significantly advance the field of computational mass spectrometry for the research of natural products. For example, they may help with the dereplication of compounds at an early stage of the drug discovery process; that is, the detection of molecules that are identical or highly similar to known drugs or drug leads. Furthermore, when a potential drug lead has been determined, these tools may help to identify it and elucidate its structure.
Since metabolites cannot be predicted from the genome sequence, high-throughput de novo identific... more Since metabolites cannot be predicted from the genome sequence, high-throughput de novo identification of small molecules is highly sought. Mass spectrometry (MS) in combination with a fragmentation technique is commonly used for this task. Unfortunately, automated analysis of such data is in its infancy. Recently, fragmentation trees have been proposed as an analysis tool for such data. Additional fragmentation steps (MS(n)) reveal more information about the molecule. We propose to use MS(n) data for the computation of fragmentation trees, and present the Colorful Subtree Closure problem to formalize this task: There, we search for a colorful subtree inside a vertex-colored graph, such that the weight of the transitive closure of the subtree is maximal. We give several negative results regarding the tractability and approximability of this and related problems. We then present an exact dynamic programming algorithm, which is parameterized by the number of colors in the graph and is swift in practice. Evaluation of our method on a dataset of 45 reference compounds showed that the quality of constructed fragmentation trees is improved by using MS(n) instead of MS² measurements.
Mass spectrometry allows sensitive, automated and high-throughput analysis of small molecules suc... more Mass spectrometry allows sensitive, automated and high-throughput analysis of small molecules such as metabolites. One major bottleneck in metabolomics is the identification of 'unknown' small molecules not in any database. Recently, fragmentation tree alignments have been introduced for the automated comparison of the fragmentation patterns of small molecules. Fragmentation pattern similarities are strongly correlated with the chemical similarity of the molecules, and allow us to cluster compounds based solely on their fragmentation patterns. Aligning fragmentation trees is computationally hard. Nevertheless, we present three exact algorithms for the problem: a dynamic programming (DP) algorithm, a sparse variant of the DP, and an Integer Linear Program (ILP). Evaluation of our methods on three different datasets showed that thousands of alignments can be computed in a matter of minutes using DP, even for 'challenging' instances. Running times of the sparse DP were an order of magnitude better than for the classical DP. The ILP was clearly outperformed by both DP approaches. We also found that for both DP algorithms, computing the 1% slowest alignments required as much time as computing the 99% fastest.
The automated fragmentation analysis of high resolution EI mass spectra based on a fragmentation ... more The automated fragmentation analysis of high resolution EI mass spectra based on a fragmentation tree algorithm is introduced. Fragmentation trees are constructed from EI spectra by automated signal extraction and evaluation. These trees explain relevant fragmentation reactions and assign molecular formulas to fragments. The method enables the identification of the molecular ion and the molecular formula of a metabolite if the molecular ion is present in the spectrum. These identifications are independent of existing library knowledge and, thus, support assignment and structural elucidation of unknown compounds. The method works even if the molecular ion is of very low abundance or hidden under contaminants with higher masses. We apply the algorithm to a selection of 50 derivatized and underivatized metabolites and demonstrate that in 78% of cases the molecular ion can be correctly assigned. The automatically constructed fragmentation trees correspond very well to published mechanisms and allow the assignment of specific relevant fragments and fragmentation pathways even in the most complex EI-spectra in our dataset. This method will be very helpful in the automated analysis of metabolites that are not included in common libraries and it thus has the potential to support the explorative character of metabolomics studies.
Covering: 2008 to 2014 Mass spectrometry is a key technology for the identification and structura... more Covering: 2008 to 2014 Mass spectrometry is a key technology for the identification and structural elucidation of natural products. Manual interpretation of the resulting data is tedious and time-consuming, so methods for automated analysis are highly sought after. In this review, we focus on four recently developed methods for the detection and investigation of small molecules, namely MetFrag/MetFusion, ISIS, FingerID, and FT-BLAST. These methods have the potential to significantly advance the field of computational mass spectrometry for the research of natural products. For example, they may help with the dereplication of compounds at an early stage of the drug discovery process; that is, the detection of molecules that are identical or highly similar to known drugs or drug leads. Furthermore, when a potential drug lead has been determined, these tools may help to identify it and elucidate its structure.
Since metabolites cannot be predicted from the genome sequence, high-throughput de novo identific... more Since metabolites cannot be predicted from the genome sequence, high-throughput de novo identification of small molecules is highly sought. Mass spectrometry (MS) in combination with a fragmentation technique is commonly used for this task. Unfortunately, automated analysis of such data is in its infancy. Recently, fragmentation trees have been proposed as an analysis tool for such data. Additional fragmentation steps (MS(n)) reveal more information about the molecule. We propose to use MS(n) data for the computation of fragmentation trees, and present the Colorful Subtree Closure problem to formalize this task: There, we search for a colorful subtree inside a vertex-colored graph, such that the weight of the transitive closure of the subtree is maximal. We give several negative results regarding the tractability and approximability of this and related problems. We then present an exact dynamic programming algorithm, which is parameterized by the number of colors in the graph and is swift in practice. Evaluation of our method on a dataset of 45 reference compounds showed that the quality of constructed fragmentation trees is improved by using MS(n) instead of MS² measurements.
Mass spectrometry allows sensitive, automated and high-throughput analysis of small molecules suc... more Mass spectrometry allows sensitive, automated and high-throughput analysis of small molecules such as metabolites. One major bottleneck in metabolomics is the identification of 'unknown' small molecules not in any database. Recently, fragmentation tree alignments have been introduced for the automated comparison of the fragmentation patterns of small molecules. Fragmentation pattern similarities are strongly correlated with the chemical similarity of the molecules, and allow us to cluster compounds based solely on their fragmentation patterns. Aligning fragmentation trees is computationally hard. Nevertheless, we present three exact algorithms for the problem: a dynamic programming (DP) algorithm, a sparse variant of the DP, and an Integer Linear Program (ILP). Evaluation of our methods on three different datasets showed that thousands of alignments can be computed in a matter of minutes using DP, even for 'challenging' instances. Running times of the sparse DP were an order of magnitude better than for the classical DP. The ILP was clearly outperformed by both DP approaches. We also found that for both DP algorithms, computing the 1% slowest alignments required as much time as computing the 99% fastest.
The automated fragmentation analysis of high resolution EI mass spectra based on a fragmentation ... more The automated fragmentation analysis of high resolution EI mass spectra based on a fragmentation tree algorithm is introduced. Fragmentation trees are constructed from EI spectra by automated signal extraction and evaluation. These trees explain relevant fragmentation reactions and assign molecular formulas to fragments. The method enables the identification of the molecular ion and the molecular formula of a metabolite if the molecular ion is present in the spectrum. These identifications are independent of existing library knowledge and, thus, support assignment and structural elucidation of unknown compounds. The method works even if the molecular ion is of very low abundance or hidden under contaminants with higher masses. We apply the algorithm to a selection of 50 derivatized and underivatized metabolites and demonstrate that in 78% of cases the molecular ion can be correctly assigned. The automatically constructed fragmentation trees correspond very well to published mechanisms and allow the assignment of specific relevant fragments and fragmentation pathways even in the most complex EI-spectra in our dataset. This method will be very helpful in the automated analysis of metabolites that are not included in common libraries and it thus has the potential to support the explorative character of metabolomics studies.
Uploads
Papers by Franziska Hufsky