Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content
Visual characteristics are among the most important features for characterizing the phenotype of biological organisms. Color and geometric properties define population phenotype and allow assessing diversity and adaptation to... more
Visual characteristics are among the most important features for characterizing the phenotype of biological organisms. Color and geometric properties define population phenotype and allow assessing diversity and adaptation to environmental conditions. To analyze geometric properties classical morphometrics relies on biologically relevant landmarks which are manually assigned to digital images. Assigning landmarks is tedious and error prone. Predefined landmarks may in addition miss out on information which is not obvious to the human eye. The machine learning (ML) community has recently proposed new data analysis methods which by uncovering subtle features in images obtain excellent predictive accuracy. Scientific credibility demands however that results are interpretable and hence to mitigate the black-box nature of ML methods. To overcome the black-box nature of ML we apply complementary methods and investigate internal representations with saliency maps to reliably identify locat...
Recent progress in machine learning and deep learning has enabled the implementation of plant and crop detection using systematic inspection of the leaf shapes and other morphological characters for identification systems for precision... more
Recent progress in machine learning and deep learning has enabled the implementation of plant and crop detection using systematic inspection of the leaf shapes and other morphological characters for identification systems for precision farming. However, the models used for this approach tend to become black-box models, in the sense that it is difficult to trace characters that are the base for the classification. The interpretability is therefore limited and the explanatory factors may not be based on reasonable visible characters. We investigate the explanatory factors of recent machine learning and deep learning models for plant classification tasks. Based on a Daucus carota and a Beta vulgaris image data set, we implement plant classification models and compare those models by their predictive performance as well as explainability. For comparison we implemented a feed forward convolutional neuronal network as a default model. To evaluate the performance, we trained an unsupervise...
Drug resistance poses a major challenge for targeted cancer therapy. To be able to functionally screen large randomly mutated target gene libraries for drug resistance mutations, we developed a biochemically defined high-throughput assay... more
Drug resistance poses a major challenge for targeted cancer therapy. To be able to functionally screen large randomly mutated target gene libraries for drug resistance mutations, we developed a biochemically defined high-throughput assay termed PhosphoFlowSeq. Instead of selecting for proliferation or resistance to apoptosis, PhosphoFlowSeq directly analyzes the enzymatic activities of randomly mutated kinases, thereby reducing the dependency on the signaling network in the host cell. Moreover, simultaneous analysis of expression levels enables compensation for expression-based biases on a single cell level. Using EGFR and its kinase inhibitor erlotinib as a model system, we demonstrate that the clinically most relevant resistance mutation T790M is reproducibly detected at high frequencies after four independent PhosphoFlowSeq selection experiments. Moreover, upon decreasing the selection pressure, also mutations which only confer weak resistance were identified, including T854A and L792H. We expect that PhosphoFlowSeq will be a valuable tool for the prediction and functional screening of drug resistance mutations in kinases.
Simplicity renders shake flasks ideal for strain selection and substrate optimization in biotechnology. Uncertainty during initial experiments may, however, cause adverse growth conditions and mislead conclusions. Using growth models for... more
Simplicity renders shake flasks ideal for strain selection and substrate optimization in biotechnology. Uncertainty during initial experiments may, however, cause adverse growth conditions and mislead conclusions. Using growth models for online predictions of future biomass (BM) and the arrival of critical events like low dissolved oxygen (DO) levels or when to harvest is hence important to optimize protocols. Established knowledge that unfavorable metabolites of growing microorganisms interfere with the substrate suggests that growth dynamics and, as a consequence, the growth model parameters may vary in the course of an experiment. Predictive monitoring of shake flask cultures will therefore benefit from estimating growth model parameters in an online and adaptive manner. This paper evaluates a newly developed particle filter (PF) which is specifically tailored to the requirements of biotechnological shake flask experiments. By combining stationary accuracy with fast adaptation to...
Protein complexes, macro-molecular assemblies of two or more proteins, play vital roles in numerous cellular activities and collectively determine the cellular state. Despite the availability of a range of methods for analysing protein... more
Protein complexes, macro-molecular assemblies of two or more proteins, play vital roles in numerous cellular activities and collectively determine the cellular state. Despite the availability of a range of methods for analysing protein complexes, systematic analysis of complexes under multiple conditions has remained challenging. Approaches based on biochemical fractionation of intact, native complexes and correlation of protein profiles have shown promise, for instance in the combination of size exclusion chromatography (SEC) with accurate protein quantification by SWATH/DIA-MS. However, most approaches for interpreting co-fractionation datasets to yield complex composition, abundance and rearrangements between samples depend heavily on prior evidence. We introduce PCprophet, a computational framework to identify novel protein complexes from SEC-SWATH-MS data and to characterize their changes across different experimental conditions. We demonstrate accurate prediction of protein co...
The SIESTA project aims at defininga new description of human sleep. The basicidea of the algorithm described below is thatextreme states of the sleeping subject are classifiedwith higher reliability. Hence we use labelsof the extreme... more
The SIESTA project aims at defininga new description of human sleep. The basicidea of the algorithm described below is thatextreme states of the sleeping subject are classifiedwith higher reliability. Hence we use labelsof the extreme states wake, deep sleep as wellas rapid eye movement sleep as well as all otherdata without labels. In order to have some benefitfrom unlabeled data, the classifier has tomodel class conditional densities and class priors. This allows to use Bayes theorem and predictthe a-posteriori probability of class. ...
The SIESTA project aims at defininga new description of human sleep. The sleepanalyzer is inferred by semi-supervised techniques. Using good features is very importantas we can not improve on them in subsequentprocessing stages. Hence we... more
The SIESTA project aims at defininga new description of human sleep. The sleepanalyzer is inferred by semi-supervised techniques. Using good features is very importantas we can not improve on them in subsequentprocessing stages. Hence we looked at variouspreprocessing techniques and applied featuresubset selection techniques to select the mostpromising ones. This paper presents results obtainedwith two entirely different techniques: Classical feature subset selection based on featureevaluation criteria and ...
This thesis investigates whether Bayesian inference can improve the reliabilityof biomedical diagnosis. In particular we discuss time series classication asis for example needed for an analysis of all-night sleep EEG recordings. Suchan... more
This thesis investigates whether Bayesian inference can improve the reliabilityof biomedical diagnosis. In particular we discuss time series classication asis for example needed for an analysis of all-night sleep EEG recordings. Suchan attempt needs 4 steps that are further analyzed.
We report about an automatic continuous sleep stager which isbased on probabilistic principles employing Hidden Markov Models(HMM). Our sleep stager oers the advantage of being objective bynot relying on human scorers, having much ner... more
We report about an automatic continuous sleep stager which isbased on probabilistic principles employing Hidden Markov Models(HMM). Our sleep stager oers the advantage of being objective bynot relying on human scorers, having much ner temporal resolution(one second instead of 30 seconds), and being based on solid probabilisticprinciples rather than a predened set of rules (Rechtschaen& Kales). Results obtained for nine
Research Interests:
ABSTRACT In this paper we report about an investigation of Bayesian inference applied to neural networks multilayer perceptrons (MLP), in particular in the task of automatic sleep staging based on electroencephalogram (EEG) and... more
ABSTRACT In this paper we report about an investigation of Bayesian inference applied to neural networks multilayer perceptrons (MLP), in particular in the task of automatic sleep staging based on electroencephalogram (EEG) and electrooculogram (EOG) signals. The main focus was on evaluating the use of so-called "doubt-levels" and "confidence intervals" ("error bars") in improving the results by rejecting uncertain cases and patterns not well represented by the training set. Bayesian inference is used to arrive at distributions of network weights based on training data. We compare the results of the full-blown Bayesian method with results obtained from a k-nearest neighbor classifier. The results show that the Bayesian technique significantly outperforms the k-nearest-neighbor classifier. At the same time, we show that Bayesian inference, for which we have developed an extension for the calculation of error bars in the latent space of hidden units, can indeed be used for improving results by rejecting cases below a doubt-level threshold of probability, as well as for the rejection of artifacts. The performance of the Bayesian solution, however, is not significantly better than alternative techniques such as doubt levels applied to a maximum posterior approach, or the use of density estimation for outlier rejection. We conclude that Bayesian inference is a valid and valuable technique for model estimation but in the given application does not lead to improved results over simpler techniques.
In the paper we stated [1]:“... a full approach without “explicit”-node merging is only possible for “lag-one” models. To be precise, a smoother (belief propagation both temporally causal and anti-causal) cannot be derived without node... more
In the paper we stated [1]:“... a full approach without “explicit”-node merging is only possible for “lag-one” models. To be precise, a smoother (belief propagation both temporally causal and anti-causal) cannot be derived without node clustering, while filtering (only temporally causal belief propagation), can be derived for higher lags also.”
Abstract: The S ESTª pro® ect aims at de¹ning a new description o¿ human sleep. ªs suchÄ the S ESTª sleep analyÆer is a diagnostic tool applied to biosignals o¿ humans. ªlthough the risk that wrong decisions harm people is lowÄ it is... more
Abstract: The S ESTª pro® ect aims at de¹ning a new description o¿ human sleep. ªs suchÄ the S ESTª sleep analyÆer is a diagnostic tool applied to biosignals o¿ humans. ªlthough the risk that wrong decisions harm people is lowÄ it is still there. Ëence introducing a reliability measure that Îags decisions that are probably wrong is a ma® or aim o¿ the pro® ect. This paper introduces a method¿ or preprocessing within the Ñayesian¿ ramework. Òe show that Ñayesian belie¿ s can be used to Îag segments where reliable decisions about ...
Home Home. ...
Abstract— In this paper we report about an investigation in which we studied the properties of Bayes’ inferred neural network classifiers in the context of outlier detection. The problem of misclassification due to outliers in the test... more
Abstract— In this paper we report about an investigation in which we studied the properties of Bayes’ inferred neural network classifiers in the context of outlier detection. The problem of misclassification due to outliers in the test data is seen as a serious problem in safety critical environments. We compare the usual way to deal with uncertainty in the Bayesian
This paper suggests a probabilistic treatment of the signal processing part of a brain-computer interface (BCI). We suggest two improvements for BCIs that cannot be obtained easily with other data driven approaches. Simply by using one... more
This paper suggests a probabilistic treatment of the signal processing part of a brain-computer interface (BCI). We suggest two improvements for BCIs that cannot be obtained easily with other data driven approaches. Simply by using one large joint distribution as a model of the entire signal processing part of the BCI, we can obtain predictions that implicitly weight information according
In this paper we report about an extensiveinvestigation on neural networks -- multilayer perceptrons(MLP), in particular -- in the task of automatic sleep stagingbased on electroencephalogram (EEG) and electrooculogram(EOG) signals. After... more
In this paper we report about an extensiveinvestigation on neural networks -- multilayer perceptrons(MLP), in particular -- in the task of automatic sleep stagingbased on electroencephalogram (EEG) and electrooculogram(EOG) signals. After the important first step of preprocessingand feature selection (for which, a search-basedselection technique could reduce the large number of featuresto a feature vector of size ten), the main focus
Research Interests:
High-throughput RNA sequencing (RNA-seq) enables comprehensive scans of entire transcriptomes, but best practices for analyzing RNA-seq data have not been fully defined, particularly for data collected with multiple sequencing platforms... more
High-throughput RNA sequencing (RNA-seq) enables comprehensive scans of entire transcriptomes, but best practices for analyzing RNA-seq data have not been fully defined, particularly for data collected with multiple sequencing platforms or at multiple sites. Here we used standardized RNA samples with built-in controls to examine sources of error in large-scale RNA-seq studies and their impact on the detection of differentially expressed genes (DEGs). Analysis of variations in guanine-cytosine content, gene coverage, sequencing error rate and insert size allowed identification of decreased reproducibility across sites. Moreover, commonly used methods for normalization (cqn, EDASeq, RUV2, sva, PEER) varied in their ability to remove these systematic biases, depending on sample complexity and initial data quality. Normalization methods that combine data from genes across sites are strongly recommended to identify and remove site-specific effects and can substantially improve RNA-seq studies.
This thesis investigates whether Bayesian inference can improve the reliabilityof biomedical diagnosis. In particular we discuss time series classication asis for example needed for an analysis of all-night sleep EEG recordings. Suchan... more
This thesis investigates whether Bayesian inference can improve the reliabilityof biomedical diagnosis. In particular we discuss time series classication asis for example needed for an analysis of all-night sleep EEG recordings. Suchan attempt needs 4 steps that are further analyzed.
This tutorial provides a brief introduction on using FSPMA for microarray data analysis. The code of the library is at several occasions closely linked with code from YASMA and thus provided under the GPL 2 license. The most important... more
This tutorial provides a brief introduction on using FSPMA for microarray data analysis. The code of the library is at several occasions closely linked with code from YASMA and thus provided under the GPL 2 license. The most important implication of the GPL 2 license is that FSPMA is free software and comes with NO WARRANTY.
Research Interests:

And 19 more

Faba bean (Vicia faba L.) is an important source of protein but breeding for increased yield stability and stress tolerance is hampered by the scarcity of phenotyping information. Because comparisons of cultivars adapted to different... more
Faba bean (Vicia faba L.) is an important source of protein but breeding for increased yield stability and stress tolerance is hampered by the scarcity of phenotyping information. Because comparisons of cultivars adapted to different agro-climatic zones improve our understanding of stress tolerance mechanisms, the root architecture and morphology of 16 pan-European faba bean cultivars were studied at maturity. Different machine learning (ML) approaches were tested in their usefulness to analyse trait variations between cultivars. A supervised, i.e. hypothesis-driven, ML approach revealed that cultivars from Portugal feature greater and coarser but less frequent lateral roots at the top of the taproot, potentially enhancing water uptake from deeper soil horizons. Unsupervised clustering revealed that trait differences between Northern and Southern cultivars are not predominant but that two cultivar groups, independently from major and minor types, differ largely in overall root system size. Methodological guidelines on how to use powerful machine learning methods such as random forest models for enhancing the phenotypical exploration of plants are given.