HAL (Le Centre pour la Communication Scientifique Directe), May 29, 2017
HOGMep is a novel Bayesian method for joint restoration and clustering on generic multi-component... more HOGMep is a novel Bayesian method for joint restoration and clustering on generic multi-component graph data. First, it uses a finite mixture of Multivariate Exponential Power (MEP) distributions as a prior model for graph signals. The general MEP form is capable of modeling broad types of signals including Gaussian, Laplacian or sparser ones. Second, a general Higher-Order Graphical Model (HOGM) on labels, encompassing the widely-used Potts model, is used to incorporate spatial relationships between neighboring graph signals. The generality of our model can tackle a large variety of data structures. Third, in contrast with regularized minimization approaches often adopted in the literature, our algorithm reliably estimates regularization parameters from observations. Such modeling leads to a complex posterior distribution of unknown parameters. This problem is tackled by Variational Bayesian Approximation (VBA) whose goal is to provide accurate approximation of the posterior distribution allowing us to efficiently compute posterior mean estimates. We demonstrate the effectiveness of HOGMep on the joint deconvolution and segmentation of color images interpreted as graph signals. Experiments show that the proposed approach outperforms state-of-the-art methods in both restoration performance and segmentation accuracy.
Property analysis and quality assessment are fundamental needs in the study of complex mixtures, ... more Property analysis and quality assessment are fundamental needs in the study of complex mixtures, for instance petroleum fractions or biomass products. For the purpose of experimental efficiency in process development, finding cheaper alternatives is a rising trend with the development of high-throughput experiments (HTE). These units produce smaller sample volumes that are not compatible with the standard analytical process applied to determine petroleum cuts properties (Y) such as density, viscosity, etc. It is time-consuming (up to two days for some properties). An alternative for Y prediction is to combine analytical techniques (requiring a small volume of sample) with data mining. Given a subset of representative data (X), one strives to develop a predictive model P, such that P(X)~Y with sufficient precision compared to the standardized reference. Principal component regression (PCR) or Projection onto Latent Structures (PLS) prediction tools. Such chemometric models do not res...
HAL (Le Centre pour la Communication Scientifique Directe), May 29, 2017
HOGMep is a novel Bayesian method for joint restoration and clustering on generic multi-component... more HOGMep is a novel Bayesian method for joint restoration and clustering on generic multi-component graph data. First, it uses a finite mixture of Multivariate Exponential Power (MEP) distributions as a prior model for graph signals. The general MEP form is capable of modeling broad types of signals including Gaussian, Laplacian or sparser ones. Second, a general Higher-Order Graphical Model (HOGM) on labels, encompassing the widely-used Potts model, is used to incorporate spatial relationships between neighboring graph signals. The generality of our model can tackle a large variety of data structures. Third, in contrast with regularized minimization approaches often adopted in the literature, our algorithm reliably estimates regularization parameters from observations. Such modeling leads to a complex posterior distribution of unknown parameters. This problem is tackled by Variational Bayesian Approximation (VBA) whose goal is to provide accurate approximation of the posterior distribution allowing us to efficiently compute posterior mean estimates. We demonstrate the effectiveness of HOGMep on the joint deconvolution and segmentation of color images interpreted as graph signals. Experiments show that the proposed approach outperforms state-of-the-art methods in both restoration performance and segmentation accuracy.
Property analysis and quality assessment are fundamental needs in the study of complex mixtures, ... more Property analysis and quality assessment are fundamental needs in the study of complex mixtures, for instance petroleum fractions or biomass products. For the purpose of experimental efficiency in process development, finding cheaper alternatives is a rising trend with the development of high-throughput experiments (HTE). These units produce smaller sample volumes that are not compatible with the standard analytical process applied to determine petroleum cuts properties (Y) such as density, viscosity, etc. It is time-consuming (up to two days for some properties). An alternative for Y prediction is to combine analytical techniques (requiring a small volume of sample) with data mining. Given a subset of representative data (X), one strives to develop a predictive model P, such that P(X)~Y with sufficient precision compared to the standardized reference. Principal component regression (PCR) or Projection onto Latent Structures (PLS) prediction tools. Such chemometric models do not res...
Adaptive multiple subtraction with wavelet-based complex unary Wiener filters, S. Ventosa, S. Le Roy, Irène Huard, A. Pica, H. Rabeson, P. Ricarte, L. Duval, Geophysics, number 77, vol. 183, p. 183-192, Nov.-Dec. 2012
http://arxiv.org/abs/1108.4674
Multiple attenuation is one of the greatest challenges in seismic processing. Due to the high cro... more Multiple attenuation is one of the greatest challenges in seismic processing. Due to the high cross-correlation between primaries and multiples, attenuating the latter without distorting the former is a complicated problem. We propose here a joint multiple model-based adaptive subtraction, using single-sample unary filters estimation in a complex wavelet transformed domain. The method offers more robustness to incoherent noise through redundant decomposition. It is first tested on synthetic data, then applied on real-field data, with a single-model adaptation and a combination of several multiple models.
The growing complexity of Cyber-Physical Systems (CPS), together with increasingly available par-... more The growing complexity of Cyber-Physical Systems (CPS), together with increasingly available par-allelism provided by multi-core chips, fosters the parallelization of simulation. Simulation speed-ups are expected from co-simulation and parallelization based on model splitting into weak-coupled sub-models, as for instance in the framework of Functional Mockup Interface (FMI). However, slackened synchronization between sub-models and their associated solvers running in parallel introduces integration errors, which must be kept inside acceptable bounds. CHOPtrey denotes a forecasting framework enhancing the performance of complex system co-simulation, with a trivalent articulation. First, we consider the framework of a Computationally Hasty Online Prediction system (CHOPred). It allows to improve the trade-off between integration speed-ups, needing large communication steps, and simulation precision, needing frequent updates for model inputs. Second, smoothed adaptive forward prediction improves co-simulation accuracy. It is obtained by past-weighted extrapolation based on Causal Hopping Oblivious Polynomials (CHOPoly). And third, signal behavior is segmented to handle the discontinuities of the exchanged signals: the segmentation is performed in a Contextual & Hierarchical Ontology of Patterns (CHOPatt). Implementation strategies and simulation results demonstrate the framework ability to adaptively relax data communication constraints beyond synchronization points which sensibly accelerate simulation. The CHOPtrey framework extends the range of applications of standard Lagrange-type methods, often deemed unstable. The embedding of predictions in lag-dependent smoothing and discontinuity handling demonstrates its practical efficiency.
Reconstruction and clustering with graph optimization and priors on gene networks and images
The... more Reconstruction and clustering with graph optimization and priors on gene networks and images
The discovery of novel gene regulatory processes improves the understanding of cell phenotypic responses to external stimuli for many biological applications, such as medicine, environment or biotechnologies. To this purpose, transcriptomic data are generated and analyzed from DNA microarrays or more recently RNAseq experiments. They consist in genetic expression level sequences obtained for all genes of a studied organism placed in different living conditions. From these data, gene regulation mechanisms can be recovered by revealing topological links encoded in graphs. In regulatory graphs, nodes correspond to genes. A link between two nodes is identified if a regulation relationship exists between the two corresponding genes. Such networks are called Gene Regulatory Networks (GRNs). Their construction as well as their analysis remain challenging despite the large number of available inference methods.
In this thesis, we propose to address this network inference problem with recently developed techniques pertaining to graph optimization. Given all the pairwise gene regulation information available, we propose to determine the presence of edges in the final GRN by adopting an energy optimization formulation integrating additional constraints. Either biological (information about gene interactions) or structural (information about node connectivity) a priori have been considered to restrict the space of possible solutions. Different priors lead to different properties of the global cost function, for which various optimization strategies, either discrete and continuous, can be applied. The post-processing network refinements we designed led to computational approaches named BRANE for \Biologically-Related A priori for Network Enhancement". For each of the proposed methods --- BRANE Cut, BRANE Relax and BRANE Clust --- our contributions are threefold: a priori-based formulation, design of the optimization strategy and validation (numerical and/or biological) on benchmark datasets from DREAM4 and DREAM5 challenges showing numerical improvement reaching 20%.
In a ramification of this thesis, we slide from graph inference to more generic data processing such as inverse problems. We notably invest in HOGMep, a Bayesian-based approach using a Variational Bayesian Approximation framework for its resolution. This approach allows to jointly perform reconstruction and clustering/segmentation tasks on multi-component data (for instance signals or images). Its performance in a color image deconvolution context demonstrates both quality of reconstruction and segmentation. A preliminary study in a medical data classification context linking genotype and phenotype yields promising results for forthcoming bioinformatics adaptations.
Uploads
Papers by Laurent Duval
Nous présentons BEADS, une méthode d'optimisation permettant d'extraire conjointement la mesure d'intérêt, débruitée et corrigée de la ligne de base. En l'absence de modèles paramétriques naturels pour cette dernière, nous adoptons une modélisation globale légère, s'appuyant sur l'aspect "positif" des pics, et sur la parcimonie de leurs dérivées.
Le problème de filtrage de la ligne de base et de débruitage est alors reformulé comme la minimisation d'une fonction comportant un terme de fidélité aux données et de pénalités promotrices de positivité et de parcimonie. Les performances seront illustrées notamment sur des chromatogrammes 1D et 2D, en évoquant d'autres usages récents (spectrométie de masse, Raman, XAS/XRD). Des perspectives en déconvolution seront enfin abordées.
Chaque filtre unaire compense des écarts d'amplitude et des décalages faibles et plus importants, via des corrections de phase et de délai entier, dans une bande de fréquence très étroite. Cette approche simplifie grandement l'estimation du filtre adapté, et fournit, même en 1D, des résultats comparables à des approches 2D classiques, à un coût de calcul très compétitif.
Adaptive multiple subtraction with wavelet-based complex unary Wiener filters, S. Ventosa, S. Le Roy, Irène Huard, A. Pica, H. Rabeson, P. Ricarte, L. Duval, Geophysics, number 77, vol. 183, p. 183-192, Nov.-Dec. 2012
http://arxiv.org/abs/1108.4674
Additional information:
http://www.sciencedirect.com/science/article/pii/S0165168411001356
The richness of natural images makes the quest for optimal La quête de représentations optimales en traitement d'images et vision par ordinateur se heurte à la variété de contenu des données bidimensionnelles. De nombreux travaux se sont cependant attelées à la tâche de séparation de zones régulières, de contours et de textures, à la recherche d'un compromis entre complexité et efficacité de représentation. La prise en compte des aspects multi-échelles, dans le siècle de l'invention des ondelettes, a joué pour l'analyse d'images un rôle important. La dernière décennie a vue apparaître une série de méthodes efficaces, combinant l'aspect multi-échelle à des aspects directionnels et fréquentiels, permettant de mieux prendre en compte l'orientation des éléments d'intérêt des images. Leur fréquente redondance leur permet d'obtenir des représentations plus parcimonieuses et parfois quasi-équivariantes pour certaines transformations. Ces méthodes sont la motivation d'une revue thématique incluant quelques incursions dans des domaines non-euclidiens (sphère, maillages).representations in image processing and computer vision challenging. The latter observation has not prevented the design of image representations, which trade off between efficiency and complexity, while achieving accurate rendering of smooth regions as well as reproducing faithful contours and textures. The most recent ones, proposed in the past decade, share an hybrid heritage highlighting the multiscale and oriented nature of edges and patterns in images. This paper presents a panorama of the aforementioned literature on decompositions in multiscale, multi-orientation bases or dictionaries. They typically exhibit redundancy to improve sparsity in the transformed domain and sometimes its invariance with respect to simple geometric deformations (translation, rotation). Oriented multiscale dictionaries extend traditional wavelet processing and may offer rotation invariance. Highly redundant dictionaries require specific algorithms to simplify the search for an efficient (sparse) representation. We also discuss the extension of multiscale geometric decompositions to non-Euclidean domains such as the sphere or arbitrary meshed surfaces. The etymology of panorama suggests an overview, based on a choice of partially overlapping "pictures". We hope that this paper will contribute to the appreciation and apprehension of a stream of current research directions in image understanding.
The richness of natural images makes the quest for optimal La quête de représentations optimales en traitement d'images et vision par ordinateur se heurte à la variété de contenu des données bidimensionnelles. De nombreux travaux se sont cependant attelées à la tâche de séparation de zones régulières, de contours et de textures, à la recherche d'un compromis entre complexité et efficacité de représentation. La prise en compte des aspects multi-échelles, dans le siècle de l'invention des ondelettes, a joué pour l'analyse d'images un rôle important. La dernière décennie a vue apparaître une série de méthodes efficaces, combinant l'aspect multi-échelle à des aspects directionnels et fréquentiels, permettant de mieux prendre en compte l'orientation des éléments d'intérêt des images. Leur fréquente redondance leur permet d'obtenir des représentations plus parcimonieuses et parfois quasi-équivariantes pour certaines transformations. Ces méthodes sont la motivation d'une revue thématique incluant quelques incursions dans des domaines non-euclidiens (sphère, maillages).representations in image processing and computer vision challenging. The latter observation has not prevented the design of image representations, which trade off between efficiency and complexity, while achieving accurate rendering of smooth regions as well as reproducing faithful contours and textures. The most recent ones, proposed in the past decade, share an hybrid heritage highlighting the multiscale and oriented nature of edges and patterns in images. This paper presents a panorama of the aforementioned literature on decompositions in multiscale, multi-orientation bases or dictionaries. They typically exhibit redundancy to improve sparsity in the transformed domain and sometimes its invariance with respect to simple geometric deformations (translation, rotation). Oriented multiscale dictionaries extend traditional wavelet processing and may offer rotation invariance. Highly redundant dictionaries require specific algorithms to simplify the search for an efficient (sparse) representation. We also discuss the extension of multiscale geometric decompositions to non-Euclidean domains such as the sphere or arbitrary meshed surfaces. The etymology of panorama suggests an overview, based on a choice of partially overlapping "pictures". We hope that this paper will contribute to the appreciation and apprehension of a stream of current research directions in image understanding. "
The discovery of novel gene regulatory processes improves the understanding of cell phenotypic responses to external stimuli for many biological applications, such as medicine, environment or biotechnologies. To this purpose, transcriptomic data are generated and analyzed from DNA microarrays or more recently RNAseq experiments. They consist in genetic expression level sequences obtained for all genes of a studied organism placed in different living conditions. From these data, gene regulation mechanisms can be recovered by revealing topological links encoded in graphs. In regulatory graphs, nodes correspond to genes. A link between two nodes is identified if a regulation relationship exists between the two corresponding genes. Such networks are called Gene Regulatory Networks (GRNs). Their construction as well as their analysis remain challenging despite the large number of available inference methods.
In this thesis, we propose to address this network inference problem with recently developed techniques pertaining to graph optimization. Given all the pairwise gene regulation information available, we propose to determine the presence of edges in the final GRN by adopting an energy optimization formulation integrating additional constraints. Either biological (information about gene interactions) or structural (information about node connectivity) a priori have been considered to restrict the space of possible solutions. Different priors lead to different properties of the global cost function, for which various optimization strategies, either discrete and continuous, can be applied. The post-processing network refinements we designed led to computational approaches named BRANE for \Biologically-Related A priori for Network Enhancement". For each of the proposed methods --- BRANE Cut, BRANE Relax and BRANE Clust --- our contributions are threefold: a priori-based formulation, design of the optimization strategy and validation (numerical and/or biological) on benchmark datasets from DREAM4 and DREAM5 challenges showing numerical improvement reaching 20%.
In a ramification of this thesis, we slide from graph inference to more generic data processing such as inverse problems. We notably invest in HOGMep, a Bayesian-based approach using a Variational Bayesian Approximation framework for its resolution. This approach allows to jointly perform reconstruction and clustering/segmentation tasks on multi-component data (for instance signals or images). Its performance in a color image deconvolution context demonstrates both quality of reconstruction and segmentation. A preliminary study in a medical data classification context linking genotype and phenotype yields promising results for forthcoming bioinformatics adaptations.