ABSTRACT We consider likelihood based inference for the parameter of a skew-normal distribution. ... more ABSTRACT We consider likelihood based inference for the parameter of a skew-normal distribution. One of the problems shown by this model is the singularity of the Fisher information matrix when skewness is absent. We derive the rate of convergence to the asymptotic distribution of the maximum likelihood estimator and study an alternative parameterization which overcomes problems related to the singularity of the information matrix.
In this article, we introduce an automatic identification procedure for transfer function models.... more In this article, we introduce an automatic identification procedure for transfer function models. These models are commonplace in time-series analysis, but their identification can be complex. To tackle this problem, we propose to couple a nonlinear conditional least-squares algorithm with a genetic search over the model space. We illustrate the performances of our proposal by examples on simulated and real
Background / Purpose: The biological processes within a cell are dynamic and the right experiment... more Background / Purpose: The biological processes within a cell are dynamic and the right experimental design to investigate these trends is the time course. Unfortunately only few pathway analysis methods account for time dependent design and none of them exploit the topology of pathways to identify the portion of the pathway mostly involved in the biological problem studied. Main conclusion: Here we present timeClip, a new method to identify the time-dependent activation of different portions of a biological pathway.
Current demand for understanding the behavior of groups of related genes, combined with the great... more Current demand for understanding the behavior of groups of related genes, combined with the greater availability of data, has led to an increased focus on statistical methods in gene set analysis. In this paper, we aim to perform a critical appraisal of the methodology based on graphical models developed in Massa et al. () that uses pathway signaling networks as a starting point to develop statistically sound procedures for gene set analysis. We pay attention to the potential of the methodology with respect to the organizational aspects of dealing with such complex but highly informative starting structures, that is pathways. We focus on three themes: the translation of a biological pathway into a graph suitable for modeling, the role of shrinkage when more genes than samples are obtained, the evaluation of respondence of the statistical models to the biological expectations. To study the impact of shrinkage, two simulation studies will be run. To evaluate the biological expectation we will use data from a network with known behavior that offer the possibility of carrying out a realistic check of respondence of the model to changes in the experimental conditions.
The international journal of biostatistics, Jan 13, 2015
For a continuous-scale diagnostic test, the receiver operating characteristic (ROC) curve is a po... more For a continuous-scale diagnostic test, the receiver operating characteristic (ROC) curve is a popular tool for displaying the ability of the test to discriminate between healthy and diseased subjects. In some studies, verification of the true disease status is performed only for a subset of subjects, possibly depending on the test result and other characteristics of the subjects. Estimators of the ROC curve based only on this subset of subjects are typically biased; this is known as verification bias. Methods have been proposed to correct verification bias, in particular under the assumption that the true disease status, if missing, is missing at random (MAR). MAR assumption means that the probability of missingness depends on the true disease status only through the test result and observed covariate information. However, the existing methods require parametric models for the (conditional) probability of disease and/or the (conditional) probability of verification, and hence are s...
Recently, a great effort in microarray data analysis is directed towards the study of the so-call... more Recently, a great effort in microarray data analysis is directed towards the study of the so-called gene sets. A gene set is defined by genes that are, somehow, functionally related. For example, genes appearing in a known biological pathway naturally define a gene set. The gene sets are usually identified from a priori biological knowledge. Nowadays, many bioinformatics resources store such kind of knowledge (see, for example, the Kyoto Encyclopedia of Genes and Genomes, among others). Although pathways maps carry important information about the structure of correlation among genes that should not be neglected, the currently available multivariate methods for gene set analysis do not fully exploit it. We propose a novel gene set analysis specifically designed for gene sets defined by pathways. Such analysis, based on graphical models, explicitly incorporates the dependence structure among genes highlighted by the topology of pathways. The analysis is designed to be used for overall...
Studies in Classification, Data Analysis, and Knowledge Organization, 2004
ABSTRACT In this paper, we tackle the study of the relationship between daily non accidental deat... more ABSTRACT In this paper, we tackle the study of the relationship between daily non accidental deaths and air pollution in the city of Philadelphia in the years 1974 -1988. For modelling the data, we propose to make use of dynamic generalized linear models. These models allow to deal with the serial dependence and time-varying effects of the covariates. Inference is performed by using extended Kaiman filter and smoother.
Recently, a great effort in microarray data analysis has been directed towards the study of the s... more Recently, a great effort in microarray data analysis has been directed towards the study of the so-called gene sets. A gene set is defined by genes that are, somehow, functionally related. For example, genes appearing in a known biological pathway naturally define a gene set. Gene sets are usually identified from a priori biological knowledge. Nowadays, many bioinformatics resources store
Proceedings of Computers in Cardiology Conference, 1993
... Catherine Bull, Monica Chiogna, Rodney Franklin and David Spiegelhalter Hospital for Sick Chi... more ... Catherine Bull, Monica Chiogna, Rodney Franklin and David Spiegelhalter Hospital for Sick Children, London and MRC Abstract Classification trees provide an attractively transparent discrimination technique and may be derived either fiom expert opinion orji-om data analysis. ...
ABSTRACT We consider likelihood based inference for the parameter of a skew-normal distribution. ... more ABSTRACT We consider likelihood based inference for the parameter of a skew-normal distribution. One of the problems shown by this model is the singularity of the Fisher information matrix when skewness is absent. We derive the rate of convergence to the asymptotic distribution of the maximum likelihood estimator and study an alternative parameterization which overcomes problems related to the singularity of the information matrix.
In this article, we introduce an automatic identification procedure for transfer function models.... more In this article, we introduce an automatic identification procedure for transfer function models. These models are commonplace in time-series analysis, but their identification can be complex. To tackle this problem, we propose to couple a nonlinear conditional least-squares algorithm with a genetic search over the model space. We illustrate the performances of our proposal by examples on simulated and real
Background / Purpose: The biological processes within a cell are dynamic and the right experiment... more Background / Purpose: The biological processes within a cell are dynamic and the right experimental design to investigate these trends is the time course. Unfortunately only few pathway analysis methods account for time dependent design and none of them exploit the topology of pathways to identify the portion of the pathway mostly involved in the biological problem studied. Main conclusion: Here we present timeClip, a new method to identify the time-dependent activation of different portions of a biological pathway.
Current demand for understanding the behavior of groups of related genes, combined with the great... more Current demand for understanding the behavior of groups of related genes, combined with the greater availability of data, has led to an increased focus on statistical methods in gene set analysis. In this paper, we aim to perform a critical appraisal of the methodology based on graphical models developed in Massa et al. () that uses pathway signaling networks as a starting point to develop statistically sound procedures for gene set analysis. We pay attention to the potential of the methodology with respect to the organizational aspects of dealing with such complex but highly informative starting structures, that is pathways. We focus on three themes: the translation of a biological pathway into a graph suitable for modeling, the role of shrinkage when more genes than samples are obtained, the evaluation of respondence of the statistical models to the biological expectations. To study the impact of shrinkage, two simulation studies will be run. To evaluate the biological expectation we will use data from a network with known behavior that offer the possibility of carrying out a realistic check of respondence of the model to changes in the experimental conditions.
The international journal of biostatistics, Jan 13, 2015
For a continuous-scale diagnostic test, the receiver operating characteristic (ROC) curve is a po... more For a continuous-scale diagnostic test, the receiver operating characteristic (ROC) curve is a popular tool for displaying the ability of the test to discriminate between healthy and diseased subjects. In some studies, verification of the true disease status is performed only for a subset of subjects, possibly depending on the test result and other characteristics of the subjects. Estimators of the ROC curve based only on this subset of subjects are typically biased; this is known as verification bias. Methods have been proposed to correct verification bias, in particular under the assumption that the true disease status, if missing, is missing at random (MAR). MAR assumption means that the probability of missingness depends on the true disease status only through the test result and observed covariate information. However, the existing methods require parametric models for the (conditional) probability of disease and/or the (conditional) probability of verification, and hence are s...
Recently, a great effort in microarray data analysis is directed towards the study of the so-call... more Recently, a great effort in microarray data analysis is directed towards the study of the so-called gene sets. A gene set is defined by genes that are, somehow, functionally related. For example, genes appearing in a known biological pathway naturally define a gene set. The gene sets are usually identified from a priori biological knowledge. Nowadays, many bioinformatics resources store such kind of knowledge (see, for example, the Kyoto Encyclopedia of Genes and Genomes, among others). Although pathways maps carry important information about the structure of correlation among genes that should not be neglected, the currently available multivariate methods for gene set analysis do not fully exploit it. We propose a novel gene set analysis specifically designed for gene sets defined by pathways. Such analysis, based on graphical models, explicitly incorporates the dependence structure among genes highlighted by the topology of pathways. The analysis is designed to be used for overall...
Studies in Classification, Data Analysis, and Knowledge Organization, 2004
ABSTRACT In this paper, we tackle the study of the relationship between daily non accidental deat... more ABSTRACT In this paper, we tackle the study of the relationship between daily non accidental deaths and air pollution in the city of Philadelphia in the years 1974 -1988. For modelling the data, we propose to make use of dynamic generalized linear models. These models allow to deal with the serial dependence and time-varying effects of the covariates. Inference is performed by using extended Kaiman filter and smoother.
Recently, a great effort in microarray data analysis has been directed towards the study of the s... more Recently, a great effort in microarray data analysis has been directed towards the study of the so-called gene sets. A gene set is defined by genes that are, somehow, functionally related. For example, genes appearing in a known biological pathway naturally define a gene set. Gene sets are usually identified from a priori biological knowledge. Nowadays, many bioinformatics resources store
Proceedings of Computers in Cardiology Conference, 1993
... Catherine Bull, Monica Chiogna, Rodney Franklin and David Spiegelhalter Hospital for Sick Chi... more ... Catherine Bull, Monica Chiogna, Rodney Franklin and David Spiegelhalter Hospital for Sick Children, London and MRC Abstract Classification trees provide an attractively transparent discrimination technique and may be derived either fiom expert opinion orji-om data analysis. ...
Uploads
Papers by Monica Chiogna