We investigate the use of the Riemannianoptimization method over the flag manifold in subspace ICA problems such as in-dependent subspace analysis (ISA) and complex ICA. In the ISA experiment, we use the Riemannian approach over the flag... more
We investigate the use of the Riemannianoptimization method over the flag manifold in subspace ICA problems such as in-dependent subspace analysis (ISA) and complex ICA. In the ISA experiment, we use the Riemannian approach over the flag manifold together ...
En este documento se presenta la versión inicial 1.0, aún en desarrollo, de un sistema abierto escrito en TOL1 para la simulación e inferencia bayesianas de tipo MonteCarlo-Markov Chain (MCMC) mediante el algoritmo de Gibbs, sobre modelos... more
En este documento se presenta la versión inicial 1.0, aún en desarrollo, de un sistema abierto escrito en TOL1 para la simulación e inferencia bayesianas de tipo MonteCarlo-Markov Chain (MCMC) mediante el algoritmo de Gibbs, sobre modelos de regresión lineal sparse con estructura arbitraria (jerárquicos, redes bayesianas, ...) con restricciones de desigualdad lineal, con tratamiento de omitidos y filtros no lineales, ambas tanto en el input como en el output, así como con estructuras ARIMA.
The stochastic or random nature of commodity prices plays a central role in models for valuing financial contingent claims on commodities. In this paper, by enhancing a multifactor framework which is consistent not only with the market... more
The stochastic or random nature of commodity prices plays a central role in models for valuing financial contingent claims on commodities. In this paper, by enhancing a multifactor framework which is consistent not only with the market observable forward price curve but also the volatilities and correlations of forward prices, we propose a two factor stochastic volatility model for the evolution of the gas forward curve. The volatility is stochastic due to a hidden Markov Chain that causes it to switch between "on peak" and "off peak" states. Based on the structure functional forms for the volatility, we propose and implement the Markov Chain Monte Carlo (MCMC) method to estimate the parameters of the forward curve model. Applications to simulated data indicate that the proposed algorithm is able to accommodate more general features, such as regime switching and seasonality. Applications to the market gas forward data shows that the MCMC approach provides stable ...
A novel semiparametric regression model for censored data is proposed as an alternative to the widely used proportional hazards survival model. The proposed regression model for censored data turns out to be flexible and practically... more
A novel semiparametric regression model for censored data is proposed as an alternative to the widely used proportional hazards survival model. The proposed regression model for censored data turns out to be flexible and practically meaningful. Features include physical interpretation of the regression coefficients through the mean response time instead of the hazard functions, and a rigorous proof of consistency of the posterior distribution. It is shown that the regression model obtained by a mixture of parametric families, has a proportional mean structure (as in an accelerated failure time models). The statistical inference is based on a nonparametric Bayesian approach that uses a Dirichlet process prior for the mixing distribution. Consistency of the posterior distribution of the regression parameters in the Euclidean metric is established. Finite sample parameter estimates along with associated measure of uncertainties can be computed by a MCMC method. Simulation studies are p...
Multilevel structural equation modeling (MSEM) is gaining popularity in the social sciences as a framework for estimating latent variable models in the presence of hierarchical data. In addition, we believe that MSEMs are quite helpful to... more
Multilevel structural equation modeling (MSEM) is gaining popularity in the social sciences as a framework for estimating latent variable models in the presence of hierarchical data. In addition, we believe that MSEMs are quite helpful to psychological and educational inquiries. Many research papers have been published on technical developments in MSEM; however, we are not aware of any tutorials on how to properly implement these complex models. In addition, applied researchers may be unfamiliar with how to implement a Bayesian estimation approach to MSEM, despite the fact that it has distinct advantages in this modeling context (Depaoli and Clifton, 2015; Hox, van de Schoot, and Matthijsse, 2012). To that end, this paper serves as a nontechnical tutorial on the implementation and application of MSEM using both frequentist and Bayesian estimation methods. We demonstrate the implementation of MSEM with three motivating examples using data from the Program for International Student Assessment (PISA). In addition, we present a small Monte Carlo study to show how the estimation of MSEM is impacted by different types of priors for data structures similar to those presented in the PISA examples. We conclude with recommendations for applied researchers based on findings from the empirical examples and the simulation study. We also outline future methodological considerations in the context of Bayesian MSEM.
This paper introduces the Global Multi-country (GM) model, an estimated multi-country Dynamic Stochastic General Equilibrium (DSGE) model of the world economy. We present the model in 3-region configurations for Euro area (EA) countries... more
This paper introduces the Global Multi-country (GM) model, an estimated multi-country Dynamic Stochastic General Equilibrium (DSGE) model of the world economy. We present the model in 3-region configurations for Euro area (EA) countries that include an individual EA Member State, the rest of the EA (REA), and the rest of the world (RoW). We provide and compare estimates of this model structure for the four largest EA countries (Germany, France, Italy, and Spain). The novelty of the paper is the estimation of ex-ante identical country models on the basis of a unified information set, which allows for clean cross-country comparison of parameter estimates and drivers of economic dynamics. The paper also provides an overview of applications of the GM model such as the structural interpretation of business cycle dynamics, the contribution to the European Commission’s economic forecast, the scenario analysis and policy counterfactuals.
This work presents the current state-of-the-art in techniques for tracking a number of objects moving in a coordinated and interacting fashion. Groups are structured objects characterized with particular motion patterns. The group can be... more
This work presents the current state-of-the-art in techniques for tracking a number of objects moving in a coordinated and interacting fashion. Groups are structured objects characterized with particular motion patterns. The group can be comprised of a small number of interacting objects (e.g. pedestrians, sport players, convoy of cars) or of hundreds or thousands of components such as crowds of people. The group object tracking is closely linked with extended object tracking but at the same time has particular features which differentiate it from extended objects. Extended objects, such as in maritime surveillance, are characterized by their kinematic states and their size or volume. Both group and extended objects give rise to a varying number of measurements and require trajectory maintenance. An emphasis is given here to sequential Monte Carlo (SMC) methods and their variants. Methods for small groups and for large groups are presented, including Markov Chain Monte Carlo (MCMC) methods, the random matrices approach and Random Finite Set Statistics methods. Efficient real-time implementations are discussed which are able to deal with the high dimensionality and provide high accuracy. Future trends and avenues are traced.
This paper outlines a new methodological framework for combining indicators of corruption. The state-space framework extends the methodology of the Worldwide Governance Indicators (WGI) to fully make use of the time-structure present in... more
This paper outlines a new methodological framework for combining indicators of corruption. The state-space framework extends the methodology of the Worldwide Governance Indicators (WGI) to fully make use of the time-structure present in corruption data. It is estimated using a Bayesian Gibbs sampler algorithm.
The state-space framework holds many advantages from a practical, an estimation and a theoretical point of view. Most importantly, it significantly expands the period for which the index can be computed while at the same time addressing the selection bias issues that trouble the Corruption Perceptions Index (CPI). In addition, its estimates are more stable and have smaller confidence intervals than both CPI and WGI. Because the estimation is transparent and data is entered without any manipulations, the estimation procedure is more objective.
In this thesis, we address several problems related to modelling complex systems. The difficulty of modelling complex systems lies partly in their topology and how they form rather complex networks. From this perspective, our interest in... more
In this thesis, we address several problems related to modelling complex systems. The difficulty of modelling complex systems lies partly in their topology and how they form rather complex networks. From this perspective, our interest in networks (graphs) is part of a broader current of research on complex systems. Graphical models provide powerful tools to model and make the statistical inference regarding complex relationships among variables. In this context, Gaussian graphical models are commonly used, since inference in such models is often tractable. In Chapter 2, we introduce a novel Bayesian framework for Gaussian graphical model determination. We carry out the posterior inference by using an efficient sampling scheme which is a trans-dimensional MCMC approach based on birth-death process. In particular, we construct an efficient search algorithm which explores the graph space to detect the underlying graph with high accuracy. We cover the theory and computational details of the proposed method. We then apply the method to large-scale real applications from mammary gene expression studies to show its empirical usefulness.
Previous work (Bell and Jones 2013a, c; Luo and Hodges 2013) has shown that, when there are trends in either the period or cohort residuals of Yang and Land’s Hierarchical Age-Period-Cohort (APC) model (Yang and Land 2006; Yang and Land... more
Previous work (Bell and Jones 2013a, c; Luo and Hodges 2013) has shown that, when there are trends in either the period or cohort residuals of Yang and Land’s Hierarchical Age-Period-Cohort (APC) model (Yang and Land 2006; Yang and Land 2013), the model can incorrectly estimate those trends, because of the well-known APC identification problem. Here we consider modelling possibilities when the age effect is known, allowing any period or cohort trends to be estimated. In particular, we suggest the application of informative priors, in a Bayesian framework, to the age trend, and we use a variety of simulated but realistic datasets to explicate this. Similarly, an informative prior could be applied to an estimated period or cohort trend, allowing the other two APC trends to be estimated. We show that a very strong informative prior is required for this purpose. As such, models of this kind can be fitted but are only useful when very strong evidence of the age trend (for example physiological evidence regarding health). Alternatively, a variety of strong priors can be tested and the most plausible solution argued for on the basis of theory.
In this paper we propose a sequential Monte Carlo algorithm to estimate a stochastic volatility model with leverage eect, non constant conditional mean and jumps. Our idea relies on the auxiliary particle lter algorithm together with the... more
In this paper we propose a sequential Monte Carlo algorithm to estimate a stochastic volatility model with leverage eect, non constant conditional mean and jumps. Our idea relies on the auxiliary particle lter algorithm together with the Markov Chain Monte Carlo (MCMC) method- ology. Our method allows to sequentially evaluate the parameters and the latent processes involved in the dynamic of interest. An empirical applica- tion on simulated data and on the Standard & Poor's 500 index is presented to study the performance of the algorithm implemented.
Quantitative individual human diet reconstruction using isotopic data and a Bayesian approach typically requires the inclusion of several model parameters, such as individual isotopic data, isotopic and macronutrient composition of food... more
Quantitative individual human diet reconstruction using isotopic data and a Bayesian approach typically requires the inclusion of several model parameters, such as individual isotopic data, isotopic and macronutrient composition of food groups, diet-to-tissue isotopic offsets and dietary routing. In an archaeological context, sparse data may hamper a widespread application of such models. However, simpler models may be proposed to address specific archaeological questions. As a consequence of the intake of marine foods, individuals from the first century AD Roman site of Herculaneum showed well-defined bone collagen radiocarbon age offsets from the expected terrestrial value. Taking as reference these radiocarbon offsets and using as model input stable isotope data (δ13C and δ15N), the performance of two Bayesian mixing model instances (routed and concentration dependent model versus nonrouted and concentration-independent) was compared to predict the carbon contribution of marine foods to bone collagen. Predictions generated by both models were in good agreement with observed values. The model with higher complexity showed only a slightly better performance in terms of accuracy and precision. This demonstrates that under similar circumstances, a simple Bayesian approach can be applied to quantify the carbon contribution of marine foods to human bone collagen.
A Bayesian Hierarchical Model is presented to estimate route choice preferences between OD pairs. The methodology adopted utilizes both Origin-Destination (OD) information and traffic counts observed on some of the links in the... more
A Bayesian Hierarchical Model is presented to
estimate route choice preferences between OD pairs. The
methodology adopted utilizes both Origin-Destination (OD)
information and traffic counts observed on some of the links
in the network to estimate route choice probabilities. Route
choice preferences are represented by multinomial distributions
and estimated via a Markov Chain Monte Carlo (MCMC)
algorithm. The proposed model takes into account measurement
errors in the link counts, the uncertanties present in OD data
and alternative routes choices both inside or outside the network
of study. The proposed method is validated on both a synthetic
example and the traffic network of Malta.
The book about the eventology ~ a science about events and its applications to problems of management of sets of events ~ a new direction in philosophy, mathematics and event management (see English version of the book:... more
In some clinical trials and epidemiologic studies, investigators are interested in knowing whether the variability of a biomarker is independently predictive of clinical outcomes. This question is often addressed via a naïve approach... more
In some clinical trials and epidemiologic studies, investigators are interested in knowing whether the variability of a biomarker is independently predictive of clinical outcomes. This question is often addressed via a naïve approach where a sample-based estimate (e.g., standard deviation) is calculated as a surrogate for the "true" variability and then used in regression models as a covariate assumed to be free of measurement error. However, it is well known that the measurement error in covariates causes underestimation of the true association. The issue of underestimation can be substantial when the precision is low because of limited number of measures per subject. The joint analysis of survival data and longitudinal data enables one to account for the measurement error in longitudinal data and has received substantial attention in recent years. In this paper we propose a joint model to assess the predictive effect of biomarker variability. The joint model consists of two linked sub-models, a linear mixed model with patient-specific variance for longitudinal data and a full parametric Weibull distribution for survival data, and the association between two models is induced by a latent Gaussian process. Parameters in the joint model are estimated under Bayesian framework and implemented using Markov chain Monte Carlo (MCMC) methods with WinBUGS software. The method is illustrated in the Ocular Hypertension Treatment Study to assess whether the variability of intraocular pressure is an independent risk of primary open-angle glaucoma. The performance of the method is also assessed by simulation studies.
El objetivo de este trabajo es evaluar la hipótesis causal de que el estatus socioeconómico de los estudiantes y la actitud de los estudiantes hacia la matemática son factores que determinan en gran medida los resultados académicos de los... more
El objetivo de este trabajo es evaluar la hipótesis causal de que el estatus socioeconómico de los estudiantes y la actitud de los estudiantes hacia la matemática son factores que determinan en gran medida los resultados académicos de los estudiantes costarricenses, medido a través del resultado en la prueba de alfabetización matemática de PISA 2012. Para esto, se define un modelo de medición de los constructos latentes y se estima el modelo estructural, tanto desde el enfoque clásico como desde el enfoque Bayesiano, para comparar ambos tipos de estimaciones. Los resultados muestran que las diferencias en la extracción socioeconómica de los estudiantes y en la actitud personal que estos tienen hacia el área matemática son un buen punto de partida para formular un modelo más extenso que contemple de mejor manera la complejidad de los factores sociales, institucionales y de contexto que inciden en el rendimiento académico de los jóvenes.
We apply a reversible-jump Markov chain Monte Carlo method to sample the Bayesian pos- terior model probability density function of 2-D seafloor resistivity as constrained by marine controlled source electromagnetic data. This density... more
We apply a reversible-jump Markov chain Monte Carlo method to sample the Bayesian pos- terior model probability density function of 2-D seafloor resistivity as constrained by marine controlled source electromagnetic data. This density function of earth models conveys infor- mation on which parts of the model space are illuminated by the data. Whereas conventional gradient-based inversion approaches require subjective regularization choices to stabilize this highly non-linear and non-unique inverse problem and provide only a single solution with no model uncertainty information, the method we use entirely avoids model regularization. The result of our approach is an ensemble of models that can be visualized and queried to provide meaningful information about the sensitivity of the data to the subsurface, and the level of res- olution of model parameters. We represent models in 2-D using a Voronoi cell parametrization. To make the 2-D problem practical, we use a source–receiver common midpoint approximation with 1-D forward modelling. Our algorithm is transdimensional and self-parametrizing where the number of resistivity cells within a 2-D depth section is variable, as are their positions and geometries. Two synthetic studies demonstrate the algorithm’s use in the appraisal of a thin, segmented, resistive reservoir which makes for a challenging exploration target. As a demonstration example, we apply our method to survey data collected over the Scarborough gas field on the Northwest Australian shelf.
In MCMC methods, such as the Metropolis-Hastings (MH) algorithm, the Gibbs sampler, or recent adaptive methods, many different strategies can be proposed, often asso-ciated in practice to unknown rates of convergence. In this paper we... more
In MCMC methods, such as the Metropolis-Hastings (MH) algorithm, the Gibbs sampler, or recent adaptive methods, many different strategies can be proposed, often asso-ciated in practice to unknown rates of convergence. In this paper we propose a simulation-based methodology to compare these rates of convergence, grounded on an entropy criterion computed from parallel (i.i.d.) simulated Markov chains coming from each candidate strat-egy. Our criterion determines the most efficient strategy among the candidates. Theoreti-cally, we give for the MH algorithm general conditions under which its successive densities satisfy adequate smoothness and tail properties, so that this entropy criterion can be esti-mated consistently using kernel density estimate and Monte Carlo integration. Simulated and actual examples in moderate dimensions are provided to illustrate this approach.
We backtest 59 instruments and investigate the predictability of daily returns using Bayesian variable selection methods. Through these models we show the importance of variable selection and reduction of over tting. We also visualize how... more
We backtest 59 instruments and investigate the predictability of daily returns using Bayesian variable selection methods. Through these models we show the importance of variable selection and reduction of overtting. We also visualize how the driving factors of daily returns from dierent classes vary over time. Predicting daily returns' magnitude is again conrmed to be a hard task but we show that for some instruments it is possible to achieve above average hit-rates that could lead to protable strategies. Simulation results show that predictability levels of daily returns also vary over time.
In this paper, we discuss the analysis of complex health survey data by using multivariate modeling techniques. Our main interests are in design-based and model-based methods that aim at accounting for clustering, stratification and... more
In this paper, we discuss the analysis of complex health survey data by using multivariate modeling techniques. Our main interests are in design-based and model-based methods that aim at accounting for clustering, stratification and weighting effects. Our main interests are in clustering effects. Methods considered include generalized linear modeling with on pseudo-likelihood and generalized estimating equations, linear mixed models estimated by restricted maximum likelihood, and hierarchical Bayes techniques using Markov Chain Monte Carlo (MCMC) methods. The methods will be compared empirically, using data from a health interview and examination survey conducted in Finland in 2000 (Health-2000 Study). The data of the Health-2000 Study were collected using personal interviews, questionnaires and clinical examinations. A stratified two-stage cluster sampling design was used. The sampling design involved positive intra-cluster correlation for several study variables. We selected the s...
A Bayesian semi-parametric dynamic model combination is proposed in order to deal with a large set of predictive densities. It extends the mixture of experts and the smoothly mixing regression models by allowing * We thank
Homo naledi is a recently discovered species of fossil hominin from South Africa. A considerable amount is already known about H. naledi but some important questions remain unanswered. Here we report a study that addressed two of them:... more
Homo naledi is a recently discovered species of fossil hominin from South Africa. A considerable amount is already known about H. naledi but some important questions remain unanswered. Here we report a study that addressed two of them: " Where does H. naledi fit in the hominin evolutionary tree? " and " How old is it? " We used a large supermatrix of craniodental characters for both early and late hominin species and Bayesian phylogenetic techniques to carry out three analyses. First, we performed a dated Bayesian analysis to generate estimates of the evolutionary relationships of fossil hominins including H. naledi. Then we employed Bayes factor tests to compare the strength of support for hypotheses about the relationships of H. naledi suggested by the best-estimate trees. Lastly, we carried out a resampling analysis to assess the accuracy of the age estimate for H. naledi yielded by the dated Bayesian analysis. The analyses strongly supported the hypothesis th...
This work deals with the development of new theoretical and experimental techniques for the efficient estimation of thermophysical properties and source-term in micro and macro-scale. Two kinds of source term were studied: a constant and... more
This work deals with the development of new theoretical and experimental techniques for the efficient estimation of thermophysical properties and source-term in micro and macro-scale. Two kinds of source term were studied: a constant and a time varying source term. The time wise variation of the source term had a sinusoidal and a pulse form. Two devices were used for the sample heating: An electrical resistance and a laser diode. For the data acquisition, an infrared camera was used, providing a full cartography of properties of the medium and also non-contact temperature measurements. The direct problem was solved by the finite differences method, and two approaches were used for the solution of the inverse problem, depending on the time varying behavior of the source term. Both approaches deal with the parameters estimation within the Bayesian framework, using the Markov Chain Monte Carlo (MCMC) method via the Metropolis Hastings (MH) algorithm for the constant source term, and the Kalman filter for the time-varying source term. The nodal strategy is presented as a method to deal with the large number of experimental data problems. Experiments were carried out in a sample with well known thermophysical properties, determined by classical methods.
The phenomenon of sponsored search advertising - where advertisers pay a fee to Internet search engines to be displayed alongside organic (non-sponsored) web search results - is gaining ground as the largest source of revenues for search... more
The phenomenon of sponsored search advertising - where advertisers pay a fee to Internet search engines to be displayed alongside organic (non-sponsored) web search results - is gaining ground as the largest source of revenues for search engines. Using a unique 6 month panel dataset of several hundred keywords collected from a large nationwide retailer that advertises on Google, we
Practitioners and construction management researchers lack believable and practical methods to assess the value proposition of emerging methods such as Virtual Design and Construction (VDC) including understanding how different levels of... more
Practitioners and construction management researchers lack believable and practical methods to assess the value proposition of emerging methods such as Virtual Design and Construction (VDC) including understanding how different levels of implementation affect its benefits. Furthermore, current methods of understanding VDC implementation and benefits cannot be updated easily to incorporate new data. This paper presents a Bayesian framework to predict benefits from application of Virtual Design and Construction (VDC) given data about its implementation. We analyzed data from 40 projects that performed some formal modeling of the project scope and/or the construction process. The analysis suggests that more extensive or higher levels of VDC implementation lead to higher project benefits. We explain the use of a Bayesian framework as an alternative to the application of classical probability theory to construction management research, how we used it to interpret data about VDC practice and outcomes, our finding that benefits have strong positive contingent correlation with the level of VDC implemented on projects, and our suggestion to use the method to update conclusions about benefits given changing data about implementation and outcomes.