Javier Trejos Zelaya was born in San José, Costa Rica in 1961. He graduated from the University of Costa Rica in 1988 in Mathematics and obtained a Doctorate in Applied Mathematics at Paul Sabatier University in Toulouse, France (1994).Full professor at the University of Costa Rica (UCR) since 1996. He founded the Research Centre for Pure and Applied Mathematics (CIMPA) in 1997 and the journal Revista de Matemática: Teoría y Aplicaciones in 1994. His research at CIMPA deals with combinatorial optimization methods in multivariate analysis, particularly in clustering, regression, and multidimensional scaling, with around 50 articles in such topics. He has also leaded several projects, in Costa Rica and Latin America, on instruction innovation at university level, with 5 books published in this field.Has organized several scientific conferences in Costa Rica and has been part of several scientific committees around the world. Has been Dean od the Faculty of Science (2012-2020), Director of CIMPA (2006-2010, Director of graduate studies in mathematics (1996-2000), Coordinator of the Institute of Advances Studies (2016-2019) and chief editor of the journal (since 1994). At the presente time, is the Director of the School of Mathematics. Javier Trejos has also directed over 20 License and Master thesis at UCR and has been member of the jury for several Master and Doctoral thesis abroad.He has been invited professor at the Metropolitan University of Mexico (2001) and the University of Limoges (1999 and 2010). Address: San José, Provincia de San Jose, Costa Rica
Communications in Statistics - Simulation and Computation, 1981
Variable Selection in Multiple Linear Regression Using the Minimum Sum of Weighted Absolute Error... more Variable Selection in Multiple Linear Regression Using the Minimum Sum of Weighted Absolute Errors Criterion. JF Wellington, SC Narula COMMUN. STAT.- SIMUL. COMPUT. 6, 641-648, 1981. C CM 6 PROBABILITY AND STATISTICS(CI).
RESUMEN El propósito de este artículo es presentar los resultados de la evaluación de heurísticas... more RESUMEN El propósito de este artículo es presentar los resultados de la evaluación de heurísticas de optimización combinatoria por particiones, específicamente, sobrecalentamiento simulado, búsqueda tabú y algoritmos genéticos, en comparación con métodos tradicionales como k-medias y clasificación jerárquica de Ward. Se utilizaron tablas de datos generadas al azar de acuerdo con ciertos parámetros establecidos. Se generaron 16 tablas de datos con variables normalmente distribuidas, se repitió el experimento 100 veces para cada tabla y cada método, y como parámetro de comparación de los resultados se utilizó la inercia intra-clases (W). Los mejores resultados se obtuvieron para el sobrecalentamiento simulado y el algoritmo genético. ABSTRACT The aim of this paper is to present the results of the evaluation of combinatorial optimization heuristic applied to obtain partitions in clustering: simulated annealing, tabu search and a genetic algorithm, using data tables generated randomly a...
RESUMEN En este artículo se estudian diversos enfoques para definir promedios móviles y volatilid... more RESUMEN En este artículo se estudian diversos enfoques para definir promedios móviles y volatilidades desde el punto de vista exponencial, es decir, dando mas ponderación a los datos mas recientes. Se ve que varios de los que enfoques usados en la literatura adolecen de varios problemas de definición. Se propone una manera de definir un sistema completo de pesos y de esta manera corregir algunos de los problemas que presentan esos enfoques. Además, se propone una fórmula recurrente que converge al promedio móvil y la volatilidad exponencial reales. Finalmente, se estudian las cotas para los errores de aproximación que se cometen cuando se usan las fórmulas estudiadas. ABSTRACT: In this paper different approaches are studied, for defining moving averages and volatiles from the exponential point of view that is, giving a larger weight to the more recent data. It is obtained that different approaches, appearing in the literature, have different definition problems. A method is proposed...
Hemos desarrollado el algoritmo usual de clasificacion jerarquica ascendente en el sistema MATHEM... more Hemos desarrollado el algoritmo usual de clasificacion jerarquica ascendente en el sistema MATHEMATICA. El usuario escoge la disimilitud segun el tipo de datos que deba analizar: cuantitativos, cualitativos o binarios, asi como el indice de agregacion a utilizar. Se dispone de varias opciones para cada escogencia. Ademas, se ha implementado un gran numero de manipulaciones sobre el arbol binario de clasificacion, como el corte del arbol, la rotaciones, la dimensionalidad, el etiquetado, los colores, etc.
We study clustering methods for binary data, first defining aggregation criteria that measure the... more We study clustering methods for binary data, first defining aggregation criteria that measure the compactness of clusters. Five new and original methods are introduced, using neighborhoods and population behavior combinatorial optimization metaheuristics: first ones are simulated annealing, threshold accepting and tabu search, and the others are a genetic algorithm and ant colony optimization. The methods are implemented, performing the proper calibration of parameters in the case of heuristics, to ensure good results. From a set of 16 data tables generated by a quasi-Monte Carlo experiment, a comparison is performed for one of the aggregations using L1 dissimilarity, with hierarchical clustering, and a version of k-means: partitioning around medoids or PAM. Simulated annealing perform very well, especially compared to classical methods.
The term structure of interest rates or yield curve is a function relating the interest rate with... more The term structure of interest rates or yield curve is a function relating the interest rate with its own term. Nonlinear regression models of Nelson-Siegel and Svensson were used to estimate the yield curve using a sample of historical data supplied by the National Stock Exchange of Costa Rica. The optimization problem involved in the estimation process of model parameters is addressed by the use of four well known combinatorial optimization metaheuristics: Ant colony optimization, Genetic algorithm, Particle swarm optimization and Simulated annealing. The aim of the study is to improve the local minima obtained by a classical quasi-Newton optimization method using a descent direction. Good results with at least two metaheuristics are achieved, Particle swarm optimization and Simulated annealing. Keywords: Yield curve, nonlinear regression, Nelson-
We study some criteria that can be applied for the partitioning of a set of objects when non Eucl... more We study some criteria that can be applied for the partitioning of a set of objects when non Euclidean distances are used; particularly, these criteria can be used when the data are described by binary variables. These criteria are based on aggregations that measure the homogeneity of a class and some are generalizations of variance or inertia. Properties of the criteria are studied and partitioning methods are proposed, based on metaheuristics of global optimization, such as simulated annealing and tabu search. Finally, comparative results on binary data are shown. RESUMEN Se estudian criterios que se pueden aplicar para particionar un conjunto de objetos cuando se usan distancias no euclídeas; en particular, los criterios pueden ser usados cuando los datos son descritos por variables binarias. estos criterios están basados en agregaciones que miden la homogeneidad de una clase y algunos son generalizaciones de la varianza o inercia. Se estudian algunas de las propiedades de los cr...
UCR::Investigacion::Unidades de Investigacion::Ciencias Basicas::Centro de Investigaciones en Mat... more UCR::Investigacion::Unidades de Investigacion::Ciencias Basicas::Centro de Investigaciones en Matematicas Puras y Aplicadas (CIMPA)
Revista de Matemática: Teoría y Aplicaciones, Aug 1, 2004
Se desarrolla la teoría necesaria para realizar el Análisis de Datos en presencia de semiproducto... more Se desarrolla la teoría necesaria para realizar el Análisis de Datos en presencia de semiproductos escalares, extendiendo los conceptos clásicos de productos escalares usualmente empleados. Para ello, retomamos las definiciones algebraicas básicas de las formas bilineales no degeneradas y vamos desarrollando todas las herramientas algebraicas necesarias. Se estudian los operadores m´as importantes en el espacio de individuos, como el operados VM y el operador MV . También se estudia el caso del semiproducto escalar de pesos en el espacio de variables, que en el caso de pesos nulos corresponde a la introducci´on de individuos suplementarios. Finalmente, llegamos a los conceptos usuales del Análisis en Componentes Principales.
Abstract. We develop the theory necessary for Data Analysis with inner semiproducts, extending tha classical concepts of inner products usually employed. For this, we use the basic algebraic definitions of non degenerated bilinear forms and develop all the algebraic tools needed. We study the most important operators on the individual space, such as the V M and the MV operators. We also study the case of the inner semiproduct of weights in the variable space, which corresponds to the introduction of supplementary individuals in the case of null weights. Finally, we arrive to the usual concepts of Principal Component Analysis.
An index of soil quality and health for banana plantations in four countries of Latin America and... more An index of soil quality and health for banana plantations in four countries of Latin America and the Caribbean The productivity of banana plantations in several Latin America and Caribbean countries has shown significant declines over the past decade which has been associated with a deterioration of physical, chemical and biological soil factors. This study aimed to construct a mathematical index to describe the quality and health of these soils, indicating the most critical factors in production. Sixty-six indicators were measured in 38 selected banana farms with different levels of production in Costa Rica, Dominican Republic, Panama and Venezuela. Discriminant analysis for all of the countries indicated that the explanatory variables of discrimination are quite different among the countries, so we proceeded to define individually the quality and health indices. Seventeen indicators were identified for Costa Rica, 13 for Panama, 11 for Venezuela and 22 from Dominican Republic; th...
The term structure of interest rates or yield curve is a function relating the interest rate with... more The term structure of interest rates or yield curve is a function relating the interest rate with its own term. Nonlinear regression models of Nelson-Siegel and Svensson were used to estimate the yield curve using a sample of historical data supplied by the National Stock Exchange of Costa Rica. The optimization problem involved in the estimation process of model parameters is addressed by the use of metaheuristics: Ant colony, Genetic algorithm, Particle swarm and Simulated annealing. The aim of the study is to improve the local minimum obtained by an optimization method using descent direction. Good results with at least two metaheuristics are achieved, Particle swarm and Simulated annealing.
RESUMEN Se presenta un estudio de los principales métodos de clasificación bimodal, tanto jerárqu... more RESUMEN Se presenta un estudio de los principales métodos de clasificación bimodal, tanto jerárquica como por particiones. Se propone una fórmula de recurrencia del tipo Lance & Williams para la clasificación jerárquica bimodal, usando el criterio de ...
informe -- Universidad de Costa Rica. Centro de Investigaciones en Matematicas Puras y Aplicadas,... more informe -- Universidad de Costa Rica. Centro de Investigaciones en Matematicas Puras y Aplicadas, 2014. San Jose, 25-28 Febrero
Communications in Statistics - Simulation and Computation, 1981
Variable Selection in Multiple Linear Regression Using the Minimum Sum of Weighted Absolute Error... more Variable Selection in Multiple Linear Regression Using the Minimum Sum of Weighted Absolute Errors Criterion. JF Wellington, SC Narula COMMUN. STAT.- SIMUL. COMPUT. 6, 641-648, 1981. C CM 6 PROBABILITY AND STATISTICS(CI).
RESUMEN El propósito de este artículo es presentar los resultados de la evaluación de heurísticas... more RESUMEN El propósito de este artículo es presentar los resultados de la evaluación de heurísticas de optimización combinatoria por particiones, específicamente, sobrecalentamiento simulado, búsqueda tabú y algoritmos genéticos, en comparación con métodos tradicionales como k-medias y clasificación jerárquica de Ward. Se utilizaron tablas de datos generadas al azar de acuerdo con ciertos parámetros establecidos. Se generaron 16 tablas de datos con variables normalmente distribuidas, se repitió el experimento 100 veces para cada tabla y cada método, y como parámetro de comparación de los resultados se utilizó la inercia intra-clases (W). Los mejores resultados se obtuvieron para el sobrecalentamiento simulado y el algoritmo genético. ABSTRACT The aim of this paper is to present the results of the evaluation of combinatorial optimization heuristic applied to obtain partitions in clustering: simulated annealing, tabu search and a genetic algorithm, using data tables generated randomly a...
RESUMEN En este artículo se estudian diversos enfoques para definir promedios móviles y volatilid... more RESUMEN En este artículo se estudian diversos enfoques para definir promedios móviles y volatilidades desde el punto de vista exponencial, es decir, dando mas ponderación a los datos mas recientes. Se ve que varios de los que enfoques usados en la literatura adolecen de varios problemas de definición. Se propone una manera de definir un sistema completo de pesos y de esta manera corregir algunos de los problemas que presentan esos enfoques. Además, se propone una fórmula recurrente que converge al promedio móvil y la volatilidad exponencial reales. Finalmente, se estudian las cotas para los errores de aproximación que se cometen cuando se usan las fórmulas estudiadas. ABSTRACT: In this paper different approaches are studied, for defining moving averages and volatiles from the exponential point of view that is, giving a larger weight to the more recent data. It is obtained that different approaches, appearing in the literature, have different definition problems. A method is proposed...
Hemos desarrollado el algoritmo usual de clasificacion jerarquica ascendente en el sistema MATHEM... more Hemos desarrollado el algoritmo usual de clasificacion jerarquica ascendente en el sistema MATHEMATICA. El usuario escoge la disimilitud segun el tipo de datos que deba analizar: cuantitativos, cualitativos o binarios, asi como el indice de agregacion a utilizar. Se dispone de varias opciones para cada escogencia. Ademas, se ha implementado un gran numero de manipulaciones sobre el arbol binario de clasificacion, como el corte del arbol, la rotaciones, la dimensionalidad, el etiquetado, los colores, etc.
We study clustering methods for binary data, first defining aggregation criteria that measure the... more We study clustering methods for binary data, first defining aggregation criteria that measure the compactness of clusters. Five new and original methods are introduced, using neighborhoods and population behavior combinatorial optimization metaheuristics: first ones are simulated annealing, threshold accepting and tabu search, and the others are a genetic algorithm and ant colony optimization. The methods are implemented, performing the proper calibration of parameters in the case of heuristics, to ensure good results. From a set of 16 data tables generated by a quasi-Monte Carlo experiment, a comparison is performed for one of the aggregations using L1 dissimilarity, with hierarchical clustering, and a version of k-means: partitioning around medoids or PAM. Simulated annealing perform very well, especially compared to classical methods.
The term structure of interest rates or yield curve is a function relating the interest rate with... more The term structure of interest rates or yield curve is a function relating the interest rate with its own term. Nonlinear regression models of Nelson-Siegel and Svensson were used to estimate the yield curve using a sample of historical data supplied by the National Stock Exchange of Costa Rica. The optimization problem involved in the estimation process of model parameters is addressed by the use of four well known combinatorial optimization metaheuristics: Ant colony optimization, Genetic algorithm, Particle swarm optimization and Simulated annealing. The aim of the study is to improve the local minima obtained by a classical quasi-Newton optimization method using a descent direction. Good results with at least two metaheuristics are achieved, Particle swarm optimization and Simulated annealing. Keywords: Yield curve, nonlinear regression, Nelson-
We study some criteria that can be applied for the partitioning of a set of objects when non Eucl... more We study some criteria that can be applied for the partitioning of a set of objects when non Euclidean distances are used; particularly, these criteria can be used when the data are described by binary variables. These criteria are based on aggregations that measure the homogeneity of a class and some are generalizations of variance or inertia. Properties of the criteria are studied and partitioning methods are proposed, based on metaheuristics of global optimization, such as simulated annealing and tabu search. Finally, comparative results on binary data are shown. RESUMEN Se estudian criterios que se pueden aplicar para particionar un conjunto de objetos cuando se usan distancias no euclídeas; en particular, los criterios pueden ser usados cuando los datos son descritos por variables binarias. estos criterios están basados en agregaciones que miden la homogeneidad de una clase y algunos son generalizaciones de la varianza o inercia. Se estudian algunas de las propiedades de los cr...
UCR::Investigacion::Unidades de Investigacion::Ciencias Basicas::Centro de Investigaciones en Mat... more UCR::Investigacion::Unidades de Investigacion::Ciencias Basicas::Centro de Investigaciones en Matematicas Puras y Aplicadas (CIMPA)
Revista de Matemática: Teoría y Aplicaciones, Aug 1, 2004
Se desarrolla la teoría necesaria para realizar el Análisis de Datos en presencia de semiproducto... more Se desarrolla la teoría necesaria para realizar el Análisis de Datos en presencia de semiproductos escalares, extendiendo los conceptos clásicos de productos escalares usualmente empleados. Para ello, retomamos las definiciones algebraicas básicas de las formas bilineales no degeneradas y vamos desarrollando todas las herramientas algebraicas necesarias. Se estudian los operadores m´as importantes en el espacio de individuos, como el operados VM y el operador MV . También se estudia el caso del semiproducto escalar de pesos en el espacio de variables, que en el caso de pesos nulos corresponde a la introducci´on de individuos suplementarios. Finalmente, llegamos a los conceptos usuales del Análisis en Componentes Principales.
Abstract. We develop the theory necessary for Data Analysis with inner semiproducts, extending tha classical concepts of inner products usually employed. For this, we use the basic algebraic definitions of non degenerated bilinear forms and develop all the algebraic tools needed. We study the most important operators on the individual space, such as the V M and the MV operators. We also study the case of the inner semiproduct of weights in the variable space, which corresponds to the introduction of supplementary individuals in the case of null weights. Finally, we arrive to the usual concepts of Principal Component Analysis.
An index of soil quality and health for banana plantations in four countries of Latin America and... more An index of soil quality and health for banana plantations in four countries of Latin America and the Caribbean The productivity of banana plantations in several Latin America and Caribbean countries has shown significant declines over the past decade which has been associated with a deterioration of physical, chemical and biological soil factors. This study aimed to construct a mathematical index to describe the quality and health of these soils, indicating the most critical factors in production. Sixty-six indicators were measured in 38 selected banana farms with different levels of production in Costa Rica, Dominican Republic, Panama and Venezuela. Discriminant analysis for all of the countries indicated that the explanatory variables of discrimination are quite different among the countries, so we proceeded to define individually the quality and health indices. Seventeen indicators were identified for Costa Rica, 13 for Panama, 11 for Venezuela and 22 from Dominican Republic; th...
The term structure of interest rates or yield curve is a function relating the interest rate with... more The term structure of interest rates or yield curve is a function relating the interest rate with its own term. Nonlinear regression models of Nelson-Siegel and Svensson were used to estimate the yield curve using a sample of historical data supplied by the National Stock Exchange of Costa Rica. The optimization problem involved in the estimation process of model parameters is addressed by the use of metaheuristics: Ant colony, Genetic algorithm, Particle swarm and Simulated annealing. The aim of the study is to improve the local minimum obtained by an optimization method using descent direction. Good results with at least two metaheuristics are achieved, Particle swarm and Simulated annealing.
RESUMEN Se presenta un estudio de los principales métodos de clasificación bimodal, tanto jerárqu... more RESUMEN Se presenta un estudio de los principales métodos de clasificación bimodal, tanto jerárquica como por particiones. Se propone una fórmula de recurrencia del tipo Lance & Williams para la clasificación jerárquica bimodal, usando el criterio de ...
informe -- Universidad de Costa Rica. Centro de Investigaciones en Matematicas Puras y Aplicadas,... more informe -- Universidad de Costa Rica. Centro de Investigaciones en Matematicas Puras y Aplicadas, 2014. San Jose, 25-28 Febrero
We study clustering methods for binary data, first defining aggregation criteria that measure the... more We study clustering methods for binary data, first defining aggregation criteria that measure the compactness of clusters. Five new and original methods are introduced, using neighborhoods and population behavior combinatorial optimization metaheuristics: first ones are simulated an-nealing, threshold accepting and tabu search, and the others are a genetic algorithm and ant colony optimization. The methods are implemented, performing the proper calibration of parameters in the case of heuris-tics, to ensure good results. From a set of 16 data tables generated by a quasi-Monte Carlo experiment, a comparison is performed for one of the aggregations using L1 dissimilarity, with hierarchical clustering, and a version of k-means: partitioning around medoids or PAM. Simulated annealing perform very well, especially compared to classical methods.
Uploads
Papers by Javier Trejos
individuos, como el operados VM y el operador MV . También se estudia el caso del semiproducto escalar de pesos en el espacio de variables, que en el caso de pesos nulos corresponde a la introducci´on de individuos suplementarios. Finalmente, llegamos a los conceptos usuales del Análisis en Componentes Principales.
Abstract.
We develop the theory necessary for Data Analysis with inner semiproducts, extending tha classical concepts of inner products usually employed. For this, we use
the basic algebraic definitions of non degenerated bilinear forms and develop all the
algebraic tools needed. We study the most important operators on the individual
space, such as the V M and the MV operators. We also study the case of the inner
semiproduct of weights in the variable space, which corresponds to the introduction of
supplementary individuals in the case of null weights. Finally, we arrive to the usual
concepts of Principal Component Analysis.
individuos, como el operados VM y el operador MV . También se estudia el caso del semiproducto escalar de pesos en el espacio de variables, que en el caso de pesos nulos corresponde a la introducci´on de individuos suplementarios. Finalmente, llegamos a los conceptos usuales del Análisis en Componentes Principales.
Abstract.
We develop the theory necessary for Data Analysis with inner semiproducts, extending tha classical concepts of inner products usually employed. For this, we use
the basic algebraic definitions of non degenerated bilinear forms and develop all the
algebraic tools needed. We study the most important operators on the individual
space, such as the V M and the MV operators. We also study the case of the inner
semiproduct of weights in the variable space, which corresponds to the introduction of
supplementary individuals in the case of null weights. Finally, we arrive to the usual
concepts of Principal Component Analysis.