Resumen. El objetivo del artículo es identificar los factores que influyen en la deserción de est... more Resumen. El objetivo del artículo es identificar los factores que influyen en la deserción de estudiantes de pedagogía, considerando sus características individuales y académicas. El estudio se realizó con 531 estudiantes de la cohorte 2009. La investigación es de tipo cuantitativo, con un diseño explicativo, longitudinal y no experimental. La información se recolectó a partir de datos secundarios, los cuales fueron analizados según el método de análisis de supervivencia, modelados a través de la regresión de riesgos proporcionales de Cox. Los resultados demostraron que las variables individuales que explican la deserción de los estudiantes corresponden al sexo y la procedencia de la región del Bio Bio. Por otro lado, las variables académicas que explican la deserción universitaria corresponden al promedio de notas de enseñanza media, el lugar en la lista de seleccionados, provenir de un establecimiento secundario científico-humanista, el total de asignaturas inscritas, el último promedio curricular y la suspensión de estudios. Se concluye que las capacidades asociadas al nivel de logro de los resultados académicos y la gestión de apoyo social para los estudiantes, se constituyen en aspectos significativos para mantener el compromiso por permanecer en el programa académico. En la medida que las capacidades y la gestión de apoyo sean positivas, los estudiantes contarán con interacciones favorables que apoyarán su participación a nivel institucional, lo cual favorecerá su desarrollo intelectual y académico. Finalmente, se concluye que a nivel de política institucional resulta relevante gestionar el apoyo de las capacidades y la adaptación de los estudiantes, ya que se contribuirá en la generación de un equilibrio positivo entre la integración académica y social, a partir de la configuración de elementos que apoyarán el desarrollo de un contexto de motivación que permitirá mantener el compromiso de los estudiantes por el logro de la meta de graduación. Palabras clave: deserción estudiantil; educación superior; análisis de supervivencia; modelo de regresión Cox. [en] Explanatory factors the student teachers drop out rates Abstract. The aim of this study is to identify the factors that influence student-teachers drop-out rates, taking their individual and academic characteristics into account. The study was conducted on 531 student-teachers from the 2009 cohort. This is a quantitative and non experimental study of a
Distance-based regression allows for a neat implementation of the partial least squares recurrenc... more Distance-based regression allows for a neat implementation of the partial least squares recurrence. In this paper, we address practical issues arising when dealing with moderately large datasets (n ~ 104) such as those typical of automobile insurance premium calculations.
The aim of this paper is to study model selection criteria in credit scoring. Such criteria are u... more The aim of this paper is to study model selection criteria in credit scoring. Such criteria are usually derived from an error cost function which takes into account misclassification probabilities in good and bad credit risk subpopulations plus other parameters encoding context information relevant to the objective portfolio. We present a distance based classification approach to credit scoring, as an
ABSTRACT This paper introduces local distance-based generalized linear models. These models exten... more ABSTRACT This paper introduces local distance-based generalized linear models. These models extend (weighted) distance-based linear models first to the generalized linear model framework. Then, a nonparametric version of these models is proposed by means of local fitting. Distances between individuals are the only predictor information needed to fit these models. Therefore, they are applicable, among others, to mixed (qualitative and quantitative) explanatory variables or when the regressor is of functional type. An implementation is provided by the R package dbstats, which also implements other distance-based prediction methods. Supplementary material for this article is available online, which reproduces all the results of this article.
This paper introduces local distance-based generalized linear models. These models extend (weight... more This paper introduces local distance-based generalized linear models. These models extend (weighted) distance-based linear models firstly with the generalized linear model concept, then by localizing. Distances between individuals are the only predictor information needed to fit these models. Therefore they are applicable to mixed (qualitative and quantitative) explanatory variables or when the regressor is of functional type. Models can be
Methodology and Computing in Applied Probability, 2014
ABSTRACT Predictions with distance-based linear and generalized linear models rely upon latent va... more ABSTRACT Predictions with distance-based linear and generalized linear models rely upon latent variables derived from the distance function. This key feature has the drawback of adding a non-linearity layer between observed predictors and response which shields one from the other and, in particular, prevents us from interpreting linear predictor coefficients as influence measures. In actuarial applications such as credit scoring or a priori rate-making we cannot forgo this capability, crucial to assess the relative leverage of risk factors. Towards the goal of recovering this functionality we define and study influence coefficients, measuring the relative importance of observed predictors. Unavoidably, due to inherent model non-linearities, these quantities will be local -valid in a neighborhood of a given point in predictor space.
Journal of Statistical Planning and Inference, 2009
New points can be superimposed on a Euclidean configuration obtained as a result of a metric mult... more New points can be superimposed on a Euclidean configuration obtained as a result of a metric multidimensional scaling at coordinates given by Gower's interpolation formula. The procedure amounts to discarding a, possibly nonnull, coordinate along an additional dimension. We derive an analytical formula for this projection error term and, for real data problems, we describe a statistical method for testing its significance, as a cautionary device prior to further distance-based predictions.
ABSTRACT The problem of nonparametrically predicting a scalar response variable from a functional... more ABSTRACT The problem of nonparametrically predicting a scalar response variable from a functional predictor is considered. A sample of pairs (functional predictor and response) is observed. When predicting the response for a new functional predictor value, a semi-metric is used to compute the distances between the new and the previously observed functional predictors. Then each pair in the original sample is weighted according to a decreasing function of these distances. A Weighted (Linear) Distance-Based Regression is fitted, where the weights are as above and the distances are given by a possibly different semi-metric. This approach can be extended to nonparametric predictions from other kinds of explanatory variables (e.g., data of mixed type) in a natural way.
Distance-based regression allows for a neat implementation of the Partial Least Squares recurrenc... more Distance-based regression allows for a neat implementation of the Partial Least Squares recurrence. In this paper we address practical issues arising when dealing with moderately large datasets (n ~ 104) such as those typical of automobile insurance premium calculations.
Communications in Statistics - Theory and Methods, 2009
We propose a method of including polynomial and interaction terms in Distance-Based Regression (C... more We propose a method of including polynomial and interaction terms in Distance-Based Regression (Cuadras and Arenas, 1990), relying on properties of a semi-Hadamard or Khatri-Rao product of matrices. We demonstrate its application to real data examples.
New points can be superimposed on a Euclidean configuration obtained as a result of a metric Mult... more New points can be superimposed on a Euclidean configuration obtained as a result of a metric Multidimensional Scaling at coordinates given by Gower’s interpolation formula. The procedure amounts to discarding a, possibly non-null, coordinate along an additional dimension. We compute this error term, assessing
its influence on distance-based predictions.
Resumen. El objetivo del artículo es identificar los factores que influyen en la deserción de est... more Resumen. El objetivo del artículo es identificar los factores que influyen en la deserción de estudiantes de pedagogía, considerando sus características individuales y académicas. El estudio se realizó con 531 estudiantes de la cohorte 2009. La investigación es de tipo cuantitativo, con un diseño explicativo, longitudinal y no experimental. La información se recolectó a partir de datos secundarios, los cuales fueron analizados según el método de análisis de supervivencia, modelados a través de la regresión de riesgos proporcionales de Cox. Los resultados demostraron que las variables individuales que explican la deserción de los estudiantes corresponden al sexo y la procedencia de la región del Bio Bio. Por otro lado, las variables académicas que explican la deserción universitaria corresponden al promedio de notas de enseñanza media, el lugar en la lista de seleccionados, provenir de un establecimiento secundario científico-humanista, el total de asignaturas inscritas, el último promedio curricular y la suspensión de estudios. Se concluye que las capacidades asociadas al nivel de logro de los resultados académicos y la gestión de apoyo social para los estudiantes, se constituyen en aspectos significativos para mantener el compromiso por permanecer en el programa académico. En la medida que las capacidades y la gestión de apoyo sean positivas, los estudiantes contarán con interacciones favorables que apoyarán su participación a nivel institucional, lo cual favorecerá su desarrollo intelectual y académico. Finalmente, se concluye que a nivel de política institucional resulta relevante gestionar el apoyo de las capacidades y la adaptación de los estudiantes, ya que se contribuirá en la generación de un equilibrio positivo entre la integración académica y social, a partir de la configuración de elementos que apoyarán el desarrollo de un contexto de motivación que permitirá mantener el compromiso de los estudiantes por el logro de la meta de graduación. Palabras clave: deserción estudiantil; educación superior; análisis de supervivencia; modelo de regresión Cox. [en] Explanatory factors the student teachers drop out rates Abstract. The aim of this study is to identify the factors that influence student-teachers drop-out rates, taking their individual and academic characteristics into account. The study was conducted on 531 student-teachers from the 2009 cohort. This is a quantitative and non experimental study of a
Distance-based regression allows for a neat implementation of the partial least squares recurrenc... more Distance-based regression allows for a neat implementation of the partial least squares recurrence. In this paper, we address practical issues arising when dealing with moderately large datasets (n ~ 104) such as those typical of automobile insurance premium calculations.
The aim of this paper is to study model selection criteria in credit scoring. Such criteria are u... more The aim of this paper is to study model selection criteria in credit scoring. Such criteria are usually derived from an error cost function which takes into account misclassification probabilities in good and bad credit risk subpopulations plus other parameters encoding context information relevant to the objective portfolio. We present a distance based classification approach to credit scoring, as an
ABSTRACT This paper introduces local distance-based generalized linear models. These models exten... more ABSTRACT This paper introduces local distance-based generalized linear models. These models extend (weighted) distance-based linear models first to the generalized linear model framework. Then, a nonparametric version of these models is proposed by means of local fitting. Distances between individuals are the only predictor information needed to fit these models. Therefore, they are applicable, among others, to mixed (qualitative and quantitative) explanatory variables or when the regressor is of functional type. An implementation is provided by the R package dbstats, which also implements other distance-based prediction methods. Supplementary material for this article is available online, which reproduces all the results of this article.
This paper introduces local distance-based generalized linear models. These models extend (weight... more This paper introduces local distance-based generalized linear models. These models extend (weighted) distance-based linear models firstly with the generalized linear model concept, then by localizing. Distances between individuals are the only predictor information needed to fit these models. Therefore they are applicable to mixed (qualitative and quantitative) explanatory variables or when the regressor is of functional type. Models can be
Methodology and Computing in Applied Probability, 2014
ABSTRACT Predictions with distance-based linear and generalized linear models rely upon latent va... more ABSTRACT Predictions with distance-based linear and generalized linear models rely upon latent variables derived from the distance function. This key feature has the drawback of adding a non-linearity layer between observed predictors and response which shields one from the other and, in particular, prevents us from interpreting linear predictor coefficients as influence measures. In actuarial applications such as credit scoring or a priori rate-making we cannot forgo this capability, crucial to assess the relative leverage of risk factors. Towards the goal of recovering this functionality we define and study influence coefficients, measuring the relative importance of observed predictors. Unavoidably, due to inherent model non-linearities, these quantities will be local -valid in a neighborhood of a given point in predictor space.
Journal of Statistical Planning and Inference, 2009
New points can be superimposed on a Euclidean configuration obtained as a result of a metric mult... more New points can be superimposed on a Euclidean configuration obtained as a result of a metric multidimensional scaling at coordinates given by Gower's interpolation formula. The procedure amounts to discarding a, possibly nonnull, coordinate along an additional dimension. We derive an analytical formula for this projection error term and, for real data problems, we describe a statistical method for testing its significance, as a cautionary device prior to further distance-based predictions.
ABSTRACT The problem of nonparametrically predicting a scalar response variable from a functional... more ABSTRACT The problem of nonparametrically predicting a scalar response variable from a functional predictor is considered. A sample of pairs (functional predictor and response) is observed. When predicting the response for a new functional predictor value, a semi-metric is used to compute the distances between the new and the previously observed functional predictors. Then each pair in the original sample is weighted according to a decreasing function of these distances. A Weighted (Linear) Distance-Based Regression is fitted, where the weights are as above and the distances are given by a possibly different semi-metric. This approach can be extended to nonparametric predictions from other kinds of explanatory variables (e.g., data of mixed type) in a natural way.
Distance-based regression allows for a neat implementation of the Partial Least Squares recurrenc... more Distance-based regression allows for a neat implementation of the Partial Least Squares recurrence. In this paper we address practical issues arising when dealing with moderately large datasets (n ~ 104) such as those typical of automobile insurance premium calculations.
Communications in Statistics - Theory and Methods, 2009
We propose a method of including polynomial and interaction terms in Distance-Based Regression (C... more We propose a method of including polynomial and interaction terms in Distance-Based Regression (Cuadras and Arenas, 1990), relying on properties of a semi-Hadamard or Khatri-Rao product of matrices. We demonstrate its application to real data examples.
New points can be superimposed on a Euclidean configuration obtained as a result of a metric Mult... more New points can be superimposed on a Euclidean configuration obtained as a result of a metric Multidimensional Scaling at coordinates given by Gower’s interpolation formula. The procedure amounts to discarding a, possibly non-null, coordinate along an additional dimension. We compute this error term, assessing
its influence on distance-based predictions.
Uploads
Papers by Eva Boj
its influence on distance-based predictions.
its influence on distance-based predictions.