Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–21 of 21 results for author: Charpentier, A

Searching in archive stat. Search in all archives.
.
  1. arXiv:2403.15790  [pdf, other

    cs.LG stat.ML

    Boarding for ISS: Imbalanced Self-Supervised: Discovery of a Scaled Autoencoder for Mixed Tabular Datasets

    Authors: Samuel Stocksieker, Denys Pommeret, Arthur Charpentier

    Abstract: The field of imbalanced self-supervised learning, especially in the context of tabular data, has not been extensively studied. Existing research has predominantly focused on image datasets. This paper aims to fill this gap by examining the specific challenges posed by data imbalance in self-supervised learning in the domain of tabular data, with a primary focus on autoencoders. Autoencoders are wi… ▽ More

    Submitted 23 March, 2024; originally announced March 2024.

  2. arXiv:2311.11900  [pdf, other

    stat.ML cs.CY cs.LG

    Measuring and Mitigating Biases in Motor Insurance Pricing

    Authors: Mulah Moriah, Franck Vermet, Arthur Charpentier

    Abstract: The non-life insurance sector operates within a highly competitive and tightly regulated framework, confronting a pivotal juncture in the formulation of pricing strategies. Insurers are compelled to harness a range of statistical methodologies and available data to construct optimal pricing structures that align with the overarching corporate strategy while accommodating the dynamics of market com… ▽ More

    Submitted 20 June, 2024; v1 submitted 20 November, 2023; originally announced November 2023.

  3. arXiv:2310.20508  [pdf, other

    stat.ML cs.CY cs.LG

    Parametric Fairness with Statistical Guarantees

    Authors: François HU, Philipp Ratz, Arthur Charpentier

    Abstract: Algorithmic fairness has gained prominence due to societal and regulatory concerns about biases in Machine Learning models. Common group fairness metrics like Equalized Odds for classification or Demographic Parity for both classification and regression are widely used and a host of computationally advantageous post-processing methods have been developed around them. However, these metrics often l… ▽ More

    Submitted 31 October, 2023; originally announced October 2023.

  4. arXiv:2309.06627  [pdf, other

    stat.ML cs.CY cs.LG

    A Sequentially Fair Mechanism for Multiple Sensitive Attributes

    Authors: François Hu, Philipp Ratz, Arthur Charpentier

    Abstract: In the standard use case of Algorithmic Fairness, the goal is to eliminate the relationship between a sensitive variable and a corresponding score. Throughout recent years, the scientific community has developed a host of definitions and tools to solve this task, which work well in many practical applications. However, the applicability and effectivity of these tools and definitions becomes less s… ▽ More

    Submitted 14 January, 2024; v1 submitted 12 September, 2023; originally announced September 2023.

  5. arXiv:2308.11090  [pdf, other

    cs.CV cs.LG stat.AP

    Fairness Explainability using Optimal Transport with Applications in Image Classification

    Authors: Philipp Ratz, François Hu, Arthur Charpentier

    Abstract: Ensuring trust and accountability in Artificial Intelligence systems demands explainability of its outcomes. Despite significant progress in Explainable AI, human biases still taint a substantial portion of its training data, raising concerns about unfairness or discriminatory tendencies. Current approaches in the field of Algorithmic Fairness focus on mitigating such biases in the outcomes of a m… ▽ More

    Submitted 31 October, 2023; v1 submitted 21 August, 2023; originally announced August 2023.

  6. arXiv:2308.02966  [pdf, other

    stat.ML cs.LG

    Generalized Oversampling for Learning from Imbalanced datasets and Associated Theory

    Authors: Samuel Stocksieker, Denys Pommeret, Arthur Charpentier

    Abstract: In supervised learning, it is quite frequent to be confronted with real imbalanced datasets. This situation leads to a learning difficulty for standard algorithms. Research and solutions in imbalanced learning have mainly focused on classification tasks. Despite its importance, very few solutions exist for imbalanced regression. In this paper, we propose a data augmentation procedure, the GOLIATH… ▽ More

    Submitted 5 August, 2023; originally announced August 2023.

    Comments: This paper focuses specifically on the Imbalanced Regression issues but could be used for Imbalanced classification tasks

  7. arXiv:2306.12912  [pdf, other

    stat.ML cs.CY cs.LG

    Mitigating Discrimination in Insurance with Wasserstein Barycenters

    Authors: Arthur Charpentier, François Hu, Philipp Ratz

    Abstract: The insurance industry is heavily reliant on predictions of risks based on characteristics of potential customers. Although the use of said models is common, researchers have long pointed out that such practices perpetuate discrimination based on sensitive features such as gender or race. Given that such discrimination can often be attributed to historical data biases, an elimination or at least m… ▽ More

    Submitted 22 June, 2023; originally announced June 2023.

  8. Fairness in Multi-Task Learning via Wasserstein Barycenters

    Authors: François Hu, Philipp Ratz, Arthur Charpentier

    Abstract: Algorithmic Fairness is an established field in machine learning that aims to reduce biases in data. Recent advances have proposed various methods to ensure fairness in a univariate environment, where the goal is to de-bias a single task. However, extending fairness to a multi-task setting, where more than one objective is optimised using a shared representation, remains underexplored. To bridge t… ▽ More

    Submitted 6 July, 2023; v1 submitted 16 June, 2023; originally announced June 2023.

  9. arXiv:2302.09288  [pdf, other

    stat.ML cs.LG stat.ME

    Data Augmentation for Imbalanced Regression

    Authors: Samuel Stocksieker, Denys Pommeret, Arthur Charpentier

    Abstract: In this work, we consider the problem of imbalanced data in a regression framework when the imbalanced phenomenon concerns continuous or discrete covariates. Such a situation can lead to biases in the estimates. In this case, we propose a data augmentation algorithm that combines a weighted resampling (WR) and a data augmentation (DA) procedure. In a first step, the DA procedure permits exploring… ▽ More

    Submitted 18 February, 2023; originally announced February 2023.

    Comments: paper accepted at the AISTATS 2023 conference, to be published in PMLR (Proceedings of Machine Learning Research)

  10. arXiv:2202.12008  [pdf, other

    stat.ML cs.AI cs.CY cs.LG stat.AP

    A Fair Pricing Model via Adversarial Learning

    Authors: Vincent Grari, Arthur Charpentier, Marcin Detyniecki

    Abstract: At the core of insurance business lies classification between risky and non-risky insureds, actuarial fairness meaning that risky insureds should contribute more and pay a higher premium than non-risky or less-risky ones. Actuaries, therefore, use econometric or machine learning techniques to classify, but the distinction between a fair actuarial classification and "discrimination" is subtle. For… ▽ More

    Submitted 26 December, 2022; v1 submitted 24 February, 2022; originally announced February 2022.

    Comments: 20 pages, 12 figures

  11. Predicting Drought and Subsidence Risks in France

    Authors: Arthur Charpentier, Molly James, Hani Ali

    Abstract: The economic consequences of drought episodes are increasingly important, although they are often difficult to apprehend in part because of the complexity of the underlying mechanisms. In this article, we will study one of the consequences of drought, namely the risk of subsidence (or more specifically clay shrinkage induced subsidence), for which insurance has been mandatory in France for several… ▽ More

    Submitted 15 July, 2021; originally announced July 2021.

  12. arXiv:2103.03635  [pdf, other

    stat.ML cs.LG econ.EM

    Autocalibration and Tweedie-dominance for Insurance Pricing with Machine Learning

    Authors: Michel Denuit, Arthur Charpentier, Julien Trufin

    Abstract: Boosting techniques and neural networks are particularly effective machine learning methods for insurance pricing. Often in practice, there are nevertheless endless debates about the choice of the right loss function to be used to train the machine learning model, as well as about the appropriate metric to assess the performances of competing models. Also, the sum of fitted values can depart from… ▽ More

    Submitted 9 July, 2021; v1 submitted 5 March, 2021; originally announced March 2021.

  13. arXiv:2006.08446  [pdf, other

    stat.AP econ.GN

    Modeling Joint Lives within Families

    Authors: Olivier Cabrignac, Arthur Charpentier, Ewen Gallic

    Abstract: Family history is usually seen as a significant factor insurance companies look at when applying for a life insurance policy. Where it is used, family history of cardiovascular diseases, death by cancer, or family history of high blood pressure and diabetes could result in higher premiums or no coverage at all. In this article, we use massive (historical) data to study dependencies between life le… ▽ More

    Submitted 15 June, 2020; originally announced June 2020.

  14. arXiv:1912.11736  [pdf, other

    econ.EM stat.ME

    Pareto models for risk management

    Authors: Arthur Charpentier, Emmanuel Flachaire

    Abstract: The Pareto model is very popular in risk management, since simple analytical formulas can be derived for financial downside risk measures (Value-at-Risk, Expected Shortfall) or reinsurance premiums and related quantities (Large Claim Index, Return Period). Nevertheless, in practice, distributions are (strictly) Pareto only in the tails, above (possible very) large threshold. Therefore, it could be… ▽ More

    Submitted 25 December, 2019; originally announced December 2019.

  15. arXiv:1905.10267  [pdf, other

    cs.SI physics.soc-ph stat.ME

    Extended Scale-Free Networks

    Authors: Arthur Charpentier, Emmanuel Flachaire

    Abstract: Recently, Broido & Clauset (2019) mentioned that (strict) Scale-Free networks were rare, in real life. This might be related to the statement of Stumpf, Wiuf & May (2005), that sub-networks of scale-free networks are not scale-free. In the later, those sub-networks are asymptotically scale-free, but one should not forget about second-order deviation (possibly also third order actually). In this ar… ▽ More

    Submitted 28 May, 2019; v1 submitted 24 May, 2019; originally announced May 2019.

  16. arXiv:1810.09214  [pdf, other

    stat.ME

    A new GEE method to account for heteroscedasticity, using asymmetric least-square regressions

    Authors: Amadou Barry, Karim Oualkacha, Arthur Charpentier

    Abstract: Generalized estimating equations (GEE) are widely used to analyze longitudinal data; however, they are not appropriate for heteroscedastic data, because they only estimate regressor effects on the mean response{\textemdash}and therefore do not account for data heterogeneity. Here, we combine the GEE with the asymmetric least squares (expectile) regression to derive a new class of estimators, which… ▽ More

    Submitted 24 December, 2020; v1 submitted 22 October, 2018; originally announced October 2018.

    Comments: 40 pages, 14 figures and all section modified

  17. arXiv:1708.06992  [pdf, other

    stat.OT econ.EM

    Econométrie et Machine Learning

    Authors: Arthur Charpentier, Emmanuel Flachaire, Antoine Ly

    Abstract: Econometrics and machine learning seem to have one common goal: to construct a predictive model, for a variable of interest, using explanatory variables (or features). However, these two fields developed in parallel, thus creating two different cultures, to paraphrase Breiman (2001). The first was to build probabilistic models to describe economic phenomena. The second uses algorithms that will le… ▽ More

    Submitted 19 March, 2018; v1 submitted 26 July, 2017; originally announced August 2017.

    Comments: in French

  18. arXiv:1707.07607  [pdf, other

    stat.OT

    We are not alone ! (at least, most of us). Homonymy in large scale social groups

    Authors: Arthur Charpentier, Baptiste Coulmont

    Abstract: This article brings forward an estimation of the proportion of homonyms in large scale groups based on the distribution of first names and last names in a subset of these groups. The estimation is based on the generalization of the "birthday paradox problem". The main results is that, in societies such as France or the United States, identity collisions (based on first + last names) are frequent.… ▽ More

    Submitted 24 July, 2017; originally announced July 2017.

  19. arXiv:1602.08773  [pdf, other

    stat.AP

    Macro vs. Micro Methods in Non-Life Claims Reserving (an Econometric Perspective)

    Authors: Arthur Charpentier, Mathieu Pigeon

    Abstract: Traditionally, actuaries have used run-off triangles to estimate reserve ("macro" models, on agregated data). But it is possible to model payments related to individual claims. If those models provide similar estimations, we investigate uncertainty related to reserves, with "macro" and "micro" models. We study theoretical properties of econometric models (Gaussian, Poisson and quasi-Poisson) on in… ▽ More

    Submitted 28 February, 2016; originally announced February 2016.

  20. arXiv:1404.4414  [pdf, other

    stat.ME math.ST

    Probit transformation for nonparametric kernel estimation of the copula density

    Authors: Gery Geenens, Arthur Charpentier, Davy Paindaveine

    Abstract: Copula modelling has become ubiquitous in modern statistics. Here, the problem of nonparametrically estimating a copula density is addressed. Arguably the most popular nonparametric density estimator, the kernel estimator is not suitable for the unit-square-supported copula densities, mainly because it is heavily affected by boundary bias issues. In addition, most common copulas admit unbounded de… ▽ More

    Submitted 16 April, 2014; originally announced April 2014.

  21. arXiv:1112.0929  [pdf, other

    stat.AP stat.ME

    Multivariate integer-valued autoregressive models applied to earthquake counts

    Authors: Mathieu Boudreault, Arthur Charpentier

    Abstract: In various situations in the insurance industry, in finance, in epidemiology, etc., one needs to represent the joint evolution of the number of occurrences of an event. In this paper, we present a multivariate integer-valued autoregressive (MINAR) model, derive its properties and apply the model to earthquake occurrences across various pairs of tectonic plates. The model is an extension of Pedelis… ▽ More

    Submitted 5 December, 2011; originally announced December 2011.