-
Fast Screening Rules for Optimal Design via Quadratic Lasso Reformulation
Authors:
Guillaume Sagnol,
Luc Pronzato
Abstract:
The problems of Lasso regression and optimal design of experiments share a critical property: their optimal solutions are typically \emph{sparse}, i.e., only a small fraction of the optimal variables are non-zero. Therefore, the identification of the support of an optimal solution reduces the dimensionality of the problem and can yield a substantial simplification of the calculations. It has recen…
▽ More
The problems of Lasso regression and optimal design of experiments share a critical property: their optimal solutions are typically \emph{sparse}, i.e., only a small fraction of the optimal variables are non-zero. Therefore, the identification of the support of an optimal solution reduces the dimensionality of the problem and can yield a substantial simplification of the calculations. It has recently been shown that linear regression with a \emph{squared} $\ell_1$-norm sparsity-inducing penalty is equivalent to an optimal experimental design problem. In this work, we use this equivalence to derive safe screening rules that can be used to discard inessential samples. Compared to previously existing rules, the new tests are much faster to compute, especially for problems involving a parameter space of high dimension, and can be used dynamically within any iterative solver, with negligible computational overhead. Moreover, we show how an existing homotopy algorithm to compute the regularization path of the lasso method can be reparametrized with respect to the squared $\ell_1$-penalty. This allows the computation of a Bayes $c$-optimal design in a finite number of steps and can be several orders of magnitude faster than standard first-order algorithms. The efficiency of the new screening rules and of the homotopy algorithm are demonstrated on different examples based on real data.
△ Less
Submitted 13 October, 2023;
originally announced October 2023.
-
Active Discrimination Learning for Gaussian Process Models
Authors:
Elham Yousefi,
Luc Pronzato,
Markus Hainy,
Werner G. Müller,
Henry P. Wynn
Abstract:
The paper covers the design and analysis of experiments to discriminate between two Gaussian process models, such as those widely used in computer experiments, kriging, sensor location and machine learning. Two frameworks are considered. First, we study sequential constructions, where successive design (observation) points are selected, either as additional points to an existing design or from the…
▽ More
The paper covers the design and analysis of experiments to discriminate between two Gaussian process models, such as those widely used in computer experiments, kriging, sensor location and machine learning. Two frameworks are considered. First, we study sequential constructions, where successive design (observation) points are selected, either as additional points to an existing design or from the beginning of observation. The selection relies on the maximisation of the difference between the symmetric Kullback Leibler divergences for the two models, which depends on the observations, or on the mean squared error of both models, which does not. Then, we consider static criteria, such as the familiar log-likelihood ratios and the Fréchet distance between the covariance functions of the two models. Other distance-based criteria, simpler to compute than previous ones, are also introduced, for which, considering the framework of approximate design, a necessary condition for the optimality of a design measure is provided. The paper includes a study of the mathematical links between different criteria and numerical illustrations are provided.
△ Less
Submitted 21 November, 2022;
originally announced November 2022.
-
Quasi-uniform designs with optimal and near-optimal uniformity constant
Authors:
Luc Pronzato,
Anatoly Zhigljavsky
Abstract:
A design is a collection of distinct points in a given set $X$, which is assumed to be a compact subset of $R^d$, and the mesh-ratio of a design is the ratio of its fill distance to its separation radius. The uniformity constant of a sequence of nested designs is the smallest upper bound for the mesh-ratios of the designs. We derive a lower bound on this uniformity constant and show that a simple…
▽ More
A design is a collection of distinct points in a given set $X$, which is assumed to be a compact subset of $R^d$, and the mesh-ratio of a design is the ratio of its fill distance to its separation radius. The uniformity constant of a sequence of nested designs is the smallest upper bound for the mesh-ratios of the designs. We derive a lower bound on this uniformity constant and show that a simple greedy construction achieves this lower bound. We then extend this scheme to allow more flexibility in the design construction.
△ Less
Submitted 20 December, 2021;
originally announced December 2021.
-
Validation design I: construction of validation designs via kernel herding
Authors:
Luc Pronzato,
Maria-João Rendas
Abstract:
We construct validation designs $Z_m$ aimed at estimating the integrated squared prediction error of a given design $X_n$. Our approach is based on the minimization of a maximum mean discrepancy for a particular kernel, conditional on $X_n$, so that sequences of nested validation designs can be constructed incrementally by kernel herding. Numerical experiments show that key features for a good val…
▽ More
We construct validation designs $Z_m$ aimed at estimating the integrated squared prediction error of a given design $X_n$. Our approach is based on the minimization of a maximum mean discrepancy for a particular kernel, conditional on $X_n$, so that sequences of nested validation designs can be constructed incrementally by kernel herding. Numerical experiments show that key features for a good validation design are its space-filling properties, in order to fill the holes left by $X_n$ and properly explore the whole design space, and the suitable weighting of its points, since evaluations far from $X_n$ tend to overestimate the global error. A dedicated weighting method, based on a particular kernel, is proposed. Numerical simulations with random functions show the superiority the method over more traditional validation based on random designs, low-discrepancy sequences, or leave-one-out cross validation.
△ Less
Submitted 10 December, 2021;
originally announced December 2021.
-
Incremental space-filling design based on coverings and spacings: improving upon low discrepancy sequences
Authors:
Amaya Nogales Gómez,
Luc Pronzato,
Maria-João Rendas
Abstract:
The paper addresses the problem of defining families of ordered sequences $\{x_i\}_{i\in N}$ of elements of a compact subset $X$ of $R^d$ whose prefixes $X_n=\{x_i\}_{i=1}^{n}$, for all orders $n$, have good space-filling properties as measured by the dispersion (covering radius) criterion. Our ultimate aim is the definition of incremental algorithms that generate sequences $X_n$ with small optima…
▽ More
The paper addresses the problem of defining families of ordered sequences $\{x_i\}_{i\in N}$ of elements of a compact subset $X$ of $R^d$ whose prefixes $X_n=\{x_i\}_{i=1}^{n}$, for all orders $n$, have good space-filling properties as measured by the dispersion (covering radius) criterion. Our ultimate aim is the definition of incremental algorithms that generate sequences $X_n$ with small optimality gap, i.e., with a small increase in the maximum distance between points of $X$ and the elements of $X_n$ with respect to the optimal solution $X_n^\star$. The paper is a first step in this direction, presenting incremental design algorithms with proven optimality bound for one-parameter families of criteria based on coverings and spacings that both converge to dispersion for large values of their parameter. The examples presented show that the covering-based method outperforms state-of-the-art competitors, including coffee-house, suggesting that it inherits from its guaranteed 50\% optimality gap.
△ Less
Submitted 9 June, 2021;
originally announced June 2021.
-
Performance analysis of greedy algorithms for minimising a Maximum Mean Discrepancy
Authors:
Luc Pronzato
Abstract:
We analyse the performance of several iterative algorithms for the quantisation of a probability measure $μ$, based on the minimisation of a Maximum Mean Discrepancy (MMD). Our analysis includes kernel herding, greedy MMD minimisation and Sequential Bayesian Quadrature (SBQ). We show that the finite-sample-size approximation error, measured by the MMD, decreases as $1/n$ for SBQ and also for kerne…
▽ More
We analyse the performance of several iterative algorithms for the quantisation of a probability measure $μ$, based on the minimisation of a Maximum Mean Discrepancy (MMD). Our analysis includes kernel herding, greedy MMD minimisation and Sequential Bayesian Quadrature (SBQ). We show that the finite-sample-size approximation error, measured by the MMD, decreases as $1/n$ for SBQ and also for kernel herding and greedy MMD minimisation when using a suitable step-size sequence. The upper bound on the approximation error is slightly better for SBQ, but the other methods are significantly faster, with a computational cost that increases only linearly with the number of points selected. This is illustrated by two numerical examples, with the target measure $μ$ being uniform (a space-filling design application) and with $μ$ a Gaussian mixture. They suggest that the bounds derived in the paper are overly pessimistic, in particular for SBQ. The sources of this pessimism are identified but seem difficult to counter.
△ Less
Submitted 28 April, 2022; v1 submitted 19 January, 2021;
originally announced January 2021.
-
Sequential online subsampling for thinning experimental designs
Authors:
Luc Pronzato,
HaiYing Wang
Abstract:
We consider a design problem where experimental conditions (design points $X_i$) are presented in the form of a sequence of i.i.d.\ random variables, generated with an unknown probability measure $μ$, and only a given proportion $α\in(0,1)$ can be selected. The objective is to select good candidates $X_i$ on the fly and maximize a concave function $Φ$ of the corresponding information matrix. The o…
▽ More
We consider a design problem where experimental conditions (design points $X_i$) are presented in the form of a sequence of i.i.d.\ random variables, generated with an unknown probability measure $μ$, and only a given proportion $α\in(0,1)$ can be selected. The objective is to select good candidates $X_i$ on the fly and maximize a concave function $Φ$ of the corresponding information matrix. The optimal solution corresponds to the construction of an optimal bounded design measure $ξ_α^*\leq μ/α$, with the difficulty that $μ$ is unknown and $ξ_α^*$ must be constructed online. The construction proposed relies on the definition of a threshold $τ$ on the directional derivative of $Φ$ at the current information matrix, the value of $τ$ being fixed by a certain quantile of the distribution of this directional derivative. Combination with recursive quantile estimation yields a nonlinear two-time-scale stochastic approximation method. It can be applied to very long design sequences since only the current information matrix and estimated quantile need to be stored. Convergence to an optimum design is proved. Various illustrative examples are presented.
△ Less
Submitted 4 August, 2020; v1 submitted 1 April, 2020;
originally announced April 2020.
-
Efficient Prediction Designs for Random Fields
Authors:
Werner G. Müller,
Luc Pronzato,
Joao Rendas,
Helmut Waldl
Abstract:
For estimation and predictions of random fields it is increasingly acknowledged that the kriging variance may be a poor representative of true uncertainty. Experimental designs based on more elaborate criteria that are appropriate for empirical kriging are then often non-space-filling and very costly to determine. In this paper, we investigate the possibility of using a compound criterion inspired…
▽ More
For estimation and predictions of random fields it is increasingly acknowledged that the kriging variance may be a poor representative of true uncertainty. Experimental designs based on more elaborate criteria that are appropriate for empirical kriging are then often non-space-filling and very costly to determine. In this paper, we investigate the possibility of using a compound criterion inspired by an equivalence theorem type relation to build designs quasi-optimal for the empirical kriging variance, when space-filling designs become unsuitable. Two algorithms are proposed, one relying on stochastic optimization to explicitly identify the Pareto front, while the second uses the surrogate criteria as local heuristic to chose the points at which the (costly) true Empirical Kriging variance is effectively computed. We illustrate the performance of the algorithms presented on both a simple simulated example and a real oceanographic dataset.
△ Less
Submitted 14 May, 2013;
originally announced May 2013.