Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–27 of 27 results for author: Hodgkinson, L

.
  1. arXiv:2311.07013  [pdf, ps, other

    stat.ML cs.LG

    A PAC-Bayesian Perspective on the Interpolating Information Criterion

    Authors: Liam Hodgkinson, Chris van der Heide, Robert Salomone, Fred Roosta, Michael W. Mahoney

    Abstract: Deep learning is renowned for its theory-practice gap, whereby principled theory typically fails to provide much beneficial guidance for implementation in practice. This has been highlighted recently by the benign overfitting phenomenon: when neural networks become sufficiently large to interpolate the dataset perfectly, model performance appears to improve with increasing model size, in apparent… ▽ More

    Submitted 12 November, 2023; originally announced November 2023.

    Comments: 9 pages

  2. arXiv:2307.07785  [pdf, other

    stat.ML cs.LG

    The Interpolating Information Criterion for Overparameterized Models

    Authors: Liam Hodgkinson, Chris van der Heide, Robert Salomone, Fred Roosta, Michael W. Mahoney

    Abstract: The problem of model selection is considered for the setting of interpolating estimators, where the number of model parameters exceeds the size of the dataset. Classical information criteria typically consider the large-data limit, penalizing model size. However, these criteria are not appropriate in modern settings where overparameterized models tend to perform well. For any overparameterized mod… ▽ More

    Submitted 15 July, 2023; originally announced July 2023.

    Comments: 23 pages, 2 figures

  3. arXiv:2307.02501  [pdf, ps, other

    stat.ML cs.LG

    Generalization Guarantees via Algorithm-dependent Rademacher Complexity

    Authors: Sarah Sachs, Tim van Erven, Liam Hodgkinson, Rajiv Khanna, Umut Simsekli

    Abstract: Algorithm- and data-dependent generalization bounds are required to explain the generalization behavior of modern machine learning algorithms. In this context, there exists information theoretic generalization bounds that involve (various forms of) mutual information, as well as bounds based on hypothesis set stability. We propose a conceptually related, but technically distinct complexity measure… ▽ More

    Submitted 4 July, 2023; originally announced July 2023.

  4. arXiv:2306.09262  [pdf, other

    stat.ML cs.LG cs.PL

    A Heavy-Tailed Algebra for Probabilistic Programming

    Authors: Feynman Liang, Liam Hodgkinson, Michael W. Mahoney

    Abstract: Despite the successes of probabilistic models based on passing noise through neural networks, recent work has identified that such methods often fail to capture tail behavior accurately, unless the tails of the base distribution are appropriately calibrated. To overcome this deficiency, we propose a systematic approach for analyzing the tails of random variables, and we illustrate how this approac… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

    Comments: 21 pages, 6 figures

  5. arXiv:2305.12313  [pdf, other

    stat.ML cs.LG

    When are ensembles really effective?

    Authors: Ryan Theisen, Hyunsuk Kim, Yaoqing Yang, Liam Hodgkinson, Michael W. Mahoney

    Abstract: Ensembling has a long history in statistical data analysis, with many impactful applications. However, in many modern machine learning settings, the benefits of ensembling are less ubiquitous and less obvious. We study, both theoretically and empirically, the fundamental question of when ensembling yields significant performance improvements in classification tasks. Theoretically, we prove new res… ▽ More

    Submitted 20 May, 2023; originally announced May 2023.

  6. arXiv:2210.07612  [pdf, other

    stat.ML cs.LG

    Monotonicity and Double Descent in Uncertainty Estimation with Gaussian Processes

    Authors: Liam Hodgkinson, Chris van der Heide, Fred Roosta, Michael W. Mahoney

    Abstract: Despite their importance for assessing reliability of predictions, uncertainty quantification (UQ) measures for machine learning models have only recently begun to be rigorously characterized. One prominent issue is the curse of dimensionality: it is commonly believed that the marginal likelihood should be reminiscent of cross-validation metrics and that both should deteriorate with larger input d… ▽ More

    Submitted 25 July, 2023; v1 submitted 14 October, 2022; originally announced October 2022.

    Comments: 33 pages, 21 figures

  7. arXiv:2205.07918  [pdf, other

    stat.ML cs.LG

    Fat-Tailed Variational Inference with Anisotropic Tail Adaptive Flows

    Authors: Feynman Liang, Liam Hodgkinson, Michael W. Mahoney

    Abstract: While fat-tailed densities commonly arise as posterior and marginal distributions in robust models and scale mixtures, they present challenges when Gaussian-based variational inference fails to capture tail decay accurately. We first improve previous theory on tails of Lipschitz flows by quantifying how the tails affect the rate of tail decay and by expanding the theory to non-Lipschitz polynomial… ▽ More

    Submitted 16 May, 2022; originally announced May 2022.

  8. arXiv:2202.02842  [pdf, other

    cs.CL cs.LG

    Evaluating natural language processing models with generalization metrics that do not need access to any training or testing data

    Authors: Yaoqing Yang, Ryan Theisen, Liam Hodgkinson, Joseph E. Gonzalez, Kannan Ramchandran, Charles H. Martin, Michael W. Mahoney

    Abstract: Selecting suitable architecture parameters and training hyperparameters is essential for enhancing machine learning (ML) model performance. Several recent empirical studies conduct large-scale correlational analysis on neural networks (NNs) to search for effective \emph{generalization metrics} that can guide this type of model selection. Effective metrics are typically expected to correlate strong… ▽ More

    Submitted 4 June, 2023; v1 submitted 6 February, 2022; originally announced February 2022.

    Journal ref: Proceedings of the 29th ACM SIGKDD international conference on knowledge discovery and data mining (2023)

  9. arXiv:2108.00781  [pdf, other

    stat.ML cs.LG

    Generalization Bounds using Lower Tail Exponents in Stochastic Optimizers

    Authors: Liam Hodgkinson, Umut Şimşekli, Rajiv Khanna, Michael W. Mahoney

    Abstract: Despite the ubiquitous use of stochastic optimization algorithms in machine learning, the precise impact of these algorithms and their dynamics on generalization performance in realistic non-convex settings is still poorly understood. While recent work has revealed connections between generalization and heavy-tailed behavior in stochastic optimization, this work mainly relied on continuous-time ap… ▽ More

    Submitted 11 July, 2022; v1 submitted 2 August, 2021; originally announced August 2021.

    Comments: 22 pages, 6 figures

  10. arXiv:2107.11228  [pdf, other

    cs.LG

    Taxonomizing local versus global structure in neural network loss landscapes

    Authors: Yaoqing Yang, Liam Hodgkinson, Ryan Theisen, Joe Zou, Joseph E. Gonzalez, Kannan Ramchandran, Michael W. Mahoney

    Abstract: Viewing neural network models in terms of their loss landscapes has a long history in the statistical mechanics approach to learning, and in recent years it has received attention within machine learning proper. Among other things, local metrics (such as the smoothness of the loss landscape) have been shown to correlate with global properties of the model (such as good generalization performance).… ▽ More

    Submitted 12 December, 2021; v1 submitted 23 July, 2021; originally announced July 2021.

    Journal ref: Thirty-fifth Annual Conference on Neural Information Processing Systems, 2021

  11. arXiv:2106.10820  [pdf, other

    cs.LG stat.ML

    Stateful ODE-Nets using Basis Function Expansions

    Authors: Alejandro Queiruga, N. Benjamin Erichson, Liam Hodgkinson, Michael W. Mahoney

    Abstract: The recently-introduced class of ordinary differential equation networks (ODE-Nets) establishes a fruitful connection between deep learning and dynamical systems. In this work, we reconsider formulations of the weights as continuous-in-depth functions using linear combinations of basis functions which enables us to leverage parameter transformations such as function projections. In turn, this view… ▽ More

    Submitted 6 November, 2021; v1 submitted 20 June, 2021; originally announced June 2021.

    Comments: Accepted at 35th Conference on Neural Information Processing Systems (NeurIPS 2021)

  12. arXiv:2102.04877  [pdf, other

    stat.ML cs.LG math.DS math.PR

    Noisy Recurrent Neural Networks

    Authors: Soon Hoe Lim, N. Benjamin Erichson, Liam Hodgkinson, Michael W. Mahoney

    Abstract: We provide a general framework for studying recurrent neural networks (RNNs) trained by injecting noise into hidden states. Specifically, we consider RNNs that can be viewed as discretizations of stochastic differential equations driven by input data. This framework allows us to study the implicit regularization effect of general noise injection schemes by deriving an approximate explicit regulari… ▽ More

    Submitted 1 December, 2021; v1 submitted 9 February, 2021; originally announced February 2021.

    Comments: 38 pages

    Journal ref: NeurIPS 2021 (https://proceedings.neurips.cc/paper/2021/hash/29301521774ff3cbd26652b2d5c95996-Abstract.html)

  13. arXiv:2006.12070  [pdf, other

    cs.LG math.DS stat.ML

    Lipschitz Recurrent Neural Networks

    Authors: N. Benjamin Erichson, Omri Azencot, Alejandro Queiruga, Liam Hodgkinson, Michael W. Mahoney

    Abstract: Viewing recurrent neural networks (RNNs) as continuous-time dynamical systems, we propose a recurrent unit that describes the hidden state's evolution with two parts: a well-understood linear component plus a Lipschitz nonlinearity. This particular functional form facilitates stability analysis of the long-term behavior of the recurrent unit using tools from nonlinear systems theory. In turn, this… ▽ More

    Submitted 23 April, 2021; v1 submitted 22 June, 2020; originally announced June 2020.

    Comments: Published as a conference paper at ICLR 2021

  14. arXiv:2006.06293  [pdf, other

    stat.ML cs.LG math.OC math.ST

    Multiplicative noise and heavy tails in stochastic optimization

    Authors: Liam Hodgkinson, Michael W. Mahoney

    Abstract: Although stochastic optimization is central to modern machine learning, the precise mechanisms underlying its success, and in particular, the precise role of the stochasticity, still remain unclear. Modelling stochastic optimization algorithms as discrete random recurrence relations, we show that multiplicative noise, as it commonly arises due to variance in local rates of convergence, results in… ▽ More

    Submitted 11 June, 2020; originally announced June 2020.

    Comments: 30 pages, 7 figures

  15. arXiv:2002.09547  [pdf, other

    stat.ML cs.LG

    Stochastic Normalizing Flows

    Authors: Liam Hodgkinson, Chris van der Heide, Fred Roosta, Michael W. Mahoney

    Abstract: We introduce stochastic normalizing flows, an extension of continuous normalizing flows for maximum likelihood estimation and variational inference (VI) using stochastic differential equations (SDEs). Using the theory of rough paths, the underlying Brownian motion is treated as a latent variable and approximated, enabling efficient training of neural SDEs as random neural ordinary differential equ… ▽ More

    Submitted 25 February, 2020; v1 submitted 21 February, 2020; originally announced February 2020.

    Comments: 17 pages, 4 figures

  16. arXiv:2001.09266  [pdf, other

    math.ST stat.ML

    The reproducing Stein kernel approach for post-hoc corrected sampling

    Authors: Liam Hodgkinson, Robert Salomone, Fred Roosta

    Abstract: Stein importance sampling is a widely applicable technique based on kernelized Stein discrepancy, which corrects the output of approximate sampling algorithms by reweighting the empirical distribution of the samples. A general analysis of this technique is conducted for the previously unconsidered setting where samples are obtained via the simulation of a Markov chain, and applies to an arbitrary… ▽ More

    Submitted 13 September, 2021; v1 submitted 25 January, 2020; originally announced January 2020.

    Comments: 26 pages, 2 figures

    MSC Class: 65C05 (Primary) 60J22; 60B10 (Secondary)

  17. arXiv:1910.03725  [pdf, other

    math.PR

    Fast approximate simulation of finite long-range spin systems

    Authors: Ross McVinish, Liam Hodgkinson

    Abstract: Tau leaping is a popular method for performing fast approximate simulation of certain continuous time Markov chain models typically found in chemistry and biochemistry. This method is known to perform well when the transition rates satisfy some form of scaling behaviour. In a similar spirit to tau leaping, we propose a new method for approximate simulation of spin systems which approximates the ev… ▽ More

    Submitted 8 October, 2019; originally announced October 2019.

    MSC Class: 60H35

  18. arXiv:1907.08410  [pdf, other

    stat.ML cs.LG

    Geometric Rates of Convergence for Kernel-based Sampling Algorithms

    Authors: Rajiv Khanna, Liam Hodgkinson, Michael W. Mahoney

    Abstract: The rate of convergence of weighted kernel herding (WKH) and sequential Bayesian quadrature (SBQ), two kernel-based sampling algorithms for estimating integrals with respect to some target probability measure, is investigated. Under verifiable conditions on the chosen kernel and target measure, we establish a near-geometric rate of convergence for target measures that are nearly atomic. Furthermor… ▽ More

    Submitted 31 October, 2021; v1 submitted 19 July, 2019; originally announced July 2019.

    Comments: Accepted to UAI 2021 (Oral)

  19. arXiv:1903.12322  [pdf, other

    stat.ML cs.LG stat.CO

    Implicit Langevin Algorithms for Sampling From Log-concave Densities

    Authors: Liam Hodgkinson, Robert Salomone, Fred Roosta

    Abstract: For sampling from a log-concave density, we study implicit integrators resulting from $θ$-method discretization of the overdamped Langevin diffusion stochastic differential equation. Theoretical and algorithmic properties of the resulting sampling methods for $ θ\in [0,1] $ and a range of step sizes are established. Our results generalize and extend prior works in several directions. In particular… ▽ More

    Submitted 10 July, 2021; v1 submitted 28 March, 2019; originally announced March 2019.

  20. arXiv:1801.00542  [pdf, ps, other

    math.PR

    Normal approximations for discrete-time occupancy processes

    Authors: Liam Hodgkinson, Ross McVinish, Philip K. Pollett

    Abstract: We study normal approximations for a class of discrete-time occupancy processes, namely, Markov chains with transition kernels of product Bernoulli form. This class encompasses numerous models which appear in the complex networks literature, including stochastic patch occupancy models in ecology, network models in epidemiology, and a variety of dynamic random graph models. Bounds on the rate of co… ▽ More

    Submitted 10 November, 2018; v1 submitted 1 January, 2018; originally announced January 2018.

    Comments: 35 pages. Changed title, revised abstract and introduction, background material moved to appendix

    MSC Class: 60J10 (Primary) 60F05; 60F25; 92D30; 92D40 (Secondary)

  21. arXiv:1406.2688  [pdf, other

    quant-ph gr-qc hep-th

    Unruh-DeWitt detector response along static and circular geodesic trajectories for Schwarzschild-AdS black holes

    Authors: Keith K. Ng, Lee Hodgkinson, Jorma Louko, Robert B. Mann, Eduardo Martin-Martinez

    Abstract: We present novel methods to numerically address the problem of characterizing the response of particle detectors in curved spacetimes. These methods allow for the integration of the Wightman function, at least in principle, in rather general backgrounds. In particular we will use this tool to further understand the nature of conformal massless scalar Hawking radiation from a Schwarzschild black ho… ▽ More

    Submitted 15 September, 2014; v1 submitted 10 June, 2014; originally announced June 2014.

    Comments: 13 pages, 12 figures. RevTex 4.1. v2 Updated to published version

    Journal ref: Phys. Rev. D 90, 064003 (2014)

  22. Static detectors and circular-geodesic detectors on the Schwarzschild black hole

    Authors: Lee Hodgkinson, Jorma Louko, Adrian C. Ottewill

    Abstract: We examine the response of an Unruh-DeWitt particle detector coupled to a massless scalar field on the (3+1)-dimensional Schwarzschild spacetime, in the Boulware, Hartle-Hawking and Unruh states, for static detectors and detectors on circular geodesics, by primarily numerical methods. For the static detector, the response in the Hartle-Hawking state exhibits the known thermality at the local Hawki… ▽ More

    Submitted 5 May, 2014; v1 submitted 12 January, 2014; originally announced January 2014.

    Comments: 53 pages, several figures. v2: correspondence with [28] clarified. v3: improved figures, minor clarifications. Published version

    Journal ref: Phys. Rev. D 89, 104002 (2014)

  23. arXiv:1309.7281  [pdf, ps, other

    gr-qc

    Particle detectors in curved spacetime quantum field theory

    Authors: Lee Hodgkinson

    Abstract: Unruh-DeWitt particle detector models are studied in a variety of time-dependent and time-independent settings. We work within the framework of first-order perturbation theory and couple the detector to a massless scalar field. The necessity of switching on (off) the detector smoothly is emphasised throughout, and the transition rate is found by taking the sharp-switching limit of the regulator-fr… ▽ More

    Submitted 15 October, 2013; v1 submitted 27 September, 2013; originally announced September 2013.

    Comments: v2. Ph.D. thesis, University of Nottingham, 232 pages (2013), Advisor: Jorma Louko

  24. arXiv:1208.3165  [pdf, ps, other

    gr-qc hep-th

    Unruh-DeWitt detector on the BTZ black hole

    Authors: Lee Hodgkinson, Jorma Louko

    Abstract: We examine an Unruh-DeWitt particle detector coupled to a scalar field in three-dimensional curved spacetime, within first-order perturbation theory. We first obtain a causal and manifestly regular expression for the instantaneous transition rate in an arbitrary Hadamard state. We then specialise to the Bañados-Teitelboim-Zanelli black hole and to a massless conformally coupled field in the Hartle… ▽ More

    Submitted 15 August, 2012; originally announced August 2012.

    Comments: 8 pages 4 figures. Talk given talk given by L.H. at "Relativity and Gravitation:100 Years after Einstein in Prague", Prague, 25 June -- 29 June 2012

    Report number: NSF-KITP-12-149

  25. Static, stationary and inertial Unruh-DeWitt detectors on the BTZ black hole

    Authors: Lee Hodgkinson, Jorma Louko

    Abstract: We examine an Unruh-DeWitt particle detector coupled to a scalar field in three-dimensional curved spacetime. We first obtain a regulator-free expression for the transition probability in an arbitrary Hadamard state, working within first-order perturbation theory and assuming smooth switching, and we show that both the transition probability and the instantaneous transition rate remain well define… ▽ More

    Submitted 4 October, 2012; v1 submitted 10 June, 2012; originally announced June 2012.

    Comments: 31 pages, 28 figures. v3: minor corrections and clarifications

    Report number: NSF-KITP-12-121

    Journal ref: Phys. Rev. D 86, 064031 (2012)

  26. How often does the Unruh-DeWitt detector click beyond four dimensions?

    Authors: Lee Hodgkinson, Jorma Louko

    Abstract: We analyse the response of an arbitrarily-accelerated Unruh-DeWitt detector coupled to a massless scalar field in Minkowski spacetimes of dimensions up to six, working within first-order perturbation theory and assuming a smooth switch-on and switch-off. We express the total transition probability as a manifestly finite and regulator-free integral formula. In the sharp switching limit, the transit… ▽ More

    Submitted 6 August, 2012; v1 submitted 20 September, 2011; originally announced September 2011.

    Comments: 30 pages. v3: presentational improvement. Published version

    Journal ref: J. Math. Phys. 53, 082301 (2012)

  27. Reinstating the 'no-lose' theorem for NMSSM Higgs discovery at the LHC

    Authors: J. R. Forshaw, J. F. Gunion, L. Hodgkinson, A. Papaefstathiou, A. D. Pilkington

    Abstract: The simplest supersymmetric model that solves the mu problem and in which the GUT-scale parameters need not be finely tuned in order to predict the correct value of the Z boson mass at low scales is the Next-to-Minimal Supersymmetric Standard Model (NMSSM). However, in order that fine tuning be absent, the lightest CP-even Higgs boson h should have mass ~100 GeV and SM couplings to gauge bosons… ▽ More

    Submitted 27 March, 2008; v1 submitted 20 December, 2007; originally announced December 2007.

    Comments: 23 pages

    Report number: MAN/HEP/2007/44, UCD-07-03

    Journal ref: JHEP 0804:090,2008