Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Free access

Nearest neighbor dirichlet mixtures

Published: 06 March 2024 Publication History
  • Get Citation Alerts
  • Abstract

    There is a rich literature on Bayesian methods for density estimation, which characterize the unknown density as a mixture of kernels. Such methods have advantages in terms of providing uncertainty quantification in estimation, while being adaptive to a rich variety of densities. However, relative to frequentist locally adaptive kernel methods, Bayesian approaches can be slow and unstable to implement in relying on Markov chain Monte Carlo algorithms. To maintain most of the strengths of Bayesian approaches without the computational disadvantages, we propose a class of nearest neighbor-Dirichlet mixtures. The approach starts by grouping the data into neighborhoods based on standard algorithms. Within each neighborhood, the density is characterized via a Bayesian parametric model, such as a Gaussian with unknown parameters. Assigning a Dirichlet prior to the weights on these local kernels, we obtain a pseudo-posterior for the weights and kernel parameters. A simple and embarrassingly parallel Monte Carlo algorithm is proposed to sample from the resulting pseudo-posterior for the unknown density. Desirable asymptotic properties are shown, and the methods are evaluated in simulation studies and applied to a motivating data set in the context of classification.

    References

    [1]
    Ian S Abramson. On bandwidth variation in kernel estimates-a square root law. The Annals of Statistics, 10(4):1217-1223, 1982.
    [2]
    Adelchi Azzalini. The skew-normal distribution and related multivariate families. Scandinavian Journal of Statistics, 32(2):159-188, 2005.
    [3]
    Gérard Biau and Luc Devroye. Lectures on the Nearest Neighbor Method. Springer, 2015.
    [4]
    David M Blei and Michael I Jordan. Variational inference for Dirichlet process mixtures. Bayesian Analysis, 1(1):121-143, 2006.
    [5]
    Adrian W Bowman. An alternative method of cross-validation for the smoothing of density estimates. Biometrika, 71(2):353-360, 1984.
    [6]
    Leo Breiman, William Meisel, and Edward Purcell. Variable kernel estimates of multivariate densities. Technometrics, 19(2):135-144, 1977.
    [7]
    Luc Devroye and Laszlo Gyorfi. Nonparametric Density Estimation: the L1 view. Wiley Series in Probability and Statistics, 1985.
    [8]
    Or Dinari, Angel Yu, Oren Freifeld, and John Fisher. Distributed MCMC inference in Dirichlet process mixture models using Julia. In 2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), pages 518-525, 2019.
    [9]
    Tarn Duong. ks: Kernel Smoothing, 2020. URL https://CRAN.R-project.org/package=ks. R package version 1.11.7.
    [10]
    Dafydd Evans. A law of large numbers for nearest neighbour statistics. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 464(2100):3175-3192, 2008.
    [11]
    Dafydd Evans, Antonia J Jones, and Wolfgang M Schmidt. Asymptotic moments of near-neighbour distance distributions. Proceedings of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences, 458(2028):2839-2849, 2002.
    [12]
    Subhashis Ghosal and Aad van der Vaart. Posterior convergence rates of Dirichlet mixtures at smooth densities. The Annals of Statistics, 35(2):697-723, 2007.
    [13]
    Subhashis Ghosal and Aad van der Vaart. Fundamentals of Nonparametric Bayesian Inference, volume 44. Cambridge University Press, 2017.
    [14]
    Subhashis Ghosal, Jayanta K Ghosh, and RV Ramamoorthi. Posterior consistency of Dirichlet mixtures in density estimation. The Annals of Statistics, 27(1):143-158, 1999.
    [15]
    Tilmann Gneiting and Adrian E Raftery. Strictly proper scoring rules, prediction, and estimation. Journal of the American Statistical Association, 102(477):359-378, 2007.
    [16]
    Gene H Golub and Charles F van Loan. Matrix Computations. John Hopkins University Press, 3rd edition, 1996.
    [17]
    P Richard Hahn, Ryan Martin, and Stephen G Walker. On recursive Bayesian predictive distributions. Journal of the American Statistical Association, 113(523):1085-1093, 2018.
    [18]
    Peter Hall. On Kullback-Leibler loss and density estimation. The Annals of Statistics, 15 (4):1491-1519, 1987.
    [19]
    Nils Lid Hjort and M Chris Jones. Locally parametric nonparametric density estimation. The Annals of Statistics, pages 1619-1647, 1996.
    [20]
    Michael C Hughes and Erik B Sudderth. Bnpy: Reliable and scalable variational inference for Bayesian nonparametric models. In Proceedings of the NIPS Probabilistic Programimming Workshop, Montreal, QC, Canada, pages 8-13, 2014.
    [21]
    Gordon J. Ross and Dean Markwick. dirichletprocess: Build Dirichlet Process Objects for Bayesian Modelling, 2019. URL https://CRAN.R-project.org/package=dirichletprocess. R package version 0.3.1.
    [22]
    Alejandro Jara, Timothy Hanson, Fernando Quintana, Peter Müller, and Gary Rosner. DPpackage: Bayesian semi- and nonparametric modeling in R. Journal of Statistical Software, 40(5):1-30, 2011. URL http://www.jstatsoft.org/v40/i05/.
    [23]
    MJ Keith, A Jameson, W Van Straten, M Bailes, S Johnston, M Kramer, A Possenti, SD Bates, NDR Bhat, M Burgay, et al. The High Time Resolution Universe Pulsar Survey I, System configuration and initial discoveries. Monthly Notices of the Royal Astronomical Society, 409(2):619-627, 2010.
    [24]
    Kenichi Kurihara, MaxWelling, and Nikos Vlassis. Accelerated variational Dirichlet process mixtures. In B. Schölkopf, J. Platt, and T. Hoffman, editors, Advances in Neural Information Processing Systems, volume 19. MIT Press, 2006. URL https://proceedings.neurips.cc/paper/2006/file/2bd235c31c97855b7ef2dc8b414779af-Paper.pdf.
    [25]
    Michael Lavine. Some aspects of Polya tree distributions for statistical modelling. Annals of Statistics, 20(3):1222-1235, 1992.
    [26]
    Michael Lavine. More aspects of Polya tree distributions for statistical modelling. The Annals of Statistics, 22(3):1161-1176, 1994.
    [27]
    Clive Loader. Local regression and likelihood. Springer Science & Business Media, 2006.
    [28]
    Clive R Loader. Local likelihood density estimation. The Annals of Statistics, 24(4):1602- 1618, 1996.
    [29]
    Don O Loftsgaarden and Charles P Quesenberry. A nonparametric estimate of a multivariate density function. The Annals of Mathematical Statistics, 36(3):1049-1051, 1965.
    [30]
    Duncan Ross Lorimer and Michael Kramer. Handbook of Pulsar Astronomy, 2012.
    [31]
    Robert James Lyon. Why are pulsars hard to find? PhD thesis, The University of Manchester (United Kingdom), 2016.
    [32]
    Hengzhao Ma and Jianzhong Li. A true O(n log n) algorithm for the all-k-nearest-neighbors problem. In International Conference on Combinatorial Optimization and Applications, pages 362-374. Springer, 2019.
    [33]
    YP Mack and Murray Rosenblatt. Multivariate k-nearest neighbor density estimates. Journal of Multivariate Analysis, 9(1):1-15, 1979.
    [34]
    Thoralf Mildenberger and Henrike Weinert. The benchden package: Benchmark densities for nonparametric density estimation. Journal of Statistical Software, 46(14):1-14, 2012. URL http://www.jstatsoft.org/v46/i14/.
    [35]
    Jeffrey W Miller and David B. Dunson. Robust Bayesian inference via coarsening. Journal of the American Statistical Association, 114(527):1113-1125, 2019.
    [36]
    Michael A Newton. On a nonparametric recursive estimator of the mixing distribution. Sankhyā: The Indian Journal of Statistics, Series A, 64(2):306-322, 2002.
    [37]
    Michael A Newton and Yunlei Zhang. A recursive algorithm for nonparametric analysis with missing data. Biometrika, 86(1):15-26, 1999.
    [38]
    Yang Ni, Yuan Ji, and Peter Müller. Consensus Monte Carlo for random subsets using shared anchors. Journal of Computational and Graphical Statistics, 29(4):1-12, 2020.
    [39]
    Georg Pólya. Über den zentralen grenzwertsatz der wahrscheinlichkeitsrechnung und das momentenproblem. Mathematische Zeitschrift, 8(3-4):171-181, 1920.
    [40]
    R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2018. URL https://www.R-project.org/.
    [41]
    Judith Rousseau and Kerrie Mengersen. Asymptotic behaviour of the posterior distribution in overfitted mixture models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73(5):689-710, 2011.
    [42]
    David W Scott. Multivariate Density Estimation: Theory, Practice, and Visualization. John Wiley & Sons, 2015.
    [43]
    Simon J Sheather and Michael C Jones. A reliable data-based bandwidth selection method for kernel density estimation. Journal of the Royal Statistical Society: Series B (Methodological), 53(3):683-690, 1991.
    [44]
    Hanyu Song, Yingjian Wang, and David B. Dunson. Distributed Bayesian clustering using finite mixture of mixtures. arXiv preprint, page arXiv:2003.13936, 2020.
    [45]
    Peter Xue-Kun Song. Multivariate dispersion models generated from Gaussian copula. Scandinavian Journal of Statistics, 27(2):305-320, 2000.
    [46]
    George R Terrell and David W Scott. Variable kernel density estimation. The Annals of Statistics, pages 1236-1265, 1992.
    [47]
    Gerald Teschl. Mathematical methods in quantum mechanics. Graduate Studies in Mathematics, 99:106, 2009.
    [48]
    Alexandre B. Tsybakov. Introduction to Nonparametric Estimation. Springer New York, NY, 1st edition, 2009.
    [49]
    Pravin M. Vaidya. An optimal algorithm for the all-nearest-neighbors problem. In 27th Annual Symposium on Foundations of Computer Science, pages 117-122, 1986.
    [50]
    Matt P Wand and M Chris Jones. Multivariate plug-in bandwidth selection. Computational Statistics, 9(2):97-116, 1994.
    [51]
    Lianming Wang and David B. Dunson. Fast Bayesian inference in Dirichlet process mixture models. Journal of Computational and Graphical Statistics, 20(1):196-216, 2011.
    [52]
    Mike West. Hyperparameter estimation in Dirichlet process mixture models. Duke University ISDS Discussion Paper# 92-A03, 1992.
    [53]
    Wing H Wong and Li Ma. Optional Polya tree and Bayesian inference. The Annals of Statistics, 38(3):1433-1459, 2010.
    [54]
    Xiaole Zhang, David J Nott, Christopher Yau, and Ajay Jasra. A sequential algorithm for fast fitting of Dirichlet process mixture models. Journal of Computational and Graphical Statistics, 23(4):1143-1162, 2014.

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image The Journal of Machine Learning Research
    The Journal of Machine Learning Research  Volume 24, Issue 1
    January 2023
    18881 pages
    ISSN:1532-4435
    EISSN:1533-7928
    Issue’s Table of Contents
    CC-BY 4.0

    Publisher

    JMLR.org

    Publication History

    Published: 06 March 2024
    Accepted: 01 August 2023
    Received: 01 February 2021
    Published in JMLR Volume 24, Issue 1

    Author Tags

    1. Bayesian
    2. density estimation
    3. distributed computing
    4. embarrassingly parallel
    5. kernel density estimation
    6. mixture model
    7. quasi-posterior
    8. scalable

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 7
      Total Downloads
    • Downloads (Last 12 months)7
    • Downloads (Last 6 weeks)3
    Reflects downloads up to 27 Jul 2024

    Other Metrics

    Citations

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media