$k$-Nearest Neighbor Graphs | IEEE Transactions on Signal Processing"/>
  Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

A Fast and Efficient Change-Point Detection Framework Based on Approximate <inline-formula><tex-math notation="LaTeX">$k$</tex-math></inline-formula>-Nearest Neighbor Graphs

Published: 01 January 2022 Publication History

Abstract

Change-point analysis is thriving in this Big Data era to address problems arising in many fields where massive data sequences are collected to study complicated phenomena over time. It plays an important role in processing these data by segmenting a long sequence into homogeneous parts for follow-up studies. The task requires the method to be able to process large datasets quickly and deal with various types of changes for high-dimensional data. We propose a new approach making use of approximate <inline-formula><tex-math notation="LaTeX">$k$</tex-math></inline-formula>-nearest neighbor information from the observations, and derive an analytic formula to control the type I error. The time complexity of our proposed method is <inline-formula><tex-math notation="LaTeX">$O(dn(\log n+k \log d)+nk^{2})$</tex-math></inline-formula> for an <inline-formula><tex-math notation="LaTeX">$n$</tex-math></inline-formula>-length sequence of <inline-formula><tex-math notation="LaTeX">$d$</tex-math></inline-formula>-dimensional data. The test statistic we consider incorporates a useful pattern for moderate- to high- dimensional data so that the proposed method could detect various types of changes in the sequence. The new approach is also asymptotic distribution free, facilitating its usage for a broader community. We apply our method to fMRI datasets and Neuropixels datasets to illustrate its effectiveness.

References

[1]
M. Basseville and I. V. Nikiforov, Detection of Abrupt Changes: Theory and Application. Hoboken, NJ, USA: Prentice hall Englewood Cliffs, 1993, vol. 104.
[2]
B. Brodsky and B. Darkhovsky, “Applications of nonparametric change-point detection methods,” in Nonparametric Methods Change-Point Problems, 1993, pp. 169–182.
[3]
E. G. Carlstein, H.-G. Müller, and D. Siegmund, “Change-point problems,”IMS, vol. 23, 1994.
[4]
M. Csörgö and L. Horváth, Limit Theorems in Change-Point Analysis. Wiley, 1997.
[5]
J. Chen and A. K. Gupta, Parametric Statistical Change Point Analysis: With Applications to Genetics, Medicine, and Finance. Boston, MA, USA: Birkhäuser, 2012.
[6]
J. J. Junet al., “Fully integrated silicon probes for high-density recording of neural activity,”Nature, vol. 551, no. 7679, pp. 232–236, 2017.
[7]
D. Q. Zeebaree, H. Haron, and A. M. Abdulazeez, “Gene selection and classification of microarray data using convolutional neural network,” in Proc. Int. Conf. Adv. Sci. Eng., 2018, pp. 145–150.
[8]
C. Leeet al., “Big healthcare data analytics: Challenges and applications,” in Handbook Large-Scale Distrib. Comput. Smart Healthcare, 2017, pp. 11–41.
[9]
P. Fryzlewicz, “Detecting possibly frequent change-points: Wild binary segmentation 2 and steepest-drop model selection,”J. Korean Stat. Soc., vol. 49, pp. 1027–1070, 2020.
[10]
P. Fryzlewicz, “Wild binary segmentation for multiple change-point detection,”Ann. Statist., vol. 42, no. 6, pp. 2243–2281, 2014.
[11]
S. Kovács, H. Li, P. Bühlmann, and A. Munk, “Seeded binary segmentation: A general methodology for fast and optimal change point detection,”2020, arXiv:2002.06633.
[12]
M. Barigozzi, H. Cho, and P. Fryzlewicz, “Simultaneous multiple change-point and factor analysis for high-dimensional time series,”J. Econometrics, vol. 206, no. 1, pp. 187–225, 2018.
[13]
D. Wang, Y. Yu, and A. Rinaldo, “Optimal change point detection and localization in sparse dynamic networks,”Ann. Statist., vol. 49.1, pp. 203–232, 2021.
[14]
M. Bhattacharjee, M. Banerjee, and G. Michailidis, “Change point estimation in a dynamic stochastic block model,”J. Mach. Learn. Res., vol. 21, pp. 1–59, 2020.
[15]
M. Londschien, S. Kovács, and P. Bühlmann, “Change-point detection for graphical models in the presence of missing values,”J. Comput. Graphical Statist., vol. 30, no. 3, pp. 768–779, 2021.
[16]
Z. Harchaoui and O. Cappé, “Retrospective mutiple change-point estimation with kernels,” in Proc. IEEE/SP 14th Workshop Stat. Signal Process., 2007, pp. 768–772.
[17]
Z. Harchaoui, E. Moulines, and F. R. Bach, “Kernel change-point analysis,” in Proc. Adv. Neural Inf. Process. Syst., 2009, pp. 609–616.
[18]
S. Arlot, A. Celisse, and Z. Harchaoui, “A kernel multiple change-point algorithm via model selection,”J. Mach. Learn. Res., vol. 20, no. 162, pp. 1–56, 2019. [Online]. Available: http://jmlr.org/papers/v20/16-155.html
[19]
S. Li, Y. Xie, H. Dai, and L. Song, “Scan b-statistic for kernel change-point detection,”Sequential Anal., vol. 38, no. 4, pp. 503–544, 2019.
[20]
D. S. Matteson and N. A. James, “A nonparametric approach for multiple change point analysis of multivariate data,”J. Amer. Stat. Assoc., vol. 109, no. 505, pp. 334–345, 2014.
[21]
H. Chen and N. Zhang, “Graph-based change-point detection,”Ann. Statist., vol. 43, no. 1, pp. 139–176, 2015.
[22]
L. Chu and H. Chen, “Asymptotic distribution-free change-point detection for multivariate and non-Euclidean data,”Ann. Statist., vol. 47, no. 1, pp. 382–414, 2019.
[23]
A. Beygelzimer, S. Kakadet, J. Langford, S. Arya, D. Mount, and S. Li, “Fast nearest neighbor search algorithms and applications, 2019,” [Online]. Available: https://CRAN.R-project.org/package=FNN.Rpackageversion1.1.3
[24]
S. Arya, D. M. Mount, N. S. Netanyahu, R. Silverman, and A. Y. Wu, “An optimal algorithm for approximate nearest neighbor searching fixed dimensions,”J. ACM, vol. 45, no. 6, pp. 891–923, 1998.
[25]
P. Ram and K. Sinha, “Revisiting kd-tree for nearest neighbor search,” in Proc. 25th ACM Sigkdd Int. Conf. Knowl. Discov. Data Mining, 2019, pp. 1378–1388.
[26]
H. Chen and J. H. Friedman, “A new graph-based two-sample test for multivariate and object data,”J. Amer. Stat. Assoc., vol. 112, no. 517, pp. 397–409, 2017.
[27]
M. Visconti di Oleggio Castello, V. Chauhan, G. Jiahui, and M. I. Gobbini, “An fMRI dataset in response to “The Grand Budapest Hotel,” a socially-rich, naturalistic movie,”bioRxiv, vol. 7, 2020, Art. no.
[28]
C. Stringer, M. Pachitariu, N. Steinmetz, C. B. Reddy, M. Carandini, and K. D. Harris, “Spontaneous behaviors drive multidimensional, brainwide activity,”Science, vol. 364, no. 6437, 2019, Art. no.
[29]
H. Chen, S. Chen, and X. Deng, “A universal nonparametric event detection framework for neuropixels data,”2019, bioRxiv.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Signal Processing
IEEE Transactions on Signal Processing  Volume 70, Issue
2022
2441 pages

Publisher

IEEE Press

Publication History

Published: 01 January 2022

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Feb 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media