Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Practical and Powerful Kernel-Based Change-Point Detection

Published: 01 January 2024 Publication History

Abstract

Change-point analysis plays a significant role in various fields to reveal discrepancies in distribution in a sequence of observations. While a number of algorithms have been proposed for high-dimensional data, kernel-based methods have not been well explored due to difficulties in controlling false discoveries and mediocre performance. In this paper, we propose a new kernel-based framework that makes use of an important pattern of data in high dimensions to boost power. Analytic approximations to the significance of the new statistics are derived and fast tests based on the asymptotic results are proposed, offering easy off-the-shelf tools for large datasets. The new tests show superior performance for a wide range of alternatives when compared with other state-of-the-art methods. We illustrate these new approaches through an analysis of a phone-call network data. All proposed methods are implemented in an R package kerSeg.

References

[1]
L. Kendrick, K. Musial, and B. Gabrys, “Change point detection in social networks—Critical review with experiments,” Comput. Sci. Rev., vol. 29, pp. 1–13, Aug. 2018.
[2]
Y. Wang, A. Chakrabarti, D. Sivakoff, and S. Parthasarathy, “Fast change point detection on dynamic social networks,” 2017,.
[3]
M. Staudacher, S. Telser, A. Amann, H. Hinterhuber, and M. Ritsch-Marte, “A new method for change-point detection developed for on-line analysis of the heart beat variability during sleep,” Physica A Statist. Mech. Appl., vol. 349, nos. 3–4, pp. 582–596, 2005.
[4]
Y. Xu and M. A. Lindquist, “Dynamic connectivity detection: An algorithm for determining functional connectivity change points in FMRI data,” Frontiers Neurosci., vol. 9, 2015, Art. no.
[5]
H. Chen, S. Chen, and X. Deng, “A universal nonparametric event detection framework for neuropixels data,” bioRxiv, p. 650671, May 2019.
[6]
R. J. Radke, S. Andra, O. Al-Kofahi, and B. Roysam, “Image change detection algorithms: A systematic survey,” IEEE Trans. Image Process., vol. 14, no. 3, pp. 294–307, Mar. 2005.
[7]
A. G. Tartakovsky, A. S. Polunchenko, and G. Sokolov, “Efficient computer network anomaly detection by changepoint detection methods,” IEEE J. Sel. Topics Signal Process., vol. 7, no. 1, pp. 4–11, Feb. 2013.
[8]
T. Wang and R. J. Samworth, “High dimensional change point estimation via sparse projection,” J. Roy. Statist. Soc.: Ser. B (Statist. Methodol.), vol. 80, no. 1, pp. 57–83, 2018.
[9]
Y. Chen, R. S. Blum, and B. M. Sadler, “Optimal quickest change detection in sensor networks using ordered transmissions,” in Proc. IEEE 21st Int. Workshop Signal Process. Adv. Wireless Commun. (SPAWC), Piscataway, NJ, USA: IEEE Press, 2020, pp. 1–5.
[10]
Y. Zhang, R. Wang, and X. Shao, “Adaptive inference for change points in high-dimensional data,” J. Amer. Statist. Assoc., vol. 117, no. 540, pp. 1751–1762, Oct. 2022.
[11]
Y. Chen, R. S. Blum, and B. M. Sadler, “Ordering for communication-efficient quickest change detection in a decomposable graphical model,” IEEE Trans. Signal Process., vol. 69, pp. 4710–4723, 2021.
[12]
F. Jiang, R. Wang, and X. Shao, “Robust inference for change points in high dimension,” J. Multivariate Anal., vol. 193, 2023, Art. no.
[13]
A. Lung-Yut-Fong, C. Lévy-Leduc, and O. Cappé, “Homogeneity and change-point detection tests for multivariate data using rank statistics,” 2011,.
[14]
D. S. Matteson and N. A. James, “A nonparametric approach for multiple change point analysis of multivariate data,” J. Amer. Statist. Assoc., vol. 109, no. 505, pp. 334–345, 2014.
[15]
J. Li, “Asymptotic distribution-free change-point detection based on interpoint distances for high-dimensional data,” J. Nonparametric Statist., vol. 32, no. 1, pp. 157–184, 2020.
[16]
H. Chen and N. Zhang, “Graph-based change-point detection,” Ann. Statist., vol. 43, no. 1, pp. 139–176, 2015.
[17]
L. Chu and H. Chen, “Asymptotic distribution-free change-point detection for multivariate and non-Euclidean data,” Ann. Statist., vol. 47, no. 1, pp. 382–414, 2019.
[18]
Y.-W. Liu and H. Chen, “A fast and efficient change-point detection framework based on approximate $k$-nearest neighbor graphs,” IEEE Trans. Signal Process., vol. 70, pp. 1976–1986, 2022.
[19]
D. Zhou and H. Chen, “Asymptotic distribution-free change-point detection for modern data based on a new ranking scheme,” 2022,.
[20]
H. Chen and L. Chu, “Graph-based change-point analysis,” Annu. Rev. Statist. Appl., vol. 10, no. 1, pp. 475–499, 2023.
[21]
P. Dubey and H.-G. Müller, “Fréchet change-point detection,” Ann. Statist., vol. 48, no. 6, pp. 3312–3335, 2020.
[22]
A. Gretton, K. M. Borgwardt, M. Rasch, B. Schölkopf, and A. J. Smola, “A kernel method for the two-sample-problem,” in Proc. Adv. Neural Inf. Process. Syst., 2007, pp. 513–520.
[23]
A. Gretton, K. Fukumizu, Z. Harchaoui, and B. K. Sriperumbudur, “A fast, consistent kernel two-sample test,” in Proc. Adv. Neural Inf. Process. Syst., 2009, pp. 673–681.
[24]
A. Gretton et al., “A kernel two-sample test,” J. Mach. Learn. Res., vol. 13, pp. 723–773, Mar. 2012.
[25]
A. Gretton et al., “Optimal kernel choice for large-scale two-sample tests,” in Proc. Adv. Neural Inf. Process. Syst., 2012, pp. 1205–1213.
[26]
Z. Harchaoui and O. Cappé, “Retrospective mutiple change-point estimation with kernels,” in Proc. IEEE/SP 14th Workshop Statist. Signal Process., Piscataway, NJ, USA: IEEE Press, 2007, pp. 768–772.
[27]
Z. Harchaoui, E. Moulines, and F. R. Bach, “Kernel change-point analysis,” in Proc. Adv. Neural Inf. Process. Syst., 2009, pp. 609–616.
[28]
S. Li, Y. Xie, H. Dai, and L. Song, “M-statistic for kernel change-point detection,” in Proc. Adv. Neural Inf. Process. Syst., 2015, pp. 3366–3374.
[29]
W. Zaremba, A. Gretton, and M. Blaschko, “B-test: A non-parametric, low variance kernel two-sample test,” in Proc. Adv. Neural Inf. Process. Syst., 2013, pp. 755–763.
[30]
S. Huang, Z. Kong, and W. Huang, “High-dimensional process monitoring and change point detection using embedding distributions in reproducing kernel Hilbert space,” IIE Trans., vol. 46, no. 10, pp. 999–1016, 2014.
[31]
W.-C. Chang, C.-L. Li, Y. Yang, and B. Póczos, “Kernel change-point detection with auxiliary deep generative models,” 2019,.
[32]
S. Arlot, A. Celisse, and Z. Harchaoui, “A kernel multiple change-point algorithm via model selection,” J. Mach. Learn. Res., vol. 20, no. 162, pp. 1–56, 2019.
[33]
H. Song and H. Chen, “Generalized kernel two-sample tests,” Biometrika, vol. 111, no. 3, pp. 755–770, 2024.
[34]
H. Chen and J. H. Friedman, “A new graph-based two-sample test for multivariate and object data,” J. Amer. Statist. Assoc., vol. 112, no. 517, pp. 397–409, 2017.
[35]
H. Chen, X. Chen, and Y. Su, “A weighted edge-count two-sample test for multivariate and object data,” J. Amer. Statist. Assoc., vol. 113, no. 523, pp. 1146–1155, 2018.
[36]
D. T. Pham, J. Möcks, and L. Sroka, “Asymptotic normality of double-indexed linear permutation statistics,” Ann. Inst. Statist. Math., vol. 41, no. 3, pp. 415–427, 1989.
[37]
W. Hoeffding, “A combinatorial central limit theorem,” Ann. Math. Statist., pp. 558–566, Dec. 1951.
[38]
D. Siegmund and B. Yakir, The Statistics of Gene Mapping. Springer-Verlag, New York, NY, USA: 2007.
[39]
N. A. James and D. S. Matteson, “ecp: An R package for nonparametric multiple change point analysis of multivariate data,” 2013,.
[40]
N. Eagle, A. S. Pentland, and D. Lazer, “Inferring friendship network structure by using mobile phone data,” Proc. Nat. Acad. Sci., vol. 106, no. 36, pp. 15274–15278, 2009.
[41]
H. Song and H. Chen, “Asymptotic distribution-free changepoint detection for data with repeated observations,” Biometrika, vol. 109, no. 3, pp. 783–798, 2022.
[42]
J. Zhang and H. Chen, “Graph-based two-sample tests for data with repeated observations,” Statist. Sinica, vol. 32, no. 1, pp. 391–415, 2022.
[43]
H. W. Block, T. H. Savits, and M. Shaked, “Some concepts of negative dependence,” Ann. Probability, vol. 10, no. 3, pp. 765–772, 1982.
[44]
Y. Hochberg and D. Rom, “Extensions of multiple testing procedures based on Simes’ test,” J. Statist. Planning Inference, vol. 48, no. 2, pp. 141–152, 1995.
[45]
E. Samuel-Cahn, “Is the Simes improved Bonferroni procedure conservative?” Biometrika, vol. 83, no. 4, pp. 928–933, 1996.
[46]
S. K. Sarkar and C.-K. Chang, “The Simes method for multiple hypothesis testing with positively dependent test statistics,” J. Amer. Statist. Assoc., vol. 92, no. 440, pp. 1601–1608, 1997.
[47]
H. W. Block, T. H. Savits, and J. Wang, “Negative dependence and the Simes inequality,” J. Statist. Planning Inference, vol. 138, no. 12, pp. 4107–4110, 2008.
[48]
H. Finner, M. Roters, and K. Strassburger, “On the Simes test under dependence,” Statist. Papers, vol. 58, no. 3, pp. 775–789, 2017.
[49]
J. Gou and A. C. Tamhane, “Hochberg procedure under negative dependence,” Statist. Sinica, pp. 339–362, 2018.
[50]
E. A. Rødland, “Simes’ procedure is ‘valid on average’,” Biometrika, vol. 93, no. 3, pp. 742–746, 2006.
[51]
L. Y. Vostrikova, “Detecting “disorder” in multidimensional random processes”, Dokl. Akad. Nauk. SSSR, vol. 259, no. 2, pp. 270–274, 1981.
[52]
A. B. Olshen, E. Venkatraman, R. Lucito, and M. Wigler, “Circular binary segmentation for the analysis of array-based DNA copy number data,” Biostatistics, vol. 5, no. 4, pp. 557–572, 2004.
[53]
P. Fryzlewicz et al., “Wild binary segmentation for multiple change-point detection,” Ann. Statist., vol. 42, no. 6, pp. 2243–2281, 2014.
[54]
H. Chen, “Sequential change-point detection based on nearest neighbors,” Ann. Statist., vol. 47, no. 3, pp. 1381–1407, 2019.
[55]
L. Chu and H. Chen, “Sequential change-point detection for high-dimensional and non-Euclidean data,” IEEE Trans. Signal Process., vol. 70, pp. 4498–4511, 2022.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Signal Processing
IEEE Transactions on Signal Processing  Volume 72, Issue
2024
5568 pages

Publisher

IEEE Press

Publication History

Published: 01 January 2024

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 12 Feb 2025

Other Metrics

Citations

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media