Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Continuous monitoring for changepoints in data streams using adaptive estimation

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

Data streams are characterised by a potentially unending sequence of high-frequency observations which are subject to unknown temporal variation. Many modern streaming applications demand the capability to sequentially detect changes as soon as possible after they occur, while continuing to monitor the stream as it evolves. We refer to this problem as continuous monitoring. Sequential algorithms such as CUSUM, EWMA and their more sophisticated variants usually require a pair of parameters to be selected for practical application. However, the choice of parameter values is often based on the anticipated size of the changes and a given choice is unlikely to be optimal for the multiple change sizes which are likely to occur in a streaming data context. To address this critical issue, we introduce a changepoint detection framework based on adaptive forgetting factors that, instead of multiple control parameters, only requires a single parameter to be selected. Simulated results demonstrate that this framework has utility in a continuous monitoring setting. In particular, it reduces the burden of selecting parameters in advance. Moreover, the methodology is demonstrated on real data arising from Foreign Exchange markets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Notes

  1. Note that the empty product has value: \(\prod _{p=N}^{N-1}\lambda _p = 1\).

  2. Note that since it is an offline method it does not make sense to compute performance measures such as ARL0 and ARL1 for comparison with AFF, CUSUM and EWMA.

References

  • Adams, N.M., Tasoulis, D.K., Anagnostopoulos, C., Hand, D.J.: Temporally-adaptive linear classification for handling population drift in credit scoring. In: Lechevallier, Y., Saporta, G. (eds.) COMPSTAT2010, Proceedings of the 19th International Conference on Computational Statistics, pp 167–176. Springer, Berlin (2010)

  • Aggarwal, C.C. (ed.): Data Streams: Models and Algorithms. Springer, Berlin (2006)

    MATH  Google Scholar 

  • Anagnostopoulos, C.: A statistical framework for streaming data analysis. PhD thesis, Imperial College London (2010)

  • Anagnostopoulos, C., Tasoulis, D.K., Adams, N.M., Pavlidis, N.G., Hand, D.J.: Online linear and quadratic discriminant analysis with adaptive forgetting for streaming classification. Stat. Anal. Data Mining 5(2), 139–166 (2012)

    Article  MathSciNet  Google Scholar 

  • Apley, D.W., Chin, C.H.: An optimal filter design approach to statistical process control. J. Qual. Technol. 39(2), 93–117 (2007)

    Google Scholar 

  • Appel, U., Brandt, A.V.: Adaptive sequential segmentation of piecewise stationary time series. Inf. Sci. 29(1), 27–56 (1983)

    Article  MATH  Google Scholar 

  • Åström, K., Borisson, U., Ljung, L., Wittenmark, B.: Theory and applications of self-tuning regulators. Automatica 13(5), 457–476 (1977)

    Article  MATH  Google Scholar 

  • Åström, K.J., Wittenmark, B.: On self tuning regulators. Automatica 9(2), 185–199 (1973)

    Article  MATH  Google Scholar 

  • Basseville, M., Nikiforov, I.V.: Detection of Abrupt Changes: Theory and Application. Prentice Hall, Englewood Cliffs (1993)

    Google Scholar 

  • Bodenham, D.A., Adams, N.M.: Continuous monitoring of a computer network using multivariate adaptive estimation. In: IEEE 13th International Conference on Data Mining Workshops (ICDMW), pp 311–388 (2013)

  • Bodenham, D.A., Adams, N.M.: Adaptive change detection for relay-like behaviour. In: IEEE Joint Information and Security Informatics Conference (2014)

  • Borkar, V.S.: Stochastic Approximation: A Dynamical Systems Viewpoint. Cambridge University Press, Cambridge (2008)

    MATH  Google Scholar 

  • Capizzi, G., Masarotto, G.: An adaptive exponentially weighted moving average control chart. Technometrics 45(3), 199–207 (2003)

    Article  MathSciNet  Google Scholar 

  • Capizzi, G., Masarotto, G.: Self-starting CUSCORE control charts for individual multivariate observations. J. Qual. Technol. 42(2), 136–152 (2010)

    Google Scholar 

  • Capizzi, G., Masarotto, G.: Adaptive generalized likelihood ratio control charts for detecting unknown patterned mean shifts. J. Qual. Technol. 44(4), 281–303 (2012)

    Google Scholar 

  • Choi, S.W., Martin, E.B., Morris, A.J., Lee, I.B.: Adaptive multivariate statistical process control for monitoring time-varying processes. Ind. Eng. Chem. Res. 45(9), 3108–3118 (2006)

    Article  Google Scholar 

  • Fortescue, T., Kershenbaum, L., Ydstie, B.: Implementation of self-tuning regulators with variable forgetting factors. Automatica 17(6), 831–835 (1981)

    Article  Google Scholar 

  • Fraker, S.E., Woodall, W.H., Mousavi, S.: Performance metrics for surveillance schemes. Qual. Eng. 20(4), 451–464 (2008)

    Article  Google Scholar 

  • Frisén, M.: Statistical surveillance. Optimality and methods. Int. Stat. Rev. 71(2), 403–434 (2003)

    Article  MATH  Google Scholar 

  • Gama, J.: Knowledge Discovery from Data Streams. Chapman Hall, Boca Raton (2010)

    Book  MATH  Google Scholar 

  • German, R.R., Lee, L.M., Horan, J.M., Milstein, R.L., Pertowski, C.A., Waller, M.N.: Updated guidelines for evaluating public health surveillance systems. Morb. Mortal. Wkly. Rep. 50, 1–35 (2001)

    Google Scholar 

  • Gustafsson, F.: Adaptive Filtering and Change Detection. Wiley, New York (2000)

    Google Scholar 

  • Hawkins, D.M.: Self-starting Cusum charts for location and scale. J. R. Stat. Soc. Ser. D 36(4), 299–316 (1987)

    Google Scholar 

  • Hawkins, D.M.: Cumulative sum control charting: an underutilized SPC tool. Qual. Eng. 5(3), 463–477 (1993)

  • Hawkins, D.M., Qiu, P., Chang, W.K.: The changepoint model for statistical process control. J. Qual. Technol. 35(4), 355–366 (2003)

    Google Scholar 

  • Haykin, S.: Adaptive Filter Theory. Prentice-Hall, Upper Saddle River (2002)

    MATH  Google Scholar 

  • Jensen, W.A., Jones-Farmer, L.A., Champ, C.W., Woodall, W.H., et al.: Effects of parameter estimation on control chart properties: a literature review. J. Qual. Technol. 38(4), 349–364 (2006)

    Google Scholar 

  • Jiang, W., Shu, W., Apley, D.W.: Adaptive cusum procedures with EWMA-based shift estimators. IIE Trans. 40(10), 992–1003 (2008)

    Article  Google Scholar 

  • Jones, L.A.: The statistical design of EWMA control charts with estimated parameters. J. Qual. Technol. 34(3), 277–288 (2002)

    Google Scholar 

  • Jones, L.A., Champ, C.W., Rigdon, S.E.: The performance of exponentially weighted moving average charts with estimated parameters. Technometrics 43(2), 156–167 (2001)

    Article  MathSciNet  Google Scholar 

  • Jones, L.A., Champ, C.W., Rigdon, S.E.: The run length distribution of the CUSUM with estimated parameters. J. Qual. Technol. 36(1), 95–108 (2004)

    Google Scholar 

  • Kalman, R.E.: A new approach to linear filtering and prediction problems. J. Basic Eng. 82(1), 35–45 (1960)

    Article  Google Scholar 

  • Kifer, D., Ben-David, S., Gehrke, J.: Detecting change in data streams. In: Proceedings of the 13th international conference on Very large data bases-Volume 30, VLDB Endowment, pp. 180–191 (2004)

  • Killick, R., Eckley, I.A.: Changepoint: An R Package for Changepoint Analysis. Lancaster University, Lancaster (2011)

    Google Scholar 

  • Killick, R., Fearnhead, P., Eckley, I.A.: Optimal detection of changepoints with a linear computational cost. J. Am. Stat. Assoc. 107(500), 1590–1598 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  • Lorden, G.: Procedures for reacting to a change in distribution. Ann. Math. Stat. 1(6), 1897–1908 (1971)

    Article  MathSciNet  MATH  Google Scholar 

  • Lucas, J.M.: The design and use of V-mask control schemes. J. Qual. Technol. 8, 1–11 (1976)

    Google Scholar 

  • Lucas, J.M., Saccucci, M.S.: Exponentially weighted moving average control schemes: properties and enhancements. Technometrics 32(1), 1–12 (1990)

    Article  MathSciNet  Google Scholar 

  • Maboudou-Tchao, E.M., Hawkins, D.M.: Detection of multiple change-points in multivariate data. J. Appl. Stat. 40(9), 1979–1995 (2013)

    Article  MathSciNet  Google Scholar 

  • Moustakides, G.V.: Optimal stopping times for detecting changes in distributions. Ann. Stat. 14(4), 1379–1387 (1986)

    Article  MathSciNet  MATH  Google Scholar 

  • Page, E.: Continuous inspection schemes. Biometrika 41(1/2), 100–115 (1954)

    Article  MathSciNet  MATH  Google Scholar 

  • Pavlidis, N.G., Tasoulis, D.K., Adams, N.M., Hand, D.J.: lambda-perceptron: an adaptive classifier for data streams. Pattern Recogn. 44(1), 78–96 (2011)

    Article  MATH  Google Scholar 

  • Roberts, S.W.: Control chart tests based on geometric moving averages. Technometrics 1(3), 239–250 (1959)

    Article  Google Scholar 

  • Ross, G.J., Adams, N.M., Tasoulis, D.K.: Nonparametric monitoring of data streams for changes in location and scale. Technometrics 53(4), 379–389 (2011)

    Article  MathSciNet  Google Scholar 

  • Sullivan, J.H.: Detection of multiple change points from clustering individual observations. J. Qual. Control 34(4), 371–383 (2002)

    Google Scholar 

  • Tsung, F., Wang, T.: Adaptive charting techniques: literature review and extensions. In: Lenz, H. (ed.) Frontiers in Statistical Quality Control, vol. 9, pp. 19–35. Springer, Berlin (2010)

    Google Scholar 

  • Wickham, H.: ggplot2: Elegant Graphics for Data Analysis. Springer, New York (2009)

    Book  MATH  Google Scholar 

  • Xie, Y., Sigmund, D.: Sequential multi-sensor change-point detection. Ann. Stat. 41(2), 670–692 (2013)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

The work of Dean Bodenham was fully supported by a Roth Studentship provided by the Department of Mathematics, Imperial College, London. The authors would like to thank C. Anagnostopoulos, D. J. Hand, N. A. Heard, G. J. Ross, W. H. Woodall and the three anonymous referees for their helpful comments which improved the manuscript. All figures were created in R using the ggplot2 package (Wickham 2009). Finally, we note that an R package ffstream implementing the AFF algorithm is in preparation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dean A. Bodenham.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 1006 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bodenham, D.A., Adams, N.M. Continuous monitoring for changepoints in data streams using adaptive estimation. Stat Comput 27, 1257–1270 (2017). https://doi.org/10.1007/s11222-016-9684-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-016-9684-8

Keywords