Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Infinite Mixtures of Markov Chains

  • Conference paper
  • First Online:
New Frontiers in Mining Complex Patterns (NFMCP 2017)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10785))

Included in the following conference series:

  • 453 Accesses

Abstract

Facilitating a satisfying user experience requires a detailed understanding of user behavior and intentions. The key is to leverage observations of activities, usually the clicks performed on Web pages. A common approach is to transform user sessions into Markov chains and analyze them using mixture models. However, model selection and interpretability of the results are often limiting factors. As a remedy, we present a Bayesian nonparametric approach to group user sessions and devise behavioral patterns. Empirical results on a social network and an electronic text book show that our approach reliably identifies underlying behavioral patterns and proves more robust than baseline competitors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    We focus on first-order dependencies but the approach is easily generalized to higher-order models; notation is quickly getting messy though.

References

  1. Mitchell, A., Olmstead, K., Purcell, K., Rainie, L., Rosenstiel, T.: Understanding the participatory news consumer (2010)

    Google Scholar 

  2. Teh, Y.W., Jordan, M.I., Beal, M.J., Blei, D.M.: Hierarchical dirichlet processes. J. Am. Stat. Assoc. 101(476), 1566–1581 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  3. Cadez, I., Heckerman, D., Meek, C., Smyth, P., White, S.: Visualization of navigation patterns on a web site using model-based clustering. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 280–284 (2000)

    Google Scholar 

  4. Ishwaran, H., Zarepour, M.: Exact and approximate sum representations for the dirichlet process. Can. J. Statistics/La Revue Canadienne de Statistique 30(2), 269–283 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  5. Schreiber, W., Sochatzy, F., Ventzke, M.: Das multimediale Schulbuch - kompetenzorientiert, individualisierbar und konstruktionstransparent. In: Analyse von Schulbüchern als Grundlage empirischer Geschichtsdidaktik, pp. 212–232 (2013)

    Google Scholar 

  6. Pirolli, P.L., Pitkow, J.E.: Distributions of surfers’ paths through the world wide web: empirical characterizations. World Wide Web 2(1–2), 29–45 (1999)

    Article  Google Scholar 

  7. Manavoglu, E., Pavlov, D., Giles, C. L.: Probabilistic user behavior models. In: Third IEEE International Conference on Data Mining, ICDM 2003. IEEE (2003)

    Google Scholar 

  8. Ypma, A., Heskes, T.: Automatic categorization of web pages and user clustering with mixtures of hidden markov models. In: Zaïane, O.R., Srivastava, J., Spiliopoulou, M., Masand, B. (eds.) WebKDD 2002. LNCS (LNAI), vol. 2703, pp. 35–49. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-39663-5_3

    Chapter  Google Scholar 

  9. Deshpande, M., Karypis, G.: Selective markov models for predicting web page accesses. ACM Trans. Internet Technol. (TOIT) 4(2), 163–184 (2004)

    Article  Google Scholar 

  10. Mochihashi, D., Sumita, E.: The infinite markov model. In: NIPS, pp. 1017–1024 (2007)

    Google Scholar 

  11. Bühlmann, P., Wyner, A.J.: Variable length markov chains. Ann. Stat. 27(2), 480–513 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  12. Begleiter, R., El-Yaniv, R., Yona, G.: On prediction using variable order markov models. J. Artif. Intell. Res. 22, 385–421 (2004)

    MathSciNet  MATH  Google Scholar 

  13. Dubey, A., Hwang, S., Rangel, C., Rasmussen, C.E., Ghahramani, Z., Wild, D.L.: Clustering protein sequence and structure space with infinite gaussian mixture models. In: Pacific Symposium on Biocomputing, pp. 399–410 (2003)

    Google Scholar 

  14. Brown, D.P.: Efficient functional clustering of protein sequences using the dirichlet process. Bioinformatics 24(16), 1765–1771 (2008)

    Article  Google Scholar 

  15. Paul, T., Puscher, D., Strufe, T.: Improving the Usability of Privacy Settings in Facebook. CoRR (2011)

    Google Scholar 

  16. Du, N., Farajtabar, M., Ahmed, A., Smola, A.J., Song, L.: Dirichlet-Hawkes processes with applications to clustering continuous-time document streams. In: Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 219–228 (2015)

    Google Scholar 

  17. Giraud, C.: Introduction to High-dimensional Statistics, vol. 138. CRC Press, Boca Raton (2014)

    MATH  Google Scholar 

  18. Cocea, M., Weibelzahl, S.: Cross-system validation of engagement prediction from log files. In: Duval, E., Klamma, R., Wolpers, M. (eds.) EC-TEL 2007. LNCS, vol. 4753, pp. 14–25. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-75195-3_2

    Chapter  Google Scholar 

  19. Salmeron-Majadas, S., Santos, O.C., Boticario, J.G.: Exploring indicators from keyboard and mouse interactions to predict the user affective state. In: Educational Data Mining (2014)

    Google Scholar 

  20. Kurihara, K., Welling, M., Teh, Y.W.: Collapsed variational dirichlet process mixture models. In: IJCAI 2007, pp. 2796–2801 (2007)

    Google Scholar 

  21. Olkin, I., Pratt, J.W.: Unbiased estimation of certain correlation coefficients. Ann. Math. Stat. 29(1), 201–211 (1958)

    Article  MathSciNet  MATH  Google Scholar 

  22. Haider, P., Chiarandini, L., Brefeld, B.: Discriminative clustering for market segmentation. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM (2012)

    Google Scholar 

  23. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Royal Stat. Soc. Ser. B (methodological) 39(1), 1–38 (1977)

    MathSciNet  MATH  Google Scholar 

  24. Akaike, H.: A new look at the statistical model identification. IEEE Trans. Autom. control 19(6), 716–723 (1974)

    Article  MathSciNet  MATH  Google Scholar 

  25. Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6(2), 461–464 (1978)

    Article  MathSciNet  MATH  Google Scholar 

  26. Roberts, G.O., Smith, A.: Simple conditions for the convergence of the gibbs sampler and metropolis-hastings algorithms. Stoch. Processes Appl. 49(2), 207–216 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  27. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  28. Baker, F.B.: The basics of item response theory (2001). For full text: http://ericae.net/irt/baker

  29. DeMars, C.: Item Response Theory. Oxford University Press, New York (2010)

    Book  Google Scholar 

Download references

Acknowledgements

This research has been funded in parts by the German Science Foundation DFG under grant GRK/1907 and by the German Federal Ministry of Education and Science BMBF under grant QQM/01LSA1503C.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jan Reubold .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Reubold, J., Boubekki, A., Strufe, T., Brefeld, U. (2018). Infinite Mixtures of Markov Chains. In: Appice, A., Loglisci, C., Manco, G., Masciari, E., Ras, Z. (eds) New Frontiers in Mining Complex Patterns. NFMCP 2017. Lecture Notes in Computer Science(), vol 10785. Springer, Cham. https://doi.org/10.1007/978-3-319-78680-3_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-78680-3_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-78679-7

  • Online ISBN: 978-3-319-78680-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics