Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Advertisement

An interdisciplinary comparison of sequence modeling methods for next-element prediction

  • Special Section Paper
  • Published:
Software and Systems Modeling Aims and scope Submit manuscript

Abstract

Data of sequential nature arise in many application domains in the form of, e.g., textual data, DNA sequences, and software execution traces. Different research disciplines have developed methods to learn sequence models from such datasets: (i) In the machine learning field methods such as (hidden) Markov models and recurrent neural networks have been developed and successfully applied to a wide range of tasks, (ii) in process mining process discovery methods aim to generate human-interpretable descriptive models, and (iii) in the grammar inference field the focus is on finding descriptive models in the form of formal grammars. Despite their different focuses, these fields share a common goal: learning a model that accurately captures the sequential behavior in the underlying data. Those sequence models are generative, i.e., they are able to predict what elements are likely to occur after a given incomplete sequence. So far, these fields have developed mainly in isolation from each other and no comparison exists. This paper presents an interdisciplinary experimental evaluation that compares sequence modeling methods on the task of next-element prediction on four real-life sequence datasets. The results indicate that machine learning methods, which generally do not aim at model interpretability, tend to outperform methods from the process mining and grammar inference fields in terms of accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. Note that, in the general sense, we are always able to map complex alpha-numerical string data into single characters.

  2. Note that we postpone methods for actually making predictions with a Petri net to Sect. 4. This allows us to separate the discussion of existing algorithms (this section) from the part where we introduce novel methods (Sect. 4).

  3. Alternatively, we are able to store a distribution of labels directly in correspondence with marking m. However, in such case, the predictor allows us to predict labels which are in fact not described by the process model in the corresponding marking.

  4. https://svn.win.tue.nl/repos/prom/Packages/SequencePredictionWithPetriNets/.

  5. As proven in  [83].

  6. Especially the predictors based on the process discovery methods are computationally expensive, as each prediction requires computing a prefix-alignment on a Petri net.

  7. https://doi.org/10.4121/uuid:a07386a5-7be3-4367-9535-70bc9e77dbe6.

  8. https://doi.org/10.4121/uuid:3926db30-f712-4394-aebc-75976070e91f.

  9. https://doi.org/10.4121/uuid:915d2bfb-7e84-49ad-a286-dc35f063a460.

  10. https://doi.org/10.4121/uuid:60383406-ffcd-441f-aa5e-4ec763426b76.

  11. http://babelfish.arc.nasa.gov/hg/jpf/jpf-statechart.

  12. https://keras.io.

  13. https://github.com/TaXxER/rnnalpha.

  14. https://CRAN.R-project.org/package=hmm.discnp.

  15. http://pm4py.org/.

  16. https://svn.win.tue.nl/repos/prom/Packages/SequencePredictionWithPetriNets/.

References

  1. Adriansyah, A.: Aligning observed and modeled behavior. Ph.D. thesis, Eindhoven University of Technology (2014)

  2. Ajmone Marsan, M., Conte, G., Balbo, G.: A class of generalized stochastic petri nets for the performance evaluation of multiprocessor systems. ACM Trans. Comput. Syst. 2(2), 93–122 (1984)

    Article  Google Scholar 

  3. Alizadeh, M., de Leoni, M., Zannone, N.: History-based construction of log-process alignments for conformance checking: discovering what really went wrong. In: Data-Driven Process Discovery and Analysis, pp. 1–15. Springer, Berlin (2014)

  4. Alizadeh, M., de Leoni, M., Zannone, N.: Constructing probable explanations of nonconformity: a data-aware and history-based approach. In: Proceedings of the IEEE Symposium Series on Computational Intelligence, pp. 1358–1365. IEEE (2015)

  5. Angluin, D.: Learning regular sets from queries and counterexamples. Inf. Comput. 75(2), 87–106 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  6. Arrivault, D., Benielli, D., Denis, F., Eyraud, R.: Scikit-SpLearn: a toolbox for the spectral learning of weighted automata compatible with scikit-learn. In: Conférence francophone sur l’Apprentissage Aurtomatique (2017)

  7. Augusto, A., Conforti, R., Dumas, M., La Rosa, M.: Split miner: discovering accurate and simple business process models from event logs. In: IEEE International Conference on Data Mining (ICDM), pp. 1–10. IEEE (2017)

  8. Augusto, A., Conforti, R., Dumas, M., La Rosa, M., Maggi, F.M., Marrella, A., Mecella, M., Soo, A.: Automated discovery of process models from event logs: review and benchmark. IEEE Trans. Knowl. Data Eng. 31, 686–705 (2018)

    Article  Google Scholar 

  9. Balle, B., Carreras, X., Luque, F.M., Quattoni, A.: Spectral learning of weighted automata. Mach. Learn. 96(1–2), 33–63 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  10. Balle, B., Eyraud, R., Luque, F.M., Quattoni, A., Verwer, S.: Results of the sequence prediction challenge (SPiCe): a competition on learning the next symbol in a sequence. In: International Conference on Grammatical Inference (ICGI), pp. 132–136. Springer, Berlin (2017)

  11. Bengio, Y., Grandvalet, Y.: No unbiased estimator of the variance of k-fold cross-validation. J. Mach. Learn. Res. 5(Sep), 1089–1105 (2004)

    MathSciNet  MATH  Google Scholar 

  12. Bergstra, J.S., Bardenet, R., Bengio, Y., Kégl, B.: Algorithms for hyper-parameter optimization. In: Shawe-Taylor, J., Zemel, R.S., Bartlett, P.L., Pereira, F., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems (NIPS), pp. 2546–2554 (2011)

  13. Berti, A., van Zelst, S.J., van der Aalst, W.M.P.: Process mining for python (PM4Py): bridging the gap between process-and data science. In: Proceedings of the ICPM Demo Track 2019, Co-located with 1st International Conference on Process Mining (ICPM 2019), Aachen, Germany, June 24–26, 2019, pp. 13–16 (2019)

  14. Bojar, O., Buck, C., Federmann, C., Haddow, B., Koehn, P., Leveling, J., Monz, C., Pecina, P., Post, M., Saint-Amand, H., et al.: Findings of the workshop on statistical machine translation. In: Proceedings of the Ninth Workshop on Statistical Machine Translation, pp. 12–58 (2014)

  15. Boulanger-Lewandowski, N., Bengio, Y., Vincent, P.: Modeling temporal dependencies in high-dimensional sequences: application to polyphonic music generation and transcription. In: Proceedings of the International Conference on Machine Learning (ICML)

  16. Breuker, D., Matzner, M., Delfmann, P., Becker, J.: Comprehensible predictive models for business processes. MIS Q. 40(4), 1009–1034 (2016)

    Article  Google Scholar 

  17. Brier, G.W.: Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78(1), 1–3 (1950)

    Article  Google Scholar 

  18. Buijs, J.C.A.M.: Receipt phase of an environmental permit application process (‘WABO’). CoSeLoG project. (2014). https://doi.org/10.4121/UUID:A07386A5-7BE3-4367-9535-70BC9E77DBE6

    Article  Google Scholar 

  19. Buijs, J.C.A.M., Van Dongen, B.F., van der Aalst, W.M.P.: On the role of fitness, precision, generalization and simplicity in process discovery. In: OTM Confederated International Conferences On the Move to Meaningful Internet Systems, pp. 305–322. Springer, Berlin (2012)

  20. Buijs, J.C.A.M., van Dongen, B.F., van der Aalst, W.M.P.: Quality dimensions in process discovery: The importance of fitness, precision, generalization and simplicity. Int J Cooper. Inf. Syst. 23(01), 1440001 (2014)

    Article  Google Scholar 

  21. Carmona, J., de Leoni, M., Depaire, B., Jouck, T.: Summary of the process discovery contest 2016. In: Proceedings of the Business Process Management Workshops, Springer, Berlin (2016)

  22. Ceci, M., Lanotte, P.F., Fumarola, F., Cavallo, D.P., Malerba, D.: Completion time and next activity prediction of processes using sequential pattern mining. In: International Conference on Discovery Science, pp. 49–61. Springer, Berlin (2014)

  23. Cho, K., van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Conference on Empirical Methods in Natural Language Processing (EMNLP), ACL (2014)

  24. Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. In: NIPS Deep Learning and Representation Learning Workshop (2014)

  25. Clark, A.: Learning deterministic context free grammars: the Omphalos competition. Mach. Learn. 66(1), 93–110 (2007)

    Article  MathSciNet  Google Scholar 

  26. De La Higuera, C.: A bibliographical study of grammatical inference. Pattern Recognit. 38(9), 1332–1348 (2005)

    Article  Google Scholar 

  27. De la Higuera, C.: Grammatical Inference: Learning Automata and Grammars. Cambridge University Press, Cambridge (2010)

    Book  MATH  Google Scholar 

  28. Di Francescomarino, C., Dumas, M., Maggi, F.M., Teinemaa, I.: Clustering-based predictive process monitoring. IEEE Trans. Serv. Comput. 12(6), 896–909 (2016)

    Article  Google Scholar 

  29. Dunning, T.: Statistical identification of language. Computing Research Laboratory, New Mexico State University, Las Cruces (1994)

    Google Scholar 

  30. Evermann, J., Rehse, J.R., Fettke, P.: Predicting process behaviour using deep learning. Decis. Support Syst. 100, 129–140 (2017)

    Article  Google Scholar 

  31. Gagniuc, P.A.: Markov Chains: From Theory to Implementation and Experimentation. Wiley, New York (2017)

    Book  MATH  Google Scholar 

  32. Gold, E.M.: Complexity of automaton identification from given data. Inf. Control 37(3), 302–320 (1978)

    Article  MathSciNet  MATH  Google Scholar 

  33. Gopalratnam, K., Cook, D.J.: Online sequential prediction via incremental parsing: the active LeZi algorithm. IEEE Intell. Syst. 22(1), 52–58 (2007)

    Article  Google Scholar 

  34. Gueniche, T., Fournier-Viger, P., Tseng, V.S.: Compact prediction tree: a lossless model for accurate sequence prediction. In: Proceedings of the International Conference on Advanced Data Mining and Applications (ADMA), pp. 177–188. Springer, Berlin (2013)

  35. Gueniche, T., Fournier-Viger, P., Raman, R., Tseng, V.S.: Cpt+: Decreasing the time/space complexity of the compact prediction tree. In: Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), pp. 625–636. Springer, Berlin (2015)

  36. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  37. Hopfield, J.J.: Neural networks and physical systems with emergent collective computational abilities. Proc. Natl. Acad. Sci. 79(8), 2554–2558 (1982)

    Article  MathSciNet  MATH  Google Scholar 

  38. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: International Conference for Learning Representations (ICLR) (2015)

  39. Koorneef, M., Solti, A., Leopold, H., Reijers, H.A.: Automatic root cause identification using most probable alignments. In: Proceedings of the Business Process Management Workshops, pp. 204–215. Springer, Berlin (2017)

  40. Lakshmanan, G.T., Shamsi, D., Doganata, Y.N., Unuvar, M., Khalaf, R.: A markov prediction model for data-driven semi-structured business processes. Knowl. Inf. Syst. 42(1), 97–126 (2015)

    Article  Google Scholar 

  41. Lang, K.J., Pearlmutter, B.A., Price, R.A.: Results of the abbadingo one DFA learning competition and a new evidence-driven state merging algorithm. In: International Colloquium on Grammatical Inference (ICGI), pp. 1–12. Springer, Berlin (1998)

  42. Leemans, M., van der Aalst, W.M.P., van den Brand, M.G.J.: Recursion aware modeling and discovery for hierarchical software event log analysis. In: 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 185–196. IEEE (2018a)

  43. Leemans, S.J.J., Fahland, D., van der Aalst, W.M.P.: Discovering block-structured process models from event logs-a constructive approach. In: International Conference on Applications and Theory of Petri Nets and Concurrency (PETRI NETS), pp. 311–329. Springer, Berlin (2013a)

  44. Leemans, S.J.J., Fahland, D., van der Aalst, W.M.P.: Discovering block-structured process models from event logs containing infrequent behaviour. In: International Conference on Business Process Management (BPM), pp. 66–78. Springer, Berlin (2013b)

  45. Leemans, S.J.J., Tax, N., ter Hofstede, A.H.M.: Indulpet miner: combining discovery algorithms. In: OTM Confederated International Conferences on the Move to Meaningful Internet Systems, pp. 97–115. Springer, Berlin (2018b)

  46. Logan, B., Chu, S.: Music summarization using key phrases. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2, pp. II749–II752. IEEE (2000)

  47. Mannhardt, F., Blinde, D.: Analyzing the trajectories of patients with sepsis using process mining. In: RADAR+ EMISA, CEUR-ws.org, vol 1859, pp. 72–80 (2017)

  48. Márquez-Chamorro, A.E., Resinas, M., Ruiz-Cortés, A., Toro, M.: Run-time prediction of business process indicators using evolutionary decision rules. Expert Syst. Appl. 87, 1–14 (2017)

    Article  Google Scholar 

  49. Mehdiyev, N., Evermann, J., Fettke, P.: A multi-stage deep learning approach for business process event prediction. In: IEEE Conference on Business Informatics (CBI), vol. 1, pp. 119–128. IEEE (2017)

  50. Object Management Group: Business process model and notation. Technical Report formal/2011-01-03, Object Management Group (2011). https://www.omg.org/spec/BPMN/2.0/

  51. Pika, A., van der Aalst, W.M.P., Fidge, C.J., ter Hofstede, A.H.M., Wynn, M.T.: Predicting deadline transgressions using event logs. In: International Conference on Business Process Management (BPM), pp. 211–216. Springer, Berlin (2012)

  52. Pitkow, J., Pirolli, P.: Mining longest repeating subsequences to predict worldwide web surfing. In: USENIX Symposium on Internet Technologies and Systems, USENIX, pp. 13–26 (1999)

  53. Pravilovic, S., Appice, A., Malerba, D.: Process mining to forecast the future of running cases. In: International Workshop on New Frontiers in Mining Complex Patterns, pp. 67–81. Springer, Berlin (2013)

  54. Rabiner, L.R.: A tutorial on hidden markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–286 (1989)

    Article  Google Scholar 

  55. Rogge-Solti, A., Weske, M.: Prediction of remaining service execution time using stochastic petri nets with arbitrary firing delays. In: Proceedings of the International Conference on Service-Oriented Computing (ICSOC), pp. 389–403. Springer, Berlin (2013)

  56. Rogge-Solti, A., Weske, M.: Prediction of business process durations using non-Markovian stochastic petri nets. Inf. Syst. 54, 1–14 (2015)

    Article  Google Scholar 

  57. Rogge-Solti, A., van der Aalst, W.M., Weske, M.: Discovering stochastic petri nets with arbitrary delay distributions from event logs. In: Proceedings of the International Conference on Business Process Management (BPM), pp. 15–27. Springer, Berlin (2013)

  58. Schonenberg, H., Weber, B., van Dongen, B.F., van der Aalst, W.M.P.: Supporting flexible processes through recommendations based on history. In: Proceedings of the International Conference on Business Process Management (BPM), pp. 51–66 (2008)

  59. Shao, J.: Linear model selection by cross-validation. J. Am. Stat. Assoc. 88(422), 486–494 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  60. Stanke, M., Waack, S.: Gene prediction with a hidden Markov model and a new intron submodel. Bioinformatics 19(suppl-2), ii215–ii225 (2003)

    Google Scholar 

  61. Tax, N., Sidorova, N., Haakma, R., van der Aalst, W.M.P.: Mining local process models. J. Innov. Digit. Ecosyst. 3(2), 183–196 (2016)

    Article  Google Scholar 

  62. Tax, N., Verenich, I., La Rosa, M., Dumas, M.: Predictive business process monitoring with LSTM neural networks. In: International Conference on Advanced Information Systems Engineering (CAiSE), pp. 477–492. Springer, Berlin (2017)

  63. Tax, N., Sidorova, N., van der Aalst, W.M.P., Haakma, R.: LocalProcessModelDiscovery: Bringing Petri nets to the pattern mining world. In: Proceedings of the International Conference on Applications and Theory of Petri Nets and Concurrency (PETRI NETS), pp. 374–384. Springer, Berlin (2018a)

  64. Tax, N., van Zelst, S.J., Teinemaa, I.: An experimental evaluation of the generalizing capabilities of process discovery techniques and black-box sequence models. In: Proceedings of the International Conference on Enterprise, Business-Process and Information Systems Modeling (BPMDS), pp. 165–180 (2018b)

  65. Teinemaa, I., Dumas, M., Maggi, F.M., Di Francescomarino, C.: Predictive business process monitoring with structured and unstructured data. In: International Conference on Business Process Management (BPM), pp. 401–417. Springer, Berlin (2016)

  66. Unuvar, M., Lakshmanan, G.T., Doganata, Y.N.: Leveraging path information to generate predictions for parallel business processes. Knowl. Inf. Syst. 47(2), 433–461 (2016)

    Article  Google Scholar 

  67. van der Aalst, W.M.P.: Process Mining: Data Science in Action. Springer, Berlin (2016)

    Book  Google Scholar 

  68. van der Aalst, W.M.P., Rubin, V.A., Verbeek, H.M.W., van Dongen, B.F., Kindler, E., Günther, C.W.: Process mining: a two-step approach to balance between underfitting and overfitting. Software and System Modeling 9(1), 87–111 (2010)

    Article  Google Scholar 

  69. van der Aalst, W.M.P., Adriansyah, A., van Dongen, B.F.: Replaying history on process models for conformance checking and performance analysis. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2(2), 182–192 (2012)

    Article  Google Scholar 

  70. van der Werf, J.M.E.M., van Dongen, B.F., Hurkens, C.A.J., Serebrenik, A.: Process discovery using integer linear programming. Fundam Inform 94(3–4), 387–412 (2009)

    MathSciNet  MATH  Google Scholar 

  71. van Dongen, B.F.: BPI challenge 2012 (2012). https://doi.org/10.4121/UUID:3926DB30-F712-4394-AEBC-75976070E91F, https://data.4tu.nl/repository/uuid:3926db30-f712-4394-aebc-75976070e91f

  72. van Dongen, B.F., de Medeiros, A.K.A., Verbeek, H.M.W., Weijters, A.J.M.M., van der Aalst, W.M.P.: The ProM framework: a new era in process mining tool support. In: International Conference on Application and Theory of Petri Nets (PETRI NETS), pp. 444–454. Springer, Berlin (2005)

  73. van Dongen, B.F., Crooy, R.A., van der Aalst, W.M.P.: Cycle time prediction: when will this case finally be finished? In: OTM Confederated International Conferences On the Move to Meaningful Internet Systems, pp. 319–336. Springer, Berlin (2008)

  74. van Dongen, B.F., Carmona, J., Chatain, T.: A unified approach for measuring precision and generalization based on anti-alignments. In: International Conference on Business Process Management (BPM), pp. 39–56. Springer, Berlin (2016)

  75. van den Broucke, S.K.L.M., De Weerdt, J., Vanthienen, J., Baesens, B.: Determining process model precision and generalization with weighted artificial negative events. IEEE Trans. Knowl. Data Eng. 26(8), 1877–1889 (2014)

    Article  Google Scholar 

  76. van der Spoel, S., van Keulen, M., Amrit, C.: Process prediction in noisy data sets: a case study in a dutch hospital. In: International Symposium on Data-Driven Process Discovery and Analysis (SIMPDA), pp. 60–83. Springer, Berlin (2012)

  77. van Zelst, S.J., Bolt, A., van Dongen, B.F.: Tuning alignment computation: an experimental evaluation. In: Proceedings of the International Workshop on Algorithms and Theories for the Analysis of Event Data (ATAED), pp. 6–20 (2017a)

  78. van Zelst, S.J., van Dongen, B.F., van der Aalst, W.M.P., Verbeek, H.M.W.: Discovering workflow nets using integer linear programming. Computing (2017b)

  79. Verwer, S., Eyraud, R., De La Higuera, C.: PautomaC: a probabilistic automata and hidden markov models learning competition. Mach. Learn. 96(1–2), 129–154 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  80. Weijters, A.J.M.M., Ribeiro, J.T.S.: Flexible heuristics miner (FHM). In: IEEE Symposium on Computational Intelligence and Data Mining (CIDM), pp. 310–317. IEEE (2011)

  81. Weinberger, M.J., Seroussi, G.: Sequential prediction and ranking in universal context modeling and data compression. IEEE Trans. Inf. Theory 43(5), 1697–1706 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  82. Wirth, N.: What can we do about the unnecessary diversity of notation for syntactic definitions? Commun. ACM 20(11), 822–823 (1977)

    Article  Google Scholar 

  83. Wong, T.T.: Performance evaluation of classification algorithms by k-fold and leave-one-out cross validation. Pattern Recognit. 48(9), 2839–2846 (2015)

    Article  MATH  Google Scholar 

  84. Ziv, J., Lempel, A.: Compression of individual sequences via variable-rate coding. IEEE Trans. Inf. Theory 24(5), 530–536 (1978)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Niek Tax.

Additional information

Communicated by Rainer Schmidt and Jens Gulden.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tax, N., Teinemaa, I. & van Zelst, S.J. An interdisciplinary comparison of sequence modeling methods for next-element prediction. Softw Syst Model 19, 1345–1365 (2020). https://doi.org/10.1007/s10270-020-00789-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10270-020-00789-3

Keywords