Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1007/978-3-030-58666-9_18guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Enhancing Event Log Quality: Detecting and Quantifying Timestamp Imperfections

Published: 13 September 2020 Publication History

Abstract

Timestamp information recorded in event logs plays a crucial role in uncovering meaningful insights into business process performance and behaviour via Process Mining techniques. Inaccurate or incomplete timestamps may cause activities in a business process to be ordered incorrectly, leading to unrepresentative process models and incorrect process performance analysis results. Thus, the quality of timestamps in an event log should be evaluated thoroughly before the log is used as input for any Process Mining activity. To the best of our knowledge, research on the (automated) quality assessment of event logs remains scarce. Our work presents an automated approach for detecting and quantifying timestamp-related issues (timestamp imperfections) in an event log. We define 15 metrics related to timestamp quality across two axes: four levels of abstraction (event, activity, trace, log) and four quality dimensions (accuracy, completeness, consistency, uniqueness). We adopted the design science research paradigm and drew from knowledge related to data quality as well as event log quality. The approach has been implemented as a prototype within the open-source Process Mining framework ProM and evaluated using three real-life event logs and involving experts from practice. This approach paves the way for a systematic and interactive enhancement of timestamp imperfections during the data pre-processing phase of Process Mining projects.

References

[1]
van der Aalst WMP Process Mining: Data Science in Action 2016 Heidelberg Springer
[2]
van der Aalst WMP, Bichler M, and Heinzl A Responsible data science Bus. Inf. Syst. Eng. 2017 59 5 311-313
[3]
Alkhattabi M, Neagu D, and Cullen A Assessing information quality of e-learning systems Comput. Hum. Behav. 2011 27 2 862-873
[4]
Andrews, R., van Dun, C.G.J., Wynn, M.T., Kratsch, W., Röglinger, M.K.E., ter Hofstede, A.H.M.: Quality-informed semi-automated event log generation for process mining. Decis. Support Syst. 132(3) (2020).
[5]
Askham, N., et al.: The six primary dimensions for data quality assessment (2013)
[6]
Awad A, Zaki NM, and Di Francescomarino C Analyzing and repairing overlapping work items Inf. Softw. Technol. 2016 80 110-123
[7]
Bose, R.P.J.C., Mans, R.S., van der Aalst, W.M.P.: Wanna improve process mining results? In: CIDM 2013, pp. 127–134. IEEE (2013).
[8]
Conforti, R., la Rosa, M., ter Hofstede, A.H.M.: Timestamp repair for business process event logs. Technical report, University of Melbourne (2018)
[9]
Dixit PM et al. Krogstie J, Reijers HA, et al. Detection and interactive repair of event ordering imperfection in process logs Advanced Information Systems Engineering 2018 Cham Springer 274-290
[10]
Emamjome F, Andrews R, and ter Hofstede AHM Panetto H, Debruyne C, Hepp M, Lewis D, Ardagna CA, and Meersman R A case study lens on process mining in practice On the Move to Meaningful Internet Systems: OTM 2019 Conferences 2019 Cham Springer 127-145
[11]
Gregor, S., Hevner, A.R.: Positioning and presenting design science research for maximum impact. MIS Q. 337–355 (2013).
[12]
Gschwandtner T, Gärtner J, Aigner W, and Miksch S Quirchmayr G, Basl J, You I, Xu L, and Weippl E A taxonomy of dirty time-oriented data Multidisciplinary Research and Practice for Information Systems 2012 Heidelberg Springer 58-72
[13]
van der Aalst W et al. Daniel F, Barkaoui K, Dustdar S, et al. Process mining manifesto Business Process Management Workshops 2012 Heidelberg Springer 169-194
[14]
Johnson AEW et al. MIMIC-III, a freely accessible database Sci. Data 2016 3 160035
[15]
Kherbouche, M.O., Laga, N., Masse, P.A.: Towards a better assessment of event logs quality. In: IEEE SSCI 2016, pp. 1–8. IEEE (2016).
[16]
Krippendorff K Reliability in content analysis Hum. Commun. Res. 2004 30 3 411-433
[17]
Lee YW, Pipino LL, Funk JD, and Wang RY Journey to Data Quality 2009 Cambridge The MIT Press
[18]
Lee YW, Strong DM, Kahn BK, and Wang RY AIMQ: a methodology for information quality assessment Inf. Manag. 2002 40 2 133-146
[19]
Lu X, et al., et al. Panetto H, et al., et al. Semi-supervised log pattern detection and exploration using event concurrence and contextual information On the Move to Meaningful Internet Systems. OTM 2017 Conferences 2017 Cham Springer 154-174
[20]
Martin N, Swennen M, Depaire B, Jans M, Caris A, and Vanhoof K Retrieving batch organisation of work insights from event logs Decis. Support Syst. 2017 100 119-128
[21]
Peffers K, Tuunanen T, Rothenberger MA, and Chatterjee S A design science research methodology for information systems research J. Manag. Inf. Syst. 2007 24 3 45-77
[22]
Pipino LL, Lee YW, and Wang RY Data quality assessment Commun. ACM 2002 45 4 211-218
[23]
Sattler KU Liu L and Özsu TM Data quality dimensions Encyclopedia of Database Systems 2009 Heidelberg Springer 612-615
[24]
Sonnenberg C and vom Brocke J Peffers K, Rothenberger M, and Kuechler B Evaluations in the science of the artificial – reconsidering the build-evaluate pattern in design science research Design Science Research in Information Systems. Advances in Theory and Practice 2012 Heidelberg Springer 381-397
[25]
Stvilia B, Gasser L, Twidale MB, and Smith LC A framework for information quality assessment J. Am. Soc. Inf. Sci. Technol. 2007 58 1720-1733
[26]
Suriadi S, Andrews R, ter Hofstede AHM, and Wynn MT Event log imperfection patterns for process mining Inf. Syst. 2017 64 132-150
[27]
Tax N, Lu X, Sidorova N, Fahland D, and van der Aalst WMP The imprecisions of precision measures in process mining Inf. Process. Lett. 2018 135 1-8
[28]
Verbeek HMW, Buijs JCAM, van Dongen BF, and van der Aalst WMP Soffer P and Proper E XES, XESame, and ProM 6 Information Systems Evolution 2011 Heidelberg Springer 60-75
[29]
Wand Y and Wang RY Anchoring data quality dimensions in ontological foundations Commun. ACM 1996 39 11 86-95
[30]
Wang RY and Strong DM Beyond accuracy: what data quality means to data consumers J. Manag. Inf. Syst. 1996 12 4 5-33
[31]
Webster J and Watson RT Analyzing the past to prepare for the future: writing a literature review MIS Q. 2002 26 2 13-23
[32]
Wynn MT and Sadiq S Hildebrandt T, van Dongen BF, Röglinger M, and Mendling J Responsible process mining - a data quality perspective Business Process Management 2019 Cham Springer 10-15

Cited By

View all
  • (2024)Text2EL+: Expert Guided Event Log Enrichment Using Unstructured TextJournal of Data and Information Quality10.1145/364001816:1(1-28)Online publication date: 6-Mar-2024
  • (2022)Quality-Informed Process Mining: A Case for Standardised Data Quality AnnotationsACM Transactions on Knowledge Discovery from Data10.1145/351170716:5(1-47)Online publication date: 5-Apr-2022
  • (2022)Supporting capacity management decisions in healthcare using data-driven process simulationJournal of Biomedical Informatics10.1016/j.jbi.2022.104060129:COnline publication date: 1-May-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
Business Process Management: 18th International Conference, BPM 2020, Seville, Spain, September 13–18, 2020, Proceedings
Sep 2020
556 pages
ISBN:978-3-030-58665-2
DOI:10.1007/978-3-030-58666-9

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 13 September 2020

Author Tags

  1. Process Mining
  2. Event log
  3. Data quality
  4. Timestamps
  5. Data quality assessment

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 22 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Text2EL+: Expert Guided Event Log Enrichment Using Unstructured TextJournal of Data and Information Quality10.1145/364001816:1(1-28)Online publication date: 6-Mar-2024
  • (2022)Quality-Informed Process Mining: A Case for Standardised Data Quality AnnotationsACM Transactions on Knowledge Discovery from Data10.1145/351170716:5(1-47)Online publication date: 5-Apr-2022
  • (2022)Supporting capacity management decisions in healthcare using data-driven process simulationJournal of Biomedical Informatics10.1016/j.jbi.2022.104060129:COnline publication date: 1-May-2022
  • (2022)Towards interactive event log forensicsInformation Systems10.1016/j.is.2022.102039109:COnline publication date: 1-Nov-2022

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media