research-article

Size and accuracy in model inference

Authors:

Yehonatan YulazariAuthors Info & Claims

ASE '19: Proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering

Pages 887 - 898

https://doi.org/10.1109/ASE.2019.00087

Published: 07 February 2020 Publication History

Abstract

Many works infer finite-state models from execution logs. Large models are more accurate but also more difficult to present and understand. Small models are easier to present and understand but are less accurate.

In this work we investigate the tradeoff between model size and accuracy in the context of the classic k-Tails model inference algorithm. First, we define mk-Tails, a generalization of k-Tails from one to many parameters, which enables fine-grained control over the tradeoff. Second, we extend mk-Tails with a reduction based on past-equivalence, which effectively reduces the size of the model without decreasing its accuracy.

We implemented our work and evaluated its performance and effectiveness on real-world logs as well as on models and generated logs from the literature.

References

[1]

Supporting materials website. http://smlab.cs.tau.ac.il/xlog/#ASE19a.

[2]

H. Amar, L. Bao, N. Busany, D. Lo, and S. Maoz. Using finite-state models for log differencing. In ESEC/SIGSOFT FSE, pages 49--59, 2018.

Digital Library

[3]

K. Androutsopoulos, D. Clark, M. Harman, J. Krinke, and L. Tratt. State-based model slicing: A survey. ACM Comput. Surv., 45(4):53:1--53:36, Aug. 2013.

Digital Library

[4]

I. Beschastnikh, Y. Brun, J. Abrahamson, M. D. Ernst, and A. Krishnamurthy. Using declarative specification to improve the understanding, extensibility, and comparison of model-inference algorithms. IEEE Trans. Software Eng., 41(4):408--428, 2015.

[5]

I. Beschastnikh, Y. Brun, S. Schneider, M. Sloan, and M. D. Ernst. Leveraging existing instrumentation to automatically infer invariant-constrained models. In SIGSOFT FSE, pages 267--277, 2011.

Digital Library

[6]

A. W. Biermann and J. A. Feldman. On the synthesis of finite-state machines from samples of their behavior. IEEE Trans. Comput., 21(6):592--597, June 1972.

Digital Library

[7]

Brics. https://www.brics.dk/automaton/.

[8]

N. Busany and S. Maoz. Behavioral log analysis with statistical guarantees. In ICSE, pages 877--887. ACM, 2016.

Digital Library

[9]

H. Cohen and S. Maoz. Have we seen enough traces? In ASE, pages 93--103. IEEE, 2015.

Digital Library

[10]

J. E. Cook and A. L. Wolf. Discovering models of software processes from event-based data. ACM Trans. Softw. Eng. Methodol., 7(3):215--249, 1998.

Digital Library

[11]

S. S. Emam and J. Miller. Inferring extended probabilistic finite-state automaton models from software executions. ACM Trans. Softw. Eng. Methodol., 27(1):4:1--4:39, 2018.

Digital Library

[12]

M. Goldstein, D. Raz, and I. Segall. Experience report: Log-based behavioral differencing. In ISSRE, pages 282--293, 2017.

[13]

L. Ilie, G. Navarro, and S. Yu. On NFA reductions. In Karhumakai J., Maurer H., Paun G., Rozenberg G. (eds) Theory Is Forever. Lecture Notes in Computer Science, vol 3113. Springer, Berlin, Heidelberg, pages 112--126. Springer, Berlin, Heidelberg, 2004.

[14]

L. Ilie and S. Yu. Reducing NFAs by invariant equivalences. Theoretical Computer Science, 306(1):373 -- 390, 2003.

Digital Library

[15]

P. C. Kanellakis and S. A. Smolka. CCS expressions, finite state processes, and three problems of equivalence. Inf. Comput., 86(1):43--68, 1990.

Digital Library

[16]

T. B. Le, X. D. Le, D. Lo, and I. Beschastnikh. Synergizing specification miners through model fissions and fusions. In ASE, pages 115--125. IEEE, 2015.

Digital Library

[17]

D. Lo and S.-C. Khoo. Quark: Empirical assessment of automaton-based specification miners. In WCRE, pages 51--60. IEEE Computer Society, 2006.

Digital Library

[18]

D. Lo and S.-C. Khoo. SMArTIC: towards building an accurate, robust and scalable specification miner. In SIGSOFT FSE, pages 265--275, 2006.

Digital Library

[19]

D. Lo, L. Mariani, and M. Pezzè. Automatic steering of behavioral model inference. In ESEC/SIGSOFT FSE, pages 345--354. ACM, 2009.

Digital Library

[20]

D. Lo, L. Mariani, and M. Santoro. Learning extended FSA from software: An empirical assessment. Journal of Systems and Software, 85(9):2063--2076, 2012.

Digital Library

[21]

D. Lorenzoli, L. Mariani, and M. Pezzè. Automatic generation of software behavioral models. In ICSE, pages 501--510, 2008.

Digital Library

[22]

L. Mariani, F. Pastore, and M. Pezzè. Dynamic analysis for diagnosing integration faults. IEEE Trans. Software Eng., 37(4):486--508, 2011.

Digital Library

[23]

L. Mariani and M. Pezzè. Dynamic detection of COTS component incompatibility. IEEE Software, 24(5):76--85, 2007.

Digital Library

[24]

R. Paige and R. E. Tarjan. Three partition refinement algorithms. SIAM J. Comput., 16(6):973--989, Dec. 1987.

Digital Library

[25]

E. Poll and A. Schubert. Verifying an implementation of SSH. In WITS, volume 7, pages 164--177, 2007.

[26]

J. Postel. Transmission control protocol. RFC 793, Internet Engineering Task Force, September 1981.

Digital Library

[27]

M. Pradel, P. Bichsel, and T. R. Gross. A framework for the evaluation of specification miners based on finite state machines. In ICSM, pages 1--10, 2010.

Digital Library

[28]

S. P. Reiss and M. Renieris. Encoding program executions. In ICSE, pages 221--230, 2001.

Digital Library

[29]

G. Rozenberg and A. Salomaa, editors. Handbook of Formal Languages, Vol. 1: Word, Language, Grammar. Springer-Verlag, Berlin, Heidelberg, 1997.

Digital Library

[30]

N. Walkinshaw and K. Bogdanov. Automated comparison of state-based software models in terms of their language and structure. ACM Trans. Softw. Eng. Methodol., 22(2):13:1--13:37, 2013.

Digital Library

[31]

Q. Wang, Y. Brun, and A. Orso. Behavioral execution comparison: Are tests representative of field behavior? In ICST, pages 321--332. IEEE Computer Society, 2017.

Cited By

Clun DShin DFilieri ABianculli D(2024)Rigorous Assessment of Model Inference Accuracy using Language CardinalityACM Transactions on Software Engineering and Methodology10.1145/364033233:4(1-39)Online publication date: 16-Jan-2024
https://dl.acm.org/doi/10.1145/3640332
Wallner FAichernig BBurghard CRoychoudhury APaiva AAbreu RStorey M(2024)It's Not a Feature, It's a Bug: Fault-Tolerant Model Mining from Noisy DataProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3623346(1-13)Online publication date: 20-May-2024
https://dl.acm.org/doi/10.1145/3597503.3623346

Size and accuracy in model inference
1. Theory of computation

Recommendations

Variational Bayesian inference for a nonlinear forward model

Variational Bayes (VB) has been proposed as a method to facilitate calculations of the posterior distributions for linear models, by providing a fast method for Bayesian inference by estimating the parameters of a factorized approximation to the ...
Probabilistic Solitude Detection on Rings of Known Size
Linear-Size hopsets with small hopbound, and constant-hopbound hopsets in RNC
Abstract
Hopsets are a fundamental graph-theoretic and graph-algorithmic construct, and they are widely used for distance-related problems in a variety of computational settings. Currently existing constructions of hopsets produce hopsets either with $Ω (n ...$ $^{}$ $^{}$ $^{}$

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ASE '19: Proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering

November 2019

1333 pages

ISBN:9781728125084

General Chair:
Thomas Zimmermann
Microsoft Research
,
Program Chairs:
Julia Lawall
Inria/LIP6, France
,
Darko Marinov
University of Illinois at Urbana-Champaign

Sponsors

In-Cooperation

IEEE CS

Publisher

IEEE Press

Publication History

Published: 07 February 2020

Check for updates

Qualifiers

Research-article

Conference

ASE '19

Sponsor:

ASE '19: 34nd IEEE/ACM International Conference on Automated Software Engineering

November 10 - 15, 2019

California, San Diego

Acceptance Rates

Overall Acceptance Rate 82 of 337 submissions, 24%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
58
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)0

Reflects downloads up to 12 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Clun DShin DFilieri ABianculli D(2024)Rigorous Assessment of Model Inference Accuracy using Language CardinalityACM Transactions on Software Engineering and Methodology10.1145/364033233:4(1-39)Online publication date: 16-Jan-2024
https://dl.acm.org/doi/10.1145/3640332
Wallner FAichernig BBurghard CRoychoudhury APaiva AAbreu RStorey M(2024)It's Not a Feature, It's a Bug: Fault-Tolerant Model Mining from Noisy DataProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3623346(1-13)Online publication date: 20-May-2024
https://dl.acm.org/doi/10.1145/3597503.3623346

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents