On the techniques we create, the tools we build, and their misalignments: a study of KLEE

[2]

CIVL project, accessed: 2015-7-3. vsl.cis.udel.edu/civl/.

[3]

CREST project, accessed: 2015-7-3. github.com/jburnim/crest/graphs/contributors.

[4]

The dacapo benchmark suite, accessed: 2015-7-31. dacapobench.org.

[5]

Google scholar. scholar.google.com. Accessed: 2015-6-4.

[6]

Google summer of code, accessed: 2015-7-3. developers.google.com/open-source/gsoc.

[7]

JCUTE project, accessed: 2015-7-3. github.com/osl/jcute/graphs/contributors.

[8]

KLEE LLVM execution engine website. klee.github.io/. Accessed: 2015-5-21.

[9]

KLEE project, accessed: 2015-7-3. github.com/klee/klee/graphs/contributors.

[10]

Software sustainability institute, accessed: 2015-7-3. software.ac.uk.

[11]

Soot project, accessed: 2015-8-25. github.com/Sable/soot/graphs/contributors.

[12]

SPF project, accessed: 2015-7-3. babelfish.arc.nasa.gov/hg/jpf/jpf-symbc.

[13]

S. Ahn and S. Malik. Modeling firmware as service functions and its application to test generation. In Hard. and Soft.: Verification and Testing, pages 61--77. Springer, 2013.

[14]

R. Bachwani, O. Crameri, R. Bianchini, D. Kostic, and W. Zwaenepoel. Sahara: Guiding the debugging of failed software upgrades. In ICSM, pages 263--272. IEEE, 2011.

[15]

A. Banerjee, S. Chattopadhyay, and A. Roychoudhury. Static analysis driven cache performance testing. In RTSS, pages 319--329. IEEE, 2013.

[16]

E. T. Barr, T. Vo, V. Le, and Z. Su. Automatic detection of floating-point exceptions. In SIGPLAN Notices, volume 48, pages 549--560. ACM, 2013.

[17]

C. Barrett, M. Deters, A. Oliveras, and A. Stump. Design and results of the satisfiability modulo theories competition. 2008.

[18]

S. Bauersfeld, T. E. Vos, and K. Lakhotia. Unit testing tool competitions--lessons learned. In Future Internet Testing, pages 75--94. Springer, 2014.

[19]

D. Beyer. Status report on software verification. In Tools and Algorithms for the Construction and Analysis of Systems, pages 373--388. Springer, 2014.

[20]

E. L. Boyer. Scholarship reconsidered: Priorities of the professoriate. Carnegie Foundation for the Advancement of Teaching, 1990.

[21]

A. Brooks, M. Roper, M. Wood, J. Daly, and J. Miller. Replication's role in software engineering. In Guide to adv. empirical soft. eng., pages 365--379. Springer, 2008.

[22]

S. Bucur, J. Kinder, and G. Candea. Prototyping symbolic execution engines for interpreted languages. In ASPLOS, pages 239--254. ACM, 2014.

[23]

C. Cadar, D. Dunbar, and D. R. Engler. KLEE: Unassisted and automatic generation of high-coverage tests for complex systems programs. In OSDI, volume 8, pages 209--224, 2008.

[24]

J. C. M. Carreira, R. Rodrigues, G. Candea, and R. Majumdar. Scalable testing of file system checkers. In EuroSys, pages 239--252. ACM, 2012.

[25]

S. Chattopadhyay, P. Eles, and Z. Peng. Automated software testing of memory performance in embedded GPUs. In EMSOFT, page 17. ACM, 2014.

[26]

K. Cong, F. Xie, and L. Lei. Automatic concolic test generation with virtual prototypes for post-silicon validation. In ICCAD, pages 303--310. IEEE, 2013.

[27]

X. Deng, J. Lee, and Robby. Bogor/kiasan: A k-bounded symbolic execution for checking strong heap properties of open systems. In ASE, pages 157--166, 2006.

[28]

P. Dinges and G. Agha. Solving complex path conditions through heuristic search on induced polytopes. In FSE, volume 14, 2014.

[29]

H. Do, S. Elbaum, and G. Rothermel. Supporting controlled experimentation with testing techniques: An infrastructure and its potential impact. Empirical Software Engineering: An International Journal, 10(4):405--435, 2005.

[30]

M. Dobrescu and K. Argyraki. Software dataplane verification. In NSDI, pages 101--114, 2014.

[31]

N. Eén and N. Sörensson. An extensible SAT-solver. In Theory and applications of satisfiability testing, pages 502--518. Springer, 2004.

[32]

M. D. Ernst, J. H. Perkins, P. J. Guo, S. McCamant, C. Pacheco, M. S. Tschantz, and C. Xiao. The daikon system for dynamic detection of likely invariants. Science of Computer Programming, 69(1):35--45, 2007.

[33]

N. S. Evans, A. Benameur, and M. C. Elder. Large-scale evaluation of a vulnerability analysis framework. In CSET, pages 3--3. USENIX Association, 2014.

[34]

S. Falke, F. Merz, and C. Sinz. Extending the theory of arrays: memset, memcpy, and beyond. In Verified Soft.: Theories, Tools, Experiments, pages 108--128. Springer, 2014.

[35]

A. Filieri, C. S. Păsăreanu, and W. Visser. Reliability analysis in symbolic pathfinder. In ICSE, pages 622--631. IEEE Press, 2013.

[36]

V. Ganesh, A. Kieżun, S. Artzi, P. J. Guo, P. Hooimeijer, and M. Ernst. HAMPI: A string solver for testing, analysis and vulnerability detection. In CAV, pages 1--19. Springer, 2011.

[37]

J. Geldenhuys, M. B. Dwyer, and W. Visser. Probabilistic symbolic execution. In ISSTA, pages 166--176. ACM, 2012.

[38]

P. Godefroid, M. Y. Levin, and D. Molnar. SAGE: whitebox fuzzing for security testing. Queue, 10(1):20, 2012.

[39]

L. Hafer and A. E. Kirkpatrick. Assessing open source software as a scholarly contribution. CACM, 52(12):126--129, 2009.

[40]

J. Howison and J. Bullard. How is software visible in the scientific literature? Technical report, Univ. of Texas, 2015.

[41]

J. Howison and J. D. Herbsleb. Incentives and integration in scientific software production. In CSCW, pages 459--470. ACM, 2013.

[42]

S.-K. Huang, H.-L. Lu, W.-M. Leong, and H. Liu. Craxweb: Automatic web application testing and attack generation. In SERE, pages 208--217. IEEE, 2013.

[43]

J. P. Ioannidis. Why most published research findings are false. Chance, 18(4):40--47, 2005.

[44]

W. Jin and A. Orso. BugRedux: reproducing field failures for in-house debugging. In ICSE, pages 474--484. IEEE, 2012.

[45]

N. Juristo and O. S. Gómez. Replication of software engineering experiments. In Empirical software engineering and verification, pages 60--88. Springer, 2012.

[46]

S. Kaleeswaran, V. Tulsian, A. Kanade, and A. Orso. MintHint: automated synthesis of repair hints. In ICSE, pages 266--276. ACM, 2014.

[47]

D. Katz. Citation and attribution of digital products: Social and technological concerns. WSSSPE at SC, 2013.

[48]

D. Katz, S.-C. Choi, H. Lapp, K. Maheshwari, F. Löffler, M. Turk, M. Hanwell, N. Wilkins-Diehr, et al. Summary of the first workshop on sustainable software for science (WSSSPE): Practice and experiences. arXiv, 2014.

[49]

F. M. Kifetew, W. Jin, R. Tiella, A. Orso, and P. Tonella. Reproducing field failures for programs with complex grammar-based input. In ICST, pages 163--172. IEEE, 2014.

[50]

C. Killian, K. Nagaraj, S. Pervez, R. Braud, J. W. Anderson, and R. Jhala. Finding latent performance bugs in systems implementations. In FSE, pages 17--26. ACM, 2010.

[51]

M. Kim, Y. Kim, and G. Rothermel. A scalable distributed concolic testing approach: An empirical evaluation. In ICST, pages 340--349. IEEE, 2012.

[52]

Y. Kim, M. Kim, Y. J. Kim, and Y. Jang. Industrial application of concolic testing approach: A case study on libexif by using CREST-BV and KLEE. In ICSE, pages 1143--1152. IEEE, 2012.

[53]

M. Knepley, J. Brown, L. C. McInnes, and B. Smith. Accurately citing software and algorithms used in publications. Technical report, 785731, 2013.

[54]

S. Krishnamurthi and J. Vitek. The real software crisis: repeatability as a core value. CACM, 58(3):34--36, 2015.

[55]

T. Kuchta, C. Cadar, M. Castro, and M. Costa. Docovery: toward generic automatic document recovery. In ASE, pages 563--574. ACM, 2014.

[56]

M. Kuzniar, P. Peresini, M. Canini, D. Venzano, and D. Kostic. A soft way for openflow switch interoperability testing. In CoNEXT, pages 265--276. ACM, 2012.

[57]

G. Li, P. Li, G. Sawaya, G. Gopalakrishnan, I. Ghosh, and S. P. Rajan. GKLEE: Concolic verification and test generation for gpus. In SIGPLAN Notices, volume 47, pages 215--224. ACM, 2012.

[58]

C. Lucas, S. Elbaum, and D. S. Rosenblum. Detecting problematic message sequences and frequencies in distributed systems. ACM SIGPLAN Notices, 47(10):915--926, 2012.

[59]

H. Ma, X. Ma, W. Liu, Z. Huang, D. Gao, and C. Jia. Control flow obfuscation using neural network to fight concolic testing. In SecureComm, pages 287--304, 2014.

[60]

K.-K. Ma, K. Phang, J. Foster, and M. Hicks. Directed symbolic execution. In Static Analysis, pages 95--111. Springer, 2011.

[61]

L. Madeyski and B. Kitchenham. Reproducible research--what, why and how. Technical report, WrUT, Report PRE W08/2015/P-02, 2015.

[62]

L. Martignoni, S. McCamant, P. Poosankam, D. Song, and P. Maniatis. Path-exploration lifting: Hi-fi tests for lo-fi emulators. ACM SIGARCH, 40(1):337--348, 2012.

[63]

N. McDonald and S. Goggins. Performance and participation in open source software on GitHub. In CHI, pages 139--144. ACM, 2013.

[64]

D. Patterson, L. Snyder, and J. Ullman. Evaluating computer scientists and engineers for promotion and tenure. Computing Research Association, 1999.

[65]

R. D. Peng. Reproducible research in computational science. Science (New York, Ny), 334(6060):1226, 2011.

[66]

G. Petiot, B. Botella, J. Julliand, N. Kosmatov, and J. Signoles. Instrumentation of annotated c programs for test generation. In SCAM, pages 105--114. IEEE, 2014.

[67]

T. Proebsting and A. M. Warren. Repeatability and benefaction in computer systems research. Technical report, Univ. of Arizona TR 14-04, 2015.

[68]

E. F. Rizzi. Discovery over application: A case study of misaligned incentives in software engineering. Master's thesis, University of Nebraska, Lincoln, 2015.

[69]

E. F. Rizzi, M. B. Dwyer, and S. Elbaum. Safely reducing the cost of unit level symbolic execution through read/write analysis. ACM SIGSOFT Soft. Eng. Notes, 39(1):1--5, 2014.

[70]

R. Sasnauskas, O. Landsiedel, M. H. Alizai, C. Weise, S. Kowalewski, and K. Wehrle. KleeNet: discovering insidious interaction bugs in wireless sensor networks before deployment. In IPSN, pages 186--196. ACM, 2010.

[71]

P. Schrammel, T. Melham, and D. Kroening. Chaining test cases for reactive system testing. In Testing Software and Systems, pages 133--148. Springer, 2013.

[72]

H. Seo and S. Kim. How we get there: a context-guided search strategy in concolic testing. In FSE, pages 413--424. ACM, 2014.

[73]

M. Shepperd, D. Bowes, and T. Hall. Researcher bias: The use of machine learning in software defect prediction. Soft. Eng., IEEE Trans. on, 40(6):603--616, 2014.

[74]

J. Siegmund, N. Siegmund, and S. Apel. Views on internal and external validity in empirical software engineering. In ICSE, 2015.

[75]

A. Slowinska, T. Stancescu, and H. Bos. Body armor for binaries: Preventing buffer overflows without recompilation. In USENIX ATC, pages 125--137, 2012.

[76]

C. Song, A. Porter, and J. S. Foster. itree: efficiently discovering high-coverage configurations using interaction trees. Soft. Eng., IEEE Trans. on, 40(3):251--265, 2014.

[77]

C. Sturton, R. Sinha, T. H. Dang, S. Jain, M. McCoyd, W. Y. Tan, P. Maniatis, S. A. Seshia, and D. Wagner. Symbolic software model validation. In MEMOCODE, pages 97--108. IEEE, 2013.

[78]

T. Su, Z. Fu, G. Pu, J. He, and Z. Su. Combining symbolic execution and model checking for data flow testing. In ICSE, volume 15, pages 654--665, 2015.

[79]

W. N. Sumner, T. Bao, and X. Zhang. Selecting peers for execution comparison. In ISSTA, pages 309--319, 2011.

[80]

R. Tartler, J. Sincero, C. Dietrich, W. Schröder-Preikschat, and D. Lohmann. Revealing and repairing configuration inconsistencies in large-scale system software. STTT, 14(5):531--551, 2012.

[81]

N. Tillmann and J. De Halleux. Pex--white box test generation for. net. In Tests and Proofs, pages 134--153. Springer, 2008.

[82]

W. Visser, J. Geldenhuys, and M. B. Dwyer. Green: reducing, reusing and recycling constraints in program analysis. In FSE, pages 1--11. ACM, 2012.

[83]

X. Wang, D. Lazar, N. Zeldovich, A. Chlipala, and Z. Tatlock. Jitk: a trustworthy in-kernel interpreter infrastructure. In OSDI, pages 33--47. USENIX, 2014.

[84]

Q. Yi, Z. Yang, S. Guo, C. Wang, J. Liu, and C. Zhao. Postconditioned symbolic execution. In ICST, pages 1--10. IEEE, 2015.

[85]

Q. Yi, Z. Yang, J. Liu, C. Zhao, and C. Wang. A synergistic analysis method for explaining failed regression tests. In ICSE, pages 257--267, 2015.

[86]

Y. Zhang, Z. Chen, J. Wang, W. Dong, and Z. Liu. Regular property guided dynamic symbolic execution. In ICSE, pages 643--653, 2015.