Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3196321.3196328acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

Logtracker: learning log revision behaviors proactively from software evolution history

Published: 28 May 2018 Publication History

Abstract

Log statements are widely used for postmortem debugging. Despite the importance of log messages, it is difficult for developers to establish good logging practices. There are two main reasons for this. First, there are no rigorous specifications or systematic processes to guide the practices of software logging. Second, logging code co-evolves with bug fixes or feature updates. While previous works on log enhancement have successfully focused on the first problem, they are hard to solve the latter. For taking the first step towards solving the second problem, this paper is inspired by code clones and assumes that logging code with similar context is pervasive in software and deserves similar modifications. To verify our assumptions, we conduct an empirical study on eight open-source projects. Based on the observation, we design and implement LogTracker, an automatic tool that can predict log revisions by mining the correlation between logging context and modifications. With an enhanced modeling of logging context, LogTracker is able to guide more intricate log revisions that cannot be covered by existing tools. We evaluate the effectiveness of LogTracker by applying it to the latest version of subject projects. The results of our experiments show that LogTracker can detect 199 instances of log revisions. So far, we have reported 25 of them, and 6 have been accepted.

References

[1]
Matthew Arnold and Barbara G. Ryder. 2001. A framework for reducing the cost of instrumented code. ACM SIGPLAN Notices 36, 5 (2001), 168--179.
[2]
Boyuan Chen and Zhen Ming (Jack) Jiang. 2017. Characterizing logging practices in Java-based open source software projects - a replication study in Apache Software Foundation. Empirical Software Engineering 22, 1 (2017), 330--374.
[3]
Boyuan Chen and Zhen Ming Jiang. 2017. Characterizing and Detecting AntiPatterns in the Logging Code. Proceedings - 2017 IEEE/ACM 39th International Conference on Software Engineering, ICSE 2017 (2017), 71--81.
[4]
Michael L. Collard, Michael John Decker, and Jonathan I. Maletic. 2013. SrcML: An infrastructure for the exploration, analysis, and manipulation of source code: A tool demonstration. In IEEE International Conference on Software Maintenance, ICSM. IEEE, 516--519.
[5]
Collectd. 2017. Start page - collectd - The system statistics collection daemon. (2017). http://collectd.org/
[6]
Software Freedom Conservancy. 2018. Git. (2018). https://git-scm.com/
[7]
Wayne Davison. 2018. rsync. (2018). https://rsync.samba.org/
[8]
Rui Ding, Hucheng Zhou, Jian-Guang Lou, Hongyu Zhang, Qingwei Lin, Qiang Fu, Dongmei Zhang, and Tao Xie. 2015. Log 2: a cost-aware logging mechanism for performance diagnosis. Proceedings of the 2015 USENIX Conference on Usenix Annual Technical Conference - USENIX ATC '15 (2015), 139--150.
[9]
Jean-Rémy Falleri, Floréal Morandat, Xavier Blanc, Matias Martinez, and Martin Montperrus. 2014. Fine-grained and accurate source code differencing. Proceedings of the 29th ACM/IEEE international conference on Automated software engineering - ASE '14 (2014), 313--324.
[10]
Free Software Foundation. 2016. Diffutils - GNU Project - Free Software Foundation. (2016). https://www.gnu.org/software/diffutils/
[11]
Free Software Foundation. 2017. Tar - GNU Project - Free Software Foundation. (2017). https://www.gnu.org/software/tar/
[12]
Free Software Foundation. 2017. Wget - GNU Project - Free Software Foundation. (2017). https://www.gnu.org/software/wget/
[13]
Python Software Foundation. 2018. Built-in Functions-Python 2.7.14 documentation. (2018). https://docs.python.org/2/library/functions.html
[14]
The Apache Software Foundation. 2017. httpd - Apache Hypertext Transfer Protocol Server - Apache HTTP Server Version 2.4. (2017). http://httpd.apache.org/docs/2.4/programs/httpd.html
[15]
Qiang Fu, Jieming Zhu, Wenlu Hu, Jian-Guang Lou, Rui Ding, Qingwei Lin, Dongmei Zhang, and Tao Xie. 2014. Where do developers log? an empirical study on logging practices in industry. Proceedings of the 36th International Conference on Software Engineering - ICSE '14 (2014), 24--33.
[16]
Mark Gabel, Lingxiao Jiang, and Zhendong Su. 2008. Scalable detection of semantic clones. Proceedings of the 30th international conference on Software engineering - ICSE '08 (2008), 321.
[17]
Github. 2018. GitHub - GumTreeDiff/gumtree: A neat code differencing tool. (2018). https://github.com/GumTreeDiff/gumtree
[18]
GitHub. 2018. skyhover/Deckard: Code clone detection; clone-related bug detection; semantic clone analysis. (2018). https://github.com/skyhover/Deckard
[19]
Lingxiao Jiang, Ghassan Misherghi, Zhendong Su, and Stéphane Glondu. 2007. DECKARD: Scalable and accurate tree-based detection of code clones. Proceedings of the 29th International Conference on on Software Engineering - ICSE '07 (2007), 96--105.
[20]
Elmar Juergens, Florian Deissenboeck, and Benjamin Hummel. 2009. CloneDetective - A workbench for clone detection research. Proceedings of the 31th International Conference on Software Engineering - ICSE '09 (2009), 603--606.
[21]
Toshihiro Kamiya, Shinji Kusumoto, and Katsuro Inoue. 2002. CCFinder: A multilinguistic token-based code clone detection system for large scale source code. IEEE Transactions on Software Engineering 28, 7 (2002), 654--670. https: //
[22]
kevin8t8. 2018. The Mutt E-Mail Client. (2018). http://www.mutt.org/
[23]
Miryung Kim, Vibha Sazawal, and David Notkin. 2005. An empirical study of code clone genealogies. ACM SIGSOFT Software Engineering Notes 30, 5 (2005), 187.
[24]
Heng Li, Weiyi Shang, Ying Zou, and Ahmed E. Hassan. 2017. Towards just-in-time suggestions for log changes. Empirical Software Engineering 22, 4 (2017), 1831--1865.
[25]
Zhenmin Li, Shan Lu, Suvda Myagmar, and Yuanyuan Zhou. 2004. CP-Miner : A Tool for Finding Copy-paste and Related Bugs in Operating System Code. Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - OSDI '04 (2004), 20.
[26]
Slashdot Media. 2018. SLOCCount download | SourceForge.net. (2018). https://sourceforge.net/projects/sloccount/
[27]
Na Meng, Miryung Kim, and Kathryn S. McKinley. 2011. Systematic editing. Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation - PLDI '11 (2011), 329.
[28]
Na Meng, Miryung Kim, and Kathryn S. McKinley. 2013. LASE : Locating and Applying Systematic Edits by Learning from Examples. Proceedings of the 35th International Conference on Software Engineering - ICSE '13 (2013), 502--511.
[29]
Antonio Pecchia, Marcello Cinque, Gabriella Carrozza, and Domenico Cotroneo. 2015. Industry Practices and Event Logging: Assessment of a Critical Software Development Process. Proceedings of the 37th IEEE International Conference on Software Engineering - ICSE '15 (2015), 169--178.
[30]
Oleksandr Polozov and Sumit Gulwani. 2015. FlashMeta: a framework for inductive program synthesis. ACM SIGPLAN Notices 50, 10 (2015), 107--126.
[31]
Reudismam Rolim, Gustavo Soares, Loris D'Antoni, Oleksandr Polozov, Sumit Gulwani, Rohit Gheyi, Ryo Suzuki, and Bjorn Hartmann. 2017. Learning syntactic program transformations from examples. Proceedings of the 39th International Conference on Software Engineering - ICSE '17 (2017), 404--415.
[32]
Warren S. Sarle, Anil K. Jain, and Richard C. Dubes. 1990. Algorithms for Clustering Data. Technometrics 32, 2 (1990), 227. arXiv:tesxx
[33]
Benjamin H Sigelman, Luiz Andr, Mike Burrows, Pat Stephenson, Manoj Plakal, Donald Beaver, Saul Jaspan, and Chandan Shanbhag. 2010. Dapper, a Large-Scale Distributed Systems Tracing Infrastructure. Technical Report. California, USA. https://doi.org/dapper-2010-1
[34]
Wietse Venema. 2013. The Postfix Home Page. (2013). http://www.postfix.org/
[35]
Ding Yuan, Soyeon Park, Peng Huang, Yang Liu, and Mm Lee. 2012. Be conservative: enhancing failure diagnosis with proactive logging. Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation - OSDI '12 41, 6 (2012), 293--306.
[36]
Ding Yuan, Soyeon Park, and Yuanyuan Zhou. 2012. Characterizing logging practices in open-source software. In Proceedings of the 34th International Conference on Software Engineering - ICSE '12. 102--112.
[37]
Ding Yuan, Jing Zheng, Soyeon Park, Yuanyuan Zhou, and Stefan Savage. 2012. Improving Software Diagnosability via Log Enhancement. ACM Transactions on Computer Systems 30, 1 (2012), 1--28.
[38]
Xu Zhao, Kirk Rodrigues, and Michael Stumm. 2017. Log20 : Fully Automated Optimal Placement of Log Printing Statements under Specified Overhead Threshold. Proceedings of the 26th Symposium on Operating Systems Principles - SOSP '17 (2017), 565--581.
[39]
Jieming Zhu, Pinjia He, Qiang Fu, Hongyu Zhang, Michael R. Lyu, and Dongmei Zhang. 2015. Learning to log: Helping developers make informed logging decisions. Proceedings of the 37th International Conference on Software Engineering - ICSE '15 (2015), 415--425.

Cited By

View all
  • (2024)A literature review and existing challenges on software logging practicesEmpirical Software Engineering10.1007/s10664-024-10452-w29:4Online publication date: 18-Jun-2024
  • (2023)EvLog: Identifying Anomalous Logs over Software Evolution2023 IEEE 34th International Symposium on Software Reliability Engineering (ISSRE)10.1109/ISSRE59848.2023.00018(391-402)Online publication date: 9-Oct-2023
  • (2023)iASTMapper: An Iterative Similarity-Based Abstract Syntax Tree Mapping AlgorithmProceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering10.1109/ASE56229.2023.00178(863-874)Online publication date: 11-Nov-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICPC '18: Proceedings of the 26th Conference on Program Comprehension
May 2018
423 pages
ISBN:9781450357142
DOI:10.1145/3196321
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 May 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. failure diagnose
  2. log revision
  3. software evolution

Qualifiers

  • Research-article

Conference

ICSE '18
Sponsor:

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)23
  • Downloads (Last 6 weeks)0
Reflects downloads up to 21 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)A literature review and existing challenges on software logging practicesEmpirical Software Engineering10.1007/s10664-024-10452-w29:4Online publication date: 18-Jun-2024
  • (2023)EvLog: Identifying Anomalous Logs over Software Evolution2023 IEEE 34th International Symposium on Software Reliability Engineering (ISSRE)10.1109/ISSRE59848.2023.00018(391-402)Online publication date: 9-Oct-2023
  • (2023)iASTMapper: An Iterative Similarity-Based Abstract Syntax Tree Mapping AlgorithmProceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering10.1109/ASE56229.2023.00178(863-874)Online publication date: 11-Nov-2023
  • (2022)FIRAProceedings of the 44th International Conference on Software Engineering10.1145/3510003.3510069(970-981)Online publication date: 21-May-2022
  • (2022)Automated evolution of feature logging statement levels using Git histories and degree of interestScience of Computer Programming10.1016/j.scico.2021.102724214:COnline publication date: 1-Feb-2022
  • (2021)A Differential Testing Approach for Evaluating Abstract Syntax Tree Mapping AlgorithmsProceedings of the 43rd International Conference on Software Engineering10.1109/ICSE43902.2021.00108(1174-1185)Online publication date: 22-May-2021
  • (2020)Logging Inter-Thread Data Dependencies in Linux KernelIEICE Transactions on Information and Systems10.1587/transinf.2019EDP7255E103.D:7(1633-1646)Online publication date: 1-Jul-2020
  • (2020)Guiding log revisions by learning from software evolution historyEmpirical Software Engineering10.1007/s10664-019-09757-y25:3(2302-2340)Online publication date: 1-May-2020
  • (2019)An Exploratory Study of Logging Configuration Practice in Java2019 IEEE International Conference on Software Maintenance and Evolution (ICSME)10.1109/ICSME.2019.00079(459-469)Online publication date: Sep-2019
  • (2018)Runtime Monitoring in Continuous Deployment by Differencing Execution Behavior ModelService-Oriented Computing10.1007/978-3-030-03596-9_58(812-827)Online publication date: 12-Nov-2018

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media