Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2365324.2365335acmotherconferencesArticle/Chapter ViewAbstractPublication PagespromiseConference Proceedingsconference-collections
research-article

An adaptive approach with active learning in software fault prediction

Published: 21 September 2012 Publication History

Abstract

Background: Software quality prediction plays an important role in improving the quality of software systems. By mining software metrics, predictive models can be induced that provide software managers with insights into quality problems they need to tackle as effectively as possible.
Objective: Traditional, supervised learning approaches dominate software quality prediction. Resulting models tend to be project specific. On the other hand, in situations where there are no previous releases, supervised learning approaches are not very useful because large training data sets are needed to develop accurate predictive models.
Method: This paper eases the limitations of supervised learning approaches and offers good prediction performance. We propose an adaptive approach in which supervised learning and active learning are coupled together. NaiveBayes classifier is used as the base learner.
Results: We track the performance at each iteration of the adaptive learning algorithm and compare it with the performance of supervised learning. Our results show that proposed scheme provides good fault prediction performance over time, i.e., it eventually outperforms the corresponding supervised learning approach. On the other hand, adaptive learning classification approach reduces the variance in prediction performance in comparison with the corresponding supervised learning algorithm.
Conclusion: The adaptive approach outperforms the corresponding supervised learning approach when both use Naive-Bayes as base learner. Additional research is needed to investigate whether this observation remains valid with other base classifiers.

References

[1]
O. Chapelle, B. Schölkopf, and A. Zien, editors. Semi-Supervised Learning. MIT Press, Cambridge, MA, 2006.
[2]
M. Culp and G. Michailidis. An iterative algorithm for extending learners to a semisupervised setting. In The 2007 Joint Statistical Meetings (JSM), 2007.
[3]
L. Datong, P. Yu, and P. Xiyuan. Online adaptive status prediction strategy for data-driven fault prognostics of complex systems. In Prognostics and System Health Management Conference (PHM-Shenzhen), 2011, pages 1--6, may 2011.
[4]
C. Fetzer, M. Raynal, and F. Tronel. An adaptive failure detection protocol. In Dependable Computing, 2001. Proceedings. 2001 Pacific Rim International Symposium on, pages 146--153, 2001.
[5]
D. Gray, D. Bowes, N. Davey, Y. Sun, and B. Christianson. The misuse of the nasa metrics data program data sets for automated software defect prediction. In Evaluation Assessment in Software Engineering (EASE 2011), 15th Annual Conference on, pages 96--103, april 2011.
[6]
G. Haffari and A. Sarkar. Analysis of semi-supervised learning with the yarowsky algorithm. In 23rd Conference on Uncertainty in Artificial Intelligence (UAI), 2007.
[7]
Y. Jiang, B. Cukic, T. Menzies, and N. Bartlow. Comparing design and code metrics for software quality prediction. In Proceedings of the 4th international workshop on Predictor models in software engineering, PROMISE '08, pages 11--18, New York, NY, USA, 2008. ACM.
[8]
S. Lessmann, B. Baesens, C. Mues, and S. Pietsch. Benchmarking classification models for software defect prediction: A proposed framework and novel findings. Software Engineering, IEEE Transactions on, 34(4): 485--496, july-aug. 2008.
[9]
D. D. Lewis and J. Catlett. Heterogeneous uncertainty sampling for supervised learning. In In Proceedings of the Eleventh International Conference on Machine Learning, pages 148--156. Morgan Kaufmann, 1994.
[10]
J. Ma, D. Li, S. Wang, and X. Xu. Data-based adaptive fault prediction method and its application. In Electronic Measurement Instruments, 2009. ICEMI '09. 9th International Conference on, pages 4-1011--4-1016, aug. 2009.
[11]
T. Menzies, J. Greenwald, and A. Frank. Data mining static code attributes to learn defect predictors. IEEE Transactions on Software Engineering, 33: 2--13, 2007.
[12]
N. Roy and A. Mccallum. Toward optimal active learning through sampling estimation of error reduction. In In Proc. 18th International Conf. on Machine Learning, pages 441--448. Morgan Kaufmann, 2001.
[13]
N. Schneidewind. Software metrics model for quality control. In Software Metrics Symposium, 1997. Proceedings., Fourth International, pages 127--136, nov 1997.
[14]
B. Settles. Active Learning Literature Survey. Technical Report 1648, University of Wisconsin--Madison, 2009.
[15]
B. Settles and M. Craven. An analysis of active learning strategies for sequence labeling tasks. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP '08, pages 1070--1079, Stroudsburg, PA, USA, 2008. Association for Computational Linguistics.
[16]
M. Shepperd, Q. Song, Z. Sun, and C. Mair. Data quality: Some comments on the nasa software defect data sets. In Personal Communication, July 2012.
[17]
M. Tang, X. Luo, and S. Roukos. Active learning for statistical natural language parsing. In In Proceedings of ACL 2002, pages 120--127, 2002.
[18]
D. Tian, K. Wu, and X. Li. A novel adaptive failure detector for distributed systems. In Networking, Architecture, and Storage, 2008. NAS '08. International Conference on, pages 215--221, june 2008.
[19]
D. Yu, B. Varadarajan, L. Deng, and A. Acero. Active learning and semi-supervised learning for speech recognition: A unified framework using the global entropy reduction maximization criterion. Comput. Speech Lang., 24(3): 433--444, July 2010.
[20]
J. Zhu, H. Wang, T. Yao, and B. K. Tsou. Active learning with sampling by uncertainty and density for word sense disambiguation and text classification. In Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1, COLING '08, pages 1137--1144, Stroudsburg, PA, USA, 2008. Association for Computational Linguistics.
[21]
X. Zhu. Semi-supervised learning literature survey, 2006.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
PROMISE '12: Proceedings of the 8th International Conference on Predictive Models in Software Engineering
September 2012
126 pages
ISBN:9781450312417
DOI:10.1145/2365324
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 September 2012

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. active learning
  2. adaptive learning
  3. software fault prediction

Qualifiers

  • Research-article

Conference

PROMISE '12

Acceptance Rates

PROMISE '12 Paper Acceptance Rate 12 of 24 submissions, 50%;
Overall Acceptance Rate 98 of 213 submissions, 46%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)8
  • Downloads (Last 6 weeks)0
Reflects downloads up to 23 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Enhancing Software Fault Localization Using Deep Learning Techniques2023 World Conference on Communication & Computing (WCONF)10.1109/WCONF58270.2023.10235227(1-5)Online publication date: 14-Jul-2023
  • (2023) ASSBertJournal of Information Security and Applications10.1016/j.jisa.2023.10342373:COnline publication date: 1-Mar-2023
  • (2023)Cross‐version defect prediction using threshold‐based active learningJournal of Software: Evolution and Process10.1002/smr.256336:4Online publication date: 2-Apr-2023
  • (2022)Active Learning of Discriminative Subgraph Patterns for API Misuse DetectionIEEE Transactions on Software Engineering10.1109/TSE.2021.306997848:8(2761-2783)Online publication date: 1-Aug-2022
  • (2022)DRE: density-based data selection with entropy for adversarial-robust deep learning modelsNeural Computing and Applications10.1007/s00521-022-07812-235:5(4009-4026)Online publication date: 19-Oct-2022
  • (2022)Vulnerability Detection for Smart Contract via Backward Bayesian Active LearningApplied Cryptography and Network Security Workshops10.1007/978-3-031-16815-4_5(66-83)Online publication date: 20-Jun-2022
  • (2021)Within-Project Software Aging Defect Prediction Based on Active Learning2021 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)10.1109/ISSREW53611.2021.00037(1-8)Online publication date: Oct-2021
  • (2021)Towards exploring the limitations of active learningProceedings of the 36th IEEE/ACM International Conference on Automated Software Engineering10.1109/ASE51524.2021.9678672(917-929)Online publication date: 15-Nov-2021
  • (2020)Empirical evaluation of the active learning strategies on software defects prediction2020 6th International Symposium on System and Software Reliability (ISSSR)10.1109/ISSSR51244.2020.00021(83-89)Online publication date: Oct-2020
  • (2020)The Application of Machine Learning in Self-Adaptive Systems: A Systematic Literature ReviewIEEE Access10.1109/ACCESS.2020.30360378(205948-205967)Online publication date: 2020
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media