Predicting firm failure in the software industry

Roumani, Yazan F.; Nwankpa, Joseph K.; Tanniru, Mohan

doi:10.1007/s10462-019-09789-2

Predicting firm failure in the software industry

Published: 20 November 2019

Volume 53, pages 4161–4182, (2020)
Cite this article

Artificial Intelligence Review Aims and scope Submit manuscript

722 Accesses
15 Citations
Explore all metrics

Abstract

Firm failure rate in the software industry is significantly higher than other industries. Due to the wide use of software products and services, failure in the software industry has implications on the industry itself as well as the economy at the local, national and global levels. This study compares the classification performance of thirteen approaches in terms of predicting firm failure in the US software industry. Seven measures are used to evaluate the classifiers’ performance. We use synthetic minority oversampling technique (SMOTE), SMOTEBoost and SMOTEBagging to account for the data imbalance issue. In order to give managers enough time to develop strategies and take the necessary actions to reduce the likelihood of failing, we use 20 financial indicators collected 4 years before the last available date about each firm. Our findings show that embedding SMOTE into boosting and bagging algorithms is better than preprocessing data using SMOTE before learning the classifier. According to the sensitivity analysis, research and development expense is the most significant predictor of firm failure followed by net sales and total revenue. Our results can be used by managers as a decision support tool to identify high-risk firms at an early stage and take the necessary actions to prevent a firm from failing. The early prediction of firm failure will allow software firms to modularize their products or services into specific “features” and offer them as “digital services” using new business models or combine these services with partner firms’ services to create new products and address evolving customer expectations. Moreover, the early prediction of firm failure in the software industry calls on firms, both new and those in the growth stage, to componentize their design for adaptability and to build agility in the way firms use their resource mix to address both market gaps as well as operational gaps.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An empirical study on predictability of software maintainability using imbalanced data

Article 05 August 2020

Prediction of failures in the project management knowledge areas using a machine learning approach for software companies

Article Open access 10 May 2022

An Examination of the Effectiveness of SMOTE-Based Algorithms on Software Defect Prediction

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Almamy J, Aston J, Ngwa LN (2016) An evaluation of Altman’s Z-score using cash flow ratio to predict corporate failure amid the recent financial crisis: evidence from the UK. J Corp Finance 36:278–285
Google Scholar
Altman EI (1968) Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. J Finance 23(4):589–609
Google Scholar
Balcaen S, Ooghe H (2006) 35 years of studies on business failure: an overview of the classic statistical methodologies and their related problems. Br Acc Rev 38(1):63–93
Google Scholar
Bayus BL, Agarwal R (2007) The role of pre-entry experience, entry timing, and product technology strategies in explaining firm survival. Manag Sci 53(12):1887–1902
Google Scholar
Bellovary JL, Giacomino DE, Akers MD (2007) A review of bankruptcy prediction studies: 1930 to present. J Financ Educ 33:1–42
Google Scholar
Bokhari Z (2007) Industry surveys: computer software. Standard and Poor’s Industry Surveys
Bossert O, Laartz J, Ramsoy TJ (2014) Running your company at two speeds. McKinsey & Company, New York
Google Scholar
Bouckaert RR (2004) Bayesian network classifiers in WEKA. Department of Computer Science, University of Waikato, Hamilton
Google Scholar
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357
MATH Google Scholar
Chawla NV, Lazarevic A, Hall LO, Bowyer KW (2003) SMOTEBoost: improving prediction of the minority class in boosting. In: European conference on principles of data mining and knowledge discovery. Springer, Berlin, pp 107–119
Chen N, Ribeiro B, Vieira AS, Duarte J, Neves CJ (2011) A genetic algorithm-based approach to cost-sensitive bankruptcy prediction. Expert Syst Appl 38(10):12939–12945
Google Scholar
Cox DR (1972) Regression models and life tables. J R Stat Soc Ser B (Methodol) 34(2):187–202
MathSciNet MATH Google Scholar
Forrest C (2017) Software industry boosts US GDP by $1.14 trillion, grows economy in all 50 states. Retrieved from https://www.techrepublic.com/article/software-industry-boosts-us-gdp-by-1-14-trillion-grows-economy-in-all-50-states/
Frank E, Hall MA, Witten IH (2016) The WEKA workbench. Online appendix for “Data mining: Practical machine learning tools and techniques”, 4th ed. Morgan Kaufmann, Los Altos
Galar M, Fernandez A, Barrenechea E, Bustince H, Herrera F (2012) A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Trans Syst Man Cybern C Appl Rev 42(4):463–484
Google Scholar
Garcia MNM, Robledo JG, González FM, Hernández FS, Barba MS (2014) Machine learning methods for mortality prediction of polytraumatized patients in intensive care units–dealing with imbalanced and high-dimensional data. In: International conference on intelligent data engineering and automated learning. Springer, Cham, pp 309–317
Gashler M, Giraud-Carrier C, Martinez T (2008) Decision tree ensemble: small heterogeneous is better than large homogeneous. In: Seventh international conference on machine learning and applications, 2008 (ICMLA’08). IEEE, pp 900–905
Geng R, Bose I, Chen X (2015) Prediction of financial distress: an empirical study of listed Chinese companies using data mining. Eur J Oper Res 241(1):236–247
Google Scholar
Gepp A, Kumar K, Bhattacharya S (2010) Business failure prediction using decision trees. J Forecast 29(6):536–555
MathSciNet MATH Google Scholar
Giarratana MS, Fosfuri A (2007) Product strategies and survival in Schumpeterian environments: evidence from the US security software industry. Organ Stud 28(6):909–929
Google Scholar
He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284
Google Scholar
Horta RM, De Lima BP, Borges CCH (2008) A semi-deterministic ensemble strategy for imbalanced datasets (SDEID) applied to bankruptcy prediction. WIT Trans Inf Commun Technol 40:205–213
Google Scholar
Kantardzic M (2011) Data mining: concepts, models, methods, and algorithms. Wiley, New York
MATH Google Scholar
Keil M, Carmel E (1995) Customer–developer links in software development. Commun ACM 38(5):33–44
Google Scholar
Kim MJ, Kang DK (2010) Ensemble with neural networks for bankruptcy prediction. Expert Syst Appl 37(4):3373–3379
Google Scholar
Kirkos E (2015) Assessing methodologies for intelligent bankruptcy prediction. Artif Intell Rev 43:1–41
Google Scholar
Kleinbaum D, Kupper L, Nizam A, Rosenberg E (2013) Applied regression analysis and other multivariable methods. Nelson Education, Scarborough
MATH Google Scholar
Kumar PR, Ravi V (2007) Bankruptcy prediction in banks and firms via statistical and intelligent techniques—a review. Eur J Oper Res 180(1):1–28
MATH Google Scholar
Kuncheva LI (2004) Combining pattern classifiers: methods and algorithms. Wiley, New York
MATH Google Scholar
Lessmann S, Baesens B, Seow HV, Thomas LC (2015) Benchmarking state-of-the- art classification algorithms for credit scoring: an update of research. Eur J Oper Res 247(1):124–136
MATH Google Scholar
Li H, Sun J (2010) Business failure prediction using hybrid2 case-based reasoning (H2CBR). Comput Oper Res 37(1):137–151
Google Scholar
Li S, Shang J, Slaughter SA (2010) Why do software firms fail? Capabilities, competitive actions, and firm survival in the software industry from 1995 to 2007. Inf Syst Res 21(3):631–654
Google Scholar
Lopez V, Fernández A, García S, Palade V, Herrera F (2013) An insight into classification with imbalanced data: empirical results and current trends on using data intrinsic characteristics. Inf Sci 250:113–141
Google Scholar
Lusch RF, Nambisan S (2015) Service innovation: a service-dominant logic perspective. MIS Q 39(1):155–175
Google Scholar
Menor LJ, Kristal MM, Rosenzweigh ED (2007) Examining the influence of operational intellectual capital on capabilities and performance. Manuf Serv Oper Manag 9(4):559–578
Google Scholar
Neves JC, Vieira A (2006) Improving bankruptcy prediction with hidden layer learning vector quantization. Eur Acc Rev 15(2):253–271
Google Scholar
Ohlson JA (1980) Financial ratios and the probabilistic prediction of bankruptcy. J Acc Res 18:109–131
Google Scholar
Oztekin A, Delen D, Turkyilmaz A, Zaim S (2013) A machine learning-based usability evaluation method for eLearning systems. Decis Support Syst 56:63–73
Google Scholar
Oztekin A, Kizilaslan R, Freund S, Iseri A (2016) A data analytic approach to forecasting daily stock returns in an emerging market. Eur J Oper Res 253(3):697–710
MathSciNet MATH Google Scholar
Pal R, Kupka K, Aneja AP, Militky J (2016) Business health characterization: a hybrid regression and support vector machine analysis. Expert Syst Appl 49:48–59
Google Scholar
R C Team (2018) R: a language and environment for statistical computing
Rokach L (2010) Ensemble-based classifiers. Artif Intell Rev 33(1–2):1–39
Google Scholar
Roumani YF, Roumani Y, Nwankpa JK, Tanniru M (2018) Classifying readmissions to a cardiac intensive care unit. Ann Oper Res 262(1–2):429–451
MathSciNet MATH Google Scholar
Saltelli A (2002) Making best use of model evaluations to compute sensitivity indices. Comput Phys Commun 145(2):280–297
MathSciNet MATH Google Scholar
Schmalensee R (2000) Antitrust issues in Schumpeterian industries. Am Econ Rev 90(2):192–196
Google Scholar
Sesmero MP, Ledezma AI, Sanchis A (2015) Generating ensembles of heterogeneous classifiers using stacked generalization. Wiley Interdiscip Rev Data Min Knowl Discov 5(1):21–34
Google Scholar
Sevim C, Oztekin A, Bali O, Gumus S, Guresen E (2014) Developing an early warning system to predict currency crises. Eur J Oper Res 237(3):1095–1104
Google Scholar
Sun L, Shenoy PP (2007) Using Bayesian networks for bankruptcy prediction: some methodological issues. Eur J Oper Res 180(2):738–753
MATH Google Scholar
Sun J, Li H, Huang QH, He KY (2014a) Predicting financial distress and corporate failure: a review from the state-of-the-art definitions, modeling, sampling, and featuring approaches. Knowl Based Syst 57:41–56
Google Scholar
Sun J, Shang Z, Li H (2014b) Imbalance-oriented SVM methods for financial distress prediction: a comparative study among the new SB-SVM-ensemble method and traditional methods. J Oper Res Soc 65(12):1905–1919
Google Scholar
Sun J, Lang J, Fujita H, Li H (2018) Imbalanced enterprise credit evaluation with DTE–SBD: decision tree ensemble based on SMOTE and bagging with differentiated sampling rates. Inf Sci 425:76–91
MathSciNet Google Scholar
Tsai CF, Wu JW (2008) Using neural network ensembles for bankruptcy prediction and credit scoring. Expert Syst Appl 34(4):2639–2649
Google Scholar
Vittinghoff E, Glidden DV, Shiboski SC, McCulloch CE (2011) Regression methods in biostatistics: linear, logistic, survival, and repeated measures models. Springer, Berlin
MATH Google Scholar
Wang S, Yao X (2009) Diversity analysis on imbalanced data sets by using ensemble models. In: IEEE symposium on computational intelligence and data mining, 2009 (CIDM’09). IEEE, pp 324–331
West D, Dellana S, Qian J (2005) Neural network ensemble strategies for financial decision applications. Comput Oper Res 32(10):2543–2559
MATH Google Scholar
Wilson RL, Sharda R (1994) Bankruptcy prediction using neural networks. Decis Support Syst 11(5):545–557
Google Scholar
Wolpert DH (1992) Stacked generalization. Neural Netw 5(2):241–259
Google Scholar
Xiao Z, Yang X, Pang Y, Dang X (2012) The prediction for listed companies’ financial distress by using multiple prediction methods with rough set and Dempster–Shafer evidence theory. Knowl Based Syst 26:196–206
Google Scholar
Zhang G, Hu MY, Patuwo BE, Indro DC (1999) Artificial neural networks in bankruptcy prediction: general framework and cross-validation analysis. Eur J Oper Res 116(1):16–32
MATH Google Scholar
Zhou L (2013) Performance of corporate bankruptcy prediction models on imbalanced dataset: the effect of sampling methods. Knowl Based Syst 41:16–25
Google Scholar
Zieba M, Tomczak SK, Tomczak JM (2016) Ensemble boosted trees with synthetic features generation in application to bankruptcy prediction. Expert Syst Appl 58:93–101
Google Scholar

Download references

Acknowledgements

This research was partially supported by a 2017 Oakland University School of Business Administration Spring/Summer Research Fellowship.

Author information

Authors and Affiliations

Department of Decision and Information Sciences, Oakland University, 342 Elliot Hall, Rochester, MI, 48309, USA
Yazan F. Roumani
Department of Information Systems and Analytics, Miami University, 83 N. Patterson Ave, Oxford, OH, 45056, USA
Joseph K. Nwankpa
Mel & Enid Zuckerman College of Public Health, The University of Arizona, 550 E. Van Buren Street, Phoenix, AZ, 85006, USA
Mohan Tanniru

Authors

Yazan F. Roumani
View author publications
You can also search for this author in PubMed Google Scholar
Joseph K. Nwankpa
View author publications
You can also search for this author in PubMed Google Scholar
Mohan Tanniru
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yazan F. Roumani.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Roumani, Y.F., Nwankpa, J.K. & Tanniru, M. Predicting firm failure in the software industry. Artif Intell Rev 53, 4161–4182 (2020). https://doi.org/10.1007/s10462-019-09789-2

Download citation

Published: 20 November 2019
Issue Date: August 2020
DOI: https://doi.org/10.1007/s10462-019-09789-2

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Predicting firm failure in the software industry

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

An empirical study on predictability of software maintainability using imbalanced data

Prediction of failures in the project management knowledge areas using a machine learning approach for software companies

An Examination of the Effectiveness of SMOTE-Based Algorithms on Software Defect Prediction

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Predicting firm failure in the software industry

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

An empirical study on predictability of software maintainability using imbalanced data

Prediction of failures in the project management knowledge areas using a machine learning approach for software companies

An Examination of the Effectiveness of SMOTE-Based Algorithms on Software Defect Prediction

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation