Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Advertisement

Predicting firm failure in the software industry

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

Firm failure rate in the software industry is significantly higher than other industries. Due to the wide use of software products and services, failure in the software industry has implications on the industry itself as well as the economy at the local, national and global levels. This study compares the classification performance of thirteen approaches in terms of predicting firm failure in the US software industry. Seven measures are used to evaluate the classifiers’ performance. We use synthetic minority oversampling technique (SMOTE), SMOTEBoost and SMOTEBagging to account for the data imbalance issue. In order to give managers enough time to develop strategies and take the necessary actions to reduce the likelihood of failing, we use 20 financial indicators collected 4 years before the last available date about each firm. Our findings show that embedding SMOTE into boosting and bagging algorithms is better than preprocessing data using SMOTE before learning the classifier. According to the sensitivity analysis, research and development expense is the most significant predictor of firm failure followed by net sales and total revenue. Our results can be used by managers as a decision support tool to identify high-risk firms at an early stage and take the necessary actions to prevent a firm from failing. The early prediction of firm failure will allow software firms to modularize their products or services into specific “features” and offer them as “digital services” using new business models or combine these services with partner firms’ services to create new products and address evolving customer expectations. Moreover, the early prediction of firm failure in the software industry calls on firms, both new and those in the growth stage, to componentize their design for adaptability and to build agility in the way firms use their resource mix to address both market gaps as well as operational gaps.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  • Almamy J, Aston J, Ngwa LN (2016) An evaluation of Altman’s Z-score using cash flow ratio to predict corporate failure amid the recent financial crisis: evidence from the UK. J Corp Finance 36:278–285

    Google Scholar 

  • Altman EI (1968) Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. J Finance 23(4):589–609

    Google Scholar 

  • Balcaen S, Ooghe H (2006) 35 years of studies on business failure: an overview of the classic statistical methodologies and their related problems. Br Acc Rev 38(1):63–93

    Google Scholar 

  • Bayus BL, Agarwal R (2007) The role of pre-entry experience, entry timing, and product technology strategies in explaining firm survival. Manag Sci 53(12):1887–1902

    Google Scholar 

  • Bellovary JL, Giacomino DE, Akers MD (2007) A review of bankruptcy prediction studies: 1930 to present. J Financ Educ 33:1–42

    Google Scholar 

  • Bokhari Z (2007) Industry surveys: computer software. Standard and Poor’s Industry Surveys

  • Bossert O, Laartz J, Ramsoy TJ (2014) Running your company at two speeds. McKinsey & Company, New York

    Google Scholar 

  • Bouckaert RR (2004) Bayesian network classifiers in WEKA. Department of Computer Science, University of Waikato, Hamilton

    Google Scholar 

  • Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357

    MATH  Google Scholar 

  • Chawla NV, Lazarevic A, Hall LO, Bowyer KW (2003) SMOTEBoost: improving prediction of the minority class in boosting. In: European conference on principles of data mining and knowledge discovery. Springer, Berlin, pp 107–119

  • Chen N, Ribeiro B, Vieira AS, Duarte J, Neves CJ (2011) A genetic algorithm-based approach to cost-sensitive bankruptcy prediction. Expert Syst Appl 38(10):12939–12945

    Google Scholar 

  • Cox DR (1972) Regression models and life tables. J R Stat Soc Ser B (Methodol) 34(2):187–202

    MathSciNet  MATH  Google Scholar 

  • Forrest C (2017) Software industry boosts US GDP by $1.14 trillion, grows economy in all 50 states. Retrieved from https://www.techrepublic.com/article/software-industry-boosts-us-gdp-by-1-14-trillion-grows-economy-in-all-50-states/

  • Frank E, Hall MA, Witten IH (2016) The WEKA workbench. Online appendix for “Data mining: Practical machine learning tools and techniques”, 4th ed. Morgan Kaufmann, Los Altos

  • Galar M, Fernandez A, Barrenechea E, Bustince H, Herrera F (2012) A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Trans Syst Man Cybern C Appl Rev 42(4):463–484

    Google Scholar 

  • Garcia MNM, Robledo JG, González FM, Hernández FS, Barba MS (2014) Machine learning methods for mortality prediction of polytraumatized patients in intensive care units–dealing with imbalanced and high-dimensional data. In: International conference on intelligent data engineering and automated learning. Springer, Cham, pp 309–317

  • Gashler M, Giraud-Carrier C, Martinez T (2008) Decision tree ensemble: small heterogeneous is better than large homogeneous. In: Seventh international conference on machine learning and applications, 2008 (ICMLA’08). IEEE, pp 900–905

  • Geng R, Bose I, Chen X (2015) Prediction of financial distress: an empirical study of listed Chinese companies using data mining. Eur J Oper Res 241(1):236–247

    Google Scholar 

  • Gepp A, Kumar K, Bhattacharya S (2010) Business failure prediction using decision trees. J Forecast 29(6):536–555

    MathSciNet  MATH  Google Scholar 

  • Giarratana MS, Fosfuri A (2007) Product strategies and survival in Schumpeterian environments: evidence from the US security software industry. Organ Stud 28(6):909–929

    Google Scholar 

  • He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284

    Google Scholar 

  • Horta RM, De Lima BP, Borges CCH (2008) A semi-deterministic ensemble strategy for imbalanced datasets (SDEID) applied to bankruptcy prediction. WIT Trans Inf Commun Technol 40:205–213

    Google Scholar 

  • Kantardzic M (2011) Data mining: concepts, models, methods, and algorithms. Wiley, New York

    MATH  Google Scholar 

  • Keil M, Carmel E (1995) Customer–developer links in software development. Commun ACM 38(5):33–44

    Google Scholar 

  • Kim MJ, Kang DK (2010) Ensemble with neural networks for bankruptcy prediction. Expert Syst Appl 37(4):3373–3379

    Google Scholar 

  • Kirkos E (2015) Assessing methodologies for intelligent bankruptcy prediction. Artif Intell Rev 43:1–41

    Google Scholar 

  • Kleinbaum D, Kupper L, Nizam A, Rosenberg E (2013) Applied regression analysis and other multivariable methods. Nelson Education, Scarborough

    MATH  Google Scholar 

  • Kumar PR, Ravi V (2007) Bankruptcy prediction in banks and firms via statistical and intelligent techniques—a review. Eur J Oper Res 180(1):1–28

    MATH  Google Scholar 

  • Kuncheva LI (2004) Combining pattern classifiers: methods and algorithms. Wiley, New York

    MATH  Google Scholar 

  • Lessmann S, Baesens B, Seow HV, Thomas LC (2015) Benchmarking state-of-the- art classification algorithms for credit scoring: an update of research. Eur J Oper Res 247(1):124–136

    MATH  Google Scholar 

  • Li H, Sun J (2010) Business failure prediction using hybrid2 case-based reasoning (H2CBR). Comput Oper Res 37(1):137–151

    Google Scholar 

  • Li S, Shang J, Slaughter SA (2010) Why do software firms fail? Capabilities, competitive actions, and firm survival in the software industry from 1995 to 2007. Inf Syst Res 21(3):631–654

    Google Scholar 

  • Lopez V, Fernández A, García S, Palade V, Herrera F (2013) An insight into classification with imbalanced data: empirical results and current trends on using data intrinsic characteristics. Inf Sci 250:113–141

    Google Scholar 

  • Lusch RF, Nambisan S (2015) Service innovation: a service-dominant logic perspective. MIS Q 39(1):155–175

    Google Scholar 

  • Menor LJ, Kristal MM, Rosenzweigh ED (2007) Examining the influence of operational intellectual capital on capabilities and performance. Manuf Serv Oper Manag 9(4):559–578

    Google Scholar 

  • Neves JC, Vieira A (2006) Improving bankruptcy prediction with hidden layer learning vector quantization. Eur Acc Rev 15(2):253–271

    Google Scholar 

  • Ohlson JA (1980) Financial ratios and the probabilistic prediction of bankruptcy. J Acc Res 18:109–131

    Google Scholar 

  • Oztekin A, Delen D, Turkyilmaz A, Zaim S (2013) A machine learning-based usability evaluation method for eLearning systems. Decis Support Syst 56:63–73

    Google Scholar 

  • Oztekin A, Kizilaslan R, Freund S, Iseri A (2016) A data analytic approach to forecasting daily stock returns in an emerging market. Eur J Oper Res 253(3):697–710

    MathSciNet  MATH  Google Scholar 

  • Pal R, Kupka K, Aneja AP, Militky J (2016) Business health characterization: a hybrid regression and support vector machine analysis. Expert Syst Appl 49:48–59

    Google Scholar 

  • R C Team (2018) R: a language and environment for statistical computing

  • Rokach L (2010) Ensemble-based classifiers. Artif Intell Rev 33(1–2):1–39

    Google Scholar 

  • Roumani YF, Roumani Y, Nwankpa JK, Tanniru M (2018) Classifying readmissions to a cardiac intensive care unit. Ann Oper Res 262(1–2):429–451

    MathSciNet  MATH  Google Scholar 

  • Saltelli A (2002) Making best use of model evaluations to compute sensitivity indices. Comput Phys Commun 145(2):280–297

    MathSciNet  MATH  Google Scholar 

  • Schmalensee R (2000) Antitrust issues in Schumpeterian industries. Am Econ Rev 90(2):192–196

    Google Scholar 

  • Sesmero MP, Ledezma AI, Sanchis A (2015) Generating ensembles of heterogeneous classifiers using stacked generalization. Wiley Interdiscip Rev Data Min Knowl Discov 5(1):21–34

    Google Scholar 

  • Sevim C, Oztekin A, Bali O, Gumus S, Guresen E (2014) Developing an early warning system to predict currency crises. Eur J Oper Res 237(3):1095–1104

    Google Scholar 

  • Sun L, Shenoy PP (2007) Using Bayesian networks for bankruptcy prediction: some methodological issues. Eur J Oper Res 180(2):738–753

    MATH  Google Scholar 

  • Sun J, Li H, Huang QH, He KY (2014a) Predicting financial distress and corporate failure: a review from the state-of-the-art definitions, modeling, sampling, and featuring approaches. Knowl Based Syst 57:41–56

    Google Scholar 

  • Sun J, Shang Z, Li H (2014b) Imbalance-oriented SVM methods for financial distress prediction: a comparative study among the new SB-SVM-ensemble method and traditional methods. J Oper Res Soc 65(12):1905–1919

    Google Scholar 

  • Sun J, Lang J, Fujita H, Li H (2018) Imbalanced enterprise credit evaluation with DTE–SBD: decision tree ensemble based on SMOTE and bagging with differentiated sampling rates. Inf Sci 425:76–91

    MathSciNet  Google Scholar 

  • Tsai CF, Wu JW (2008) Using neural network ensembles for bankruptcy prediction and credit scoring. Expert Syst Appl 34(4):2639–2649

    Google Scholar 

  • Vittinghoff E, Glidden DV, Shiboski SC, McCulloch CE (2011) Regression methods in biostatistics: linear, logistic, survival, and repeated measures models. Springer, Berlin

    MATH  Google Scholar 

  • Wang S, Yao X (2009) Diversity analysis on imbalanced data sets by using ensemble models. In: IEEE symposium on computational intelligence and data mining, 2009 (CIDM’09). IEEE, pp 324–331

  • West D, Dellana S, Qian J (2005) Neural network ensemble strategies for financial decision applications. Comput Oper Res 32(10):2543–2559

    MATH  Google Scholar 

  • Wilson RL, Sharda R (1994) Bankruptcy prediction using neural networks. Decis Support Syst 11(5):545–557

    Google Scholar 

  • Wolpert DH (1992) Stacked generalization. Neural Netw 5(2):241–259

    Google Scholar 

  • Xiao Z, Yang X, Pang Y, Dang X (2012) The prediction for listed companies’ financial distress by using multiple prediction methods with rough set and Dempster–Shafer evidence theory. Knowl Based Syst 26:196–206

    Google Scholar 

  • Zhang G, Hu MY, Patuwo BE, Indro DC (1999) Artificial neural networks in bankruptcy prediction: general framework and cross-validation analysis. Eur J Oper Res 116(1):16–32

    MATH  Google Scholar 

  • Zhou L (2013) Performance of corporate bankruptcy prediction models on imbalanced dataset: the effect of sampling methods. Knowl Based Syst 41:16–25

    Google Scholar 

  • Zieba M, Tomczak SK, Tomczak JM (2016) Ensemble boosted trees with synthetic features generation in application to bankruptcy prediction. Expert Syst Appl 58:93–101

    Google Scholar 

Download references

Acknowledgements

This research was partially supported by a 2017 Oakland University School of Business Administration Spring/Summer Research Fellowship.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yazan F. Roumani.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Roumani, Y.F., Nwankpa, J.K. & Tanniru, M. Predicting firm failure in the software industry. Artif Intell Rev 53, 4161–4182 (2020). https://doi.org/10.1007/s10462-019-09789-2

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-019-09789-2

Keywords