Integrated artificial intelligence-based resizing strategy and multiple criteria decision making technique to form a management decision in an imbalanced environment

Lin, Sin-Jin

doi:10.1007/s13042-016-0574-3

Integrated artificial intelligence-based resizing strategy and multiple criteria decision making technique to form a management decision in an imbalanced environment

Original Article
Published: 02 August 2016

Volume 8, pages 1981–1992, (2017)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Sin-Jin Lin¹

691 Accesses
13 Citations
Explore all metrics

Abstract

Classification in an imbalanced dataset is a current challenge in machine learning communities, as the class-imbalanced problem deteriorates the performance of numerous classifiers. This study introduces a two-stage intelligent data preprocessing approach to tackle the class-imbalanced problem. By modifying the penalty parameter of the support vector machine (SVM), the discriminating boundary will move toward the majority class and in turn misclassify the majority class examples as minority class examples. That is, more misclassifications for the majority class examples are equivalent to a greater number of minority class examples. Executing the SVM as a preprocessor can be used to overcome the class imbalanced problem. Sequentially, the modified dataset undergoes the random forest to defy the curse of dimensionality. Finally, the preprocessed data are fed into a rule-based classifier to generate comprehensive decision rules. According to the empirical results, the presented architecture is a promising alternative for the class-imbalanced problem.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Ensemble Tree Classifier for Highly Imbalanced Data Classification

Article 26 August 2021

Hybrid Support Vector Machine with Grey Wolf Optimization for Classifying Multivariate Data

CURE-SMOTE algorithm and hybrid algorithm for feature selection and parameter optimization based on random forests

Article Open access 14 March 2017

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Ashfaq RAR, Wang XZ, Huang JZ, Abbas H, He YL (2016) Fuzziness based semi-supervised learning approach for intrusion detection system. Inf Sci. doi:10.1016/j.ins.2016.04.019
Google Scholar
Bang S, Kang J, Jhun M, Kim E (2016) Hierarchically penalized support vector machine with grouped variables. Int J Mach Learn Cyber. doi:10.1007/s13042-016-0494-2
Google Scholar
Barakat N, Diederich J (2005) Eclectic rule-extraction from support vector machines. Int J Comput Intel 2:59–62
Google Scholar
Bazzazi AA, Osanloo M, Karimi B (2011) Deriving preference order of open pit mines equipment through MADM methods application of modified VIKOR method. Expert Syst Appl 38:2550–2556
Article Google Scholar
Borkar P, Sarode MV, Malik LG (2016) Modality of adaptive neuro-fuzzy classifier for acoustic signal-based traffic density state estimation employing linguistic hedges for feature selection. Int J Fuzzy Syst 18:379–394
Article Google Scholar
Breiman L (2001) Random forests. Mach Learn 45:5–32
Article MATH Google Scholar
Chang CW, Wu CR, Lin CT, Chen HC (2007) An application of AHP and sensitivity analysis for selecting the best slicing machine. Comput Ind Eng 52:296–307
Article Google Scholar
Chen X, Fan K, Liu W, Zhang X, Xue M (2015) Discriminative structure discovery via dimensionality reduction for facial image manifold. Neural Comput Appl 26:373–381
Article Google Scholar
Das SP, Padhy S (2015) A novel hybrid model using teaching–learning-based optimization and a support vector machine for commodity futures index forecasting. Int J Mach Learn Cyber. doi:10.1007/s13042-015-0359-0
Google Scholar
Eichberger J, Guerdjikova A (2010) Case-based belief formation under ambiguity. Math Soc Sci 60:161–177
Article MATH MathSciNet Google Scholar
Farquad MAH, Bose I (2012) Preprocessing unbalanced data using support vector machine. Decis Support Syst 53:226–233
Article Google Scholar
Feng L, Li T, Ruan D, Gou S (2011) A vague-rough set approach for uncertain knowledge acquisition. Knowl-Based Syst 24:837–843
Article Google Scholar
Feng HM, Wang XZ (2015) Performance improvement of classifier fusion for batch samples based on upper integral. Neural Netw 63:87–93
Article MATH Google Scholar
Friedman M (1974) Explanation and scientific understanding. J Philos 71:5–19
Article Google Scholar
Gaganis C (2009) Classification techniques for the identification of falsified financial statements a comparative analysis. Intel Syst Account Financ Manag 16:207–229
Article Google Scholar
García S, Fernández A, Herrera F (2009) Enhancing the effectiveness and interpretability of decision tree and rule induction classifiers with evolutionary training set selection over imbalanced problems. Appl Soft Comput 9:304–1314
Google Scholar
Gao M, Hong X, Chen S, Harris CJ (2011) A combined SMOTE and PSO based RBF classifier for two-class imbalanced problems. Neurocomputing 74:3456–3466
Article Google Scholar
Gao X, Fan L, Xu H (2015) Multiple rank multi-linear kernel support vector machine for matrix data classification. Int J Mach Learn Cyber. doi:10.1007/s13042-015-0383-0
Google Scholar
Gallant SI (1998) Connectionist expert systems. Commun ACM 31:152–169
Article Google Scholar
Genuer R, Poggi JM, Tuleau-Malot C (2010) Variable selection using random forests. Pattern Recogn Lett 31:2225–2236
Article Google Scholar
Gonzalez-Abril L, Cuberos FJ, Velasco F, Ortega JA (2009) Ameva an autonomous discretization algorithm. Expert Syst Appl 36:5327–5332
Article Google Scholar
Goode S, Lacey D (2011) Detecting complex account fraud in the enterprise the role of technical and non-technical controls. Decis Support Syst 50:702–714
Article Google Scholar
Grzymala-Busse JW, Stefanowski J, Wilk S (2005) A comparison of two approaches to data mining from imbalanced data. J Intell Manuf 16:565–573
Article Google Scholar
He Y, Liu NK, Hu Y, Wang X (2015) OWA operator based link prediction ensemble for social network. Expert Syst Appl 42:21–50
Article Google Scholar
He YL, Wang XZ, Huang JZ (2016) Fuzzy nonlinear regression analysis using a random weight network. Inf Sci 364–365:222–240
Article Google Scholar
Kang X, Miao D (2016) A variable precision rough set model based on the granularity of tolerance relation. Knowl Based Syst 102:103–115
Article Google Scholar
Kim HS, Sohn SY (2010) Support vector machines for default prediction of SMEs based on technology credit. Eur J Oper Res 201:838–846
Article MATH Google Scholar
Kwak N, Choi CH (2002) Input feature selection for classification problems. IEEE Trans Neural Netw 13:143–159
Article Google Scholar
Liu Y, Yu X, Huang JX, An A (2011) Combining integrated sampling with SVM ensembles for learning from imbalanced datasets. Inf Process Manag 47:617–631
Article Google Scholar
Ling CX, Sheng VS, Yang Q (2006) Test strategies for cost-sensitive decision tree. IEEE Trans Knowl Data Eng 18:1055–1067
Article Google Scholar
Lin SJ (2016) Hybrid kernelized fuzzy clustering and multiple attributes decision analysis for corporate risk management. Int J Fuzzy Syst. doi:10.1007/s40815-016-0196-7
Google Scholar
Lin SJ, Hsu MF (2016) Incorporated risk metrics and hybrid AI techniques for risk management. Neural Comput Appl. doi:10.1007/s00521-016-2253-4
Google Scholar
Lin SJ, Chen TF (2016) Multi-agent architecture for corporate operating performance assessment. Neural Process Lett 43:115–132
Article Google Scholar
Liu NK, He YL, Lim HY, Wang XZ (2014) Domain ontology graph model and its application in Chinese text classification. Neural Comput Appl 24:779–798
Article Google Scholar
Mirza B, Lin Z, Toh KA (2013) Weighted online sequential extreme learning machine for class imbalance learning. Neural Process Lett 38:465–486
Article Google Scholar
Nebot V, Berlanga R (2012) Finding association rules in semantic web data. Knowl Based Syst 25:51–62
Article Google Scholar
Opricovic S (1998) Multicriteria optimization of civil engineering systems. Faculty of Civil Engineering, Belgra
Google Scholar
Opricovic S, Tzeng GH (2002) Multicriteria planning of post-earthquake sustainable reconstruction. Comput Aided Civil Inf 17:211–220
Article Google Scholar
Opricovic S, Tzeng GH (2004) Compromise solution by MCDM methods a comparative analysis of VIKOR and TOPSIS. Eur J Oper Res 156:445–455
Article MATH Google Scholar
Orriols-Puig A, Bernadó-Mansilla E (2009) Evolutionary rule-based systems for imbalanced data sets. Soft Comput 13:213–225
Article Google Scholar
Paelinck JHP (1976) Qualitative multiple criteria analysis, environment protection and multiregional development. Region Sci Assoc 36:56–59
Google Scholar
Pawlak Z (1982) Rough sets. Int J Comput Inf Sci 11:341–356
Article MATH Google Scholar
Peng Y, Wang G, Kou G, Shi Y (2011) An empirical study of classification algorithm evaluation for financial risk prediction. Appl Soft Comput 11:2906–2915
Article Google Scholar
Rokach L (2010) Ensemble-based classifiers. Artif Intell Rev 33:1–39
Article Google Scholar
Sestito S, Dillon T (1992) Automated knowledge acquisition of rules with continuously valued attributes. In Proceedings of 12th international conference on expert systems and their applications (AVIGNON’92), Avignon –France, pp 645–656
Sun A, Lim EP, Liu Y (2009) On strategies for imbalanced text classification using SVM a comparative study. Decis Support Syst 48:191–201
Article Google Scholar
Tan A, Wu W, Li J, Lin G (2016) Evidence-theory-based numerical characterization of multigranulation rough sets in incomplete information systems. Fuzzy Set Syst 294:18–35
Article MathSciNet MATH Google Scholar
Tavana M, Mavi RK, Santos-Arteaga FJ, Doust ER (2016) An extended VIKOR method using stochastic data and subjective judgments. Comput Ind Eng 97:240–247
Article Google Scholar
Wang XZ, Ashfaq RAR, Fu AM (2015) Fuzziness based sample categorization for classifier performance improvement. J Intell Fuzzy Syst 29:1185–1196
Article MathSciNet Google Scholar
Wang XZ (2015) Learning from big data with uncertainty. J Intell Fuzzy Syst 28:2329–2330
Article MathSciNet Google Scholar
Wang L, Chen J, Fan M, Zhao X, Cui H, Cui H (2011) Feature selection and prediction of sub-health state using random forest. Energy Proc 13:5223–5228
Article Google Scholar
Wang G, Ma J, Huang L, Xu K (2012) Two credit scoring models based on dual strategy ensemble trees. Knowl Based Syst 26:61–68
Article Google Scholar
Wang Y (2013) Smooth nonparametric copula estimation with least squares support vector regression. Neural Process Lett 38:81–96
Article Google Scholar
Wu S, Sun M, Yang J (2011) Stochastic neighbor projection on manifold for feature extraction. Neurocomputing 74:2780–2789
Article Google Scholar
Zhao HX, Xing HJ, Wang XZ (2011) Two-stage dimensionality reduction approach based on 2DLDA and fuzzy rough sets technique. Neurocomputing 74:3722–3727
Article Google Scholar

Download references

Acknowledgments

The author would like to thanks Ministry of Science and Technology of the Republic of China, Taiwan for financially supporting this work under Contract No. 104-2410-H-034 -023 -MY2.

Author information

Authors and Affiliations

Department of Accounting, Chinese Culture University, 55, Hwa-Kang Rd., Yang-Ming-Shan, Taipei, 11114, Taiwan, ROC
Sin-Jin Lin

Authors

Sin-Jin Lin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sin-Jin Lin.

Appendix A VlseKriterijumska Optimizacija I Kompromisno Resenje (VIKOR)

VIKOR was proposed by Opricovic [37] and Opricovic and Tzeng [38] for multi-criteria optimization of complicated problems. Opricovic [37] indicated that the VIKOR ranked alternatives in the occurrence of conflicting criteria by generating the multi-criteria ranking index, which was ground on the specific evaluation of closeness to the ideal alternative. The VIKOR was expressed as follow [39].

Step 1 Calculate the best g ^*_i and the worst g ^-_i values of whole criterion functions, i = 1, … , n.

$$\begin{aligned} g_{i}^{*} = \left\{ \begin{aligned} &\mathop {Max}\limits_{j} \begin{array}{*{20}c} {} \\ \end{array} g_{ij} \begin{array}{*{20}c} {} \\ \end{array} for\begin{array}{*{20}c} {} \\ \end{array} benefit\begin{array}{*{20}c} {} \\ \end{array} criteria \hfill \\ &\mathop {Min}\limits_{j} \begin{array}{*{20}c} {} \\ \end{array} g_{ij} \begin{array}{*{20}c} {} \\ \end{array} for\begin{array}{*{20}c} {} \\ \end{array} \cos t\begin{array}{*{20}c} {} \\ \end{array} criteria \hfill \\ \end{aligned} \right\},\quad \begin{array}{*{20}c} {j = 1, \ldots ,J} \\ \end{array} \hfill \\ g_{i}^{ - } = \left\{ \begin{aligned} &\mathop {Max}\limits_{j} \begin{array}{*{20}c} {} \\ \end{array} g_{ij} \begin{array}{*{20}c} {} \\ \end{array} for\begin{array}{*{20}c} {} \\ \end{array} benefit\begin{array}{*{20}c} {} \\ \end{array} criteria \hfill \\ &\mathop {Min}\limits_{j} \begin{array}{*{20}c} {} \\ \end{array} g_{ij} \begin{array}{*{20}c} {} \\ \end{array} for\begin{array}{*{20}c} {} \\ \end{array} \cos t\begin{array}{*{20}c} {} \\ \end{array} criteria \hfill \\ \end{aligned} \right\},\quad \begin{array}{*{20}c} {j = 1, \ldots ,J} \\ \end{array} \hfill \\ \end{aligned}$$

(A1)

where the number of alternatives denotes as J, the number of criteria is expressed as n and the rating of i-th criterion function for alternative b _j.

Step 2 Calculate the values of X _j and Y _j, j = 1, …, J.

$$\begin{aligned} X_{j} = \sum\limits_{i = 1}^{n} {\left[ {w_{i} (g_{i}^{*} - g_{ij} )/(g_{i}^{*} - g^{ - } )} \right]} \hfill \\ Y_{j} = \mathop {Max}\limits_{i} \begin{array}{*{20}c} {\left[ {w_{i} (f_{i}^{*} - f_{ij} )/(f_{i}^{*} - f_{i}^{ - } )} \right]} \\ \end{array} \hfill \\ \end{aligned}$$

(A2)

where the weight of i-th criteria is expressed as w _i, the ranking evaluation are measured by X _j and Y _j.

Step 3 Calculate the value Z _j, j = 1, … , J.

$$\begin{aligned} Z_{j} &= \left[ {v(X_{j} - X^{*} )/(X^{ - } - X^{*} )} \right] + \left[ {(1 - v)(Y_{j} - Y^{*} )/(Y^{ - } - Y^{*} )} \right] \hfill \\ X^{*} &= \mathop {Min}\limits_{j} X_{j} ,\quad X^{ - } = \mathop {Max}\limits_{j} X_{j} , \hfill \\ Y^{*} &= \mathop {Min}\limits_{j} Y_{j} ,\quad Y^{ - } = \mathop {Max}\limits_{j} Y_{j} , \hfill \\ \end{aligned}$$

(A3)

where X ^*is the solution with the maximum group utility, Y ^*is the solution with a minimum single regret of the opponent, and the weight of the strategy of the majority of criteria is represented inv. This compromise solution is stable within a decision making process, which could be “voting by majority rule” (when v > 0.5 is need), or “by consensus” v ≈ 0.5 or “with veto” v < 0.5 [39]. Followed by the prior researches [39, 43], the value of v is set to 0.5.

Step 4 Ranking the alternatives in decreasing order. There are three ranking lists X, Y and Z.

Step 5 Generate the alternative b′, which was measured by Z and ranked the best, as a compromise solution if the following two conditions are satisfied [43]

(a)
Z(b″)–Z(b′) ≥ 1 – (J − 1)
(b)
Alternative b′ is ranked the best by X and/or Y.

If only the condition (b) is violated, the alternatives b′ and b″ are taken as compromise solutions, where b″ was measured by Z was ranked the second. If the condition (a) is violated, alternatives b′, … , b ^M were viewed as compromise solution, where b ^M was evaluated by Z was ranked the M-th and was according to the relation Z(b ^M) - Z(b′) < 1(J − 1)for maximum M.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lin, SJ. Integrated artificial intelligence-based resizing strategy and multiple criteria decision making technique to form a management decision in an imbalanced environment. Int. J. Mach. Learn. & Cyber. 8, 1981–1992 (2017). https://doi.org/10.1007/s13042-016-0574-3

Download citation

Received: 11 February 2016
Accepted: 26 July 2016
Published: 02 August 2016
Issue Date: December 2017
DOI: https://doi.org/10.1007/s13042-016-0574-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Integrated artificial intelligence-based resizing strategy and multiple criteria decision making technique to form a management decision in an imbalanced environment

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

An Ensemble Tree Classifier for Highly Imbalanced Data Classification

Hybrid Support Vector Machine with Grey Wolf Optimization for Classifying Multivariate Data

CURE-SMOTE algorithm and hybrid algorithm for feature selection and parameter optimization based on random forests

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix A VlseKriterijumska Optimizacija I Kompromisno Resenje (VIKOR)

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Integrated artificial intelligence-based resizing strategy and multiple criteria decision making technique to form a management decision in an imbalanced environment

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

An Ensemble Tree Classifier for Highly Imbalanced Data Classification

Hybrid Support Vector Machine with Grey Wolf Optimization for Classifying Multivariate Data

CURE-SMOTE algorithm and hybrid algorithm for feature selection and parameter optimization based on random forests

Explore related subjects

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix A VlseKriterijumska Optimizacija I Kompromisno Resenje (VIKOR)

Appendix A VlseKriterijumska Optimizacija I Kompromisno Resenje (VIKOR)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation