Abstract
Using a statistical model in a diagnosis task generally requires a large amount of labeled data. When ground truth information is not available, too expensive or difficult to collect, one has to rely on expert knowledge. In this paper, it is proposed to use partial information from domain experts expressed as belief functions. Expert opinions are combined in this framework and used with measurement data to estimate the parameters of a statistical model using a variant of the EM algorithm. The particular application investigated here concerns the diagnosis of railway track circuits. A noiseless Independent Factor Analysis model is postulated, assuming the observed variables extracted from railway track inspection signals to be generated by a linear mixture of independent latent variables linked to the system component states. Usually, learning with this statistical model is performed in an unsupervised way using unlabeled examples only. In this paper, it is proposed to handle this learning process in a soft-supervised way using imperfect information on the system component states. Fusing partially reliable information about cluster membership is shown to significantly improve classification results.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Amari S, Cichocki A, Yang HH (1996) A new learning algorithm for blind signal separation. In: Proceedings of the 8th conference on advances in neural information processing systems (NIPS). MIT Press, Cambridge, pp 756–763
Ambroise C, Denoeux T, Govaert G, Smets P (2001) Learning from an imprecise teacher: probabilistic and evidential approaches. In: Proceedings of the 10th international symposium on applied stochastic models and data analysis (ASMDA), Compiègne, France, pp 100–105.
Ambroise C, Govaert G (2000) EM algorithm for partially known labels. In: Proceedings of the 7th conference of the international federation of classification societies (IFCS). Springer, Namur, Belgium, pp 161–166
Amini R, Gallinari P (2005) Semi-supervised learning with an imperfect supervisor. Knowl Inf Syst 8(4):385–413
Attias H (1999) Independent factor analysis. Neural Comput 11(4):803–851
Bartholomew DJ, Martin K (1999) Latent variable models and factor analysis. 2nd edn. Arnold, London
Bell AJ, Sejnowski TJ (1995) An information maximization approach to blind separation and blind deconvolution. Neural Comput 7(6):1129–1159
Ben Yaghlane A, Denoeux T, Mellouli K (2006) Elicitation of expert opinions for constructing belief functions. In: Proceedings of the 11th international conference on information processing and management of uncertainty in knowledge-based systems (IPMU ’06), Paris, France, pp 403–411
Bishop CM (2006) Pattern recognition and machine learning (information science and statistics). Springer, New York
Coelho F, de Pádua Braga A, Natowicz R, Rouzier R (2010) Semi-supervised model applied to the prediction of the response to preoperative chemotherapy for breast cancer. Soft Comput A Fusion Found Methodol Appl 15(6):1137–1144
Chapelle O, Scholkopf B, Zien A (2006) Semi-supervised learning. MIT Press, Cambridge
Côme E, Cherfi Z, Oukhellou L, Aknin P (2008) Semi-supervised IFA with prior knowledge on the mixing process: an application to a railway device diagnosis. In: Proceedings of the 7th ICMLA’08. San Diego, pp 415–420
Côme E, Oukhellou L, Denoeux T, Aknin P (2009a) Noiseless Independent Factor Analysis with mixing constraints in a semi-supervised framework. Application to railway device fault diagnosis. In: Proceedings of the 19th international conference on artificial neural networks (ICANN), Limassol, Cyprus, pp 416–425
Côme E, Oukhellou L, Denoeux T, Aknin P (2009b) Learning from partially supervised data using mixture models and belief functions. Pattern Recognit 42(3):334–348
Côme E, Oukhellou L, Denoeux T, Aknin P (2011) Fault diagnosis of a railway device using semi-supervised independent factor analysis with mixing constraints. Pattern Anal Appl (to appear). doi:10.1007/s10044-011-0212-3
Comon P (1994) Independent component analysis, a new concept? Signal Process 36(3):287–314 (special issue on higher-order statistics)
Dempster AP (1967) Upper and lower probabilities induced by a multivalued mapping. Ann Math Stat 38(2):325–339
Denoeux T (1995) A k-nearest neighbor classification rule based on Dempster–Shafer theory. IEEE Trans Syst Man Cybern 25(5):804–813
Denoeux T (2008) Conjunctive and disjunctive combination of belief functions induced by non distinct bodies of evidence. Artif Intell 172:234–264
Denoeux T (2010) Maximum likelihood from evidential data: an extension of the EM algorithm. In: Borgelt C et al (eds) Combining soft computing and statistical methods in data analysis, AISC 77. Springer, pp 181–188
Denoeux T (2011) Maximum likelihood estimation from uncertain data in the belief function framework. IEEE Trans Knowl Data Eng. doi:10.1109/TKDE.2011.201
Denoeux T, Zouhal LM (2001) Handling possibilistic labels in pattern classification using evidential reasoning. Fuzzy Sets Syst 122(3):47–62
Dubois D, Prade H (1988) Representation and combination of uncertainty with belief functions and possibility measures. Comput Intell 4(4):244–264
Duda RO, Hart PE, Stork DG (2001) Pattern classification. 2nd edn. Wiley, New York
Elouedi Z, Mellouli K, Smets Ph (2004) Assessing sensor reliability for multisensor data fusion within the Transferable Belief Model. IEEE Trans Syst Man Cybern B 34(1):782–787
Elouedi Z, Mellouli K, Smets Ph (2001) Belief decision trees: theoretical foundations. Int J Approx Reason 28(2-3):91–124
Ghahramani Z (2004) Unsupervised learning. In: Bousquet O, Raetsch G, von Luxburg U (eds) Advanced lectures on machine learning. Springer, Berlin, pp 72–112
Grandvalet Y (2002) Logistic regression for partial labels. In: Proceedings of the 9th international conference on information processing and management of uncertainty in knowledge-based systems (IPMU), vol 3. Annecy, France, pp 1935–1941
Grandvalet Y, Bengio Y (2005) Semi-supervised learning by entropy minimization. In: Proceedings of the 17th conference on advances in neural information processing systems (NIPS). MIT Press, Cambridge, pp 529–536
Ha-Duong M (2008) Hierarchical fusion of expert opinions in the Transferable Belief Model, application to climate sensitivity. Int J Approx Reason 49(3):555–574
Hastie T, Tibshirani R, Friedman J (2006) The elements of statistical learning, data mining, inference and prediction. Springer, New York
Hüllermeier E, Beringer J (2005) Learning from ambiguously labeled examples. In: Proceedings of the 6th international symposium on intelligent data analysis (IDA-05), Madrid, Spain, pp 168–179
Jenhani I, Ben Amor N, Elouedi Z (2007) Decision trees as possibilistic classifiers. Int J Approx Reason 43(8):784–807
Jraidi I, Elouedi Z (2007) Belief classification approach based on generalized credal EM. In: Mellouli K (ed) 9th European conference on symbolic and quantitative approaches to reasoning with uncertainty (ECSQARU ’07), Springer, Hammamet, Tunisia, pp 524–535
Klose A (2004) Extracting fuzzy classification rules from partially labeled data. Soft Comput A Fusion Found Methodol Appl 8(6):417–427
Lawrence ND, Schölkopf B(2001) Estimating a kernel fisher discriminant in the presence of label noise. In: Proceedings of the 18th international conference on machine learning (ICML). Morgan Kaufmann, San Francisco, pp 306–313
Li Y, Wessels L, De Ridder D, Reinders M (2007) Classification in the presence of class noise using a probabilistic kernel fisher method. Pattern Recognit 40(12):3349–3357
McLachlan GJ (1977) Estimating the linear discriminant function from initial samples containing a small number of unclassified observations. J Am Stat Assoc 72(358):403–406
McLachlan GJ, Krishnan T (1997) The EM algorithm and extension. Wiley, New York
Mercier D, Quost B, Denoeux T (2008) Refined modeling of sensor reliability in the belief function framework using contextual discounting. Inf Fusion 9(2):246–258
Moulines E, Cardoso J, Cassiat E (1997) Maximum likelihood for blind separation and deconvolution of noisy signals using mixture models. In: Proceedings of the IEEE international conference on acoustics, speech and signal processing, Munich, Germany, pp 3617–3620
Nocedal J, Wright S (1999) Numerical optimization. In: Springer series in operations research and financial engineering. Springer, Berlin
Oukhellou L, Debiolles A, Aknin P, Vilette F (2004) Automatic diagnostic of track circuit in a predictive maintenance context. In: International conference on railway engineering, London
Oukhellou L, Debiolles A, Denoeux T, Aknin P (2010) Fault diagnosis in railway track circuits using Dempster–Shafer classifier fusion. Eng Appl Artif Intell 23:117–128
Palacios A, Sánchez L, Couso I (2011) Linguistic cost-sensitive learning of genetic fuzzy classifiers for imprecise data. Int J Approx Reason 52(6):841–862
Pichon F, Denoeux T (2010) The unnormalized Dempster’s rule of combination: a new justification from the Least Commitment Principle and some extensions. J Autom Reason 45(1):61–87
Quost B, Masson M-H, Denoeux T (2011) Classifier fusion in the Dempster–Shafer framework using optimized t-norm based combination rules. Int J Approx Reason 52(3):353–374
Shafer G (1976) A mathematical theory of evidence. University Press, Princeton
Smets Ph (1990) The combination of evidence in the Transferable Belief Model. IEEE Trans Pattern Anal Mach Intell 12(5):447–458
Smets Ph (1993) Belief functions: the disjunctive rule of combination and the generalized Bayesian theorem. Int J Approx Reason 9:1–35
Smets Ph (1995) The canonical decomposition of a weighted belief. In: International joint conference on artificial intelligence. Morgan Kaufman, San Mateo, CA, pp 1896–1901
Smets Ph, Kennes R (1994) The transferable belief model. Artif Intell 66:191–234
Vannoorenbergue P, Denoeux T (2002) Handling uncertain labels in multiclass problems using belief decision trees. In: Proceedings of IPMU’2002, vol III. Annecy, France, pp 1919–1926
Vannoorenberghe P, Smets Ph (2005) Partially supervised learning by a credal EM approach. In: Godo L (ed) Proceedings of the 8th European conference on symbolic and quantitative approaches to reasoning with uncertainty (ECSQARU ’05), Springer, Barcelona, Spain, pp 956–967
Worden K, Manson G, Denoeux T (2009) An evidence-based approach to damage location on an aircraft structure. Mech Syst Signal Process 23(6):1792–1804
Yager RR (1987) On the Dempster–Shafer framework and new combination rules. Inf Sci 41(2):93–137
Acknowledgments
This work was supported by the French National Research Agency (ANR) under project DIAGHIST. The authors thank the French National Railway Company (SNCF) and its experts for their collaboration.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Cherfi, Z.L., Oukhellou, L., Côme, E. et al. Partially supervised Independent Factor Analysis using soft labels elicited from multiple experts: application to railway track circuit diagnosis. Soft Comput 16, 741–754 (2012). https://doi.org/10.1007/s00500-011-0766-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-011-0766-4