Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Multi-label classification via label correlation and first order feature dependance in a data stream

Published: 01 June 2019 Publication History
  • Get Citation Alerts
  • Highlights

    A Bayesian-based multi-label online learning method for multi-label data stream classification is proposed.
    Our method not only learns the label correlation from each arrived sample but also dynamically determines the number of predicted labels based on Hoeffding inequality and the label cardinality.
    Our method can also handle missing values and concept drifts in the data stream effectively.
    Extensive comparative experiments with the state-of-the-art algorithms validate the superior performance of our method.

    Abstract

    Many batch learning algorithms have been introduced for offline multi-label classification (MLC) over the years. However, the increasing data volume in many applications such as social networks, sensor networks, and traffic monitoring has posed many challenges to batch MLC learning. For example, it is often expensive to re-train the model with the newly arrived samples, or it is impractical to learn on the large volume of data at once. The research on incremental learning is therefore applicable to a large volume of data and especially for data stream. In this study, we develop a Bayesian-based method for learning from multi-label data streams by taking into consideration the correlation between pairs of labels and the relationship between label and feature. In our model, not only the label correlation is learned with each arrived sample with ground truth labels but also the number of predicted labels are adjusted based on Hoeffding inequality and the label cardinality. We also extend the model to handle missing values, a problem common in many real-world data. To handle concept drift, we propose a decay mechanism focusing on the age of the arrived samples to incrementally adapt to the change of data. The experimental results show that our method is highly competitive compared to several well-known benchmark algorithms under both the stationary and concept drift settings.

    References

    [1]
    M.-L. Zhang, Z.-H. Zhou, A review on multi-label learning algorithms, IEEE Trans. Knowl. Data Eng. 26 (Aug (8)) (2014).
    [2]
    T.T. Nguyen, T.T.T. Nguyen, X.C. Pham, A.W.-C. Liew, J.C. Bezdek, An ensemble-based online learning algorithm for streaming data, preprint arXiv: 1704.07938, 2017.
    [3]
    T.T.T. Nguyen, T.T. Nguyen, X.C. Pham, A.W.-C. Liew, Y. Hu, T. Liang, C.-T. Li, A novel online Bayes classifier, in: Proceeding of Digital Image Computing: Techniques and Applications, 2016.
    [4]
    T.T.T. Nguyen, A.W.-C. Liew, T.T. Nguyen, S. Wang, A novel Bayesian framework for online imbalanced learning, in: Proceeding of Digital Image Computing: Techniques and Applications, 2017.
    [5]
    A. Bifet, R. Gavaldà, Adaptive learning from evolving data streams, Advances in Intelligent Data Analysis VIII, Springer, 2009, pp. 249–260.
    [6]
    J. Read, A. Bifet, G. Holmes, B. Pfahringer, Scalable and efficient multi-label classification for evolving data stream, Mach. Learn. 88 (2012) 243–272.
    [7]
    A. Osojnik, P. Panov, S. Dzeroski, Multi-label classification via multi-target on data streams, Mach. Learn. 106 (2017) 745–770.
    [8]
    M.-L. Zhang, Y.-K. Li, X.-Y. Liu, X. Geng, Binary relevance for multi-label learning: an overview, Front. Comput. Sci. (2017) 1–12.
    [9]
    E.S. Xioufis, M. Spiliopoulou, G. Tsoumakas, I. Vlahavas, Dealing with concept drift and class imbalance in multi-label stream classification, in: Proceeding of 22nd International Conference on Artificial Intelligence, 2011.
    [10]
    G. Tsoumakas, K. Ioannis, Multi-label classification: an overview, Int. J. Data Warehous. Min. (2007) 1–13.
    [11]
    M.R. Boutell, J. Luo, X. Shen, C.M. Brown, Learning multi-label scene classification, Pattern Recognit. 37 (9) (2004) 1757–1771.
    [12]
    J. Read, B. Pfahringer, G. Holmes, E. Frank, W. Buntine, M. Grobelnik, J. Shawe-Taylor (Eds.), Classifier Chains for Multi-Label Classification, in: Lecture Notes in Artificial Intelligence, 5782, Springer, Berlin, 2009, pp. 254–269.
    [13]
    K. Dembczýnski, W. Cheng, E. Hullermeier, Bayes optimal multilabel classification via probabilistic classifier chains, in: Proceedings of the 27th International Conference on Machine Learning, Haifa, Israel, 2010, pp. 279–286.
    [14]
    J. Read, M. Luca, L. David, Efficient Monte Carlo methods for multi-dimensional learning with classifier chains, Pattern Recognit. 47 (3) (2014) 1535–1546. Handwriting Recognition and other PR Applications.
    [15]
    J. Nam, E.L. Mencía, H.J. Kim, J. Furnkranz, Maximizing subset accuracy with recurrent neural networks in multi-label classification, in: Proceeding of NIPS, 2017.
    [16]
    G. Tsoumakas, I. Vlahavas, Random k-labelsets: an ensemble method for multilabel classification, in: J.N. Kok, J. Koronacki, R.L. de Mantaras, S. Matwin, D. Mladeniˇc, A. Skowron (Eds.), Lecture Notes in Artificial Intelligence 4701, Springer, Berlin, 2007, pp. 406–417.
    [17]
    G. Tsoumakas, I. Katakis, I. Vlahavas, Random k-labelsets for multi-label classification, IEEE Trans. Knowl. Data Eng. 23 (7) (2011) 1079–1089.
    [18]
    Y. Guo, S. Gu, Multi-label classification using conditional dependency networks, in: Proceeding of the 22nd IJCAI, 2011, pp. 1300–1305.
    [19]
    J. Read, B. Pfahringer, G. Holmes, Multi-label classification using ensembles of pruned sets, in: Proc. of IEEE International Conference on Data Mining, 2008, pp. 995–1000.
    [20]
    M.-L. Zhang, Z.H. Zhou, ML-KNN: a lazy learning approach to multi-label learning, Pattern Recognit. 40 (7) (2007) 2038–2048.
    [21]
    A. Elisseeff, J. Weston, A kernel method for multi-labeled classification, in: T.G. Dietterich, S. Becker, Z. Ghahramani (Eds.), Proceeding of 14th NIPS, MIT Press, Cambridge, MA, 2002, pp. 681–687.
    [22]
    A. Clare, R.D. King, L. De Raedt, A. Siebes (Eds.), Knowledge Discovery in Multi-Label Phenotype Data, in: Lecture Notes in Computer Science, 2168, Springer, Berlin, 2001, pp. 42–53.
    [23]
    C. Vens, J. Struyf, L. Schietgat, S. Džeroski, H. Blockeel, Decision trees for hierarchical multilabel classification, Mach. Learn. 73 (2) (2008) 185–214.
    [24]
    M.-L. Zhang, J.M. Peña, V. Robles, Feature selection for multi-label naive Bayes classification, Inf. Sci. 179 (19) (2009) 3218–3229.
    [25]
    S. Ubaru, A. Mazumdar, Multilabel classification with group testing and codes, in: Proceeding of ICML, 2017.
    [26]
    A. Kapoor, R. Viswanathan, P. Jain, Multilabel classification using Bayesian compressed sensing, in: Proceeding of NIPS, 2012.
    [27]
    Y.-N. Chen, H.-T. Lin, Feature-aware label space dimension reduction for multi-label classification, Adv. Neural Inf. Process. Syst. (2012) 1529–1537.
    [28]
    Y. Zhang, J.G. Schneider, Multi-label output codes using canonical correlation analysis, in: AISTATS, 2011, pp. 873–882.
    [29]
    Y. Lin, Q. Hu, J. Zhang, Z. Wu, Multi-label feature selection with stream labels, Inf. Sci. 372 (2016) 256–275.
    [30]
    J. Lee, D.-W. Kim, SCLS: Multi-label feature selection based on scalable criterion for large dataset, Pattern Recognit. 66 (2017) 342–352.
    [31]
    X.C. Pham, M.T. Dang, S.V. Dinh, S. Hoang, T.T. Nguyen, A.W.-C. Liew, Learning from data stream based on random projection and Hoeffding tree classifier, in: Proceeding of Digital Image Computing: Techniques and Applications, 2017.
    [32]
    Y. Zhang, Z-H. Zhou, Multi-label dimensionality reduction via dependence maximization, ACM Trans. Knowl. Discov. Data 4 (3) (2010) 1–21.
    [33]
    C. Wang, S. Yan, L. Zhang, H.-J. Zhang, Multi-label sparse coding for automatic image annotation, in: Proceeding of the IEEE Conference on Computer Vision and Pattern Recognition, 2009, pp. 1643–1650.
    [34]
    W. Qu, Y. Zhang, J. Zhu, Q. Qiu, Mining multi-label concept-drifting data streams using dynamic classifier ensemble, Advances in Machine Learning, Springer, 2009, pp. 308–321.
    [35]
    M.T. Dang, A.V. Luong, T.-T. Vu, Q.V.H. Nguyen, T.T. Nguyen, B. Stantic, An ensemble system with random projection and dynamic ensemble selection, in: Proceeding of ACIIDS, 2018.
    [36]
    L. Wang, H. Shen, H. Tian, et al., Weighted ensemble classification of multi-label data streams, in: J. Kim, et al. (Eds.), PAKDD Part II, in: Lecture Note in Artificial Intelligence, 2017, pp. 551–562.
    [37]
    P. Domingos, G. Hulten, Mining high-speed data streams, in: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2000, pp. 71–80.
    [38]
    A. Bifet, G. Holmes, B. Pfahringer, R. Kirkby, R. Gavaldà, New ensemble methods for evolving data streams, in: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, France, June28-July 01, New York, USA, ACM, 2009, pp. 139–148.
    [39]
    Z. Shi, Y. Xue, Y. Wen, G. Cai, Efficient class incremental learning for multi-label classification of evolving data streams, in: International Joint Conference on Neural Networks, 2014.
    [40]
    Z. Shi, C. Feng, Y. Wen, H. Zhao, Drift detection for multi-label data streams based on label grouping and entropy, in: IEEE International Conference on Data Mining Workshop, 2014.
    [41]
    K. Karponi, G. Tsoumakas, et al., An empirical comparison of methods for multi-label data stream classification, P. Angelov, et al. (Eds.), Advances in Big Data, Advantages in Intelligent Systems and Computing, 529, 2017.
    [42]
    J. Read, P. Reutemann, B. Pfahringer, G. Holmes, MEKA: a multi-label/multi-target extension to WEKA, J. Mach. Learn. Res. 17 (21) (2016) 1–5.
    [43]
    G. Tsoumakas, E. Spyromitros-Xioufis, J. Vilcek, I. Vlahavas, MULAN: a Java library for multi-label learning, J. Mach. Learn. Res. 12 (Jul) (2011) 2411–2414.
    [44]
    P. Domingos, M. Pazzani, Beyond independence: conditions for the optimality of the simple Bayesian classifier, in: Proceeding of 13th ICML, 1996, pp. 105–112.
    [45]
    N. Friedman, D. Geiger, M. Goldszmidt, Bayesian network classifiers, Mach. Learn. 29 (2) (1997) 131–163.
    [46]
    G.I. Webb, J.R. Boughton, Z. Wang, Not so Naïve Bayes: aggregating one-dependence estimators, Mach. Learn. 58 (2005) 5–24.
    [47]
    G.I. Webb, J. Boughton, F. Zheng, K.M. Ting, H. Salem, Learning by extrapolation from marginal to full-multivariate probability distributions: decreasingly naive Bayesian classification, Mach. Learn. (2011) 1–40.
    [48]
    F. Zheng, G.I. Webb, P. Suraweera, L. Zhu, Subsumption resolution: an efficient and effective technique for semi-Naive Bayesian learning, Mach. Learn. 87 (1) (2012) 93–125.
    [49]
    N.A. Zaidi, J. Cerquides, M.J. Carman, G.I. Webb, Alleviating Naive Bayes attribute independence assumption by attribute weighting, J. Mach. Learn. Res. 14 (2013) 1947–1988.
    [50]
    F. Zheng, G.I. Webb, Efficient lazy elimination for averaged-one dependence estimators, in: Proceedings of the Twenty-third International Conference on Machine Learning (ICML 2006), 2006, pp. 1113–1120.
    [51]
    A. Bifet, G. Holmes, B. Pfahringer, P. Kranen, H. Kremer, T. Jansen, T. Seidl, MOA: massive online analysis, a framework for stream classification and clustering, J. Mach. Learn. Res. Workshop Conf. Proc. 11 (2010) Workshop on Applications of Pattern Analysis http://moa.cms.waikato.ac.nz (the latest release of June 2017).
    [52]
    T.T. Nguyen, T.T.T. Nguyen, X.C. Pham, A.W.-C. Liew, A novel combining classifier method based on variational inference, Pattern Recognit. 49 (2016) 198–212.
    [53]
    T.T. Nguyen, M.P. Nguyen, X.C. Pham, A.W.-C. Liew, Heterogeneous classifier ensemble with fuzzy rule-based meta learner, Inf. Sci. 422 (2018) 144–160.
    [54]
    T.T. Nguyen, A.W.-C. Liew, X.C. Pham, M.P. Nguyen, A novel 2-stage combining classifier model with stacking and genetic algorithm based feature selection, in: D.-S. Huang, K.-H. Jo, L. Wang (Eds.), Intelligent Computing Methodologies, Springer International Publishing, 2014, pp. 33–43.
    [55]
    T.T. Nguyen, A.W.-C. Liew, M.T. Tran, M.P. Nguyen, Combining multi classifiers based on a genetic algorithm – a Gaussian mixture model framework, in: D.-S. Huang, K.-H. Jo, L. Wang (Eds.), Intelligent Computing Methodologies, Springer International Publishing, 2014, pp. 56–67.
    [56]
    T.T. Nguyen, A.W.-C. Liew, M.T. Tran, X.C. Pham, M.P. Nguyen, A novel genetic algorithm approach for simultaneous feature and classifier selection in multi classifier system, in: IEEE Congress on Evolutionary Computation (CEC), 2014, pp. 1698–1705.
    [57]
    T.T. Nguyen, A.W.-C. Liew, X.C. Pham, M.P. Nguyen, Optimization of ensemble classifier system based on multiple objectives genetic algorithm, in: International Conference on Machine Learning and Cybernetics (ICMLC), 1, 2014, pp. 46–51.
    [58]
    T.T. Nguyen, A.W.-C. Liew, M.T. Tran, T.T.T. Nguyen, M.P. Nguyen, Fusion of classifiers based on a novel 2-stage model, in: X. Wang, W. Pedrycz, P. Chan, Q. He (Eds.), Machine Learning and Cybernetics, Springer, 2014, pp. 60–68.
    [59]
    T.T. Nguyen, X.C. Pham, A.W.-C. Lien, W. Pedrycz, Aggregation of classifiers: a justifiable information granularity approach, IEEE Trans. Cybern. (2018),.
    [60]
    T.T. Nguyen, A.W.-C. Liew, M.T. Tran, M.P. Nguyen, Combining classifiers based on Gaussian mixture model approach to ensemble data, in: X. Wang, W. Pedrycz, P. Chan, Q. He (Eds.), Machine Learning and Cybernetics, Springer, 2014, pp. 3–12.
    [61]
    F. Shi, J. Wang, Z. Wang, Region-based supervised annotation for semantic image retrieval, Int. J. Electron. Commun. 65 (2011) 929–936.
    [62]
    W. Hoeffding, Probability inequalities for sums of bounded random variables, J. Am. Statist. Assoc. 58 (301) (1963) 13–30.
    [63]
    J. Read, Scalable Multi-Label Classification, PhD Thesis University of Waikato, 2010.
    [64]
    A.W.C. Liew, N.F. Law, H. Yan, Missing value imputation for gene expression data: computational techniques to recover missing data from available information, Brief. Bioinform. 12 (5) (Sep 2011) 498–513.
    [65]
    R. Deb, A.W.C. Liew, Missing value imputation for the analysis of incomplete traffic accident data, Inf. Sci. 339 (2016) 274–289.
    [66]
    X. Gan, A.W.C. Liew, H. Yan, Microarray missing data imputation based on a set theoretic framework and biological consideration, Nucleic Acids Res. 34 (2006) 1608–1619.
    [67]
    J. Gama, P. Medas, G. Castillo, Pedro Rodrigues, Learning with drift detection, Advances in Artificial Intelligence, 3171, Springer, Berlin Heidelberg, 2004, pp. 268–295.
    [68]
    A. Bifet, R. Gavaldà, Learning from time-changing data with adaptive windowing, in: Proceeding of ICDM, 2007.
    [69]
    E. Cohen, M. Strauss, Maintaining time-decaying stream aggregates, J. Algorithms 59 (1) (2006) 19–36.
    [70]
    Y. Sun, K. Tang, L.L. Minku, S. Wang, X. Yao, Online ensemble learning of data streams with gradually evolved classes, IEEE Trans. Knowl. Data Eng. 28 (6) (2016) 1532–1545.
    [71]
    J. Gama, R. Sebastiao, P.P. Rodrigues, Issues in evaluation of stream learning algorithms, in: Proceeding of KDD, 2009.
    [72]
    J. Demsar, Statistical comparisons of classifiers over multiple datasets, J. Mach. Learn. Res. 7 (2006) 1–30.
    [73]
    S. Garcia, F. Herrera, An extension on statistical comparisons of classifiers over multiple data sets for all pairwise comparisons, J. Mach. Learn. Res. 9 (2008) 2579–2596.
    [74]
    G. Hulten, L. Spencer, P. Domingos, Mining time-changing data streams, in: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2001, pp. 97–106.

    Cited By

    View all
    • (2023)BGNN-XML: Bilateral Graph Neural Networks for Extreme Multi-Label Text ClassificationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.319365735:7(6698-6709)Online publication date: 1-Jul-2023
    • (2023)Spatial Context-Aware Object-Attentional Network for Multi-Label Image ClassificationIEEE Transactions on Image Processing10.1109/TIP.2023.326616132(3000-3012)Online publication date: 1-Jan-2023
    • (2023)A Multi-label Imbalanced Data Classification Method Based on Label Partition IntegrationWeb Information Systems and Applications10.1007/978-981-99-6222-8_2(14-25)Online publication date: 15-Sep-2023
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Pattern Recognition
    Pattern Recognition  Volume 90, Issue C
    Jun 2019
    484 pages

    Publisher

    Elsevier Science Inc.

    United States

    Publication History

    Published: 01 June 2019

    Author Tags

    1. Multi-label classification
    2. Multi-label learning
    3. Online learning
    4. Data stream
    5. Concept drift
    6. Label correlation
    7. Feature dependence

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 11 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)BGNN-XML: Bilateral Graph Neural Networks for Extreme Multi-Label Text ClassificationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.319365735:7(6698-6709)Online publication date: 1-Jul-2023
    • (2023)Spatial Context-Aware Object-Attentional Network for Multi-Label Image ClassificationIEEE Transactions on Image Processing10.1109/TIP.2023.326616132(3000-3012)Online publication date: 1-Jan-2023
    • (2023)A Multi-label Imbalanced Data Classification Method Based on Label Partition IntegrationWeb Information Systems and Applications10.1007/978-981-99-6222-8_2(14-25)Online publication date: 15-Sep-2023
    • (2022)Global and local attention-based multi-label learning with missing labelsInformation Sciences: an International Journal10.1016/j.ins.2022.02.022594:C(20-42)Online publication date: 1-May-2022
    • (2021)A new self-organizing map based algorithm for multi-label stream classificationProceedings of the 36th Annual ACM Symposium on Applied Computing10.1145/3412841.3441922(418-426)Online publication date: 22-Mar-2021

    View Options

    View options

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media