research-article

Multi-label classification via label correlation and first order feature dependance in a data stream

Authors:

Tien Thanh Nguyen,

Thi Thu Thuy Nguyen,

Anh Vu Luong,

Quoc Viet Hung Nguyen,

Alan Wee-Chung Liew,

Bela StanticAuthors Info & Claims

Volume 90, Issue C

Pages 35 - 51

https://doi.org/10.1016/j.patcog.2019.01.007

Published: 01 June 2019 Publication History

Highlights

•

A Bayesian-based multi-label online learning method for multi-label data stream classification is proposed.

•

Our method not only learns the label correlation from each arrived sample but also dynamically determines the number of predicted labels based on Hoeffding inequality and the label cardinality.

•

Our method can also handle missing values and concept drifts in the data stream effectively.

•

Extensive comparative experiments with the state-of-the-art algorithms validate the superior performance of our method.

Abstract

Many batch learning algorithms have been introduced for offline multi-label classification (MLC) over the years. However, the increasing data volume in many applications such as social networks, sensor networks, and traffic monitoring has posed many challenges to batch MLC learning. For example, it is often expensive to re-train the model with the newly arrived samples, or it is impractical to learn on the large volume of data at once. The research on incremental learning is therefore applicable to a large volume of data and especially for data stream. In this study, we develop a Bayesian-based method for learning from multi-label data streams by taking into consideration the correlation between pairs of labels and the relationship between label and feature. In our model, not only the label correlation is learned with each arrived sample with ground truth labels but also the number of predicted labels are adjusted based on Hoeffding inequality and the label cardinality. We also extend the model to handle missing values, a problem common in many real-world data. To handle concept drift, we propose a decay mechanism focusing on the age of the arrived samples to incrementally adapt to the change of data. The experimental results show that our method is highly competitive compared to several well-known benchmark algorithms under both the stationary and concept drift settings.

References

[1]

M.-L. Zhang, Z.-H. Zhou, A review on multi-label learning algorithms, IEEE Trans. Knowl. Data Eng. 26 (Aug (8)) (2014).

Highlights

Abstract

References

Cited By

Index Terms

Recommendations

Semi-supervised multi-label classification using incomplete label information

MULFE: Multi-Label Learning via Label-Specific Feature Space Ensemble

Multi-label classification via incremental clustering on an evolving data stream

Comments

Information

Published In

Publisher

Publication History

Author Tags

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations