research-article

Multi-label Learning with Highly Incomplete Data via Collaborative Embedding

Authors:

Xiangliang ZhangAuthors Info & Claims

KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Pages 1494 - 1503

https://doi.org/10.1145/3219819.3220038

Published: 19 July 2018 Publication History

Abstract

Tremendous efforts have been dedicated to improving the effectiveness of multi-label learning with incomplete label assignments. Most of the current techniques assume that the input features of data instances are complete. Nevertheless, the co-occurrence of highly incomplete features and weak label assignments is a challenging and widely perceived issue in real-world multi-label learning applications due to a number of practical reasons including incomplete data collection, moderate labels from annotators, etc. Existing multi-label learning algorithms are not directly applicable when the observed features are highly incomplete. In this work, we attack this problem by proposing a weakly supervised multi-label learning approach, based on the idea of collaborative embedding. This approach provides a flexible framework to conduct efficient multi-label classification at both transductive and inductive mode by coupling the process of reconstructing missing features and weak label assignments in a joint optimisation framework. It is designed to collaboratively recover feature and label information, and extract the predictive association between the feature profile and the multi-label tag of the same data instance. Substantial experiments on public benchmark datasets and real security event data validate that our proposed method can provide distinctively more accurate transductive and inductive classification than other state-of-the-art algorithms.

Supplementary Material

suppl.mov (r0791o.mp4)

Supplemental video

Download
13.67 MB

MP4 File (han_multi-label_embedding.mp4)

Download
488.55 MB

References

[1]

Matthew R. Boutell, Jiebo Luo, Xipeng Shen, and Christopher M. Brown. Learning multi-label scene classification. Pattern Recognition, 37:1757--1771, 2004.

[2]

Serhat Selcuk Bucak, Rong Jin, and Anil K Jain. Multi-label learning with incomplete class assignments. In CVPR, pages 2801--2808, June 2011.

Digital Library

[3]

Ricardo Cabral, Fernando De la Torre, Joao Paulo Costeira, and Alexandre Bernardino. Matrix completion forweakly-supervised multi-label image classification. TPAMI, 37(1):121--135, 2015.

[4]

Borja Calvo, Pedro Larranage, and Jose A.Lozano. Feature subset selection from positive and unlabelled examples. Pattern Recognition Letters, 30:1027--1036, 2009.

Digital Library

[5]

Emmanuel J Candes and Yaniv Plan. Matrix completion with noise. Proceedings of the IEEE, 98(6):925--936, June 2010.

[6]

Kai-Yang Chiang, Cho-Jui Hsieh, and Inderjit S. Dhillon. Matrix completion with noisy side information. In NIPS, pages 3447--3455, 2015.

Digital Library

[7]

Tat-Seng Chua, Jinhui Tang, Richang Hong, Haojie Li, Zhiping Luo, and Yantao Zheng. NUS-WIDE: A real-world web image database from national university of singapore. In Proceedings of the ACM International Conference on Image and Video Retrieval, pages 48:1--48:9, 2009.

Digital Library

[8]

André Elisseeff and JasonWeston. A kernel method for multi-labelled classification. In NIPS, pages 681--687, 2001.

Digital Library

[9]

Charles Elkan and Keith Noto. Learning classifiers from only positive and unlabeled data. In SIGKDD, pages 213--220, 2008.

Digital Library

[10]

Gabriel Pui Cheong Fung, J. X. Yu, Hongjun Lu, and P. S. Yu. Text classification without negative examples revisit. TKDE, 18(1):6--20, 2006.

Digital Library

[11]

Andrew B. Goldberg, Xiaojin Zhu, Benjamin Recht, Jun-Ming Xu, and Robert Nowak. Transduction with matrix completion: Three birds with one stone. In NIPS, pages 757--765, 2010.

Digital Library

[12]

Yuhong Guo. Convex co-embedding for matrix completion with predictive side information. In AAAI, pages 1955--1961, 2017.

[13]

Cho-Jui Hsieh, Nagarajan Natarajan, and Inderjit S.Dhillon. PU learning for matrix completion. In ICML, pages 663--672, 2015.

Digital Library

[14]

Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization. In ICLR, 2014.

[15]

Xiaoli Li and Bing Liu. Learning to classify texts using positive and unlabeled data. In IJCAI, pages 587--592, 2003.

Digital Library

[16]

Zijia Lin, Guiguang Ding, Mingqing Hu, Jianmin Wang, and Xiaojun Ye. Image tag completion via image-specific and tag-specific linear sparse reconstructions. In CVPR, pages 1618--1625, 2013.

Digital Library

[17]

Dong Liu, Xian-Sheng Hua, MengWang, and Hong-Jiang Zhang. Image retagging. In Proceedings of the 18th ACM International Conference on Multimedia, pages 491--500, 2010.

Digital Library

[18]

Farzaneh Mirzazadeh, Yuhong Guo, and Dale Schuurmans. Convex co-embedding. In AAAI, pages 1989--1996, 2014.

Digital Library

[19]

Nagarajan Natarajan, Inderjit S Dhillon, Pradeep K Ravikumar, and Ambuj Tewari. Learning with noisy labels. In NIPS, pages 1196--1204, 2013.

Digital Library

[20]

Art B Owen and Patrick O Perry. Bi-cross-validation of the SVD and the nonnegative matrix factorization. The Annals of Applied Statistics, 3(2):564--594, 2009.

[21]

Marthinus C. du Plessis, Gang Niu, and Masashi Sugiyama. Analysis of learning from positive and unlabeled data. In NIPS, pages 703--711, 2014.

Digital Library

[22]

Ali Rahimi and Ben Recht. Random features for large-scale kernel machines. In NIPS, 2007.

Digital Library

[23]

Sundararajan Sellamanickam, Priyanka Garg, and Sathiya Keerthi Selvaraj. Pairwise ranking based approach to learning with positive and unlabelled examples. In CIKM, pages 663--672, 2011.

Digital Library

[24]

Si Si, Kai-Yang Chiang, Cho-Jui Hsieh, Nikhil Rao, and Inderjit Dhillon. Goaldirected inductive matrix completion. In KDD, 2016.

Digital Library

[25]

Cees G. M. Snoek, Marcel Worring, Jan C. van Gemert, Jan-Mark Geusebroek, and Arnold W. M. Smeulders. The challenge problem for automated detection of 101 semantic concepts in Multimedia. In Proceedings of the 14th ACM International Conference on Multimedia, pages 421--430, 2006

Digital Library

[26]

Grigorios Tsoumakas, Ioannis Katakis, and Ioannis P Vlahavas. Random klabelsets for multilabel classification. TKDE, 23(7):1079--1089, 2011.

Digital Library

[27]

Changhu Wang, Shuicheng Yan, Lei Zhang, and Hongjiang Zhang. Multi-label sparse coding for automatic image annotation. In CVPR, pages 1643--1650, 2009.

[28]

Lei Wu, Rong Jin, and Anil K Jain. Tag completion for image retrieval. TPAMI, 35(3):716--727, 2013.

Digital Library

[29]

Miao Xu, Rong Jin, and Zhi-Hua Zhou. Speedup matrix completion with side information: Application to multi-label learning. In NIPS, pages 2301--2309, 2013.

Digital Library

[30]

Hsiang-Fu Yu, Prateek Jain, Purushottam Kar, and Inderjit S. Dhillon. Large-scale multi-label learning with missing labels. In Proceedings of the 31st International Conference on International Conference on Machine Learning - Volume 32, ICML'14, pages I--593--I--601. JMLR.org, 2014.

Digital Library

[31]

Hsiang-Fu Yu, Prateek Jain, Purushottam Kar, and Inderjit S.Dhillon. Large-scale multi-label learning with missing labels. In ICML, 2014.

Digital Library

[32]

Yin Zhang Yu-Yin Sun and Zhi-Hua Zhou. Multi-label learning with weak label. In AAAI, pages 593--598, 2010.

Digital Library

[33]

Feipeng Zhao and Yuhong Guo. Semi-supervised multi-label learning with incomplete labels. In IJCAI, pages 4062--4068, 2015.

Digital Library

[34]

Guangyu Zhu, Shuicheng Yan, and Yi Ma. Image tag refinement towards lowrank, content-tag prior and error sparsity. In Proceedings of the 18th ACM International Conference on Multimedia, pages 461--470, 2010.

Digital Library

Cited By

Xia YTang MWang P(2023)Bi-directional matrix completion for highly incomplete multi-label learning via co-embedding predictive side informationApplied Intelligence10.1007/s10489-023-05004-653:23(28074-28098)Online publication date: 22-Sep-2023
https://doi.org/10.1007/s10489-023-05004-6
Liu WWang HShen XTsang I(2022)The Emerging Trends of Multi-Label LearningIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2021.311933444:11(7955-7974)Online publication date: 1-Nov-2022
https://doi.org/10.1109/TPAMI.2021.3119334
Sun LWang TDing WXu JTan A(2022)Two‐stage‐neighborhood‐based multilabel classification for incomplete data with missing labelsInternational Journal of Intelligent Systems10.1002/int.2286137:10(6773-6810)Online publication date: Mar-2022
https://doi.org/10.1002/int.22861
Show More Cited By

Index Terms

Multi-label Learning with Highly Incomplete Data via Collaborative Embedding
1. Theory of computation
  1. Theory and algorithms for application domains
    1. Machine learning theory
      1. Inductive inference
      2. Semi-supervised learning

Recommendations

Multi-label learning by exploiting label dependency
KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining

In multi-label learning, each training example is associated with a set of labels and the task is to predict the proper label set for the unseen example. Due to the tremendous (exponential) number of possible label sets, the task of learning from multi-...
Semi-supervised multi-label classification using incomplete label information
Highlights
- An inductive semi-supervised method called Smile is proposed for multi-label classification using incomplete label information.
Abstract
Classifying multi-label instances using incompletely labeled instances is one of the fundamental tasks in multi-label learning. Most existing methods regard this task as supervised weak-label learning problem and assume sufficient ...
Learning safe multi-label prediction for weakly labeled data

In this paper we study multi-label learning with weakly labeled data, i.e., labels of training examples are incomplete, which commonly occurs in real applications, e.g., image classification, document categorization. This setting includes, e.g., (i) ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

July 2018

2925 pages

ISBN:9781450355520

DOI:10.1145/3219819

General Chairs:
Yike Guo
Imperial College London
,
Faisal Farooq
IBM

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 July 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

KDD '18

Sponsor:

KDD '18: The 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

August 19 - 23, 2018

London, United Kingdom

Acceptance Rates

KDD '18 Paper Acceptance Rate 107 of 983 submissions, 11%;

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

9
Total Citations
View Citations
1,308
Total Downloads

Downloads (Last 12 months)39
Downloads (Last 6 weeks)5

Reflects downloads up to 09 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Xia YTang MWang P(2023)Bi-directional matrix completion for highly incomplete multi-label learning via co-embedding predictive side informationApplied Intelligence10.1007/s10489-023-05004-653:23(28074-28098)Online publication date: 22-Sep-2023
https://doi.org/10.1007/s10489-023-05004-6
Liu WWang HShen XTsang I(2022)The Emerging Trends of Multi-Label LearningIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2021.311933444:11(7955-7974)Online publication date: 1-Nov-2022
https://doi.org/10.1109/TPAMI.2021.3119334
Sun LWang TDing WXu JTan A(2022)Two‐stage‐neighborhood‐based multilabel classification for incomplete data with missing labelsInternational Journal of Intelligent Systems10.1002/int.2286137:10(6773-6810)Online publication date: Mar-2022
https://doi.org/10.1002/int.22861
Cole EAodha OLorieul TPerona PMorris DJojic N(2021)Multi-Label Learning from Single Positive Labels2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR46437.2021.00099(933-942)Online publication date: Jun-2021
https://doi.org/10.1109/CVPR46437.2021.00099
Sun LLyu GFeng SHuang X(2020)Beyond missing: weakly-supervised multi-label learning with incomplete and noisy labelsApplied Intelligence10.1007/s10489-020-01878-yOnline publication date: 29-Sep-2020
https://doi.org/10.1007/s10489-020-01878-y
Sun LFeng SLyu GZhang HDai G(2020)Partial multi-label learning with noisy side informationKnowledge and Information Systems10.1007/s10115-020-01527-3Online publication date: 23-Nov-2020
https://doi.org/10.1007/s10115-020-01527-3
Yu TYu GWang JGuo M(2020)Partial Multi-label Learning with Label and Feature CollaborationDatabase Systems for Advanced Applications10.1007/978-3-030-59410-7_41(621-637)Online publication date: 18-Sep-2020
https://doi.org/10.1007/978-3-030-59410-7_41
Liu XLi JLi Y(2020)How Hard Is Completeness Reasoning for Conjunctive Queries?Computing and Combinatorics10.1007/978-3-030-58150-3_12(149-161)Online publication date: 27-Aug-2020
https://doi.org/10.1007/978-3-030-58150-3_12
Ye PFeng SFeng HDai G(2019)Robust Multi-Label Learning with Corrupted Features and Incomplete Labels2019 Chinese Automation Congress (CAC)10.1109/CAC48633.2019.8996261(4411-4416)Online publication date: Nov-2019
https://doi.org/10.1109/CAC48633.2019.8996261

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents