Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3219819.3220038acmotherconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Multi-label Learning with Highly Incomplete Data via Collaborative Embedding

Published: 19 July 2018 Publication History

Abstract

Tremendous efforts have been dedicated to improving the effectiveness of multi-label learning with incomplete label assignments. Most of the current techniques assume that the input features of data instances are complete. Nevertheless, the co-occurrence of highly incomplete features and weak label assignments is a challenging and widely perceived issue in real-world multi-label learning applications due to a number of practical reasons including incomplete data collection, moderate labels from annotators, etc. Existing multi-label learning algorithms are not directly applicable when the observed features are highly incomplete. In this work, we attack this problem by proposing a weakly supervised multi-label learning approach, based on the idea of collaborative embedding. This approach provides a flexible framework to conduct efficient multi-label classification at both transductive and inductive mode by coupling the process of reconstructing missing features and weak label assignments in a joint optimisation framework. It is designed to collaboratively recover feature and label information, and extract the predictive association between the feature profile and the multi-label tag of the same data instance. Substantial experiments on public benchmark datasets and real security event data validate that our proposed method can provide distinctively more accurate transductive and inductive classification than other state-of-the-art algorithms.

Supplementary Material

suppl.mov (r0791o.mp4)
Supplemental video
MP4 File (han_multi-label_embedding.mp4)

References

[1]
Matthew R. Boutell, Jiebo Luo, Xipeng Shen, and Christopher M. Brown. Learning multi-label scene classification. Pattern Recognition, 37:1757--1771, 2004.
[2]
Serhat Selcuk Bucak, Rong Jin, and Anil K Jain. Multi-label learning with incomplete class assignments. In CVPR, pages 2801--2808, June 2011.
[3]
Ricardo Cabral, Fernando De la Torre, Joao Paulo Costeira, and Alexandre Bernardino. Matrix completion forweakly-supervised multi-label image classification. TPAMI, 37(1):121--135, 2015.
[4]
Borja Calvo, Pedro Larranage, and Jose A.Lozano. Feature subset selection from positive and unlabelled examples. Pattern Recognition Letters, 30:1027--1036, 2009.
[5]
Emmanuel J Candes and Yaniv Plan. Matrix completion with noise. Proceedings of the IEEE, 98(6):925--936, June 2010.
[6]
Kai-Yang Chiang, Cho-Jui Hsieh, and Inderjit S. Dhillon. Matrix completion with noisy side information. In NIPS, pages 3447--3455, 2015.
[7]
Tat-Seng Chua, Jinhui Tang, Richang Hong, Haojie Li, Zhiping Luo, and Yantao Zheng. NUS-WIDE: A real-world web image database from national university of singapore. In Proceedings of the ACM International Conference on Image and Video Retrieval, pages 48:1--48:9, 2009.
[8]
André Elisseeff and JasonWeston. A kernel method for multi-labelled classification. In NIPS, pages 681--687, 2001.
[9]
Charles Elkan and Keith Noto. Learning classifiers from only positive and unlabeled data. In SIGKDD, pages 213--220, 2008.
[10]
Gabriel Pui Cheong Fung, J. X. Yu, Hongjun Lu, and P. S. Yu. Text classification without negative examples revisit. TKDE, 18(1):6--20, 2006.
[11]
Andrew B. Goldberg, Xiaojin Zhu, Benjamin Recht, Jun-Ming Xu, and Robert Nowak. Transduction with matrix completion: Three birds with one stone. In NIPS, pages 757--765, 2010.
[12]
Yuhong Guo. Convex co-embedding for matrix completion with predictive side information. In AAAI, pages 1955--1961, 2017.
[13]
Cho-Jui Hsieh, Nagarajan Natarajan, and Inderjit S.Dhillon. PU learning for matrix completion. In ICML, pages 663--672, 2015.
[14]
Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization. In ICLR, 2014.
[15]
Xiaoli Li and Bing Liu. Learning to classify texts using positive and unlabeled data. In IJCAI, pages 587--592, 2003.
[16]
Zijia Lin, Guiguang Ding, Mingqing Hu, Jianmin Wang, and Xiaojun Ye. Image tag completion via image-specific and tag-specific linear sparse reconstructions. In CVPR, pages 1618--1625, 2013.
[17]
Dong Liu, Xian-Sheng Hua, MengWang, and Hong-Jiang Zhang. Image retagging. In Proceedings of the 18th ACM International Conference on Multimedia, pages 491--500, 2010.
[18]
Farzaneh Mirzazadeh, Yuhong Guo, and Dale Schuurmans. Convex co-embedding. In AAAI, pages 1989--1996, 2014.
[19]
Nagarajan Natarajan, Inderjit S Dhillon, Pradeep K Ravikumar, and Ambuj Tewari. Learning with noisy labels. In NIPS, pages 1196--1204, 2013.
[20]
Art B Owen and Patrick O Perry. Bi-cross-validation of the SVD and the nonnegative matrix factorization. The Annals of Applied Statistics, 3(2):564--594, 2009.
[21]
Marthinus C. du Plessis, Gang Niu, and Masashi Sugiyama. Analysis of learning from positive and unlabeled data. In NIPS, pages 703--711, 2014.
[22]
Ali Rahimi and Ben Recht. Random features for large-scale kernel machines. In NIPS, 2007.
[23]
Sundararajan Sellamanickam, Priyanka Garg, and Sathiya Keerthi Selvaraj. Pairwise ranking based approach to learning with positive and unlabelled examples. In CIKM, pages 663--672, 2011.
[24]
Si Si, Kai-Yang Chiang, Cho-Jui Hsieh, Nikhil Rao, and Inderjit Dhillon. Goaldirected inductive matrix completion. In KDD, 2016.
[25]
Cees G. M. Snoek, Marcel Worring, Jan C. van Gemert, Jan-Mark Geusebroek, and Arnold W. M. Smeulders. The challenge problem for automated detection of 101 semantic concepts in Multimedia. In Proceedings of the 14th ACM International Conference on Multimedia, pages 421--430, 2006
[26]
Grigorios Tsoumakas, Ioannis Katakis, and Ioannis P Vlahavas. Random klabelsets for multilabel classification. TKDE, 23(7):1079--1089, 2011.
[27]
Changhu Wang, Shuicheng Yan, Lei Zhang, and Hongjiang Zhang. Multi-label sparse coding for automatic image annotation. In CVPR, pages 1643--1650, 2009.
[28]
Lei Wu, Rong Jin, and Anil K Jain. Tag completion for image retrieval. TPAMI, 35(3):716--727, 2013.
[29]
Miao Xu, Rong Jin, and Zhi-Hua Zhou. Speedup matrix completion with side information: Application to multi-label learning. In NIPS, pages 2301--2309, 2013.
[30]
Hsiang-Fu Yu, Prateek Jain, Purushottam Kar, and Inderjit S. Dhillon. Large-scale multi-label learning with missing labels. In Proceedings of the 31st International Conference on International Conference on Machine Learning - Volume 32, ICML'14, pages I--593--I--601. JMLR.org, 2014.
[31]
Hsiang-Fu Yu, Prateek Jain, Purushottam Kar, and Inderjit S.Dhillon. Large-scale multi-label learning with missing labels. In ICML, 2014.
[32]
Yin Zhang Yu-Yin Sun and Zhi-Hua Zhou. Multi-label learning with weak label. In AAAI, pages 593--598, 2010.
[33]
Feipeng Zhao and Yuhong Guo. Semi-supervised multi-label learning with incomplete labels. In IJCAI, pages 4062--4068, 2015.
[34]
Guangyu Zhu, Shuicheng Yan, and Yi Ma. Image tag refinement towards lowrank, content-tag prior and error sparsity. In Proceedings of the 18th ACM International Conference on Multimedia, pages 461--470, 2010.

Cited By

View all
  • (2023)Bi-directional matrix completion for highly incomplete multi-label learning via co-embedding predictive side informationApplied Intelligence10.1007/s10489-023-05004-653:23(28074-28098)Online publication date: 22-Sep-2023
  • (2022)The Emerging Trends of Multi-Label LearningIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2021.311933444:11(7955-7974)Online publication date: 1-Nov-2022
  • (2022)Two‐stage‐neighborhood‐based multilabel classification for incomplete data with missing labelsInternational Journal of Intelligent Systems10.1002/int.2286137:10(6773-6810)Online publication date: Mar-2022
  • Show More Cited By

Index Terms

  1. Multi-label Learning with Highly Incomplete Data via Collaborative Embedding

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
      July 2018
      2925 pages
      ISBN:9781450355520
      DOI:10.1145/3219819
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 19 July 2018

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. highly incomplete feature
      2. multi-label learning
      3. weak labels

      Qualifiers

      • Research-article

      Conference

      KDD '18
      Sponsor:

      Acceptance Rates

      KDD '18 Paper Acceptance Rate 107 of 983 submissions, 11%;
      Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)39
      • Downloads (Last 6 weeks)5
      Reflects downloads up to 09 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)Bi-directional matrix completion for highly incomplete multi-label learning via co-embedding predictive side informationApplied Intelligence10.1007/s10489-023-05004-653:23(28074-28098)Online publication date: 22-Sep-2023
      • (2022)The Emerging Trends of Multi-Label LearningIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2021.311933444:11(7955-7974)Online publication date: 1-Nov-2022
      • (2022)Two‐stage‐neighborhood‐based multilabel classification for incomplete data with missing labelsInternational Journal of Intelligent Systems10.1002/int.2286137:10(6773-6810)Online publication date: Mar-2022
      • (2021)Multi-Label Learning from Single Positive Labels2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR46437.2021.00099(933-942)Online publication date: Jun-2021
      • (2020)Beyond missing: weakly-supervised multi-label learning with incomplete and noisy labelsApplied Intelligence10.1007/s10489-020-01878-yOnline publication date: 29-Sep-2020
      • (2020)Partial multi-label learning with noisy side informationKnowledge and Information Systems10.1007/s10115-020-01527-3Online publication date: 23-Nov-2020
      • (2020)Partial Multi-label Learning with Label and Feature CollaborationDatabase Systems for Advanced Applications10.1007/978-3-030-59410-7_41(621-637)Online publication date: 18-Sep-2020
      • (2020)How Hard Is Completeness Reasoning for Conjunctive Queries?Computing and Combinatorics10.1007/978-3-030-58150-3_12(149-161)Online publication date: 27-Aug-2020
      • (2019)Robust Multi-Label Learning with Corrupted Features and Incomplete Labels2019 Chinese Automation Congress (CAC)10.1109/CAC48633.2019.8996261(4411-4416)Online publication date: Nov-2019

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media