Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2623330.2623651acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

FastXML: a fast, accurate and stable tree-classifier for extreme multi-label learning

Published: 24 August 2014 Publication History

Abstract

The objective in extreme multi-label classification is to learn a classifier that can automatically tag a data point with the most relevant subset of labels from a large label set. Extreme multi-label classification is an important research problem since not only does it enable the tackling of applications with many labels but it also allows the reformulation of ranking problems with certain advantages over existing formulations. Our objective, in this paper, is to develop an extreme multi-label classifier that is faster to train and more accurate at prediction than the state-of-the-art Multi-label Random Forest (MLRF) algorithm [2] and the Label Partitioning for Sub-linear Ranking (LPSR) algorithm [35]. MLRF and LPSR learn a hierarchy to deal with the large number of labels but optimize task independent measures, such as the Gini index or clustering error, in order to learn the hierarchy. Our proposed FastXML algorithm achieves significantly higher accuracies by directly optimizing an nDCG based ranking loss function. We also develop an alternating minimization algorithm for efficiently optimizing the proposed formulation. Experiments reveal that FastXML can be trained on problems with more than a million labels on a standard desktop in eight hours using a single core and in an hour using multiple cores.

Supplementary Material

MP4 File (p263-sidebyside.mp4)

References

[1]
Wikipedia dataset for the 4th large scale hierarchical text classification challenge. http://lshtc.iit.demokritos.gr/.
[2]
R. Agrawal, A. Gupta, Y. Prabhu, and M. Varma. Multi-label learning with millions of labels: Recommending advertiser bid phrases for web pages. In WWW, pages 13--24, 2013.
[3]
G. Andrew and J. Gao. Scalable training of L1-regularized log-linear models. In ICML, pages 33--40, 2007.
[4]
K. Balasubramanian and G. Lebanon. The landmark selection method for multiple output prediction. In ICML, 2012.
[5]
S. Bengio, J. Weston, and D. Grangier. Label embedding trees for large multi-class tasks. In NIPS, 2010.
[6]
D. Bertsekas. Nonlinear Programming. Athena Scientific, 1999.
[7]
W. Bi and J. T.-Y. Kwok. Multilabel classification on tree- and dag-structured hierarchies. In ICML, 2011.
[8]
W. Bi and J. T.-Y. Kwok. Efficient multi-label classification with many labels. In ICML, pages 405--413, 2013.
[9]
N. Cesa-Bianchi, C. Gentile, and L. Zaniboni. Incremental algorithms for hierarchical classification. JMLR, 7, 2006.
[10]
Y.-N. Chen and H.-T. Lin. Feature-aware label space dimension reduction for multi-label classification. In NIPS, pages 1538--1546, 2012.
[11]
A. Choromanska and J. Langford. Logarithmic time online multiclass prediction. http://arxiv.org/abs/1406.1822, 2014.
[12]
M. Cisse, N. Usunier, T. Artieres, and P. Gallinari. Robust bloom filters for large multilabel classification tasks. In NIPS, pages 1851--1859, 2013.
[13]
J. Deng, S. Satheesh, A. C. Berg, and F. Li. Fast and balanced: Efficient label tree learning for large scale object recognition. In NIPS, 2011.
[14]
R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin. LIBLINEAR: A library for large linear classification. JMLR, 9:1871--1874, 2008.
[15]
C.-S. Feng and H.-T. Lin. Multi-label classification with error-correcting codes. JMLR, pages 289--295, 2011.
[16]
T. Gao and D. Koller. Discriminative learning of relaxed hierarchy for large-scale visual recognition. In ICCV, pages 2072--2079, 2011.
[17]
P. Geurts, D. Ernst, and L. Wehenkel. Extremely randomized trees. ML, pages 3--42, 2006.
[18]
B. Hariharan, S. V. N. Vishwanathan, and M. Varma. Efficient max-margin multi-label classification with applications to zero-shot learning. ML, 2012.
[19]
D. Hsu, S. Kakade, J. Langford, and T. Zhang. Multi-label prediction via compressed sensing. In NIPS, 2009.
[20]
S. Ji, L. Tang, S. Yu, and J. Ye. Extracting shared subspace for multi-label classification. In KDD, pages 381--389, 2008.
[21]
C. Jose, P. Goyal, P. Aggrwal, and M. Varma. Local deep kernel learning for efficient non-linear svm prediction. In ICML, June 2013.
[22]
A. Kapoor, R. Viswanathan, and P. Jain. Multilabel classification using bayesian compressed sensing. In NIPS, 2012.
[23]
I. Katakis, G. Tsoumakas, and I. Vlahavas. Multilabel text classification for automated tag suggestion. In ECML/PKDD Discovery Challenge, 2008.
[24]
K. Koh, S.-J. Kim, and S. Boyd. An interior-point method for large-scale l1-regularized logistic regression. JMLR, 8:1519--1555, 2007.
[25]
A. Kustarev, Y. Ustinovsky, Y. Logachev, E. Grechnikov, I. Segalovich, and P. Serdyukov. Smoothing ndcg metrics using tied scores. In CIKM, pages 2053--2056, 2011.
[26]
P. D. Ravikumar, A. Tewari, and E. Yang. On ndcg consistency of listwise ranking methods. In AISTATS, pages 618--626, 2011.
[27]
J. Rousu, C. Saunders, S. Szedmak, and J. Shawe-Taylor. Kernel-based learning of hierarchical multilabel classification models. JMLR, 7, 2006.
[28]
C. Snoek, M. Worring, J. van Gemert, J.-M. Geusebroek, and A. Smeulders. The challenge problem for automated detection of 101 semantic concepts in multimedia. In ACM Multimedia, pages 421--430, 2006.
[29]
F. Tai and H.-T. Lin. Multi-label classification with principal label space transformation. In Workshop proceedings of learning from multi-label data, 2010.
[30]
G. Tsoumakas, I. Katakis, and I. Vlahavas. Effective and efficient multilabel classification in domains with large number of labels. In ECML/PKDD 2008 Workshop on Mining Multidimensional Data, 2008.
[31]
H. Valizadegan, R. Jin, R. Zhang, and J. Mao. Learning to rank by optimizing ndcg measure. In SIGIR, pages 41--48, 2000.
[32]
M. N. Volkovs and R. S. Zemel. Boltzrank: Learning to maximize expected ranking gain. In ICML, pages 1089--1096, 2009.
[33]
Y. Wang, L. Wang, Y. Li, D. He, and T.-Y. Liu. A theoretical analysis of nDCG type ranking measures. In COLT, pages 25--54, 2013.
[34]
J. Weston, S. Bengio, and N. Usunier. Wsabie: Scaling up to large vocabulary image annotation. In IJCAI, 2011.
[35]
J. Weston, A. Makadia, and H. Yee. Label partitioning for sublinear ranking. In ICML, volume 28, pages 181--189, 2013.
[36]
H.-F. Yu, P. Jain, P. Kar, and I. S. Dhillon. Large-scale multi-label learning with missing labels. ICML, 2014.
[37]
G.-X. Yuan, C.-H. Ho, and C.-J. Lin. An improved glmnet for l1-regularized logistic regression. JMLR, 13:1999--2030, 2012.
[38]
Y. Zhang and J. G. Schneider. Multi-label output codes using canonical correlation analysis. In AISTATS, pages 873--882, 2011.

Cited By

View all
  • (2024)Information Retrieval and Machine Learning Methods for Academic Expert FindingAlgorithms10.3390/a1702005117:2(51)Online publication date: 23-Jan-2024
  • (2024)Follow the Path: Hierarchy-Aware Extreme Multi-Label Completion for Semantic Text TaggingProceedings of the ACM Web Conference 202410.1145/3589334.3645558(2094-2105)Online publication date: 13-May-2024
  • (2024)MatchXML: An Efficient Text-Label Matching Framework for Extreme Multi-Label Text ClassificationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.337475036:9(4781-4793)Online publication date: Sep-2024
  • Show More Cited By

Index Terms

  1. FastXML: a fast, accurate and stable tree-classifier for extreme multi-label learning

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '14: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining
    August 2014
    2028 pages
    ISBN:9781450329569
    DOI:10.1145/2623330
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 24 August 2014

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. extreme classification
    2. multi-label learning
    3. ranking

    Qualifiers

    • Research-article

    Funding Sources

    • TCS Phd fellowship

    Conference

    KDD '14
    Sponsor:

    Acceptance Rates

    KDD '14 Paper Acceptance Rate 151 of 1,036 submissions, 15%;
    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)64
    • Downloads (Last 6 weeks)5
    Reflects downloads up to 15 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Information Retrieval and Machine Learning Methods for Academic Expert FindingAlgorithms10.3390/a1702005117:2(51)Online publication date: 23-Jan-2024
    • (2024)Follow the Path: Hierarchy-Aware Extreme Multi-Label Completion for Semantic Text TaggingProceedings of the ACM Web Conference 202410.1145/3589334.3645558(2094-2105)Online publication date: 13-May-2024
    • (2024)MatchXML: An Efficient Text-Label Matching Framework for Extreme Multi-Label Text ClassificationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.337475036:9(4781-4793)Online publication date: Sep-2024
    • (2024)Learning Label-Adaptive Representation for Large-Scale Multi-Label Text ClassificationIEEE/ACM Transactions on Audio, Speech and Language Processing10.1109/TASLP.2024.339372232(2630-2640)Online publication date: 1-Jan-2024
    • (2024)Leveraging Pre-Trained Extreme Multi-Label Classifiers for Zero-Shot Learning2024 11th IEEE Swiss Conference on Data Science (SDS)10.1109/SDS60720.2024.00041(233-236)Online publication date: 30-May-2024
    • (2024)Exploring local interpretability in dimensionality reductionExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.124074252:PAOnline publication date: 24-Jul-2024
    • (2024)Learning cluster-wise label distribution for label enhancementInternational Journal of Machine Learning and Cybernetics10.1007/s13042-024-02343-9Online publication date: 27-Aug-2024
    • (2024)Top-K Pairwise Ranking: Bridging the Gap Among Ranking-Based Measures for Multi-label ClassificationInternational Journal of Computer Vision10.1007/s11263-024-02157-wOnline publication date: 26-Jul-2024
    • (2024)TLC-XML: Transformer with Label Correlation for Extreme Multi-label Text ClassificationNeural Processing Letters10.1007/s11063-024-11460-z56:1Online publication date: 10-Feb-2024
    • (2024)Collaborative learning of supervision and correlation for generalized zero-shot extreme multi-label learningApplied Intelligence10.1007/s10489-024-05498-854:8(6285-6298)Online publication date: 9-May-2024
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media