research-article

Consistent, Balanced, and Overlapping Label Trees for Extreme Multi-label Learning

Authors:

Bo FuAuthors Info & Claims

CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge Management

Pages 551 - 560

https://doi.org/10.1145/3511808.3557261

Published: 17 October 2022 Publication History

Abstract

The emerging eXtreme Multi-label Learning (XML) aims to induce multi-label predictive models from big datasets with extremely large numbers of instances, features, and especially labels. To meet the great efficiency challenge of XML, one flexible solution is the methodology of label tree, which, as its name suggests, is technically defined as a tree hierarchy of label subsets, partitioning the original large-scale XML problem into a number of small-scale sub-problems (i.e., denoted by leaf nodes) and then reducing the complexity to logarithmic time. Notably, the expected label trees should accurately find the right leaf nodes for future instances (i.e., effectiveness) and generate balanced leaf nodes (i.e., efficiency). To achieve this, we propose a novel generic method of label tree, namely Consistent, Balanced, and Overlapping Label Tree (CBOLT). To enhance the precision, we employ the weighted clustering to partition non-leaf nodes and allow overlapping label subsets, enabling to alleviate the inconsistent path and disjoint label subset issues. To improve the efficiency, we propose a new concept of a balanced problem scale and implement it with a balanced regularization for non-leaf nodes partition. We conduct extensive experiments on several benchmark XML datasets. Empirical results demonstrate that CBOLT is superior to the existing methods of label trees, and it can be applied to existing XML methods and achieve competitive performance with strong baselines.

References

[1]

Rahul Agrawal, Archit Gupta, Yashoteja Prabhu, and Manik Varma. 2013. Multi-label Learning with Millions of Labels: Recommending Advertiser Bid Phrases for Web Pages. In International World Wide Web Conference. 13--24.

Digital Library

[2]

Rohit Babbar and Bernhard Schölkopf. 2017. DiSMEC: Distributed Sparse Machines for Extreme Multi-label Classification. In ACM International Conference on Web Search and Data Mining. 721--729.

Digital Library

[3]

Rohit Babbar and Bernhard Schölkopf. 2019. Data Scarcity, Robustness and Extreme Multi-label Classification. Machine Learning, Vol. 108, 8-9 (2019), 1329--1351.

Digital Library

[4]

Kush Bhatia, Kunal Dahiya, Himanshu Jain, Purushottam Kar, Anshul Mittal, Yashoteja Prabhu, and Manik Varma. 2016. The Extreme Classification Repository: Multi-label Datasets and Code.

[5]

Kush Bhatia, Himanshu Jain, Purushottam Kar, Manik Varma, and Prateek Jain. 2015. Sparse Local Embeddings for Extreme Multi-label Classification. In Neural Information Processing Systems. 730--738.

[6]

Ella Bingham and Heikki Mannila. 2001. Random Projection in Dimensionality Reduction: Applications to Image and Text Data. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 245--250.

Digital Library

[7]

Christos Boutsidis, Anastasios Zouzias, and Petros Drineas. 2010. Random Projections for k-means Clustering. In Neural Information Processing Systems. 298--306.

[8]

Timothy I. Cannings and Richard J. Samworth. 2017. Random-Projection Ensemble Classification. Journal of the Royal Statistical Society Series B, Vol. 79, 4 (2017), 959--1035.

[9]

Wei-Cheng Chang, Hsiang-Fu Yu, Kai Zhong, Yiming Yang, and Inderjit Dhillon. 2019. X-BERT: eXtreme Multi-label Text Classification with using Bidirectional Encoder Representations from Transformers. arXiv preprint arXiv:1905.02331 (2019).

[10]

Wei-Cheng Chang, Hsiang-Fu Yu, Kai Zhong, Yiming Yang, and Inderjit S. Dhillon. 2020a. Taming Pretrained Transformers for Extreme Multi-label Text Classification. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 3163--3171.

[11]

Wei-Cheng Chang, Hsiang-Fu Yu, Kai Zhong, Yiming Yang, and Inderjit S. Dhillon. 2020b. Taming Pretrained Transformers for Extreme Multi-label Text Classification. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 3163--3171.

[12]

Kenneth L. Clarkson and David P. Woodruff. 2017. Low-rank Approximation and Regression in Input Sparsity Time. J. ACM, Vol. 63, 6 (2017), 1--45.

Digital Library

[13]

Kunal Dahiya, Deepak Saini, Anshul Mittal, Ankush Shaw, Kushal Dave, Akshay Soni, Himanshu Jain, Sumeet Agarwal, and Manik Varma. 2021. DeepXML: A Deep Extreme Multi-Label Learning Framework Applied to Short Text Documents. In ACM International Conference on Web Search and Data Mining. 31--39.

Digital Library

[14]

Sanjoy Dasgupta and Anupam Gupta. 2003. An Elementary Proof of A Theorem of Johnson and Lindenstrauss. Random Structures & Algorithms, Vol. 22, 1 (2003), 60--65.

Digital Library

[15]

Petros Drineas, Michael W Mahoney, Shan Muthukrishnan, and Tamás Sarlós. 2011. Faster Least Squares Approximation. Numerische mathematik, Vol. 117, 2 (2011), 219--249.

[16]

Robert Durrant and Ata Kabán. 2013. Sharp Generalization Error Bounds for Randomly-projected Classifiers. In International Conference on Machine Learning. 693--701.

[17]

Robert J. Durrant and Ata Kabán. 2015. Random Projections as Regularizers: Learning A Linear Discriminant From Fewer Observations than Dimensions. Machine Learning, Vol. 99, 2 (2015), 257--286.

Digital Library

[18]

Xiaoli Zhang Fern and Carla E. Brodley. 2003. Random Projection for High Dimensional Data Clustering: A Cluster Ensemble Approach. In International Conference on Machine Learning. 186--193.

[19]

Zhiqi Ge and Ximing Li. 2021. To Be or not to Be, Tail Labels in Extreme Multi-label Learning. In ACM International Conference on Information and Knowledge Management. 555--564.

[20]

Vivek Gupta, Rahul Wadbude, Nagarajan Natarajan, Harish Karnick, Prateek Jain, and Piyush Rai. 2019. Distributional Semantics Meets Multi-label Learning. In AAAI Conference on Artificial Intelligence. 3747--3754.

[21]

Himanshu Jain, Venkatesh Balasubramanian, Bhanu Chunduri, and Manik Varma. 2019. Slice: Scalable Linear Extreme Classifiers trained on 100 Million Labels for Related Searches. In ACM International Conference on Web Search and Data Mining. 528--536.

Digital Library

[22]

Himanshu Jain, Yashoteja Prabhu, and Manik Varma. 2016. Extreme Multi-label Loss Functions for Recommendation, Tagging, Ranking & Other Missing Label Applications. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 935--944.

[23]

Ankit Jalan and Purushottam Kar. 2019. Accelerating Extreme Classification via Adaptive Feature Agglomeration. In International Joint Conference on Artificial Intelligence. 2600--2606.

[24]

Ting Jiang, Deqing Wang, Leilei Sun, Huayi Yang, Zhengyang Zhao, and Fuzhen Zhuang. 2021. LightXML: Transformer with Dynamic Negative Sampling for High-Performance Extreme Multi-label Text Classification. In AAAI Conference on Artificial Intelligence. 7987--7994.

[25]

William B. Johnson, Joram Lindenstrauss, and Gideon Schechtman. 1984. Extensions of Lipschitz Mappings into A Hilbert Space. Contemp. Math., Vol. 26, 189--206 (1984), 1.

[26]

Sujay Khandagale, Han Xiao, and Rohit Babbar. 2020. Bonsai: Diverse and Shallow Trees for Extreme Multi-label Classification. Machine Learning, Vol. 109, 11 (2020), 2099--2119.

Digital Library

[27]

Jingzhou Liu, Wei-Cheng Chang, Yuexin Wu, and Yiming Yang. 2017. Deep Learning for Extreme Multi-label Text Classification. In International ACM SIGIR Conference on Research and Development in Information Retrieval. 115--124.

[28]

Weiwei Liu, Haobo Wang, Xiaobo Shen, and Ivor Tsang. 2020. The Emerging Trends of Multi-Label Learning. arXiv preprint arXiv:2011.11197 (2020).

[29]

Saurabh Paul, Christos Boutsidis, Malik Magdon-Ismail, and Petros Drineas. 2014. Random Projections for Linear Support Vector Machines. ACM Transactions on Knowledge Discovery from Data, Vol. 8, 4 (2014), 1--25.

[30]

Yashoteja Prabhu, Anil Kag, Shilpa Gopinath, Kunal Dahiya, Shrutendra Harsola, Rahul Agrawal, and Manik Varma. 2018a. Extreme Multi-label Learning with Label Features for Warm-start Tagging, Ranking & Recommendation. In International Conference on Web Search and Data Mining. 441--449.

[31]

Yashoteja Prabhu, Anil Kag, Shrutendra Harsola, Rahul Agrawal, and Manik Varma. 2018b. Parabel: Partitioned Label Trees for Extreme Classification with Application to Dynamic Search Advertising. In The Web Conference. 993--1002.

Digital Library

[32]

Yashoteja Prabhu and Manik Varma. 2014. FastXML: A Fast, Accurate and Stable Tree-classifier for eXtreme Multi-label Learning. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 263--272.

Digital Library

[33]

Tamas Sarlos. 2006. Improved Approximation Algorithms for Large Matrices via Random Projections. In Annual IEEE Symposium on Foundations of Computer Science. IEEE, 143--152.

Digital Library

[34]

Alon Schclar and Lior Rokach. 2009. Random Projection Ensemble Classifiers. In International Conference on Enterprise Information Systems. Springer, 309--316.

[35]

Qinfeng Shi, Chunhua Shen, Rhys Hill, and Rhys Hill. 2012. Is Margin Preserved after Random Projection?. In International Conference on Machine Learning. 643--650.

[36]

Wissam Siblini, Pascale Kuntz, and Frank Meyer. 2018. CRAFTML, an Efficient Clustering-based Random Forest for Extreme Multi-label Learning. In International Conference on Machine Learning. 4671--4680.

[37]

Yukihiro Tagami. 2017. AnnexML: Approximate Nearest Neighbor Search for Extreme Multi-label Classification. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 455--464.

Digital Library

[38]

Yashaswi Verma. 2019. An Embarrassingly Simple Baseline for eXtreme Multi-label Prediction. arXiv preprint arXiv:1912.08140 (2019).

[39]

Marek Wydmuch, Kalina Jasinska, Mikhail Kuznetsov, Róbert Busa-Fekete, and Krzysztof Dembczynskii. 2018. A no-regret generalization of hierarchical softmax to extreme multi-label classification. In Neural Information Processing Systems. 6358--6368.

[40]

Guangxu Xun, Kishlay Jha, Jianhui Sun, and Aidong Zhang. 2020. Correlation Networks for Extreme Multi-label Text Classification. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1074--1082.

[41]

Hui Ye, Zhiyu Chen, Da-Han Wang, and Brian D. Davison. 2020a. Pretrained Generalized Autoregressive Model with Adaptive Probabilistic Label Clusters for Extreme Multi-label Text Classification. In International Conference on Machine Learning. 10809--10819.

[42]

Hui Ye, Zhiyu Chen, Da-Han Wang, and Brian Davison. 2020b. Pretrained Generalized Autoregressive Model with Adaptive Probabilistic Label Clusters for Extreme Multi-label Text Classification. In International Conference on Machine Learning. 10809--10819.

[43]

Ronghui You, Zihan Zhang, Ziye Wang, Suyang Dai, Hiroshi Mamitsuka, and Shanfeng Zhu. 2019. AttentionXML: Label Tree-based Attention-Aware Deep Model for High-Performance Extreme Multi-Label Text Classification. In Neural Information Processing Systems. 5812--5822.

[44]

Wenjie Zhang, Junchi Yan, Xiangfeng Wang, and Hongyuan Zha. 2018. Deep Extreme Multi-label Learning. In ACM on International Conference on Multimedia Retrieval. 100--107.

Index Terms

Consistent, Balanced, and Overlapping Label Trees for Extreme Multi-label Learning
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
        Supervised learning by classification
    2. Machine learning approaches
      1. Classification and regression trees

Recommendations

To Be or not to Be, Tail Labels in Extreme Multi-label Learning
CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management

EXtreme Multi-label Learning (XML) aims to predict each instance its most relevant subset of labels from an extremely huge label space, often exceeding one million or even larger in many real applications. In XML scenarios, the labels exhibit a long ...
Deep Extreme Multi-label Learning
ICMR '18: Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval

Extreme multi-label learning (XML) or classification has been a practical and important problem since the boom of big data. The main challenge lies in the exponential label space which involves 2^L possible label sets especially when the label dimension ...
Extreme Multi-label Learning with Label Features for Warm-start Tagging, Ranking & Recommendation
WSDM '18: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining

The objective in extreme multi-label learning is to build classifiers that can annotate a data point with the subset of relevant labels from an extremely large label set. Extreme classification has, thus far, only been studied in the context of ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge Management

October 2022

5274 pages

ISBN:9781450392365

DOI:10.1145/3511808

General Chairs:
Mohammad Al Hasan
Indiana University Purdue University, Indianapolis, USA
,
Li Xiong
Emory University, Atlanta, USA

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Key R&D Program of China

Conference

CIKM '22

Sponsor:

CIKM '22: The 31st ACM International Conference on Information and Knowledge Management

October 17 - 21, 2022

GA, Atlanta, USA

Acceptance Rates

CIKM '22 Paper Acceptance Rate 621 of 2,257 submissions, 28%;

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
142
Total Downloads

Downloads (Last 12 months)35
Downloads (Last 6 weeks)5

Reflects downloads up to 09 Aug 2024

Other Metrics

View Author Metrics

Citations

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents