Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3511808.3557261acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Consistent, Balanced, and Overlapping Label Trees for Extreme Multi-label Learning

Published: 17 October 2022 Publication History
  • Get Citation Alerts
  • Abstract

    The emerging eXtreme Multi-label Learning (XML) aims to induce multi-label predictive models from big datasets with extremely large numbers of instances, features, and especially labels. To meet the great efficiency challenge of XML, one flexible solution is the methodology of label tree, which, as its name suggests, is technically defined as a tree hierarchy of label subsets, partitioning the original large-scale XML problem into a number of small-scale sub-problems (i.e., denoted by leaf nodes) and then reducing the complexity to logarithmic time. Notably, the expected label trees should accurately find the right leaf nodes for future instances (i.e., effectiveness) and generate balanced leaf nodes (i.e., efficiency). To achieve this, we propose a novel generic method of label tree, namely Consistent, Balanced, and Overlapping Label Tree (CBOLT). To enhance the precision, we employ the weighted clustering to partition non-leaf nodes and allow overlapping label subsets, enabling to alleviate the inconsistent path and disjoint label subset issues. To improve the efficiency, we propose a new concept of a balanced problem scale and implement it with a balanced regularization for non-leaf nodes partition. We conduct extensive experiments on several benchmark XML datasets. Empirical results demonstrate that CBOLT is superior to the existing methods of label trees, and it can be applied to existing XML methods and achieve competitive performance with strong baselines.

    References

    [1]
    Rahul Agrawal, Archit Gupta, Yashoteja Prabhu, and Manik Varma. 2013. Multi-label Learning with Millions of Labels: Recommending Advertiser Bid Phrases for Web Pages. In International World Wide Web Conference. 13--24.
    [2]
    Rohit Babbar and Bernhard Schölkopf. 2017. DiSMEC: Distributed Sparse Machines for Extreme Multi-label Classification. In ACM International Conference on Web Search and Data Mining. 721--729.
    [3]
    Rohit Babbar and Bernhard Schölkopf. 2019. Data Scarcity, Robustness and Extreme Multi-label Classification. Machine Learning, Vol. 108, 8-9 (2019), 1329--1351.
    [4]
    Kush Bhatia, Kunal Dahiya, Himanshu Jain, Purushottam Kar, Anshul Mittal, Yashoteja Prabhu, and Manik Varma. 2016. The Extreme Classification Repository: Multi-label Datasets and Code.
    [5]
    Kush Bhatia, Himanshu Jain, Purushottam Kar, Manik Varma, and Prateek Jain. 2015. Sparse Local Embeddings for Extreme Multi-label Classification. In Neural Information Processing Systems. 730--738.
    [6]
    Ella Bingham and Heikki Mannila. 2001. Random Projection in Dimensionality Reduction: Applications to Image and Text Data. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 245--250.
    [7]
    Christos Boutsidis, Anastasios Zouzias, and Petros Drineas. 2010. Random Projections for k-means Clustering. In Neural Information Processing Systems. 298--306.
    [8]
    Timothy I. Cannings and Richard J. Samworth. 2017. Random-Projection Ensemble Classification. Journal of the Royal Statistical Society Series B, Vol. 79, 4 (2017), 959--1035.
    [9]
    Wei-Cheng Chang, Hsiang-Fu Yu, Kai Zhong, Yiming Yang, and Inderjit Dhillon. 2019. X-BERT: eXtreme Multi-label Text Classification with using Bidirectional Encoder Representations from Transformers. arXiv preprint arXiv:1905.02331 (2019).
    [10]
    Wei-Cheng Chang, Hsiang-Fu Yu, Kai Zhong, Yiming Yang, and Inderjit S. Dhillon. 2020a. Taming Pretrained Transformers for Extreme Multi-label Text Classification. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 3163--3171.
    [11]
    Wei-Cheng Chang, Hsiang-Fu Yu, Kai Zhong, Yiming Yang, and Inderjit S. Dhillon. 2020b. Taming Pretrained Transformers for Extreme Multi-label Text Classification. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 3163--3171.
    [12]
    Kenneth L. Clarkson and David P. Woodruff. 2017. Low-rank Approximation and Regression in Input Sparsity Time. J. ACM, Vol. 63, 6 (2017), 1--45.
    [13]
    Kunal Dahiya, Deepak Saini, Anshul Mittal, Ankush Shaw, Kushal Dave, Akshay Soni, Himanshu Jain, Sumeet Agarwal, and Manik Varma. 2021. DeepXML: A Deep Extreme Multi-Label Learning Framework Applied to Short Text Documents. In ACM International Conference on Web Search and Data Mining. 31--39.
    [14]
    Sanjoy Dasgupta and Anupam Gupta. 2003. An Elementary Proof of A Theorem of Johnson and Lindenstrauss. Random Structures & Algorithms, Vol. 22, 1 (2003), 60--65.
    [15]
    Petros Drineas, Michael W Mahoney, Shan Muthukrishnan, and Tamás Sarlós. 2011. Faster Least Squares Approximation. Numerische mathematik, Vol. 117, 2 (2011), 219--249.
    [16]
    Robert Durrant and Ata Kabán. 2013. Sharp Generalization Error Bounds for Randomly-projected Classifiers. In International Conference on Machine Learning. 693--701.
    [17]
    Robert J. Durrant and Ata Kabán. 2015. Random Projections as Regularizers: Learning A Linear Discriminant From Fewer Observations than Dimensions. Machine Learning, Vol. 99, 2 (2015), 257--286.
    [18]
    Xiaoli Zhang Fern and Carla E. Brodley. 2003. Random Projection for High Dimensional Data Clustering: A Cluster Ensemble Approach. In International Conference on Machine Learning. 186--193.
    [19]
    Zhiqi Ge and Ximing Li. 2021. To Be or not to Be, Tail Labels in Extreme Multi-label Learning. In ACM International Conference on Information and Knowledge Management. 555--564.
    [20]
    Vivek Gupta, Rahul Wadbude, Nagarajan Natarajan, Harish Karnick, Prateek Jain, and Piyush Rai. 2019. Distributional Semantics Meets Multi-label Learning. In AAAI Conference on Artificial Intelligence. 3747--3754.
    [21]
    Himanshu Jain, Venkatesh Balasubramanian, Bhanu Chunduri, and Manik Varma. 2019. Slice: Scalable Linear Extreme Classifiers trained on 100 Million Labels for Related Searches. In ACM International Conference on Web Search and Data Mining. 528--536.
    [22]
    Himanshu Jain, Yashoteja Prabhu, and Manik Varma. 2016. Extreme Multi-label Loss Functions for Recommendation, Tagging, Ranking & Other Missing Label Applications. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 935--944.
    [23]
    Ankit Jalan and Purushottam Kar. 2019. Accelerating Extreme Classification via Adaptive Feature Agglomeration. In International Joint Conference on Artificial Intelligence. 2600--2606.
    [24]
    Ting Jiang, Deqing Wang, Leilei Sun, Huayi Yang, Zhengyang Zhao, and Fuzhen Zhuang. 2021. LightXML: Transformer with Dynamic Negative Sampling for High-Performance Extreme Multi-label Text Classification. In AAAI Conference on Artificial Intelligence. 7987--7994.
    [25]
    William B. Johnson, Joram Lindenstrauss, and Gideon Schechtman. 1984. Extensions of Lipschitz Mappings into A Hilbert Space. Contemp. Math., Vol. 26, 189--206 (1984), 1.
    [26]
    Sujay Khandagale, Han Xiao, and Rohit Babbar. 2020. Bonsai: Diverse and Shallow Trees for Extreme Multi-label Classification. Machine Learning, Vol. 109, 11 (2020), 2099--2119.
    [27]
    Jingzhou Liu, Wei-Cheng Chang, Yuexin Wu, and Yiming Yang. 2017. Deep Learning for Extreme Multi-label Text Classification. In International ACM SIGIR Conference on Research and Development in Information Retrieval. 115--124.
    [28]
    Weiwei Liu, Haobo Wang, Xiaobo Shen, and Ivor Tsang. 2020. The Emerging Trends of Multi-Label Learning. arXiv preprint arXiv:2011.11197 (2020).
    [29]
    Saurabh Paul, Christos Boutsidis, Malik Magdon-Ismail, and Petros Drineas. 2014. Random Projections for Linear Support Vector Machines. ACM Transactions on Knowledge Discovery from Data, Vol. 8, 4 (2014), 1--25.
    [30]
    Yashoteja Prabhu, Anil Kag, Shilpa Gopinath, Kunal Dahiya, Shrutendra Harsola, Rahul Agrawal, and Manik Varma. 2018a. Extreme Multi-label Learning with Label Features for Warm-start Tagging, Ranking & Recommendation. In International Conference on Web Search and Data Mining. 441--449.
    [31]
    Yashoteja Prabhu, Anil Kag, Shrutendra Harsola, Rahul Agrawal, and Manik Varma. 2018b. Parabel: Partitioned Label Trees for Extreme Classification with Application to Dynamic Search Advertising. In The Web Conference. 993--1002.
    [32]
    Yashoteja Prabhu and Manik Varma. 2014. FastXML: A Fast, Accurate and Stable Tree-classifier for eXtreme Multi-label Learning. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 263--272.
    [33]
    Tamas Sarlos. 2006. Improved Approximation Algorithms for Large Matrices via Random Projections. In Annual IEEE Symposium on Foundations of Computer Science. IEEE, 143--152.
    [34]
    Alon Schclar and Lior Rokach. 2009. Random Projection Ensemble Classifiers. In International Conference on Enterprise Information Systems. Springer, 309--316.
    [35]
    Qinfeng Shi, Chunhua Shen, Rhys Hill, and Rhys Hill. 2012. Is Margin Preserved after Random Projection?. In International Conference on Machine Learning. 643--650.
    [36]
    Wissam Siblini, Pascale Kuntz, and Frank Meyer. 2018. CRAFTML, an Efficient Clustering-based Random Forest for Extreme Multi-label Learning. In International Conference on Machine Learning. 4671--4680.
    [37]
    Yukihiro Tagami. 2017. AnnexML: Approximate Nearest Neighbor Search for Extreme Multi-label Classification. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 455--464.
    [38]
    Yashaswi Verma. 2019. An Embarrassingly Simple Baseline for eXtreme Multi-label Prediction. arXiv preprint arXiv:1912.08140 (2019).
    [39]
    Marek Wydmuch, Kalina Jasinska, Mikhail Kuznetsov, Róbert Busa-Fekete, and Krzysztof Dembczynskii. 2018. A no-regret generalization of hierarchical softmax to extreme multi-label classification. In Neural Information Processing Systems. 6358--6368.
    [40]
    Guangxu Xun, Kishlay Jha, Jianhui Sun, and Aidong Zhang. 2020. Correlation Networks for Extreme Multi-label Text Classification. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1074--1082.
    [41]
    Hui Ye, Zhiyu Chen, Da-Han Wang, and Brian D. Davison. 2020a. Pretrained Generalized Autoregressive Model with Adaptive Probabilistic Label Clusters for Extreme Multi-label Text Classification. In International Conference on Machine Learning. 10809--10819.
    [42]
    Hui Ye, Zhiyu Chen, Da-Han Wang, and Brian Davison. 2020b. Pretrained Generalized Autoregressive Model with Adaptive Probabilistic Label Clusters for Extreme Multi-label Text Classification. In International Conference on Machine Learning. 10809--10819.
    [43]
    Ronghui You, Zihan Zhang, Ziye Wang, Suyang Dai, Hiroshi Mamitsuka, and Shanfeng Zhu. 2019. AttentionXML: Label Tree-based Attention-Aware Deep Model for High-Performance Extreme Multi-Label Text Classification. In Neural Information Processing Systems. 5812--5822.
    [44]
    Wenjie Zhang, Junchi Yan, Xiangfeng Wang, and Hongyuan Zha. 2018. Deep Extreme Multi-label Learning. In ACM on International Conference on Multimedia Retrieval. 100--107.

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge Management
    October 2022
    5274 pages
    ISBN:9781450392365
    DOI:10.1145/3511808
    • General Chairs:
    • Mohammad Al Hasan,
    • Li Xiong
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 17 October 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. balanced regularization
    2. consistent path
    3. extreme multi-label learning
    4. label tree

    Qualifiers

    • Research-article

    Funding Sources

    • National Key R&D Program of China

    Conference

    CIKM '22
    Sponsor:

    Acceptance Rates

    CIKM '22 Paper Acceptance Rate 621 of 2,257 submissions, 28%;
    Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 142
      Total Downloads
    • Downloads (Last 12 months)35
    • Downloads (Last 6 weeks)5
    Reflects downloads up to 09 Aug 2024

    Other Metrics

    Citations

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media