Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

SCOAL: A framework for simultaneous co-clustering and learning from complex data

Published: 22 October 2010 Publication History

Abstract

For difficult classification or regression problems, practitioners often segment the data into relatively homogeneous groups and then build a predictive model for each group. This two-step procedure usually results in simpler, more interpretable and actionable models without any loss in accuracy. In this work, we consider problems such as predicting customer behavior across products, where the independent variables can be naturally partitioned into two sets, that is, the data is dyadic in nature. A pivoting operation now results in the dependent variable showing up as entries in a “customer by product” data matrix. We present the Simultaneous CO-clustering And Learning (SCOAL) framework, based on the key idea of interleaving co-clustering and construction of prediction models to iteratively improve both cluster assignment and fit of the models. This algorithm provably converges to a local minimum of a suitable cost function. The framework not only generalizes co-clustering and collaborative filtering to model-based co-clustering, but can also be viewed as simultaneous co-segmentation and classification or regression, which is typically better than independently clustering the data first and then building models. Moreover, it applies to a wide range of bi-modal or multimodal data, and can be easily specialized to address classification and regression problems. We demonstrate the effectiveness of our approach on both these problems through experimentation on a variety of datasets.

References

[1]
Agarwal, D. and Chen, B. 2009. Regression-based latent factor models. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'09). 19--28.
[2]
Agarwal, D. and Merugu, S. 2007. Predictive discrete latent factor models for large scale dyadic data. In Proceedings of the International Conference on Knowledge Discovery and Data Mining (KDD'07). 26--35.
[3]
Banerjee, A., Dhillon, I., Ghosh, J., Merugu, S., and Modha, D. 2007. A generalized maximum entropy approach to Bregman co-clustering and matrix approximation. J. Mach. Learn. Resear. 8, 1919--1986.
[4]
Banerjee, A., Merugu, S., Dhillon, I., and Ghosh, J. 2005. Clustering with Bregman divergences. J. Mach. Learn. Resear. 6, 1705--1749.
[5]
Baumann, T. and Germond, A. 1993. Application of the kohonen network to short-term load forecasting. In Proceedings of the International Conference on Neural Networks to Power System (ANNPS). 407--412.
[6]
Breiman, L., Friedman, J. H., Olshen, R. A., and Stone, C. J. 1984. Classification and Regression Trees. Wadsworth, Belmont, CA.
[7]
Cheng, Y. and Church, G. M. 2000. Biclustering of expression data. In Proceedings of the International Conference on Intelligent Systems for Molecular Biology (ICMB). 93--103.
[8]
Cho, H., Dhillon, I. S., Guan, Y., and Sra, S. 2004. Minimum sum squared residue co-clustering of gene expression data. In Proceedings of the SIAM Conference on Data Mining (SDM).
[9]
Deodhar, M. and Ghosh, J. 2007. A framework for simultaneous co-clustering and learning from complex data. Department of Electrical and Computer Engineering University of Texas at Austin, IDEAL-2007-08, http://www.lans.ece.utexas.edu/papers/techreports/deodhar07Coclust. pdf.
[10]
Deodhar, M. and Ghosh, J. 2009. Mining for the most certain predictions from dyadic data. In Proceedings of the International Conference on Knowledge Discovery and Data Mining (KDD'09). 249--258.
[11]
Dhillon, I., Mallela, S., and Modha, D. 2003. Information-theoretic co-clustering. In Proceedings of the International Conference on Knowledge Discovery and Data Mining (KDD'03). 89--98.
[12]
Djukanovic, M., Babic, B., Sobajic, D., and Pao, Y. 1993. Unsupervised/supervised learning concept for 24-house load forecasting. IEE Proc.-Generation, Transmiss. Distrib. 140, 311--318.
[13]
Friedman, J. 2008. Fast sparse regression and classification. Tech. rep., Stanford University.
[14]
George, T. and Merugu, S. 2005. A scalable collaborative filtering framework based on co-clustering. In Proceedings of the IEEE International Conference on Data Mining (ICDM'05). 625--628.
[15]
Gill, P. E., Murray, W., and Wright, M. H. 1981. Practical Optimization. Academic Press, Harcourt Brace and Company, London.
[16]
Grover, R. and Srinivasan, V. 1987. A simultaneous approach to market segmentation and market structuring. J. Market. Res., 139--153.
[17]
Hastie, T., Tibshirani, R., and Friedman, J. 2001. The Elements of Statistical Learning. Springer, New York.
[18]
Herlocker, J., Konstan, J., Borchers, A., and Riedl, J. 1999. An algorithmic framework for performing collaborative filtering. In Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 230--237.
[19]
Huber, P. J. 1981. Robust Statistics. Wiley, New York.
[20]
Jordan, M., Jacobs, R., Nowlan, S. J., and Hinton, G. E. 1991. Adaptive mixtures of local experts. Neural Computat. 3, 79--87.
[21]
Jornsten, R. and Yu, B. 2003. Simultaneous gene clustering and subset selection for sample classification via MDL. BMC Bioinformat. 19, 9, 1100--1109.
[22]
Kim, B. and Rossi, P. 1994. Purchase frequency, sample selection, and price sensitivity: The heavy-user bias. Market. Lett., 57--67.
[23]
Kim, B. and Sullivan, M. 1998. The effect of parent brand experience on line extension trial and repeat purchase. Market. Lett., 181--193.
[24]
Lee, W. and Liu, B. 2003. Learning with positive and unlabeled examples using weighted logistic regression. In Proceedings of the 20th International Conference on Machine Learning (ICML).
[25]
Liu, X., Krishnan, A., and Mondry, A. 2005. An entropy-based gene selection method for cancer classification using microarray data. BMC Bioinformat. 6, 76.
[26]
Lokmic, L. and Smith, K. A. 2000. Cash flow forecasting using supervised and unsupervised neural networks. Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks (IJCNN 2000), 6343.
[27]
Madeira, S. C. and Oliveira, A. L. 2004. Biclustering algorithms for biological data analysis: A survey. IEEE Trans. Comput. Biol. Bioinf 1, 1, 24--45.
[28]
McCullagh, P. and Nelder, J. A. 1983. Generalized Linear Models. Chapman and Hall, London.
[29]
Neal, R. M. and Hinton, G. E. 1998. A view of the EM algorithm that justifies incremental, sparse, and other variants. In Learning in Graphical Models, MIT Press, 355--368.
[30]
Oh, K. and Han, I. 2001. An intelligent clustering forecasting system based on change-point detection and artificial neural networks: Application to financial economics. In Proceedings of the Itawaii International Conference on Systems Science (HICSS-34). 3011.
[31]
Quinlan, J. R. 1992. Learning with continuous classes. In Proceedings of the 5th Australian Joint Conference on Artificial Intelligence. World Scientific, 343--348.
[32]
Ramamurti, V., and Ghosh, J. 1998. On the use of localized gating in mixtures of experts networks. (invited paper) In Proceedings of the SPIE Conference on Applications and Science of Computational Intelligence. 24--35.
[33]
Seetharaman, P., Ainslie, A., and Chintagunta, P. 1999. Investigating household state dependence effects across categories. J. Market. Res., 488--500.
[34]
Sfetsos, A. and Siriopoulos, C. 2004. Time series forecasting with a hybrid clustering scheme and pattern recognition. Systems, Man and Cybernetics, Part A, IEEE 34, 399--405.
[35]
Sharkey, A. 1996. On combining artificial neural networks. Conn. Sci. 8, 3/4, 299--314.
[36]
Wang, Y. and Witten, I. H. 1997. Inducing model trees for continuous classes. In Proceedings of the 9th European Conference on Machine Learning. 128--137.
[37]
Wedel, M. and Steenkamp, J. 1991. A clusterwise regression method for simultaneous fuzzy market structuring and benefit segmentation. J. Market. Res., 385--396.
[38]
Zhang, S., Neagu, D., and Balescu, C. 2005. Refinement of clustering solutions using a multi-label voting algorithm for neuro-fuzzy ensembles. In Proceedings of the International on Natural Computation (ICNC). 1300--1303.

Cited By

View all
  • (2024)A Survey of Co-ClusteringACM Transactions on Knowledge Discovery from Data10.1145/368179318:9(1-28)Online publication date: 25-Jul-2024
  • (2024)An artificial intelligence algorithm for co‐clustering to help in pharmacovigilance before and during the COVID‐19 pandemicBritish Journal of Clinical Pharmacology10.1111/bcp.1601290:5(1258-1267)Online publication date: 8-Feb-2024
  • (2023)Hierarchical co-clustering with augmented matrices from external domainsPattern Recognition10.1016/j.patcog.2023.109657142(109657)Online publication date: Oct-2023
  • Show More Cited By

Index Terms

  1. SCOAL: A framework for simultaneous co-clustering and learning from complex data

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Knowledge Discovery from Data
    ACM Transactions on Knowledge Discovery from Data  Volume 4, Issue 3
    October 2010
    191 pages
    ISSN:1556-4681
    EISSN:1556-472X
    DOI:10.1145/1839490
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 22 October 2010
    Accepted: 01 November 2009
    Revised: 01 September 2009
    Received: 01 January 2009
    Published in TKDD Volume 4, Issue 3

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Predictive modeling
    2. classification
    3. co-clustering
    4. dyadic data
    5. multimodal data
    6. regression

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)4
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 12 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)A Survey of Co-ClusteringACM Transactions on Knowledge Discovery from Data10.1145/368179318:9(1-28)Online publication date: 25-Jul-2024
    • (2024)An artificial intelligence algorithm for co‐clustering to help in pharmacovigilance before and during the COVID‐19 pandemicBritish Journal of Clinical Pharmacology10.1111/bcp.1601290:5(1258-1267)Online publication date: 8-Feb-2024
    • (2023)Hierarchical co-clustering with augmented matrices from external domainsPattern Recognition10.1016/j.patcog.2023.109657142(109657)Online publication date: Oct-2023
    • (2022)Phishing Website Detection With Semantic Features Based on Machine Learning ClassifiersInternational Journal on Semantic Web & Information Systems10.4018/IJSWIS.29703218:1(1-24)Online publication date: 23-Feb-2022
    • (2022)Semantic Trajectory Frequent Pattern Mining ModelInternational Journal on Semantic Web & Information Systems10.4018/IJSWIS.29703118:1(1-20)Online publication date: 23-Feb-2022
    • (2022)A Context-Independent Ontological Linked Data Alignment Approach to Instance MatchingInternational Journal on Semantic Web & Information Systems10.4018/IJSWIS.29597718:1(1-29)Online publication date: 22-Feb-2022
    • (2022)Mc-DNNInternational Journal on Semantic Web & Information Systems10.4018/IJSWIS.29555318:1(1-20)Online publication date: 17-Feb-2022
    • (2022)Doc2KGInternational Journal on Semantic Web & Information Systems10.4018/IJSWIS.29555218:1(1-20)Online publication date: 17-Feb-2022
    • (2022)A Short Survey on the User Cold Start Problem in Recommender Systems: Metadata and Meta-Learning Methods2022 IEEE International Conference on Big Data (Big Data)10.1109/BigData55660.2022.10020294(3928-3934)Online publication date: 17-Dec-2022
    • (2022)A storage computing architecture with multiple NDP devices for accelerating compaction performance in LSM-tree based KV storesJournal of Systems Architecture: the EUROMICRO Journal10.1016/j.sysarc.2022.102681130:COnline publication date: 1-Sep-2022
    • Show More Cited By

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media