research-article

SCOAL: A framework for simultaneous co-clustering and learning from complex data

Authors:

Meghana Deodhar,

Joydeep GhoshAuthors Info & Claims

ACM Transactions on Knowledge Discovery from Data (TKDD), Volume 4, Issue 3

Article No.: 11, Pages 1 - 31

https://doi.org/10.1145/1839490.1839492

Published: 22 October 2010 Publication History

Abstract

For difficult classification or regression problems, practitioners often segment the data into relatively homogeneous groups and then build a predictive model for each group. This two-step procedure usually results in simpler, more interpretable and actionable models without any loss in accuracy. In this work, we consider problems such as predicting customer behavior across products, where the independent variables can be naturally partitioned into two sets, that is, the data is dyadic in nature. A pivoting operation now results in the dependent variable showing up as entries in a “customer by product” data matrix. We present the Simultaneous CO-clustering And Learning (SCOAL) framework, based on the key idea of interleaving co-clustering and construction of prediction models to iteratively improve both cluster assignment and fit of the models. This algorithm provably converges to a local minimum of a suitable cost function. The framework not only generalizes co-clustering and collaborative filtering to model-based co-clustering, but can also be viewed as simultaneous co-segmentation and classification or regression, which is typically better than independently clustering the data first and then building models. Moreover, it applies to a wide range of bi-modal or multimodal data, and can be easily specialized to address classification and regression problems. We demonstrate the effectiveness of our approach on both these problems through experimentation on a variety of datasets.

References

[1]

Agarwal, D. and Chen, B. 2009. Regression-based latent factor models. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'09). 19--28.

Digital Library

[2]

Agarwal, D. and Merugu, S. 2007. Predictive discrete latent factor models for large scale dyadic data. In Proceedings of the International Conference on Knowledge Discovery and Data Mining (KDD'07). 26--35.

Digital Library

[3]

Banerjee, A., Dhillon, I., Ghosh, J., Merugu, S., and Modha, D. 2007. A generalized maximum entropy approach to Bregman co-clustering and matrix approximation. J. Mach. Learn. Resear. 8, 1919--1986.

Digital Library

[4]

Banerjee, A., Merugu, S., Dhillon, I., and Ghosh, J. 2005. Clustering with Bregman divergences. J. Mach. Learn. Resear. 6, 1705--1749.

Digital Library

[5]

Baumann, T. and Germond, A. 1993. Application of the kohonen network to short-term load forecasting. In Proceedings of the International Conference on Neural Networks to Power System (ANNPS). 407--412.

[6]

Breiman, L., Friedman, J. H., Olshen, R. A., and Stone, C. J. 1984. Classification and Regression Trees. Wadsworth, Belmont, CA.

[7]

Cheng, Y. and Church, G. M. 2000. Biclustering of expression data. In Proceedings of the International Conference on Intelligent Systems for Molecular Biology (ICMB). 93--103.

Digital Library

[8]

Cho, H., Dhillon, I. S., Guan, Y., and Sra, S. 2004. Minimum sum squared residue co-clustering of gene expression data. In Proceedings of the SIAM Conference on Data Mining (SDM).

[9]

Deodhar, M. and Ghosh, J. 2007. A framework for simultaneous co-clustering and learning from complex data. Department of Electrical and Computer Engineering University of Texas at Austin, IDEAL-2007-08, http://www.lans.ece.utexas.edu/papers/techreports/deodhar07Coclust. pdf.

[10]

Deodhar, M. and Ghosh, J. 2009. Mining for the most certain predictions from dyadic data. In Proceedings of the International Conference on Knowledge Discovery and Data Mining (KDD'09). 249--258.

Digital Library

[11]

Dhillon, I., Mallela, S., and Modha, D. 2003. Information-theoretic co-clustering. In Proceedings of the International Conference on Knowledge Discovery and Data Mining (KDD'03). 89--98.

Digital Library

[12]

Djukanovic, M., Babic, B., Sobajic, D., and Pao, Y. 1993. Unsupervised/supervised learning concept for 24-house load forecasting. IEE Proc.-Generation, Transmiss. Distrib. 140, 311--318.

[13]

Friedman, J. 2008. Fast sparse regression and classification. Tech. rep., Stanford University.

[14]

George, T. and Merugu, S. 2005. A scalable collaborative filtering framework based on co-clustering. In Proceedings of the IEEE International Conference on Data Mining (ICDM'05). 625--628.

Digital Library

[15]

Gill, P. E., Murray, W., and Wright, M. H. 1981. Practical Optimization. Academic Press, Harcourt Brace and Company, London.

[16]

Grover, R. and Srinivasan, V. 1987. A simultaneous approach to market segmentation and market structuring. J. Market. Res., 139--153.

[17]

Hastie, T., Tibshirani, R., and Friedman, J. 2001. The Elements of Statistical Learning. Springer, New York.

[18]

Herlocker, J., Konstan, J., Borchers, A., and Riedl, J. 1999. An algorithmic framework for performing collaborative filtering. In Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 230--237.

Digital Library

[19]

Huber, P. J. 1981. Robust Statistics. Wiley, New York.

[20]

Jordan, M., Jacobs, R., Nowlan, S. J., and Hinton, G. E. 1991. Adaptive mixtures of local experts. Neural Computat. 3, 79--87.

[21]

Jornsten, R. and Yu, B. 2003. Simultaneous gene clustering and subset selection for sample classification via MDL. BMC Bioinformat. 19, 9, 1100--1109.

[22]

Kim, B. and Rossi, P. 1994. Purchase frequency, sample selection, and price sensitivity: The heavy-user bias. Market. Lett., 57--67.

[23]

Kim, B. and Sullivan, M. 1998. The effect of parent brand experience on line extension trial and repeat purchase. Market. Lett., 181--193.

[24]

Lee, W. and Liu, B. 2003. Learning with positive and unlabeled examples using weighted logistic regression. In Proceedings of the 20th International Conference on Machine Learning (ICML).

[25]

Liu, X., Krishnan, A., and Mondry, A. 2005. An entropy-based gene selection method for cancer classification using microarray data. BMC Bioinformat. 6, 76.

[26]

Lokmic, L. and Smith, K. A. 2000. Cash flow forecasting using supervised and unsupervised neural networks. Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks (IJCNN 2000), 6343.

Digital Library

[27]

Madeira, S. C. and Oliveira, A. L. 2004. Biclustering algorithms for biological data analysis: A survey. IEEE Trans. Comput. Biol. Bioinf 1, 1, 24--45.

Digital Library

[28]

McCullagh, P. and Nelder, J. A. 1983. Generalized Linear Models. Chapman and Hall, London.

[29]

Neal, R. M. and Hinton, G. E. 1998. A view of the EM algorithm that justifies incremental, sparse, and other variants. In Learning in Graphical Models, MIT Press, 355--368.

Digital Library

[30]

Oh, K. and Han, I. 2001. An intelligent clustering forecasting system based on change-point detection and artificial neural networks: Application to financial economics. In Proceedings of the Itawaii International Conference on Systems Science (HICSS-34). 3011.

Digital Library

[31]

Quinlan, J. R. 1992. Learning with continuous classes. In Proceedings of the 5th Australian Joint Conference on Artificial Intelligence. World Scientific, 343--348.

[32]

Ramamurti, V., and Ghosh, J. 1998. On the use of localized gating in mixtures of experts networks. (invited paper) In Proceedings of the SPIE Conference on Applications and Science of Computational Intelligence. 24--35.

[33]

Seetharaman, P., Ainslie, A., and Chintagunta, P. 1999. Investigating household state dependence effects across categories. J. Market. Res., 488--500.

[34]

Sfetsos, A. and Siriopoulos, C. 2004. Time series forecasting with a hybrid clustering scheme and pattern recognition. Systems, Man and Cybernetics, Part A, IEEE 34, 399--405.

Digital Library

[35]

Sharkey, A. 1996. On combining artificial neural networks. Conn. Sci. 8, 3/4, 299--314.

[36]

Wang, Y. and Witten, I. H. 1997. Inducing model trees for continuous classes. In Proceedings of the 9th European Conference on Machine Learning. 128--137.

[37]

Wedel, M. and Steenkamp, J. 1991. A clusterwise regression method for simultaneous fuzzy market structuring and benefit segmentation. J. Market. Res., 385--396.

[38]

Zhang, S., Neagu, D., and Balescu, C. 2005. Refinement of clustering solutions using a multi-label voting algorithm for neuro-fuzzy ensembles. In Proceedings of the International on Natural Computation (ICNC). 1300--1303.

Digital Library

Cited By

Wang HSong YChen WLuo ZLi CLi T(2024)A Survey of Co-ClusteringACM Transactions on Knowledge Discovery from Data10.1145/368179318:9(1-28)Online publication date: 25-Jul-2024
https://dl.acm.org/doi/10.1145/3681793
Destere AMarchello GMerino DOthman NGérard ALavrut TViard DRocher FCorneli MBouveyron CDrici M(2024)An artificial intelligence algorithm for co‐clustering to help in pharmacovigilance before and during the COVID‐19 pandemicBritish Journal of Clinical Pharmacology10.1111/bcp.1601290:5(1258-1267)Online publication date: 8-Feb-2024
https://doi.org/10.1111/bcp.16012
Sugahara KOkamoto K(2023)Hierarchical co-clustering with augmented matrices from external domainsPattern Recognition10.1016/j.patcog.2023.109657142(109657)Online publication date: Oct-2023
https://doi.org/10.1016/j.patcog.2023.109657
Show More Cited By

Index Terms

SCOAL: A framework for simultaneous co-clustering and learning from complex data
1. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

Ensemble Block Co-clustering: A Unified Framework for Text Data
CIKM '20: Proceedings of the 29th ACM International Conference on Information & Knowledge Management

In this paper, we propose a unified framework for Ensemble Block Co-clustering (EBCO), which aims to fuse multiple basic co-clusterings into a consensus structured affinity matrix. Each co-clustering to be fused is obtained by applying a co-clustering ...
A framework for simultaneous co-clustering and learning from complex data
KDD '07: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining

For difficult classification or regression problems, practitioners often segment the data into relatively homogenous groups and then build a model for each group. This two-step procedure usually results in simpler, more interpretable and actionable ...
TWCC

A co-clustering method TWCC was proposed, in which two types of weights are automatically computed.Its the first two-way subspace weighting partitional co-clustering method.It can simultaneously weight data from two ways for co-clustering.Experimental ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Knowledge Discovery from Data

ACM Transactions on Knowledge Discovery from Data Volume 4, Issue 3

October 2010

191 pages

ISSN:1556-4681

EISSN:1556-472X

DOI:10.1145/1839490

Issue’s Table of Contents

Copyright © 2010 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 October 2010

Accepted: 01 November 2009

Revised: 01 September 2009

Received: 01 January 2009

Published in TKDD Volume 4, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

Division of Information and Intelligent Systems

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

44
Total Citations
View Citations
730
Total Downloads

Downloads (Last 12 months)4
Downloads (Last 6 weeks)1

Reflects downloads up to 12 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wang HSong YChen WLuo ZLi CLi T(2024)A Survey of Co-ClusteringACM Transactions on Knowledge Discovery from Data10.1145/368179318:9(1-28)Online publication date: 25-Jul-2024
https://dl.acm.org/doi/10.1145/3681793
Destere AMarchello GMerino DOthman NGérard ALavrut TViard DRocher FCorneli MBouveyron CDrici M(2024)An artificial intelligence algorithm for co‐clustering to help in pharmacovigilance before and during the COVID‐19 pandemicBritish Journal of Clinical Pharmacology10.1111/bcp.1601290:5(1258-1267)Online publication date: 8-Feb-2024
https://doi.org/10.1111/bcp.16012
Sugahara KOkamoto K(2023)Hierarchical co-clustering with augmented matrices from external domainsPattern Recognition10.1016/j.patcog.2023.109657142(109657)Online publication date: Oct-2023
https://doi.org/10.1016/j.patcog.2023.109657
Almomani AAlauthman MShatnawi MAlweshah MAlrosan AAlomoush WGupta BGupta BGupta B(2022)Phishing Website Detection With Semantic Features Based on Machine Learning ClassifiersInternational Journal on Semantic Web & Information Systems10.4018/IJSWIS.29703218:1(1-24)Online publication date: 23-Feb-2022
https://dl.acm.org/doi/10.4018/IJSWIS.297032
Su JLi J(2022)Semantic Trajectory Frequent Pattern Mining ModelInternational Journal on Semantic Web & Information Systems10.4018/IJSWIS.29703118:1(1-20)Online publication date: 23-Feb-2022
https://dl.acm.org/doi/10.4018/IJSWIS.297031
Barbosa ABittencourt ISiqueira SDermeval DCruz N(2022)A Context-Independent Ontological Linked Data Alignment Approach to Instance MatchingInternational Journal on Semantic Web & Information Systems10.4018/IJSWIS.29597718:1(1-29)Online publication date: 22-Feb-2022
https://dl.acm.org/doi/10.4018/IJSWIS.295977
Tembhurne JAlmin MDiwan T(2022)Mc-DNNInternational Journal on Semantic Web & Information Systems10.4018/IJSWIS.29555318:1(1-20)Online publication date: 17-Feb-2022
https://dl.acm.org/doi/10.4018/IJSWIS.295553
Bassiliades NStylianou NVlachava DKonstantinidis IPeristeras V(2022)Doc2KGInternational Journal on Semantic Web & Information Systems10.4018/IJSWIS.29555218:1(1-20)Online publication date: 17-Feb-2022
https://dl.acm.org/doi/10.4018/IJSWIS.295552
Jiang HZhou JStewart AWang H(2022)A Short Survey on the User Cold Start Problem in Recommender Systems: Metadata and Meta-Learning Methods2022 IEEE International Conference on Big Data (Big Data)10.1109/BigData55660.2022.10020294(3928-3934)Online publication date: 17-Dec-2022
https://doi.org/10.1109/BigData55660.2022.10020294
Sun HWang QYue YZhao YFu S(2022)A storage computing architecture with multiple NDP devices for accelerating compaction performance in LSM-tree based KV storesJournal of Systems Architecture: the EUROMICRO Journal10.1016/j.sysarc.2022.102681130:COnline publication date: 1-Sep-2022
https://dl.acm.org/doi/10.1016/j.sysarc.2022.102681
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents