Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2623330.2623636acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Online chinese restaurant process

Published: 24 August 2014 Publication History

Abstract

Processing large volumes of streaming data in near-real-time is becoming increasingly important as the Internet, sensor networks and network traffic grow. Online machine learning is a typical means of dealing with streaming data, since it allows the classification model to learn one instance of data at a time. Although many online learning methods have been developed since the development of the Perceptron algorithm, existing online methods assume that the number of classes is available in advance of classification process. However, this assumption is unrealistic for large scale or streaming data sets. This work proposes an online Chinese restaurant process (CRP) algorithm, which is an online and nonparametric algorithm, to tackle this problem. This work proposes a relaxing function as part of the prior and updates the parameters with the likelihood function in terms of the consistency between the true label information and predicted result. This work presents two Gibbs sampling algorithms to perform posterior inference. In the experiments, the online CRP is applied to three massive data sets, and compared with several online learning and batch learning algorithms. One of the data sets is obtained from Wikipedia, which comprises approximately two million documents. The experimental results reveal that the proposed online CRP performs well and efficiently on massive data sets. Finally, this work proposes two methods to update the hyperparameter $\alpha$ of the online CRP. The first method is based on the posterior distribution of $\alpha$, and the second exploits the property of online learning, namely adapting to change, to adjust $\alpha$ dynamically.

Supplementary Material

MP4 File (p591-sidebyside.mp4)

References

[1]
D. Aldous. Exchangeability and related topics. In École d'Été St Flour 1983, pages 1--198. Springer-Verlag, 1985. Lecture Notes in Math. 1117.
[2]
C. E. Antoniak. Mixtures of dirichlet processes with applications to bayesian nonparametric problems. Annals of Statistics, 2(6), November 1974.
[3]
D. M. Blei and P. I. Frazier. Distance dependent chinese restaurant processes. J. Mach. Learn. Res., 12:2461--2488, Nov. 2011.
[4]
D. M. Blei, T. L. Griffiths, and M. I. Jordan. The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies. J. ACM, 57(2):7:1--7:30, Feb. 2010.
[5]
D. M. Blei, T. L. Griffiths, M. I. Jordan, and J. B. Tenenbaum. Hierarchical topic models and the nested chinese restaurant process. In NIPS, 2003.
[6]
D. M. Blei and M. I. Jordan. Variational inference for dirichlet process mixtures. Bayesian Analysis, 1(1):121--144, 2006.
[7]
C.-C. Chang and C.-J. Lin. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1--27:27, 2011. Software available at http://www.csie.ntu.edu.tw/ cjlin/libsvm.
[8]
K. Crammer, O. Dekel, J. Keshet, S. Shalev-Shwartz, and Y. Singer. Online passive-aggressive algorithms. J. Mach. Learn. Res., 7:551--585, Dec. 2006.
[9]
K. Crammer, M. Dredze, and A. Kulesza. Multi-class confidence weighted algorithms. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2, EMNLP '09, pages 496--504, Stroudsburg, PA, USA, 2009. Association for Computational Linguistics.
[10]
K. Crammer, A. Kulesza, and M. Dredze. Adaptive regularization of weight vectors. Machine Learning, 91(2):155--187, 2013.
[11]
M. Dundar, F. Akova, A. Qi, and B. Rajwa. Bayesian nonexhaustive learning for online discovery and modeling of emerging classes. In ICML, 2012.
[12]
M. D. Escobar. Estimating normal means with a dirichlet process prior. Journal of the American Statistical Association, 89(425):268--277, 1994.
[13]
M. D. Escobar and M. West. Bayesian density estimation and inference using mixtures. Journal of the american statistical association, 90(430):577--588, 1995.
[14]
T. S. Ferguson. A bayesian analysis of some nonparametric problems. The annals of statistics, page 209--230, 1973.
[15]
M. A. T. Figueiredo and A. K. Jain. Unsupervised learning of finite mixture models. IEEE Trans. Pattern Anal. Mach. Intell., 24(3):381--396, Mar. 2002.
[16]
Y. Freund and R. E. Schapire. Large margin classification using the perceptron algorithm. Mach. Learn., 37(3):277--296, Dec. 1999.
[17]
S. J. Gershman and D. M. Blei. A tutorial on Bayesian nonparametric models. Journal of Mathematical Psychology, 56(1):1--12, Feb. 2012.
[18]
S. C. Hoi, J. Wang, and P. Zhao. LIBOL: A Library for Online Learning Algorithms. Nanyang Technological University, 2012.
[19]
S. C. H. Hoi, J. Wang, and P. Zhao. Exact soft confidence-weighted learning. In ICML. icml.cc / Omnipress, 2012.
[20]
H. Ishwaran and L. F. James. Gibbs sampling methods for stick-breaking priors. Journal of the American Statistical Association, 96(453):161--173, 2001.
[21]
S. M. Kay. Fundamentals of statistical signal processing: estimation theory. Prentice-Hall, Inc., Upper Saddle River, NJ, USA, 1993.
[22]
D. D. Lewis, Y. Yang, T. G. Rose, and F. Li. Rcv1: A new benchmark collection for text categorization research. J. Mach. Learn. Res., 5:361--397, Dec. 2004.
[23]
Y. Li and P. M. Long. The Relaxed Online Maximum Margin Algorithm. Mach. Learn., 46(1--3):361--387, Jan. 2002.
[24]
N. Littlestone. Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm. Mach. Learn., 2(4):285--318, Apr. 1988.
[25]
G. Loomes and R. Sugden. Regret theory: An alternative theory of rational choice under uncertainty. Economic Journal, 92(368):805--24, 1982.
[26]
S. N. MacEachern. Estimating normal means with a conjugate style Dirichlet process prior. Communications in Statistics - Simulation and Computation, 23(3):727--741, 1994.
[27]
C. D. Manning, P. Raghavan, and H. Schtze. Introduction to Information Retrieval. Cambridge University Press, New York, NY, USA, 2008.
[28]
R. M. Neal. Markov chain sampling methods for dirichlet process mixture models. Journal of computational and graphical statistics, pages 249--265, 2000.
[29]
A. Y. Ng, M. I. Jordan, and Y. Weiss. On Spectral Clustering: Analysis and an algorithm. In Advances in Neural Information Processing Systems 14, pages 849--856. MIT Press, 2001.
[30]
J. Pitman. Combinatorial stochastic processes. Technical Report 621, Dept. Statistics, U.C. Berkeley, 2002. Lecture notes for St. Flour course, July 2002.
[31]
C. E. Rasmussen. The infinite gaussian mixture model. In NIPS, pages 554--560, 1999.
[32]
J. Shi and J. Malik. Normalized Cuts and Image Segmentation. IEEE Trans. Pattern Anal. Mach. Intell., 22(8):888--905, 2000.
[33]
R. Socher, A. L. Maas, and C. D. Manning. Spectral chinese restaurant processes: Nonparametric clustering based on similarities. Journal of Machine Learning Research - Proceedings Track, 15:698--706, 2011.
[34]
E. B. Sudderth. Graphical models for visual object recognition and tracking. PhD thesis, Cambridge, MA, USA, 2006. AAI0809973.
[35]
M. J. Wainwright and M. I. Jordan. Graphical models, exponential families, and variational inference. Found. Trends Mach. Learn., 1(1--2):1--305, Jan. 2008.
[36]
M. West, P. Müller, and M. D. Escobar. Hierarchical priors and mixture models, with applications in regression and density estimation. Aspects of Uncertainty: A Tribute to D. V. Lindley, pages 363--386, 1994.
[37]
F. Zhdanov and V. Vovk. Competitive online generalized linear regression under square loss. In Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III, ECML PKDD'10, pages 531--546, Berlin, Heidelberg, 2010. Springer-Verlag.
[38]
M. Zinkevich. Online convex programming and generalized infinitesimal gradient ascent. In ICML, pages 928--936, 2003.

Cited By

View all
  • (2021)Using Dirichlet Marked Hawkes Processes for Insider Threat DetectionDigital Threats: Research and Practice10.1145/34579083:1(1-19)Online publication date: 22-Oct-2021
  • (2018)Bayesian exploratory clustering with entropy Chinese restaurant processIntelligent Data Analysis10.3233/IDA-16333222:3(551-568)Online publication date: 7-May-2018
  • (2017)Nonparametric multi-assignment clusteringIntelligent Data Analysis10.3233/IDA-16010521:4(893-911)Online publication date: 19-Aug-2017
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '14: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining
August 2014
2028 pages
ISBN:9781450329569
DOI:10.1145/2623330
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 August 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. adaptive learning
  2. chinese restaurant process
  3. nonparametric
  4. online learning

Qualifiers

  • Research-article

Funding Sources

Conference

KDD '14
Sponsor:

Acceptance Rates

KDD '14 Paper Acceptance Rate 151 of 1,036 submissions, 15%;
Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)35
  • Downloads (Last 6 weeks)0
Reflects downloads up to 25 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2021)Using Dirichlet Marked Hawkes Processes for Insider Threat DetectionDigital Threats: Research and Practice10.1145/34579083:1(1-19)Online publication date: 22-Oct-2021
  • (2018)Bayesian exploratory clustering with entropy Chinese restaurant processIntelligent Data Analysis10.3233/IDA-16333222:3(551-568)Online publication date: 7-May-2018
  • (2017)Nonparametric multi-assignment clusteringIntelligent Data Analysis10.3233/IDA-16010521:4(893-911)Online publication date: 19-Aug-2017
  • (2016)The infinite mixtures of food productsXRDS: Crossroads, The ACM Magazine for Students10.1145/298344523:1(66-67)Online publication date: 20-Sep-2016
  • (2015)Universal knowledge discovery from big data using combined dual-cycleInternational Journal of Machine Learning and Cybernetics10.1007/s13042-015-0376-z9:1(133-144)Online publication date: 23-May-2015

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media