Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Using continuous features in the maximum entropy model

Published: 01 October 2009 Publication History

Abstract

We investigate the problem of using continuous features in the maximum entropy (MaxEnt) model. We explain why the MaxEnt model with the moment constraint (MaxEnt-MC) works well with binary features but not with the continuous features. We describe how to enhance constraints on the continuous features and show that the weights associated with the continuous features should be continuous functions instead of single values. We propose a spline-based solution to the MaxEnt model with non-linear continuous weighting functions and illustrate that the optimization problem can be converted into a standard log-linear model at a higher-dimensional space. The empirical results on two classification tasks that contain continuous features are reported. The results confirm our insight and show that our proposed solution consistently outperforms the MaxEnt-MC model and the bucketing approach with significant margins.

References

[1]
Parameter estimation for a computable general equilibrium model: A maximum entropy approach. Econ. Model. v19 i3. 375-398.
[2]
Asuncion, A., Newman, D.J., 2007. UCI Machine Learning Repository Irvine, CA: University of California, School of Information and Computer Science. <http://www.ics.uci.edu/~mlearn/MLRepository.html>.
[3]
A maximum entropy approach to natural language processing. Comput. Linguist. v22. 39-71.
[4]
Chen, S.F., Rosenfeld, R., 1999. A gaussian prior for smoothing maximum entropy models. In: Technical Report CMU-CS-99-108, Carnegie Mellon University.
[5]
A survey of smoothing techniques for ME models. IEEE Trans. Speech Audio Process. v8 i1. 37-50.
[6]
Generalized iterative scaling for log-linear models. Ann. Math. Statist. v43. 1470-1480.
[7]
Deng, L., Li, X., Yu, D., Acero, A., 2005. A hidden trajectory model with bi-directional target-filtering: Cascaded vs. integrated implementation for phonetic recognition, In: Proc. of ICASSP 2005, vol. 1, pp. 337-340.
[8]
Maximum entropy model-based baseball highlight detection and classification. Computer Vision and Image Understanding. v96 i2. 181-199.
[9]
Goodman, J., 2004. Exponential priors for maximum entropy models. In: Proc. of the HLT-NAACL, pp. 305-311.
[10]
Gu, Y., McCallum, A., Towsley, D., 2005. Detecting anomalies in network traffic using maximum entropy estimation. In: Proc. of Internet Measurement Conf., pp. 345-350.
[11]
The principle of maximum entropy. Math. Intell. v7 i1.
[12]
A maximum entropy model of phonotactics and phonotactic learning. Linguist. Inq. v39 i3. 379-440.
[13]
Kazama, J., 2004. Improving maximum entropy natural language processing by uncertainty-aware extensions and unsupervised learning. Ph.D. Thesis, University of Tokyo.
[14]
Maximum entropy models with inequality constraints: A case study on text categorization. Mach. Learn. v60 i1-3. 159-194.
[15]
Ma, C., Nguyen, P., Mahajan, M., 2007. Finding speaker identities with a conditional maximum entropy model. In: Proc. of ICASSP 2007, vol. IV, pp. 261-264.
[16]
Mahajan, M., Gunawardana, A., Acero, A., 2006. Training algorithms for hidden conditional random fields. In: Proc. of ICASSP 2006, vol. I, pp. 273-276.
[17]
Malouf, R., 2002. A comparison of algorithms for maximum entropy parameter estimation. In: Proc. of CoNLL, vol. 20, pp. 1-7.
[18]
Updating quasi-newton matrices with limited storage. Math. Comput. v35. 773-782.
[19]
Och, F.J., Ney, H., 2002. Discriminative training and maximum entropy models for statistical machine translation. In: Proc. of the 40th Annual Meeting of the ACL, pp. 295-302.
[20]
Riedmiller, M., Braun, H., 1993. A direct adaptive method for faster back-propagation learning: The RPROP algorithm. In: Proc. of IEEE ICNN, vol. 1, pp. 586-591.
[21]
A maximum entropy approach to adaptive statistical language modeling. Comput. Speech Lang. v10. 187-228.
[22]
Yu, D., Mahajan, M., Mau, P., Acero, A., 2005a. Maximum entropy based generic filter for language model adaptation. In: Proc. of ICASSP 2005, vol. I, pp. 597-600.
[23]
Yu, D., Deng, L. Acero, A., 2005b. Evaluation of a long-contextual-span hidden trajectory model and phonetic recognizer using A* lattice search, In: Proc. of Interspeech 2005, pp. 553-556.
[24]
Structured speech modeling. IEEE Trans. Audio, Speech, Lang. Process. v14 i5. 1492-1504.
[25]
Yu, D., Deng, L., Gong, Y., Acero, A., 2008. Discriminative training of variable-parameter hmms for noise robust speech recognition. In: Proc. of Interspeech 2008, vol. I, pp. 285-288.
[26]
Yu, D., Deng, L., Acero, A., 2009. Hidden conditional random field with distribution constraints for phone classification, In: Proc. of Interspeech 2009.
[27]
Yu, D., Deng, L., Gong, Y., Acero, A., in press. A novel framework and training algorithm for variable-parameter hidden markov models. IEEE Trans. Audio, Speech, Lang. Process.

Cited By

View all
  • (2016)Multi-label maximum entropy model for social emotion classification over short textNeurocomputing10.1016/j.neucom.2016.03.088210:C(247-256)Online publication date: 19-Oct-2016
  • (2016)Social emotion classification of short text via topic-level maximum entropy modelInformation and Management10.1016/j.im.2016.04.00553:8(978-986)Online publication date: 1-Dec-2016
  • (2015)Accelerated Continuous Conditional Random Fields For Load ForecastingIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2015.239931127:8(2023-2033)Online publication date: 1-Aug-2015

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Pattern Recognition Letters
Pattern Recognition Letters  Volume 30, Issue 14
October, 2009
95 pages

Publisher

Elsevier Science Inc.

United States

Publication History

Published: 01 October 2009

Author Tags

  1. Continuous feature
  2. Distribution constraint
  3. Maximum entropy model
  4. Maximum entropy principle
  5. Moment constraint
  6. Spline interpolation

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2016)Multi-label maximum entropy model for social emotion classification over short textNeurocomputing10.1016/j.neucom.2016.03.088210:C(247-256)Online publication date: 19-Oct-2016
  • (2016)Social emotion classification of short text via topic-level maximum entropy modelInformation and Management10.1016/j.im.2016.04.00553:8(978-986)Online publication date: 1-Dec-2016
  • (2015)Accelerated Continuous Conditional Random Fields For Load ForecastingIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2015.239931127:8(2023-2033)Online publication date: 1-Aug-2015

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media