Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Free access

The sequence memoizer

Published: 01 February 2011 Publication History

Abstract

Probabilistic models of sequences play a central role in most machine translation, automated speech recognition, lossless compression, spell-checking, and gene identification applications to name but a few. Unfortunately, real-world sequence data often exhibit long range dependencies which can only be captured by computationally challenging, complex models. Sequence data arising from natural processes also often exhibits power-law properties, yet common sequence models do not capture such properties. The sequence memoizer is a new hierarchical Bayesian model for discrete sequence data that captures long range dependencies and power-law characteristics, while remaining computationally attractive. Its utility as a language model and general purpose lossless compressor is demonstrated.

References

[1]
Bartlett, N., Pfau, D., Wood, F. Forgetting counts: Constant memory inference for a dependent hierarchical Pitman--Yor process. In 27th International Conference on Machine Learning, to appear (2010).
[2]
Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C. A neural probabilistic language model. J. Mach. Learn. Res. 3 (2003), 1137--1155.
[3]
Chen, S.F., Goodman, J. An empirical study of smoothing techniques for language modeling. Comput. Speech Lang. 13, 4 (1999), 359--394.
[4]
Cleary, J.G., Teahan, W.J. Unbounded length contexts for PPM. Comput. J. 40 (1997), 67--75.
[5]
Doucet, A., de Freitas, N., Gordon, N.J. Sequential Monte Carlo Methods in Practice. Statistics for Engineering and Information Science. Springer-Verlag, New York, May 2001.
[6]
Gasthaus, J., Teh, Y.W. Improvements to the sequence memoizer. In Advances in Neural Information Processing Systems 23, to appear (2010).
[7]
Gasthaus, J., Wood, F., Teh, Y.W. Lossless compression based on the sequence memoizer. Data Compression Conference 2010. J.A. Storer, M.W. Marcellin, eds. Los Alamitos, CA, USA, 2010, 337--345. IEEE Computer Society.
[8]
Gelman, A., Carlin, J.B., Stern, H.S., Rubin, D.B. Bayesian data analysis. Chapman & Hall, CRC, 2nd edn, 2004.
[9]
Giegerich, R., Kurtz, S. From Ukkonen to McCreight and Weiner: A unifying view of linear-lime suffix tree construction. Algorithmica 19, 3 (1997), 331--353.
[10]
Goldwater, S., Griffiths, T.L., Johnson, M. Interpolating between types and tokens by estimating power law generators. In Advances in Neural Information Processing Systems 18 (2006), MIT Press, 459--466.
[11]
MacKay, D.J.C., Peto, L.B. A hierarchical Dirichlet language model. Nat. Lang. Eng. 1, 2 (1995), 289--307.
[12]
Mahoney, M. Large text compression benchmark. URL: http://www.mattmahoney.net/text/text.html (2009).
[13]
Mnih, A., Yuecheng, Z., Hinton, G. Improving a statistical language model through non-linear prediction. Neurocomputing 72, 7--9 (2009), 1414--1418.
[14]
Pitman, J. Coalescents with multiple collisions. Ann. Probab. 27 (1999), 1870--1902.
[15]
Pitman, J., Yor, M. The two-parameter Poisson-Dirichlet distribution derived from a stable subordinator. Ann. Probab. 25 (1997), 855--900.
[16]
Robert, C.P., Casella, G. Monte Carlo Statistical Methods. Springer Verlag, 2004.
[17]
Shannon, C.E. A mathematical theory of communication. Bell Syst. Tech. J. (reprinted in ACM SIGMOBILE Mobile Computing and Communications Review 2001) (1948).
[18]
Teh, Y.W. A hierarchical Bayesian language model based on Pitman-Yor processes. In Proceedings of the Association for Computational Linguistics (2006), 985--992.
[19]
Willems, F.M.J. The context-tree weighting method: Extensions. IEEE Trans. Inform. Theory 44, 2 (1998), 792--798.
[20]
Willems, F.M.J. CTW website. URL: http://www.ele.tue.nl/ctw/ (2009).
[21]
Wood, F., Archambeau, C., Gasthaus, J., James, L., Teh, Y.W. A stochastic memoizer for sequence data. In 26th International Conference on Machine Learning (2009), 1129--1136.
[22]
Wu, P., Teahan, W.J. A new PPM variant for Chinese text compression. Nat. Lang. Eng. 14, 3 (2007), 417--430.
[23]
Zipf, G. Selective Studies and the Principle of Relative Frequency in Language. Harvard University Press, Cambridge, MA, 1932.

Cited By

View all
  • (2022)Single-Block Recursive Poisson–Dirichlet Fragmentations of Normalized Generalized Gamma ProcessesMathematics10.3390/math1004056110:4(561)Online publication date: 11-Feb-2022
  • (2020)Mining International Political Norms from the GDELT DatabaseProceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems10.5555/3398761.3399035(1943-1945)Online publication date: 5-May-2020
  • (2020)Nonparametric estimation of probabilistic sensitivity measuresStatistics and Computing10.1007/s11222-019-09887-930:2(447-467)Online publication date: 1-Mar-2020
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Communications of the ACM
Communications of the ACM  Volume 54, Issue 2
February 2011
115 pages
ISSN:0001-0782
EISSN:1557-7317
DOI:10.1145/1897816
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 February 2011
Published in CACM Volume 54, Issue 2

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Popular
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)298
  • Downloads (Last 6 weeks)17
Reflects downloads up to 30 Aug 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Single-Block Recursive Poisson–Dirichlet Fragmentations of Normalized Generalized Gamma ProcessesMathematics10.3390/math1004056110:4(561)Online publication date: 11-Feb-2022
  • (2020)Mining International Political Norms from the GDELT DatabaseProceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems10.5555/3398761.3399035(1943-1945)Online publication date: 5-May-2020
  • (2020)Nonparametric estimation of probabilistic sensitivity measuresStatistics and Computing10.1007/s11222-019-09887-930:2(447-467)Online publication date: 1-Mar-2020
  • (2020)Bayesian Nonparametric Prediction with Multi-sample DataNonparametric Statistics10.1007/978-3-030-57306-5_11(113-121)Online publication date: 12-Nov-2020
  • (2019)A simple proof of Pitman–Yor’s Chinese restaurant process from its stick-breaking representationDependence Modeling10.1515/demo-2019-00037:1(45-52)Online publication date: 8-Mar-2019
  • (2019)Distribution theory for hierarchical processesThe Annals of Statistics10.1214/17-AOS167847:1Online publication date: 1-Feb-2019
  • (2018)Bayesian nonparametric inference beyond the Gibbs‐type frameworkScandinavian Journal of Statistics10.1111/sjos.1233445:4(1062-1091)Online publication date: 21-May-2018
  • (2018)Non-parametric Bayesian inference of strategies in repeated gamesThe Econometrics Journal10.1111/ectj.1211221:3(298-315)Online publication date: 13-Sep-2018
  • (2018)Accurate parameter estimation for Bayesian network classifiers using hierarchical Dirichlet processesMachine Language10.1007/s10994-018-5718-0107:8-10(1303-1331)Online publication date: 1-Sep-2018
  • (2017)Dynamic-depth context tree weightingProceedings of the 31st International Conference on Neural Information Processing Systems10.5555/3294996.3295092(3330-3339)Online publication date: 4-Dec-2017
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Digital Edition

View this article in digital edition.

Digital Edition

Magazine Site

View this article on the magazine site (external)

Magazine Site

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media