[PDF][PDF] A hierarchical Bayesian language model based on Pitman-Yor processes

YW Teh - Proceedings of the 21st International Conference on …, 2006 - aclanthology.org
Proceedings of the 21st International Conference on Computational …, 2006aclanthology.org
We propose a new hierarchical Bayesian n-gram model of natural languages. Our model
makes use of a generalization of the commonly used Dirichlet distributions called Pitman-
Yor processes which produce power-law distributions more closely resembling those in
natural languages. We show that an approximation to the hierarchical Pitman-Yor language
model recovers the exact formulation of interpolated Kneser-Ney, one of the best smoothing
methods for n-gram language models. Experiments verify that our model gives cross entropy …
Abstract
We propose a new hierarchical Bayesian n-gram model of natural languages. Our model makes use of a generalization of the commonly used Dirichlet distributions called Pitman-Yor processes which produce power-law distributions more closely resembling those in natural languages. We show that an approximation to the hierarchical Pitman-Yor language model recovers the exact formulation of interpolated Kneser-Ney, one of the best smoothing methods for n-gram language models. Experiments verify that our model gives cross entropy results superior to interpolated Kneser-Ney and comparable to modified Kneser-Ney.
aclanthology.org