Abstract From January to September, 2003, the Institute of Computational Linguistics of Peking Un... more Abstract From January to September, 2003, the Institute of Computational Linguistics of Peking University accomplished the compilation of The Grammatical Knowledge-base of Chinese high-frequency words, which is a National Key Fundamental Research Program ( ...
Proceedings of the 14th Youth Conference on …, Jan 1, 2009
Abstract: In this article, we assign Chinese n-gram sequences to different types by their statist... more Abstract: In this article, we assign Chinese n-gram sequences to different types by their statistical properties such as frequency, mutual information and left/right border entropy. We call these sequence type “Radixes” and define some combination rules between them. ...
Abstract From January to September, 2003, the Institute of Computational Linguistics of Peking Un... more Abstract From January to September, 2003, the Institute of Computational Linguistics of Peking University accomplished the compilation of The Grammatical Knowledge-base of Chinese high-frequency words, which is a National Key Fundamental Research Program ( ...
In recent years, mono-or multilingual corpora are viewed as key resources in language information... more In recent years, mono-or multilingual corpora are viewed as key resources in language information processing and language engineering projects. A large number of new approaches to language-related applications and research based on large-scale corpora ...
Abstract From January to September, 2003, the Institute of Computational Linguistics of Peking Un... more Abstract From January to September, 2003, the Institute of Computational Linguistics of Peking University accomplished the compilation of The Grammatical Knowledge-base of Chinese high-frequency words, which is a National Key Fundamental Research Program ( ...
Proceedings of the 14th Youth Conference on …, Jan 1, 2009
Abstract: In this article, we assign Chinese n-gram sequences to different types by their statist... more Abstract: In this article, we assign Chinese n-gram sequences to different types by their statistical properties such as frequency, mutual information and left/right border entropy. We call these sequence type “Radixes” and define some combination rules between them. ...
Abstract From January to September, 2003, the Institute of Computational Linguistics of Peking Un... more Abstract From January to September, 2003, the Institute of Computational Linguistics of Peking University accomplished the compilation of The Grammatical Knowledge-base of Chinese high-frequency words, which is a National Key Fundamental Research Program ( ...
In recent years, mono-or multilingual corpora are viewed as key resources in language information... more In recent years, mono-or multilingual corpora are viewed as key resources in language information processing and language engineering projects. A large number of new approaches to language-related applications and research based on large-scale corpora ...
Uploads
Papers by HuaRui Zhang