Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.3115/1220175.1220202dlproceedingsArticle/Chapter ViewAbstractPublication PagesaclConference Proceedingsconference-collections
Article
Free access

Semi-supervised conditional random fields for improved sequence segmentation and labeling

Published: 17 July 2006 Publication History

Abstract

We present a new semi-supervised training procedure for conditional random fields (CRFs) that can be used to train sequence segmentors and labelers from a combination of labeled and unlabeled training data. Our approach is based on extending the minimum entropy regularization framework to the structured prediction case, yielding a training objective that combines unlabeled conditional entropy with labeled conditional likelihood. Although the training objective is no longer concave, it can still be used to improve an initial model (e.g. obtained from supervised training) by iterative ascent. We apply our new training algorithm to the problem of identifying gene and protein mentions in biological texts, and show that incorporating unlabeled data improves the performance of the supervised CRF in this case.

References

[1]
S. Abney. (2004). Understanding the Yarowsky algorithm. Computational Linguistics, 30(3):365--395.
[2]
Y. Altun, D. McAllester and M. Belkin. (2005). Maximum margin semi-supervised learning for structured variables. Advances in Neural Information Processing Systems 18.
[3]
A. Blum and T. Mitchell. (1998). Combining labeled and unlabeled data with co-training. Proceedings of the Workshop on Computational Learning Theory, 92--100.
[4]
S. Boyd and L. Vandenberghe. (20047). Convex Optimization. Cambridge University Press.
[5]
V. Castelli and T. Cover. (1996). The relative value of labeled and unlabeled samples in pattern recognition with an unknown mixing parameter. IEEE Trans. on Information Theory, 42(6):2102--2117.
[6]
G. Celeux and G. Govaert. (1992). A classification EM algorithm for clustering and two stochastic versions. Computational Statistics and Data Analysis, 14:315--332.
[7]
I. Cohen and F. Cozman. (2006). Risks of semi-supervised learning. Semi-Supervised Learning, O. Chapelle, B. Scholköpf and A. Zien, (Editors), 55--70, MIT Press.
[8]
A. Corduneanu and T. Jaakkola. (2006). Data dependent regularization. Semi-Supervised Learning, O. Chapelle, B. Scholköpf and A. Zien, (Editors), 163--182, MIT Press.
[9]
T. Cover and J. Thomas, (1991). Elements of Information Theory, John Wiley & Sons.
[10]
R. Duda and P. Hart. (1973). Pattern Classification and Scene Analysis, John Wiley & Sons.
[11]
Y. Grandvalet and Y. Bengio. (2004). Semi-supervised learning by entropy minimization, Advances in Neural Information Processing Systems, 17:529--536.
[12]
J. Lafferty, A. McCallum and F. Pereira. (2001). Conditional random fields: probabilistic models for segmenting and labeling sequence data. Proceedings of the 18th International Conference on Machine Learning, 282--289.
[13]
W. Li and A. McCallum. (2005). Semi-supervised sequence modeling with syntactic topic models. Proceedings of Twentieth National Conference on Artificial Intelligence, 813--818.
[14]
A. McCallum. (2002). MALLET: A machine learning for language toolkit. {http://mallet.cs.umass.edu}
[15]
R. McDonald, K. Lerman and Y. Jin. (2005). Conditional random field biomedical entity tagger. {http://www.seas.upenn.edu/~sryantm/software/BioTagger/}
[16]
R. McDonald and F. Pereira. (2005). Identifying gene and protein mentions in text using conditional random fields. BMC Bioinformatics 2005, 6(Suppl 1):S6.
[17]
K. Nigam, A. McCallum, S. Thrun and T. Mitchell. (2000). Text classification from labeled and unlabeled documents using EM. Machine learning. 39(2/3):135--167.
[18]
J. Nocedal and S. Wright. (2000). Numerical Optimization, Springer.
[19]
S. Roberts, R. Everson and I. Rezek. (2000). Maximum certainty data partitioning. Pattern Recognition, 33(5):833--839.
[20]
S. Vishwanathan, N. Schraudolph, M. Schmidt and K. Murphy. (2006). Accelerated training of conditional random fields with stochastic meta-descent. Proceedings of the 23th International Conference on Machine Learning.
[21]
D. Yarowsky. (1995). Unsupervised word sense disambiguation rivaling supervised methods. Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics, 189--196.
[22]
D. Zhou, O. Bousquet, T. Navin Lal, J. Weston and B. Schölkopf. (2004). Learning with local and global consistency. Advances in Neural Information Processing Systems, 16:321--328.
[23]
D. Zhou, J. Huang and B. Schölkopf. (2005). Learning from labeled and unlabeled data on a directed graph. Proceedings of the 22nd International Conference on Machine Learning, 1041--1048.
[24]
X. Zhu, Z. Ghahramani and J. Lafferty. (2003). Semisupervised learning using Gaussian fields and harmonic functions. Proceedings of the 20th International Conference on Machine Learning, 912--919.

Cited By

View all
  1. Semi-supervised conditional random fields for improved sequence segmentation and labeling

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image DL Hosted proceedings
      ACL-44: Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
      July 2006
      1214 pages

      Publisher

      Association for Computational Linguistics

      United States

      Publication History

      Published: 17 July 2006

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate 85 of 443 submissions, 19%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)66
      • Downloads (Last 6 weeks)12
      Reflects downloads up to 04 Oct 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2017)Data-Driven Shape Analysis and ProcessingComputer Graphics Forum10.1111/cgf.1279036:1(101-132)Online publication date: 1-Jan-2017
      • (2016)Data-driven shape analysis and processingSIGGRAPH ASIA 2016 Courses10.1145/2988458.2988473(1-38)Online publication date: 28-Nov-2016
      • (2014)Joint semi-supervised learning of Hidden Conditional Random Fields and Hidden Markov ModelsPattern Recognition Letters10.1016/j.patrec.2013.03.02837(161-171)Online publication date: 1-Feb-2014
      • (2013)Wikipedia entity expansion and attribute extraction from the web using semi-supervised learningProceedings of the sixth ACM international conference on Web search and data mining10.1145/2433396.2433468(567-576)Online publication date: 4-Feb-2013
      • (2012)Graph-based lexicon expansion with sparsity-inducing penaltiesProceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies10.5555/2382029.2382142(677-687)Online publication date: 3-Jun-2012
      • (2012)A hybrid two-stage approach for discipline-independent canonical representation extraction from referencesProceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries10.1145/2232817.2232871(285-294)Online publication date: 10-Jun-2012
      • (2012)Semi-supervised Mesh Segmentation and LabelingComputer Graphics Forum10.1111/j.1467-8659.2012.03217.x31:7pt2(2241-2248)Online publication date: 1-Sep-2012
      • (2012)Technical term recognition with semi-supervised learning using hierarchical bayesian language modelsProceedings of the 17th international conference on Applications of Natural Language Processing and Information Systems10.1007/978-3-642-31178-9_42(327-332)Online publication date: 26-Jun-2012
      • (2011)A weakly-supervised approach to argumentative zoning of scientific documentsProceedings of the Conference on Empirical Methods in Natural Language Processing10.5555/2145432.2145464(273-283)Online publication date: 27-Jul-2011
      • (2011)Aspects of semi-supervised and active learning in conditional random fieldsProceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III10.5555/2034161.2034180(273-288)Online publication date: 5-Sep-2011
      • Show More Cited By

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media