Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/1873781.1873921dlproceedingsArticle/Chapter ViewAbstractPublication PagescolingConference Proceedingsconference-collections
research-article
Free access

Kernel slicing: scalable online training with conjunctive features

Published: 23 August 2010 Publication History

Abstract

This paper proposes an efficient online method that trains a classifier with many conjunctive features. We employ kernel computation called kernel slicing, which explicitly considers conjunctions among frequent features in computing the polynomial kernel, to combine the merits of linear and kernel-based training. To improve the scalability of this training, we reuse the temporal margins of partial feature vectors and terminate unnecessary margin computations. Experiments on dependency parsing and hyponymy-relation extraction demonstrated that our method could train a classifier orders of magnitude faster than kernel-based online learning, while retaining its space efficiency.

References

[1]
Ando, Rie Kubota and Tong Zhang. 2005. A framework for learning predictive structures from multiple tasks and unlabeled data. Journal of Machine Learning Research, 6:1817--1853.
[2]
Aoe, Jun'ichi. 1989. An efficient digital search algorithm by using a double-array structure. IEEE Transactions on Software Engineering, 15(9):1066--1077.
[3]
Bellare, Kedar, Partha Pratim Talukdar, Giridhar Kumaran, Fernando Pereira, Mark Liberman, Andrew McCallum, and Mark Dredze. 2007. Lightly-supervised attribute extraction. In Proc. NIPS 2007 Workshop on Machine Learning for Web Search.
[4]
Cavallanti, Giovanni, Nicolò Cesa-Bianchi, and Claudio Gentile. 2007. Tracking the best hyperplane with a simple budget perceptron. Machine Learning, 69(2--3):143--167.
[5]
Chang, Yin-Wen, Cho-Jui Hsieh, Kai-Wei Chang, Michael Ringgaard, and Chih-Jen Lin. 2010. Training and testing low-degree polynomial data mappings via linear SVM. Journal of Machine Learning Research, 11:1471--1490.
[6]
Crammer, Koby, Ofer Dekel, Joseph Keshet, Shai Shalev-Shwartz, and Yoram Singer. 2006. Online passive-aggressive algorithms. Journal of Machine Learning Research, 7:551--585.
[7]
Daumé III, Hal. 2006. Practical Structured Learning Techniques for Natural Language Processing. Ph.D. thesis, University of Southern California.
[8]
Daumé III, Hal. 2008. Cross-task knowledge-constrained self training. In Proc. EMNLP 2008, pages 680--688.
[9]
Dekel, Ofer, Shai Shalev-Shwartz, and Yoram Singer. 2008. The forgetron: A kernel-based perceptron on a budget. SIAM Journal on Computing, 37(5):1342--1372.
[10]
Freund, Yoav and Robert E. Schapire. 1999. Large margin classification using the perceptron algorithm. Machine Learning, 37(3):277--296.
[11]
Goldberg, Yoav and Michael Elhadad. 2008. splitSVM: fast, space-efficient, non-heuristic, polynomial kernel computation for NLP applications. In Proc. ACL-08: HLT, Short Papers, pages 237--240.
[12]
Isozaki, Hideki and Hideto Kazawa. 2002. Efficient support vector classifiers for named entity recognition. In Proc. COLING 2002, pages 1--7.
[13]
Iwakura, Tomoya and Seishi Okamoto. 2008. A fast boosting-based learner for feature-rich tagging and chunking. In Proc. CoNLL 2008, pages 17--24.
[14]
Kudo, Taku and Yuji Matsumoto. 2003. Fast methods for kernel-based text analysis. In Proc. ACL 2003, pages 24--31.
[15]
Liang, Percy, Hal Daumé III, and Dan Klein. 2008. Structure compilation: trading structure for features. In Proc. ICML 2008, pages 592--599.
[16]
Okanohara, Daisuke and Jun'ichi Tsujii. 2007. A discriminative language model with pseudo-negative samples. In Proc. ACL 2007, pages 73--80.
[17]
Okanohara, Daisuke and Jun'ichi Tsujii. 2009. Learning combination features with L 1 regularization. In Proc. NAACL HLT 2009, Short Papers, pages 97--100.
[18]
Orabona, Francesco, Joseph Keshet, and Barbara Caputo. 2009. Bounded kernel-based online learning. Journal of Machine Learning Research, 10:2643--2666.
[19]
Perkins, Simon, Kevin Lacker, and James Theiler. 2003. Grafting: fast, incremental feature selection by gradient descent in function space. Journal of Machine Learning Research, 3:1333--1356.
[20]
Sassano, Manabu. 2004. Linear-time dependency analysis for Japanese. In Proc. COLING 2004, pages 8--14.
[21]
Sumida, Asuka, Naoki Yoshinaga, and Kentaro Torisawa. 2008. Boosting precision and recall of hyponymy relation acquisition from hierarchical layouts in Wikipedia. In Proc. LREC 2008, pages 2462--2469.
[22]
Tsuruoka, Yoshimasa, Jun'ichi Tsujii, and Sophia Ananiadou. 2009. Stochastic gradient descent training for L1-regularized log-linear models with cumulative penalty. In Proc. ACL-IJCNLP 2009, pages 477--485.
[23]
Williams, Hugh E. and Justin Zobel. 1999. Compressing integers for fast file access. The Computer Journal, 42(3):193--201.
[24]
Wu, Yu-Chieh, Jie-Chi Yang, and Yue-Shi Lee. 2007. An approximate approach for training polynomial kernel SVMs in linear time. In Proc. ACL 2007, Interactive Poster and Demonstration Sessions, pages 65--68.
[25]
Yata, Susumu, Masahiro Tamura, Kazuhiro Morita, Masao Fuketa, and Jun'ichi Aoe. 2009. Sequential insertions and performance evaluations for doublearrays. In Proc. the 71st National Convention of IPSJ, pages 1263--1264. (In Japanese).
[26]
Yoshinaga, Naoki and Masaru Kitsuregawa. 2009. Polynomial to linear: efficient classification with conjunctive features. In Proc. EMNLP 2009, pages 1542--1551.

Cited By

View all
  • (2016)Ordering concepts based on common attribute intensityProceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence10.5555/3061053.3061143(3747-3753)Online publication date: 9-Jul-2016
  • (2014)DISCOVERING ROBUST EMBEDDINGS IN DISSIMILARITY SPACE FOR HIGH-DIMENSIONAL LINGUISTIC FEATURESComputational Intelligence10.1111/j.1467-8640.2012.00452.x30:2(285-315)Online publication date: 1-May-2014
  • (2012)Identifying constant and unique relations by using time-series textProceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning10.5555/2390948.2391044(883-892)Online publication date: 12-Jul-2012
  1. Kernel slicing: scalable online training with conjunctive features

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image DL Hosted proceedings
      COLING '10: Proceedings of the 23rd International Conference on Computational Linguistics
      August 2010
      1408 pages

      Publisher

      Association for Computational Linguistics

      United States

      Publication History

      Published: 23 August 2010

      Qualifiers

      • Research-article

      Acceptance Rates

      Overall Acceptance Rate 1,537 of 1,537 submissions, 100%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)33
      • Downloads (Last 6 weeks)6
      Reflects downloads up to 10 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2016)Ordering concepts based on common attribute intensityProceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence10.5555/3061053.3061143(3747-3753)Online publication date: 9-Jul-2016
      • (2014)DISCOVERING ROBUST EMBEDDINGS IN DISSIMILARITY SPACE FOR HIGH-DIMENSIONAL LINGUISTIC FEATURESComputational Intelligence10.1111/j.1467-8640.2012.00452.x30:2(285-315)Online publication date: 1-May-2014
      • (2012)Identifying constant and unique relations by using time-series textProceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning10.5555/2390948.2391044(883-892)Online publication date: 12-Jul-2012

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media