Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/1654758.1654769dlproceedingsArticle/Chapter ViewAbstractPublication PagestextgraphsConference Proceedingsconference-collections
research-article
Free access

Seeing stars when there aren't many stars: graph-based semi-supervised learning for sentiment categorization

Published: 09 June 2006 Publication History
  • Get Citation Alerts
  • Abstract

    We present a graph-based semi-supervised learning algorithm to address the sentiment analysis task of rating inference. Given a set of documents (e.g., movie reviews) and accompanying ratings (e.g., "4 stars"), the task calls for inferring numerical ratings for unlabeled documents based on the perceived sentiment expressed by their text. In particular, we are interested in the situation where labeled data is scarce. We place this task in the semi-supervised setting and demonstrate that considering unlabeled reviews in the learning process can improve rating-inference performance. We do so by creating a graph on both labeled and unlabeled data to encode certain assumptions for this task. We then solve an optimization problem to obtain a smooth rating function over the whole graph. When only limited labeled data is available, this method achieves significantly better predictive accuracy over other methods that ignore the unlabeled examples during training.

    References

    [1]
    Mikhail Belkin, Partha Niyogi, and Vikas Sindhwani. 2005. On manifold regularization. In Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics (AISTAT 2005).
    [2]
    A. Blum and S. Chawla. 2001. Learning from labeled and unlabeled data using graph mincuts. In Proc. 18th International Conf. on Machine Learning.
    [3]
    Pimwadee Chaovalit and Lina Zhou. 2005. Movie review mining: a comparison between supervised and unsupervised classification approaches. In HICSS. IEEE Computer Society.
    [4]
    Kushal Dave, Steve Lawrence, and David M. Pennock. 2003. Mining the peanut gallery: opinion extraction and semantic classification of product reviews. In WWW '03: Proceedings of the 12th international conference on World Wide Web, pages 519--528.
    [5]
    Olivier Delalleau, Yoshua Bengio, and Nicolas Le Roux. 2005. Efficient non-parametric function induction in semi-supervised learning. In Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics (AISTAT 2005).
    [6]
    Minqing Hu and Bing Liu. 2004. Mining and summarizing customer reviews. In Proceedings of KDD '04, the ACM SIGKDD international conference on Knowledge discovery and data mining, pages 168--177. ACM Press.
    [7]
    T. Joachims. 1999. Making large-scale svm learning practical. In B. Schölkopf, C. Burges, and A. Smola, editors, Advances in Kernel Methods - Support Vector Learning. MIT Press.
    [8]
    T. Joachims. 2003. Transductive learning via spectral graph partitioning. In Proceedings of ICML-03, 20th International Conference on Machine Learning.
    [9]
    Jon M. Kleinberg and Éva Tardos. 2002. Approximation algorithms for classification problems with pair-wise relationships: metric labeling and markov random fields. J. ACM, 49(5):616--639.
    [10]
    Bo Pang and Lillian Lee. 2005. Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the ACL.
    [11]
    Matthias Seeger. 2001. Learning with labeled and unlabeled data. Technical report, University of Edinburgh.
    [12]
    James Shanahan, Yan Qu, and Janyce Wiebe, editors. 2005. Computing attitude and affect in text. Springer, Dordrecht, The Netherlands.
    [13]
    Vikas Sindhwani, Partha Niyogi, and Mikhail Belkin. 2005. Beyond the point cloud: from transductive to semi-supervised learning. In ICML05, 22nd International Conference on Machine Learning, Bonn, Germany.
    [14]
    A. J. Smola and B. Schölkopf. 2004. A tutorial on support vector regression. Statistics and Computing, 14:199--222.
    [15]
    Peter Turney. 2002. Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In Proceedings of ACL-02, 40th Annual Meeting of the Association for Computational Linguistics, pages 417--424.
    [16]
    Xiaojin Zhu, Zoubin Ghahramani, and John Lafferty. 2003. Semi-supervised learning using Gaussian fields and harmonic functions. In ICML-03, 20th International Conference on Machine Learning.
    [17]
    Xiaojin Zhu. 2005. Semi-supervised learning literature survey. Technical Report 1530, Computer Sciences, University of Wisconsin-Madison. http://www.cs.wisc.edu/~jerryzhu/pub/ssl_survey.pdf.

    Cited By

    View all
    1. Seeing stars when there aren't many stars: graph-based semi-supervised learning for sentiment categorization

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image DL Hosted proceedings
      TextGraphs-1: Proceedings of the First Workshop on Graph Based Methods for Natural Language Processing
      June 2006
      115 pages

      Publisher

      Association for Computational Linguistics

      United States

      Publication History

      Published: 09 June 2006

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)43
      • Downloads (Last 6 weeks)5
      Reflects downloads up to 12 Aug 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2018)CLASENTIACM Transactions on Asian and Low-Resource Language Information Processing10.1145/320988517:4(1-28)Online publication date: 21-Jul-2018
      • (2018)Sentiment Analysis by CapsulesProceedings of the 2018 World Wide Web Conference10.1145/3178876.3186015(1165-1174)Online publication date: 10-Apr-2018
      • (2017)Exploring performance of clustering methods on document sentiment analysisJournal of Information Science10.1177/016555151561737443:1(54-74)Online publication date: 1-Feb-2017
      • (2017)Graph-based Semi-supervised Learning for Text ClassificationProceedings of the ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3121050.3121055(59-66)Online publication date: 1-Oct-2017
      • (2017)A method for multi-class sentiment classification based on an improved one-vs-one (OVO) strategy and the support vector machine (SVM) algorithmInformation Sciences: an International Journal10.1016/j.ins.2017.02.016394:C(38-52)Online publication date: 1-Jul-2017
      • (2017)Semi-supervised learning through adaptive Laplacian graph trimmingImage and Vision Computing10.1016/j.imavis.2016.11.01360:C(38-47)Online publication date: 1-Apr-2017
      • (2017)Random Multi-GraphsImage and Vision Computing10.1016/j.imavis.2016.08.00660:C(30-37)Online publication date: 1-Apr-2017
      • (2017)Multi-class sentiment classificationExpert Systems with Applications: An International Journal10.1016/j.eswa.2017.03.04280:C(323-339)Online publication date: 1-Sep-2017
      • (2016)A Survey and Comparative Study of Tweet Sentiment Analysis via Semi-Supervised LearningACM Computing Surveys10.1145/293270849:1(1-26)Online publication date: 29-Jun-2016
      • (2016)Graph-Based Semisupervised Learning for Acoustic Modeling in Automatic Speech RecognitionIEEE/ACM Transactions on Audio, Speech and Language Processing10.1109/TASLP.2016.259380024:11(1946-1956)Online publication date: 1-Nov-2016
      • Show More Cited By

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media