Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2487788.2487997acmotherconferencesArticle/Chapter ViewAbstractPublication PageswebconfConference Proceedingsconference-collections
research-article

Revised mutual information approach for german text sentiment classification

Published: 13 May 2013 Publication History
  • Get Citation Alerts
  • Abstract

    The significant increase in content of online social media such as product reviews, blogs, forums etc., have led to an increasing attention to sentiment analysis tools and approaches that make use of mining this substantially growing content. The aim of this paper is to develop a robust classification approach of customer reviews based on a self-annotated domain-specific corpus by applying a statistical approach i.e., mutual information. First, subjective words in each test sentence are identified. Second, ambiguous adjectives such as high, low, large, many etc., are disambiguated based on their accompanying noun using a conditional mutual information approach. Third, a mutual information approach is applied to find the sentiment orientation (polarity) of the identified subjective words based on analyzing their statistical relationship with the manually annotated sentiment labels within a sizeable sentiment training data. Fourth, since negation plays a significant role in flipping the sentiment polarity of an identified sentiment word, we estimate the role of negation in affecting the classification accuracy. Finally, the identified polarity for each test sentence is evaluated against experts' annotation.

    References

    [1]
    F. Ahmed, A. Nürnberger, and M. Nitsche. Supporting arabic cross-lingual retrieval using contextual information. In A. Rauber and A. de Vries (Eds.), editors, Multidisciplinary Information Retrieval, volume 6653, pages 30--45. Springer-Verlag, Berlin-Heidelberg, 2011.
    [2]
    E. M. Airoldi, X. Bai, and R. Padman. Markov blankets and meta-heuristic search: Sentiment extraction from unstructured text. Lecture Notes in Computer Science, 3932 (Advances in Web Mining and Web Usage Analysis):167--187, 2006.
    [3]
    K. Boland, A. Wira-Alam, and R. Messerschmidt. Creating an annotated corpus for sentiment analysis of german product reviews. Technical report, GESIS - Leibniz Institute for the Social Sciences, 2013. http://www.gesis.org/en/publications/gesis-technicalreports/.
    [4]
    E. Cambria, C. Havasi, and A. Hussain. Senticnet 2: A semantic and affective resource for opinion mining and sentiment analysis. In In Proceedings of the FLAIRS Conference, 2012.
    [5]
    E. Cambria and A. Hussain. Sentic Computing: Techniques, Tools, and Applications. Springer, Dordrecht, Netherlands, 2012. Book Link: http://www.springer.com/biomed/book/978-94-007-5069-2.
    [6]
    G. C. Cawley and N. L. Talbot. Efficient approximate leave-one-out cross-validation for kernel logistic regression. Mach. Learn., 71(2-3):243--264, 2008.
    [7]
    G. C. Cawley and N. L. C. Talbot. Fast exact leave-one-out cross-validation of sparse least-squares support vector machines. Neural Networks, 17(10):1467--1475, 2004.
    [8]
    O. Chapelle, V. Vapnik, O. Bousquet, and S. Mukherjee. Choosing multiple parameters for support vector machines. Mach. Learn., 46(1-3):131--159, 2002.
    [9]
    K. W. Church and P. Hanks. Word association norms, mutual information, and lexicography. Comput. Linguist., 16(1):22--29, 1990.
    [10]
    G. Druck, G. Mann, and A. McCallum. Learning from labeled features using generalized expectation criteria. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '08, pages 595--602, 2008.
    [11]
    B. Efron. Estimating the error rate of a prediction rule: Improvement on cross-validation. Journal of the American Statistical Association, 78(382):316--331, 1983.
    [12]
    A. Esuli and F. Sebastiani. Sentiwordnet: A publicly available lexical resource for opinion mining. In Proceedings of LREC-2006, The fifth international conference on Language Resources and Evaluation, pages 62--66, 2006.
    [13]
    V. Hatzivassiloglou and J. M. Wiebe. Effects of adjective orientation and gradability on sentence subjectivity. In Proceedings of the 18th conference on Computational linguistics - Volume 1, COLING '00, pages 299--305, Stroudsburg, PA, USA, 2000. Association for Computational Linguistics.
    [14]
    M. Hu and B. Liu. Mining and summarizing customer reviews. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '04, pages 168--177, 2004.
    [15]
    S. Kale, R. Kumar, and S. Vassilvitskii. Cross-validation and mean-square stability. In Proceedings of the Second Symposium on Innovations in Computer Science (ICS2011), pages 487--495, 2011.
    [16]
    R. Kohavi. A study of cross-validation and bootstrap for accuracy estimation and model selection. In Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2, IJCAI'95, pages 1137--1143, 1995.
    [17]
    C. W. K. Leung, S. C. F. Chan, and F. Chung. Integrating Collaborative Filtering and Sentiment Analysis: A Rating Inference Approach. In Proceedings of The ECAI 2006 Workshop on Recommender Systems, pages 62--66, 2006.
    [18]
    Y. Lin, J. Zhang, X. Wang, and A. Zhou. An information theoretic approach to sentiment polarity classification. In Proceedings of the 2nd Joint WICOW/AIRWeb Workshop on Web Quality, WebQuality '12, pages 35--40, 2012.
    [19]
    B. Liu. Opinion observer: Analyzing and comparing opinions on the web. In In WWW '05: Proceedings of the 14th international conference on World Wide Web, pages 342--351. ACM Press, 2005.
    [20]
    B. Lu and B. K. Tsou. Cityu-dac: Disambiguating sentiment-ambiguous adjectives within context. In Proceedings of the 5th International Workshop on Semantic Evaluation, pages 292--295, July 2010.
    [21]
    Q. Mei, X. Ling, M. Wondra, H. Su, and C. Zhai. Topic sentiment mixture: modeling facets and opinions in weblogs. In Proceedings of the 16th international conference on World Wide Web, WWW '07, pages 171--180, New York, NY, USA, 2007. ACM.
    [22]
    P. Melville, W. Gryc, and R. D. Lawrence. Sentiment analysis of blogs by combining lexical knowledge with text classification. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '09, pages 1275--1284, New York, NY, USA, 2009. ACM.
    [23]
    G. A. Miller. Wordnet: a lexical database for english. Communications of the ACM, 38(11):39--41, 1995.
    [24]
    K. Nigam, J. Lafferty, and A. McCallum. Using maximum entropy for text classification. In Proceedings of the IJCAI-99 Workshop on Machine Learning for Information Filtering, pages 61--67, 1999.
    [25]
    B. Pang, L. Lee, and S. Vaithyanathan. Thumbs up?: sentiment classification using machine learning techniques. In Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10, EMNLP '02, pages 79--86, Stroudsburg, PA, USA, 2002. Association for Computational Linguistics.
    [26]
    G. Qiu, B. Liu, J. Bu, and C. Chen. Expanding domain sentiment lexicon through double propagation. In Proceedings of the 21st international joInt conference on Artifical intelligence, IJCAI'09, pages 1199--1204, 2009.
    [27]
    R. Remus, U. Quasthoff, and G. Heyer. Sentiws - a publicly available german-language resource for sentiment analysis. In N. C. C. Chair), K. Choukri, B. Maegaard, J. Mariani, J. Odijk, S. Piperidis, M. Rosner, and D. Tapias, editors, Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10), Valletta, Malta, may 2010. European Language Resources Association (ELRA).
    [28]
    H. Schmid. Improvements in part-of-speech tagging with an application to german. In In Proceedings of the ACL SIGDAT-Workshop, pages 47--50, 1995.
    [29]
    R. M. Tong. An operational system for detecting and tracking opinions in on-line discussions. In Proceedings of the ACM SIGIR 2001 Workshop on Operational Text Classification, pages 1--6, 2001.
    [30]
    P. D. Turney. Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ACL '02, pages 417--424, 2002.
    [31]
    R. Valitutti. Wordnet-affect: an affective extension of wordnet. In In Proceedings of the 4th International Conference on Language Resources and Evaluation, pages 1083--1086, 2004.
    [32]
    T. Wilson, J. Wiebe, and R. Hwa. Just how mad are you? finding strong and weak opinion clauses. In Proceedings of the 19th national conference on Artificial intelligence, AAAI'04, pages 761--767, 2004.
    [33]
    Y. Wu and P. Jin. Semeval-2010 task 18: Disambiguating sentiment ambiguous adjectives. In Proceedings of the 5th International Workshop on Semantic Evaluation, pages 81--85, Uppsala, Sweden, July 2010. Association for Computational Linguistics.
    [34]
    Y. Wu, M. Wang, P. Jin, and S. Yu. Disambiguate sentiment ambiguous adjectives. In Proceedings of the IEEE International Conference on Natural Language Processing and Knowledge Engineering, pages 1191--1199, 2008.
    [35]
    S.-C. Yang and M.-J. Liu. Ysc-dsaa: An approach to disambiguate sentiment ambiguous adjectives based on saaol. In Proceedings of the 5th International Workshop on Semantic Evaluation, pages 440--443, July 2010.
    [36]
    H. Yu and V. Hatzivassiloglou. Towards answering opinion questions: separating facts from opinions and identifying the polarity of opinion sentences. In Proceedings of the 2003 conference on Empirical methods in natural language processing, EMNLP '03, pages 129--136, 2003.

    Cited By

    View all
    • (2023)Identifying dynamic interaction patterns in mandatory and discretionary lane changes using graph structureComputer-Aided Civil and Infrastructure Engineering10.1111/mice.13099Online publication date: 23-Sep-2023
    • (2022)An Intelligent Unsupervised Approach for Handling Context-Dependent Words in Urdu Sentiment AnalysisACM Transactions on Asian and Low-Resource Language Information Processing10.1145/351083021:5(1-15)Online publication date: 29-Apr-2022
    • (2018)Review on Recent Advances in Information Mining From Big Consumer Opinion Data for Product DesignJournal of Computing and Information Science in Engineering10.1115/1.404108719:1Online publication date: 17-Sep-2018
    • Show More Cited By

    Index Terms

    1. Revised mutual information approach for german text sentiment classification

          Recommendations

          Comments

          Information & Contributors

          Information

          Published In

          cover image ACM Other conferences
          WWW '13 Companion: Proceedings of the 22nd International Conference on World Wide Web
          May 2013
          1636 pages
          ISBN:9781450320382
          DOI:10.1145/2487788

          Sponsors

          • NICBR: Nucleo de Informatcao e Coordenacao do Ponto BR
          • CGIBR: Comite Gestor da Internet no Brazil

          In-Cooperation

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          Published: 13 May 2013

          Permissions

          Request permissions for this article.

          Check for updates

          Author Tags

          1. disambiguation
          2. mutual information
          3. negation
          4. sentiment analysis

          Qualifiers

          • Research-article

          Conference

          WWW '13
          Sponsor:
          • NICBR
          • CGIBR
          WWW '13: 22nd International World Wide Web Conference
          May 13 - 17, 2013
          Rio de Janeiro, Brazil

          Acceptance Rates

          WWW '13 Companion Paper Acceptance Rate 831 of 1,250 submissions, 66%;
          Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)1
          • Downloads (Last 6 weeks)0
          Reflects downloads up to 11 Aug 2024

          Other Metrics

          Citations

          Cited By

          View all
          • (2023)Identifying dynamic interaction patterns in mandatory and discretionary lane changes using graph structureComputer-Aided Civil and Infrastructure Engineering10.1111/mice.13099Online publication date: 23-Sep-2023
          • (2022)An Intelligent Unsupervised Approach for Handling Context-Dependent Words in Urdu Sentiment AnalysisACM Transactions on Asian and Low-Resource Language Information Processing10.1145/351083021:5(1-15)Online publication date: 29-Apr-2022
          • (2018)Review on Recent Advances in Information Mining From Big Consumer Opinion Data for Product DesignJournal of Computing and Information Science in Engineering10.1115/1.404108719:1Online publication date: 17-Sep-2018
          • (2017)Identification and classification of multilingual document using maximized mutual information2017 International Conference on Energy, Communication, Data Analytics and Soft Computing (ICECDS)10.1109/ICECDS.2017.8389734(1679-1682)Online publication date: Aug-2017
          • (2016)SentiMIApplied Soft Computing10.1016/j.asoc.2015.11.01639:C(140-153)Online publication date: 1-Feb-2016
          • (2015)Feature based clustering considering context dependent words2015 1st International Conference on Next Generation Computing Technologies (NGCT)10.1109/NGCT.2015.7375214(713-718)Online publication date: Sep-2015
          • (2014)Baseline evaluationProceedings of the 14th International Conference on Knowledge Technologies and Data-driven Business10.1145/2637748.2638420(1-8)Online publication date: 16-Sep-2014
          • (2014)Aspect based Summarization of Context Dependent Opinion WordsProcedia Computer Science10.1016/j.procs.2014.08.09635(166-175)Online publication date: 2014

          View Options

          Get Access

          Login options

          View options

          PDF

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          Media

          Figures

          Other

          Tables

          Share

          Share

          Share this Publication link

          Share on social media