Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1935826.1935884acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
poster

Clustering product features for opinion mining

Published: 09 February 2011 Publication History

Abstract

In sentiment analysis of product reviews, one important problem is to produce a summary of opinions based on product features/attributes (also called aspects). However, for the same feature, people can express it with many different words or phrases. To produce a useful summary, these words and phrases, which are domain synonyms, need to be grouped under the same feature group. Although several methods have been proposed to extract product features from reviews, limited work has been done on clustering or grouping of synonym features. This paper focuses on this task. Classic methods for solving this problem are based on unsupervised learning using some forms of distributional similarity. However, we found that these methods do not do well. We then model it as a semi-supervised learning problem. Lexical characteristics of the problem are exploited to automatically identify some labeled examples. Empirical evaluation shows that the proposed method outperforms existing state-of-the-art methods by a large margin.

References

[1]
Agirre E, Alfonseca E, Hall K, Kravalova J, Pa ca M, and Soroa A. A study on similarity and relatedness using distributional and WordNet-based approaches. in Proceedings of ACL. 2009. 19--27
[2]
Alvarez M and Lim S. A Graph Modeling of Semantic Similarity between Words. in Proceeding of the Conference on Semantic Computing. 2007. 355--362
[3]
Andrzejewski D, Zhu X, and Craven M. Incorporating domain knowledge into topic modeling via Dirichlet forest priors. in Proceedings of ICML. 2009. 25--32
[4]
Bishop C, Pattern recognition and machine learning. 2006: Springer.
[5]
Blei D, Ng A Y, and Jordan M I, Latent Dirichlet Allocation. Journal of Machine Learning Research, 2003. 3(3): 993--1022.
[6]
Bollegala D, Matsuo Y, and Ishizuka M. Measuring semantic similarity between words using web search engines. in Proceedings of WWW. 2007.757--766
[7]
Branavan S R K, Chen H, Eisenstein J, and Barzilay R. Learning document--level semantic properties from free-text annotations. in Proceedings of ACL. 2008.569--603
[8]
Brown P, Mercer R, Della Pietra V, and Lai J, Class-based n-gram models of natural language. Computational Linguistics, 1992. 18(4): 467--479.
[9]
Carenini G, Ng R, and Zwart E. Extracting knowledge from evaluative text. in Proceedings of International Conference on Knowledge Capture. 2005.11--18
[10]
Chen H, Lin M, and Wei Y. Novel association measures using web search with double checking. in ACL. 2006.1016
[11]
Fellbaum C, WordNet: An electronic lexical database. 1998: MIT press Cambridge, MA.
[12]
Guo H, Zhu H, Guo Z, Zhang X, and Su Z. Product feature categorization with multilevel latent semantic association. in Proceedings of CIKM. 2009.1087--1096
[13]
Harris Z S, Mathematical structures of language. Interscience tracts in pure and applied mathematics, no. 21. 1968, New York: Interscience Publishers. ix, 230 p.
[14]
Hu M and Liu B. Mining and summarizing customer reviews. in Proceedings of SIGKDD. 2004.168--177
[15]
Hughes T and Ramage D. Lexical semantic relatedness with random graph walks. in EMNLP. 2007.581--589
[16]
Jiang J and Conrath D. Semantic similarity based on corpus statistics and lexical taxonomy. in Proceedings of Research in Computational Linguistics. 1997.19--33
[17]
Jin W, Ho H, and Srihari R. OpinionMiner: a novel machine learning system for web opinion mining and extraction. in Proceedings of KDD. 2009.1195--1204
[18]
Kim S and Hovy E. Extracting opinions, opinion holders, and topics expressed in online news media text. in Proceedings of EMNLP. 2006.1065--1074
[19]
Kobayashi N, Inui K, and Matsumoto Y. Extracting aspect-evaluation and aspect-of relations in opinion mining. in Proceedings of EMNLP. 2007.1065--1074
[20]
Ku L, Liang Y-T, and Chen H-H. Opinion Extraction, Summarization and Tracking in News and Blog Corpora. in Proceedings of AAAI. 2006.100--107
[21]
Lee L. Measures of distributional similarity. 1999: Proceedings of ACL.25--32
[22]
Lin D. Automatic retrieval and clustering of similar words. 1998: Proceedings of ACL.768--774
[23]
Lin D. An information-theoretic definition of similarity. in Proceedings of ICML. 1998.296--304
[24]
Lin D and Wu X. Phrase clustering for discriminative learning. in Proceedings of ACL. 2009.1030--1038
[25]
Liu B, Web data mining; Exploring hyperlinks, contents, and usage data. 2006, Springer.
[26]
Liu B, Hu M, and Cheng J. Opinion Observer: Analyzing and Comparing Opinions on the Web. in Proceedings of WWW. 2005.342--351
[27]
MacQueen J. Some methods for classification and analysis of multivariate observations. in Proceedings of Symposium on Mathematical Statistics and Probability. 1966.281--297
[28]
Matsuo Y, Sakaki T, Uchiyama K, and Ishizuka M. Graph-based word clustering using a web search engine. 2006. Proceedings of EMNLP.542--550
[29]
Mei Q, Ling X, Wondra M, Su H, and Zhai C. Topic sentiment mixture: Modeling facets and opinions in weblogs. in Proceedings of WWW. 2007.171--180
[30]
Nigam K, McCallum A, Thrun S, and Mitchell T, Text classification from labeled and unlabeled documents using EM. Machine Learning, 2000. 39(2): 103--134.
[31]
Pang B and Lee L, Opinion Mining and Sentiment Analysis. Foundations and Trends in IR. 2008. 1--135.
[32]
Pantel P, Crestan E, Borkovsky A, Popescu A, and Vyas V. Web-scale distributional similarity and entity set expansion. in Proceedings of EMNLP. 2009.938--947
[33]
Pedersen. Information Content Measures of Semantic Similarity Perform Better Without Sense-Tagged Text. in Proceedings of NAACL HLT. 2010.
[34]
Pereira F, Tishby N, and Lee L. Distributional clustering of English words. in Proceedings of ACL. 1993.183--190
[35]
Popescu A-M and Etzioni O. Extracting Product Features and Opinions from Reviews. in EMNLP. 2005.339--346
[36]
Resnik P. Using information content to evaluate semantic similarity in a taxonomy. in IJCAI. 1995.448--453
[37]
Sahami M and Heilman T. A web-based kernel function for measuring the similarity of short text snippets. in Proceedings of WWW. 2006.377--386
[38]
Stoyanov V and Cardie C. Topic identification for fine-grained opinion analysis. in COLING. 2008.817--824
[39]
Su Q, Xiang K, Wang H, Sun B, and Yu S. Using pointwise mutual information to identify implicit features in customer reviews. in ICCPOL. 2006.22--30
[40]
Titov I and McDonald R. Modeling online reviews with multi-grain topic models. in WWW. 2008.111--120
[41]
Wagstaff K, Cardie C, Rogers S, and Schroedl S. Constrained k-means clustering with background knowledge. in In Proceedings of ICML. 2001.577--584
[42]
Yang D and Powers D. Measuring semantic similarity in the taxonomy of WordNet. 2005. Proceedings of the Australasian conference on Computer Science.322
[43]
Zhai Z, Liu B, Xu H, and Jia P, Grouping Product Features Using Semi-supervised Learning with Soft-Constraints, in Proceedings of COLING. 2010.

Cited By

View all
  • (2024)Automating Customer Needs to Engineering Characteristics Mapping in Quality Function Deployment: A Deep Learning Approach2024 7th International Conference on Artificial Intelligence and Big Data (ICAIBD)10.1109/ICAIBD62003.2024.10604487(1-5)Online publication date: 24-May-2024
  • (2024)Opinion Mining with Manifold ForestsContext-Aware Systems and Applications10.1007/978-3-031-58878-5_1(3-18)Online publication date: 19-Aug-2024
  • (2023)An Approach to Summarizing Product ReviewsNSU Vestnik. Series: Linguistics and Intercultural Communication10.25205/1818-7935-2022-20-4-90-10620:4(90-106)Online publication date: 5-Feb-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WSDM '11: Proceedings of the fourth ACM international conference on Web search and data mining
February 2011
870 pages
ISBN:9781450304931
DOI:10.1145/1935826
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 February 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. opinion mining
  2. product feature grouping

Qualifiers

  • Poster

Conference

Acceptance Rates

WSDM '11 Paper Acceptance Rate 83 of 372 submissions, 22%;
Overall Acceptance Rate 498 of 2,863 submissions, 17%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)42
  • Downloads (Last 6 weeks)2
Reflects downloads up to 26 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Automating Customer Needs to Engineering Characteristics Mapping in Quality Function Deployment: A Deep Learning Approach2024 7th International Conference on Artificial Intelligence and Big Data (ICAIBD)10.1109/ICAIBD62003.2024.10604487(1-5)Online publication date: 24-May-2024
  • (2024)Opinion Mining with Manifold ForestsContext-Aware Systems and Applications10.1007/978-3-031-58878-5_1(3-18)Online publication date: 19-Aug-2024
  • (2023)An Approach to Summarizing Product ReviewsNSU Vestnik. Series: Linguistics and Intercultural Communication10.25205/1818-7935-2022-20-4-90-10620:4(90-106)Online publication date: 5-Feb-2023
  • (2023)A two-stage unsupervised sentiment analysis methodMultimedia Tools and Applications10.1007/s11042-023-14864-682:17(26527-26544)Online publication date: 8-Mar-2023
  • (2023)Optimized Clustering Model for Healthcare Sentiments on Twitter: A Big Data Analysis ApproachBig Data Analytics for Smart Transport and Healthcare Systems10.1007/978-981-99-6620-2_9(157-173)Online publication date: 4-Dec-2023
  • (2023)Unimodal Sentiment AnalysisMulti-Modal Sentiment Analysis10.1007/978-981-99-5776-7_4(135-177)Online publication date: 27-Nov-2023
  • (2022)A Generic Graph-Based Method for Flexible Aspect-Opinion Analysis of Complex Product Customer FeedbackInformation10.3390/info1303011813:3(118)Online publication date: 28-Feb-2022
  • (2022)Explore and Interpret the Correlations Among VR Applications2022 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct)10.1109/ISMAR-Adjunct57072.2022.00015(22-26)Online publication date: Oct-2022
  • (2022)MINING ONLINE REVIEWS TO SUPPORT CUSTOMERS’ DECISION-MAKING PROCESS IN E-COMMERCE PLATFORMS: A NARRATIVE LITERATURE REVIEWJournal of Organizational Computing and Electronic Commerce10.1080/10919392.2022.205345432:1(69-97)Online publication date: 28-Apr-2022
  • (2022)Review on sentiment analysis for text classification techniques from 2010 to 2021Multimedia Tools and Applications10.1007/s11042-022-14112-382:6(8137-8193)Online publication date: 1-Dec-2022
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media