Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1014052.1014073acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
Article

Mining and summarizing customer reviews

Published: 22 August 2004 Publication History

Abstract

Merchants selling products on the Web often ask their customers to review the products that they have purchased and the associated services. As e-commerce is becoming more and more popular, the number of customer reviews that a product receives grows rapidly. For a popular product, the number of reviews can be in hundreds or even thousands. This makes it difficult for a potential customer to read them to make an informed decision on whether to purchase the product. It also makes it difficult for the manufacturer of the product to keep track and to manage customer opinions. For the manufacturer, there are additional difficulties because many merchant sites may sell the same product and the manufacturer normally produces many kinds of products. In this research, we aim to mine and to summarize all the customer reviews of a product. This summarization task is different from traditional text summarization because we only mine the features of the product on which the customers have expressed their opinions and whether the opinions are positive or negative. We do not summarize the reviews by selecting a subset or rewrite some of the original sentences from the reviews to capture the main points as in the classic text summarization. Our task is performed in three steps: (1) mining product features that have been commented on by customers; (2) identifying opinion sentences in each review and deciding whether each opinion sentence is positive or negative; (3) summarizing the results. This paper proposes several novel techniques to perform these tasks. Our experimental results using reviews of a number of products sold online demonstrate the effectiveness of the techniques.

References

[1]
Agrawal, R. & Srikant, R. 1994. Fast algorithm for mining association rules. VLDB'94, 1994.
[2]
Boguraev, B., and Kennedy, C. 1997. Salience-Based Content Characterization of Text Documents. In Proc. of the ACL'97/EACL'97 Workshop on Intelligent Scalable Text Summarization.
[3]
Bourigault, D. 1995. Lexter: A terminology extraction software for knowledge acquisition from texts. KAW'95.
[4]
Bruce, R., and Wiebe, J. 2000. Recognizing Subjectivity: A Case Study of Manual Tagging. Natural Language Engineering.
[5]
Cardie, C., Wiebe, J., Wilson, T. and Litman, D. 2003. Combining Low-Level and Summary Representations of Opinions for Multi-Perspective Question Answering. 2003 AAAI Spring Symposium on New Directions in Question Answering.
[6]
Church, K.W. and Hanks, P. 1990. Word Association Norms, Mutual Information and Lexicography. Computational Linguistics, 16(1):22--29.
[7]
Daille, B. 1996. Study and Implementation of Combined Techniques for Automatic Extraction of Terminology. The Balancing Act: Combining Symbolic and Statistical Approaches to Language. MIT Press, Cambridge
[8]
Das, S. and Chen, M., 2001. Yahoo! for Amazon: Extracting market sentiment from stock message boards. APFA'01.
[9]
Dave, K., Lawrence, S., and Pennock, D., 2003. Mining the Peanut Gallery: Opinion Extraction and Semantic Classification of Product Reviews. WWW'03.
[10]
DeJong, G. 1982. An Overview of the FRUMP System. Strategies for Natural Language Parsing. 149--176.
[11]
FASTR. http://www.limsi.fr/Individu/jacquemi/FASTR/
[12]
Fellbaum, C. 1998. WordNet: an Electronic Lexical Database, MIT Press.
[13]
Finn, A. and Kushmerick, N. 2003. Learning to Classify Documents according to Genre. IJCAI-03 Workshop on Computational Approaches to Style Analysis and Synthesis.
[14]
Finn, A., Kushmerick, N., and Smyth, B. 2002. Genre Classification and Domain Transfer for Information Filtering. In Proc. of European Colloquium on Information Retrieval Research, pages 353--362.
[15]
Goldstein, J., Kantrowitz, M., Mittal, V., and Carbonell, J. 1999. Summarizing Text Documents: Sentence Selection and Evaluation Metrics. SIGIR'99.
[16]
Hatzivassiloglou, V. and Mckeown, K., 1997. Predicting the Semantic Orientation of Adjectives. In Proc. of 35th ACL/8th EACL.
[17]
Hatzivassiloglou, V. and Wiebe, 2000. J. Effects of Adjective Orientation and Gradability on Sentence Subjectivity. COLING'00.
[18]
Hearst, M, 1992. Direction-based Text Interpretation as an Information Access Refinement. In Paul Jacobs, editor, Text-Based Intelligent Systems. Lawrence Erlbaum Associates.
[19]
Hu, M., and Liu, B. 2004. Mining Opinion Features in Customer Reviews. To appear in AAAI'04, 2004.
[20]
Huettner, A. and Subasic, P., 2000. Fuzzy Typing for Document Management. In ACL'00 Companion Volume: Tutorial Abstracts and Demonstration Notes.
[21]
Jacquemin, C., and Bourigault, D. 2001. Term extraction and automatic indexing. In R. Mitkov, editor, Handbook of Computational Linguistics. Oxford University Press.
[22]
Justeson, J. S., and Katz, S.M. 1995. Technical Terminology: some linguistic properties and an algorithm for identification in text. Natural Language Engineering 1(1):9--27.
[23]
Karlgren, J. and Cutting, D. 1994. Recognizing Text Genres with Simple Metrics using Discriminant Analysis. COLING'94.
[24]
Kessler, B., Nunberg, G., and Schutze, H. 1997. Automatic Detection of Text Genre. In Proc. of 35th ACL/8th EACL.
[25]
Kupiec, J., Pedersen, J., and Chen, F. 1995. A Trainable Document Summarizer. SIGIR'1995
[26]
Liu, B., Hsu, W., Ma, Y. 1998. Integrating Classification and Association Rule Mining. KDD'98, 1998.
[27]
Mani, I., and Bloedorn, E., 1997. Multi-document Summarization by Graph Search and Matching. AAAI'97.
[28]
Manning, C. and Schutze, H. 1999. Foundations of Statistical Natural Language Processing, MIT Press. Cambridge, MA: May 1999.
[29]
Miller, G., Beckwith, R, Fellbaum, C., Gross, D., and Miller, K. 1990. Introduction to WordNet: An on-line lexical database. International Journal of Lexicography (special issue), 3(4):235--312.
[30]
Morinaga, S., Ya Yamanishi, K., Tateishi, K, and Fukushima, T. 2002. Mining Product Reputations on the Web. KDD'02.
[31]
NLProcessor - Text Analysis Toolkit. 2000. http://www.infogistics.com/textanalysis.html
[32]
Paice, C. D. 1990. Constructing Literature Abstracts by Computer: Techniques and Prospects. Information Processing and Management 26:171--186.
[33]
Pang, B., Lee, L., and Vaithyanathan, S., 2002. Thumbs up? Sentiment Classification Using Machine Learning Techniques. In Proc. of EMNLP 2002
[34]
Reimer, U. and Hahn, U. 1997. A Formal Model of Text Summarization based on Condensation Operators of a Terminological Logic. In Proceedings of ACL'97 Workshop on Intelligent, Scalable Text Summarization.
[35]
Sack, W., 1994. On the Computation of Point of View. AAAI'94, Student abstract.
[36]
Salton, G. Singhal, A. Buckley, C. and Mitra, M. 1996. Automatic Text Decomposition using Text Segments and Text Themes. ACM Conference on Hypertext.
[37]
Sparck J. 1993a. Discourse Modeling for Automatic Text Summarizing. Technical Report 290, University of Cambridge Computer Laboratory.
[38]
Sparck J. 1993b. What might be in a summary? Information Retrieval 93: 9--26.
[39]
Tait, J. 1983. Automatic Summarizing of English Texts. Ph.D. Dissertation, University of Cambridge.
[40]
Tetreault, J. 1999. Analysis of Syntax-Based Pronoun Resolution Methods. ACL'99.
[41]
Tong, R., 2001. An Operational System for Detecting and Tracking Opinions in on-line discussion. SIGIR 2001 Workshop on Operational Text Classification.
[42]
Turney, P. 2002. Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews. ACL'02.
[43]
Wiebe, J. 2000. Learning Subjective Adjectives from Corpora. AAAI'00.
[44]
Wiebe, J., Bruce, R., and O'Hara, T. 1999. Development and Use of a Gold Standard Data Set for Subjectivity Classifications. In Proc. of ACL'99.

Cited By

View all
  • (2025)Quality achhi hai (is good), satisfied! Towards aspect based sentiment analysis in code-mixed languageComputer Speech & Language10.1016/j.csl.2024.10166889(101668)Online publication date: Jan-2025
  • (2024)Feature extraction from customer reviews using enhanced rulesPeerJ Computer Science10.7717/peerj-cs.182110(e1821)Online publication date: 31-Jan-2024
  • (2024)Quality perception of São Paulo transportation services:Revista de Gestão Ambiental e Sustentabilidade10.5585/2024.2339213:1(e23392)Online publication date: 11-Jan-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '04: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
August 2004
874 pages
ISBN:1581138881
DOI:10.1145/1014052
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 August 2004

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. reviews
  2. sentiment classification
  3. summarization
  4. text mining

Qualifiers

  • Article

Conference

KDD04

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1,231
  • Downloads (Last 6 weeks)125
Reflects downloads up to 01 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2025)Quality achhi hai (is good), satisfied! Towards aspect based sentiment analysis in code-mixed languageComputer Speech & Language10.1016/j.csl.2024.10166889(101668)Online publication date: Jan-2025
  • (2024)Feature extraction from customer reviews using enhanced rulesPeerJ Computer Science10.7717/peerj-cs.182110(e1821)Online publication date: 31-Jan-2024
  • (2024)Quality perception of São Paulo transportation services:Revista de Gestão Ambiental e Sustentabilidade10.5585/2024.2339213:1(e23392)Online publication date: 11-Jan-2024
  • (2024)Positive Online Customer Reviews Significantly Boost Sales for Micro-BusinessesIntegrated Journal for Research in Arts and Humanities10.55544/ijrah.4.4.144:4(85-92)Online publication date: 18-Jul-2024
  • (2024)Perceived Thoughts and Tweets: Progression of NEP 2020Asian Journal of Management10.52711/2321-5763.2024.00011(62-68)Online publication date: 21-Mar-2024
  • (2024)Consumer Sentiment is Extracted from the e-Commerce Website Evaluates Dataset using an Assembly ModelInternational Journal of Advanced Research in Science, Communication and Technology10.48175/IJARSCT-17874(498-504)Online publication date: 28-Apr-2024
  • (2024)Sentiment Analysis Web AppInternational Journal of Advanced Research in Science, Communication and Technology10.48175/IJARSCT-17565(409-412)Online publication date: 22-Apr-2024
  • (2024)Hotel Rating Prediction System Based on Time FactorsJournal of Organizational and End User Computing10.4018/JOEUC.34212936:1(1-29)Online publication date: 15-May-2024
  • (2024)Utilizing Artificial Intelligence for Text Classification in Communication SciencesDesign and Development of Emerging Chatbot Technology10.4018/979-8-3693-1830-0.ch013(218-235)Online publication date: 15-Mar-2024
  • (2024)Machine-Learning-Based Approaches for Multi-Level Sentiment Analysis of Romanian ReviewsMathematics10.3390/math1203045612:3(456)Online publication date: 31-Jan-2024
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media