Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3366030.3366133acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiiwasConference Proceedingsconference-collections
short-paper

Fatten Features and Drop Wastes: Finding Repeaters' Reviews by Feature Generation and Feature Selection

Published: 22 February 2020 Publication History

Abstract

In this paper, we proposed a method for determining whether a given restaurant review comment is a repeater's review, or not. We often use restaurant review sites to decide which restaurant to go to. When we read a restaurant review comment, we can know whether the reviewer is a repeater of the restaurant. If a certain restaurant has many repeaters, the restaurant must be great. However, restaurant review sites usually do not provide a "revisit rate". Therefore, we tackle a problem for determining whether a review is a repeater's review, or not. There are many sentences in a review comment that are completely not useful for determining whether the review is a repeater review, such as what was ordered, what was delicious, or how was the price. To confront such difficulties, we have taken the following approach. First, very various features are extracted from review comments so as not to miss the features that represent repeaters' reviews. Next, from the very various features, only the necessary features that really contribute to the classification is selected by a feature selection method. Finally, classification is performed using a classifier. We have implemented the proposed method using super-CWC [12], a state-of-the-art feature selection method, and SVM. The experimental results show that the proposed method is better than other methods.

References

[1]
Li-Chen Cheng, Judy C.R.Tseng, and Tsai-Yu Chung. 2017. Case Study of Fake Web Reviews. In Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2017). 706--709.
[2]
Kushal Dave, Steve Lawrence, and David M. Pennock. 2003. Mining the Peanut Gallery: Opinion Extraction and Semantic Classification of Product Reviews. In Proceedings of the 12th International Conference on World Wide Web (WWW 2003). 519--528.
[3]
Xiaowen Ding and Bing Liu. 2007. The Utility of Linguistic Rules in Opinion Mining. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2007). 811--812.
[4]
Amir Fayazi, Kyumin Lee, James Caverlee, and Anna Squicciarini. 2015. Uncovering Crowdsourced Manipulation of Online Reviews. In Proceedings of the 38th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2015). 233--242.
[5]
Eric Gilbert and Karrie Karahalios. 2010. Understanding Deja Reviewers. In Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work (CSCW 2010). 225--228.
[6]
Taku Kudo, Kaoru Yamamoto, and Yuji Matsumoto. 2004. Applying Conditional Random Fields to Japanese Morphological Analysis. In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing (EMNLP 2004). 230--237.
[7]
Theodoros Lappas, Mark Crovella, and Evimaria Terzi. 2012. Selecting a Characteristic Set of Reviews. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2012). 832--840.
[8]
Ankan Mullick, Surjodoy Ghosh D, Shivam Maheswari, Srotaswini Sahoo, Suman Kalyan Maity, Soumya C, and Pawan Goyal. 2018. Identifying Opinion and Fact Subcategories from the Social Web. In Proceedings of the 2018 ACM Conference on Supporting Groupwork (GROUP2018). 145--149.
[9]
Michael P. O'Mahony and Barry Smyth. 2009. Learning to Recommend Helpful Hotel Reviews. In Proceedings of the Third ACM Conference on Recommender Systems (RecSys 2009). 305--308.
[10]
Deanna Osman, John Yearwood, and Peter Vamplew. 2007. Using Corpus Analysis to Inform Research into Opinion Detection in Blogs. In Proceedings of the Sixth Australasian Conference on Data Mining and Analytics (AusDM 2007). 65--75.
[11]
Kilho Shin, Danny Fernandes, and Seiya Miyazaki. 2011. Consistency measures for feature selection: a formal definition, relative sensitivity comparison and a fast algorithm. In Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence (IJCAI 2011). 1491--1497.
[12]
Kilho Shin, Tetsuji Kuboyama, Takako Hashimoto, and Dave Shepard. 2015. Super-CWC and super-LCC: Super Fast Feature Selection Algorithms. In Proceedings of the 2015 IEEE International Conference on Big Data (BigData 2015). 1--7.
[13]
Phong Minh Vu, Hung Viet Pham, Tam The Nguyen, and Tung Thanh Nguyen. 2016. Phrase-based Extraction of User Opinions in Mobile App Reviews. In Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering (ASE 2016). 726--731.
[14]
Derry Tanti Wijaya and Stéphane Bressan. 2008. A Random Walk on the Red Carpet: Rating Movies with User Reviews and Pagerank. In Proceedings of the 17th ACM Conference on Information and Knowledge Management (CIKM 2008). 951--960.
[15]
José P. Zagal, Amanda Ladd, and Terris Johnson. 2009. Characterizing and Understanding Game Reviews. In Proceedings of the 4th International Conference on Foundations of Digital Games (FDG 2009). 215--222.

Index Terms

  1. Fatten Features and Drop Wastes: Finding Repeaters' Reviews by Feature Generation and Feature Selection

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    iiWAS2019: Proceedings of the 21st International Conference on Information Integration and Web-based Applications & Services
    December 2019
    709 pages
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    In-Cooperation

    • JKU: Johannes Kepler Universität Linz
    • @WAS: International Organization of Information Integration and Web-based Applications and Services

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 22 February 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. classification
    2. feature selection
    3. repeaters' reviews

    Qualifiers

    • Short-paper
    • Research
    • Refereed limited

    Conference

    iiWAS2019

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 53
      Total Downloads
    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 25 Dec 2024

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media