Main
Main
(INFORMATION TECHNOLOGY)
By
A. SANTHOSH (20KB1A1201) K. VENKATESWARLU (20KB1A1230)
V. NITHIN (20KB1A1259) Y. RAVI (20KB1A1261)
N. JAGADEESH (21KB5A1205)
1
Dept. of IT and AI&DS,NBKRIST
Website: www.nbkrist.org Email: ist@nbkrist.org
Ph: 08624-228 247 Fax: 08624-228 257
BONAFIDE CERTIFICATE
This is to certify that the Project work entitled “FAKE REVIEWS DETECTION BASED
ON INDEX OPTIMIZATION” is a bonafide work done by A. Santhosh (20KB1A1201),
K. Venkateswarlu (20KB1A1230), V. Nithin (20KB1A1259), Y. Ravi (20KB1A1261), N.
Jagadeesh (21KB5A1205), and in the department of IT and AI&DS, N.B.K.R. Institute of
Science & Technology, Vidyanagar and is submitted to JNTUA, Ananthapuramu in the
partial fulfillment for the award of B.Tech degree in Information Technology. This work
has been carried out under my supervision.
2
Dept. of IT and AI&DS,NBKRIST
TABLE OF CONTENTS
BONAFIDE CERTIFICATE
LIST OF FIGURES
LIST OF TABLES
AKNOWLEDGEMENT
ABSTRACT
1 INTRODUCTION
1.1 INTRODUCTION
1.2 MOTIVATION
1.6 SUMMARY
2 LITERATURE SURVEY
2.2 SURVEY
3.1 INTRODUCTION
4 IMPLEMENTATION
4.1 INTRODUCTION
4.3 CREATE AN UI
4.4 REQUIREMENTS
6.1 CONCLUSION
4
Dept. of IT and AI&DS,NBKRIST
6.2 FUTURE ENHANCEMENTS
7 REFERENCES
5
Dept. of IT and AI&DS,NBKRIST
LIST OF FIGURES
1 COMPARISION OF DIFFRENT
MODELS
2 NAÏVE BAYES ACCURACY
MODEL
3 SVM CLASSIFICATION REPORT
4 SYSTEM ARCHITECTURE
5 DATASET
6 CONFUSION MATRIX
7 PIE CHART
8 HISTOGRAMS
9 API
10 RESULT
6
Dept. of IT and AI&DS,NBKRIST
ACKNOWLEDGEMENT
The satisfaction that accompanies the successful completion of a project would be
incomplete without the people who made it possible of their constant guidance
and encouragement crowned our efforts with success.
We would like to express our profound sence of gratitude to our project guide Mr. M.
SIVA PRATHAP REDDY, Associate Professor, Department of IT and AI&DS,
N.B.K.R.I.S.T (affiliated to JNTUA, Ananthapuramu), Vidyanagar, for his masterful
guidance and the constant encouragement throughout the project. Our sincere thanks for
him suggestions and unmatched services without, which this work would have been an
unfulfilled dream.
We convey our special thanks to Dr. Y. Venkata Rami Reddy respectable chairman of
N.B.K.R. Institute of Science and Technology, for providing excellent infrastructure in
our campus for the completion of the project
We convey our special thanks to Sri N. Ram Kumar Reddy respectable correspondent
of N.B.K.R. Institute of Science and Technology, for providing excellent
infrastructure in our campus for the completion of the project.
We would like to convey our heartful thanks to Staff members, Lab technicians, and
our friends, who extended their cooperation in making this project as a successful
one.
We would like to thank one and all who have helped us directly and indirectly to
complete this project successful
7
Dept. of IT and AI&DS,NBKRIST
ABSTRACT
Buying and selling the goods and services through internet called as electronic network
known to be E-commerce. Due to the convenience of e-commerce, the number of users are
increased. Meanwhile, the people review of product also increased. In e-commerce websites,
fake review is often the major problem. Nowadays, it is known to be common that user can
write the review for their purchased product.
There are many ways that user can write reviews. Using this opportunity, there is a possibility
that spammers can leave fake review. Many users determine the quality of product based on
user’s reviews. So, the fake review creates lot of problems on product quality, sales, and
economic growth. To tackle this problem, we are going to use Naive Bayesian classifier and
svm classifier which are very simple and easy technique to classify the product review.
Feature extraction can be used to extract the feature . Here we are using dataset for
classifying the product reviews. Here our aim is to find fake reviews. By detecting fake
reviews the accuracy of e-commerce system can be improved.
8
Dept. of IT and AI&DS,NBKRIST
1. INTRODUCTION
1.1 INTRODUCTION
The way the people express their opinions and communicate with others on the web is
radically changed. Nowadays, customers rely a lot on the written reviews before going to
purchase the product. There are basically two types of spam reviews. First type is
“positive/negative reviews” and second type is” no reviews”. The positive/negative reviews
of product gives opinion on their product selection. So, positive review makes good opinion
about the product as well as increases the quality of product while negative review makes
bad impression about product as well as destroys the reputation of product. Second type, no
reviews (e.g. Ads) has no opinion on the product. It is also a kind of
encouragement/discouragement for customer to buy the product. The growth of e-commerce
sites also increase the number of spam reviews. 20% of reviews on e-commerce websites
were actually fake. In order to detect the fake reviews we are going to utilize feature
selection, extraction techniques and classification methods. There are several algorithms for
classifying the spam and non-spam data such as vector machine, decision tree, Naïve Bayes,
neural network are well-known classifiers . For process and analyze the data set, we first
apply feature extraction technique over the selected attributes. Overall knowledge of various
classifier, we find that the accuracy can be improved using Naïve based and svm
classification techniques. The performance of Naïve based and svm classifiers were found to
be better than other classifiers.
1.2 MOTIVATION
The motivation for implementing this spam reviews detection project “Fake reviews
detection using index optimization” is to improve maintaining trust, fairness and authenticity
in online reviews, which ultimately leads to better user experiences and brand reputation.
Reviews play a crucial role in shaping consumer decisions. Spam or fake reviews can
undermine trust in a product or service, leading to potential loss of customers and reputation
damage for businesses. Authenticity is key in online platforms. Spam reviews distort the
overall rating and mislead potential customers. Detecting and removing spam reviews ensure
9
Dept. of IT and AI&DS,NBKRIST
a fair representation of products or services. Spam reviews clutter the review space and make
it difficult for users to find relevant information. Removing spam helps in reducing noise and
improving the overall quality of reviews. For businesses, their online reputation is invaluable.
Spam reviews can tarnish a brand's image. Detecting and removing spam helps protect the
brand's reputation and integrity.
Given a dataset of user-generated reviews for products or services, the objective is to develop
a machine learning model that can accurately classify reviews as either spam or genuine.
Spam reviews are defined as those that are fake, deceptive, or manipulative in nature, aiming
to influence the perception of the product or service unfairly. The model should be able to
effectively distinguish between legitimate reviews and spam reviews, thus helping maintain
the integrity and trustworthiness of the review platform.
10
Dept. of IT and AI&DS,NBKRIST
4.Deployment&Integrate: Integrate the trained model into the existing review platform
infrastructure, either as a standalone system or as part of a larger spam detection pipeline.
Ensure seamless integration with minimal disruption to user experience.
iteratively improve the model's effectiveness and adapt to evolving spamming techniques .
SCOPE:
1.Review types: Focus on detecting spam reviews in text-based reviews for products,
services, or businesses across various domains (e.g., e-commerce, hospitality, restaurants).
2.Platforms: Determine the online platforms where spam reviews will be targeted for
detection. This could include e-commerce websites, review aggregation sites, social media
platforms, and any other platforms hosting user-generated reviews.
3. Spam Categories: Define the categories or characteristics of spam reviews that the
detection system will focus on. This may include fake reviews, biased reviews, reviews
posted by bots, reviews containing promotional content, or any other forms of deceptive or
manipulative content.
4.Detection Techniques: Specify the machine learning methods and algorithms to be
employed for spam detection.
5.Feature Selection: Identify the features and attributes of reviews that will be used for spam
detection. This may include textual features, metadata (e.g., reviewer text), analysis.
6.Data Sources: Utilize publicly available datasets, proprietary data from the review
platform (if available), and synthetic data generation techniques to augment the training
dataset.
7.Model Maintenance: Develop procedures for model maintenance, including version
control, retraining schedules, and performance monitoring. Establish protocols for handling
model drift and concept drift over time.
11
Dept. of IT and AI&DS,NBKRIST
Organizing a project report effectively is crucial for presenting your findings, insights, and
recommendations clearly and logically. Here's a suggested structure for organizing a project
1.Title Page:
Project title
2.Abstract:
A brief summary of the project objectives, methods, Key findings, and Conclusions.
3.Table of Contents:
List of sections and subsections with corresponding page numbers.
4.Introduction:
Background and Context of the problem (Spam reviews detection.
Motivation for the project.
Objectives of the project.
Overview of the report structure.
5.Literature Review:
Reviews of existing research, methods, and technologies related to spam reviews
detection.
Discussion of relevant theories, models, and approaches in the field.
6.Methadology:
Description of the dataset used for training and testing.
Explanation of data collection and preprocessing techniques.
Overview of feature engineering methods.
Explanation of the machine learning or detection algorithms employed.
Details of model evaluation metrics and techniques.
7. Results:
Presentation of experimental results, including model performance metrics.
Visualization of key findings (e.g., confusion matrix, ROC curves).
Discussion of any challenges encountered during experimentation.
8.Conclusion:
12
Dept. of IT and AI&DS,NBKRIST
Summary of the key findings and insights.
Recapitulation of the project objectives and whether they were achieved.
9.Reference:
Implications of the findings for the field of spam reviews detection.
1.6 SUMMARY
The way the people express their opinions and communicate with others on the web is
radically changed. Nowadays, customers rely a lot on the written reviews before going to
purchase the product. There are basically two types of spam reviews. First type is
“positive/negative reviews” and second type is” no reviews”. The positive/negative reviews
of product gives opinion on their product selection. So, positive review makes good opinion
about the product as well as increases the quality of product while negative review makes
bad impression about product as well as destroys the reputation of product. Second type, no
reviews (e.g. Ads) has no opinion on the product. It is also a kind of
encouragement/discouragement for customer to buy the product. The growth of e-commerce
sites also increase the number of spam reviews. 20% of reviews on e-commerce websites
were actually fake. In order to detect the fake reviews we are going to utilize feature
selection, extraction techniques and classification methods. There are several algorithms for
classifying the spam and non-spam data such as vector machine, decision tree, Naïve Bayes,
neural network are well-known classifiers . For process and analyze the data set, we first
apply feature extraction technique over the selected attributes. Overall knowledge of various
classifier, we find that the accuracy can be improved using Naïve based and svm
classification techniques. The performance of Naïve based and svm classifiers were found to
be better than other classifiers.
13
Dept. of IT and AI&DS,NBKRIST
2.LITERATURE SURVEY
The existing system that propose “Product spam reviews detection based on index
optimization” by using machine learning algorithms. In existing system they used a dataset
consists of 11 attributes called Product name, Commodity property, Positive evaluation word,
Negative evaluation word, Positive effect word, Negative effect word, Review on the length
of the text, Number of votes for review content, and Review on the users credit experience.
Steps follow to implement:
1.Load the dataset
2.Selecting the indexes by using Index optimization techniques.
3.preprocess the selected data fields.
4.Split dataset into train and test data.
5. train the naïve bayes and svm models by giving the test and train data.
6.Model evaluation facors like f1 score, recall, and precition rate.
7.Test the model by giving input reviews.
14
Dept. of IT and AI&DS,NBKRIST
Accuracy of model:
Fig:-1
DEMERITS:
Accuracy low
Similar data indexes
Duplication data
2.2 SURVEY
Doing this “fake reviews detection based on index optimization” project , I was going
through old conference papers and researched.
15
Dept. of IT and AI&DS,NBKRIST
2018 Han yutan False comment False comments Improving online Getting low
recognition recognitions based content accuracy.
on CNN. evaluation.
2017 S. What yelp fake Analyzes the yelp Offers insights Limited to the
Mukarjee, review filter fake review filter into the yelp platform
N. Glance might be doing. and proposes a challenges and and findings
framework for effectiveness of may not
understanding existing spam apply
review spam. filters. universally.
By doing above research I proposed new system called “Fake reviews detection based on
index optimization” by using machine learning algorithms and index optimization
techniques.In proposed system we took a diffrent dataset compare with an existing model.
dataset contains 6 attributes which are category, review, rating, likes, username, label.
Steps followed to implement:
1.Load the dataset
2.Selecting the indexes by using Index optimization techniques.
3.preprocess the selected data fields.
4.Split dataset into train and test data.
5. train the naïve bayes and svm models by giving the test and train data.
6.Model evaluation facors like f1 score, recall, and precition rate.
7.create an user interface using python streamlit.
8.Test the model by giving input reviews.
While comparing with the existing model in proposed model we vary the dataset contains 8
fields by using index optimization techniques. which are gave optimized accurate
results .Also the proposed model increase the efficiency of accurate results and reduce the
noisy.
Accuracy of model:
16
Dept. of IT and AI&DS,NBKRIST
Fig:-2 Naïve Bayes Model Classification Report
Fig:-
Fig:-3 SVM Model Classification Report
MERITS:
High Accuracy
Accurate Predictions
17
Dept. of IT and AI&DS,NBKRIST
3. METHODOLOGY
3.1 INTRODUCTION
The methodology of spam reviews detection based on index optimization involves utilizing
various techniques to identify and filter out fraudulent or deceptive reviews from genuine
ones. Here's a breakdown of what this methodology typically involves:
1.Data collection: Data collection in spam review detection involves gathering a diverse set
of reviews from various sources, such as e-commerce websites, forums, or social media
platforms. These reviews can encompass different products, services, and domains to ensure
the model's robustness across various contexts.
Overall, effective data collection is foundational for training accurate and reliable spam
review detection models. It ensures that the model learns from a comprehensive and
representative sample of reviews, leading to better performance in identifying and filtering
18
Dept. of IT and AI&DS,NBKRIST
2.Data preprocessing: Data preprocessing in spam review detection involves transforming
raw review data into a format that is suitable for analysis and model training. Here's a typical
workflow for preprocessing data in this context:
Removing special characters like alpha numeric, punctuations and symbols
Convert all text into lowercase
Split the sentence into tokens for further analysis
Eleminate common stop words like(the, is, and)
Stemming words to its root form
3.Feature selection & Extraction: Feature selection and extraction in spam review
detection involve identifying and extracting relevant information from the review data to
By using Index optimization techniques like select kbestand chi-sqare methods selecting the
features.
4.Model selection & Training: Model selection and training in spam review detection
involve choosing appropriate machine learning algorithms and optimizing their parameters to
build an effective spam detection system.
In model selection selecting the machine learning models which are suitable for classification
such as naïve bayes, svm, logistic regression, decision trees, and random forest algorithms.
In model training involves several steps:
Split the dataset into train and test.
Preprocess datasets.
Train the model
5.Model evaluation: Model evaluation in spam reviews detection involves assessing the
performance of the trained model to ensure its effectiveness in distinguishing between spam
and genuine reviews.
Accuracy: Accuracy measures the overall correctness of the model and is calculated as (TP +
TN) / (TP + TN + FP + FN).
Precision: Precision measures the proportion of correctly classified spam reviews among all
reviews classified as spam and is calculated as TP / (TP + FP).
19
Dept. of IT and AI&DS,NBKRIST
F1-score: F1-score is the harmonic mean of precision and recall and is calculated as 2 *
Model integration
Testing model
Deployment
20
Dept. of IT and AI&DS,NBKRIST
SYSTEM ARCHITECTURE
Data Collection
Model evaluation
Testing model
Deployment
Fig:-4 Architecture
21
Dept. of IT and AI&DS,NBKRIST
4. IMPLEMENTATION
4.1 INTRODUCTION
Implementing a spam review detection system involves several steps, including data
collection, preprocessing, feature extraction, model training, and evaluation. Here's a high-
level overview of how you can approach each step:
Data Collection: Collecting reviews dataset from e-commerce websites are any other free
sources.
For this project collected the reviews dataset related products from below website link then
added some additional attributes to that.
Fig:-5 Dataset
Preprocess: Pre-processing involves in data cleaning for analysis such as removing
duplications, removing stop words, converting text into lower case, tokenization, removing
punctuations, stemming.
22
Dept. of IT and AI&DS,NBKRIST
Feature selection&extraction: Feature selection&extraction is the process of selecting
indexes which are giving accurate results based on weights or scores.
SelectKBest: is a method for feature selection that selects the top k features with the highest
scores based on a specified criterion. The criterion could be a statistical measure like mutual
information, F-score, or chi-square.
TF(t,d) = no.of times term t appears in document d / total no.of terms in document d
IDF(t, d) = log(no .of documents in corpus / no.of documents contain the term)
Numerical Example
Imagine the term 𝑡 appears 20 times in a document that contains a total of 100 words.
Term Frequency (TF) of 𝑡 can be calculated as follow:
23
Dept. of IT and AI&DS,NBKRIST
𝑇𝐹=20/100=0.2
𝐼𝐷𝐹=𝑙𝑜𝑔(10000/100)=2
Using these two quantities, we can calculate TF-IDF score of the term 𝑡 for the
document.
TF-IDF=0.2∗2=0.4
The TF-IDF represents score to every term in sentence like below vector matrix form.
24
Dept. of IT and AI&DS,NBKRIST
4.2 MODEL TRAINING:
Implemented Algorithms:
1.Naïve Bayes
Some popular examples of Naïve Bayes Algorithm are spam filtration, Sentimental
analysis, and classifying articles.The Naïve Bayes algorithm is comprised of two
words Naïve and Bayes, Which can be described as:
Naïve: It is called Naïve because it assumes that the occurrence of a certain feature is
independent of the occurrence of other features. Such as if the fruit is identified on the bases
of color, shape, and taste, then red, spherical, and sweet fruit is recognized as an apple.
Hence each feature individually contributes to identify that it is an apple without depending
on each other.
25
Dept. of IT and AI&DS,NBKRIST
Bayes' Theorem:
Bayes' theorem is also known as Bayes' Rule or Bayes' law, which is used to determine the
probability of a hypothesis with prior knowledge. It depends on the conditional probability.
Where,
P(B|A) is Likelihood probability: Probability of the evidence given that the probability of a
hypothesis is true.
Given a dataset with features X = (x1, x2, x3,…,xn) and a target variable y , where xi
represents the value of the ith feature, and y represents the class label, the Naive Bayes
algorithm calculates the probability of each class given the features using Bayes' theorem:
P(y∣x1,x2,...,xn)=P(x1)×P(x2)×...×P(xn)P(y)×P(x1∣y)×P(x2∣y)×...×P(xn∣y)
p(x1)*p(x2)*….*p(xn)
During training, the algorithm calculates the prior probability of each class 𝑃(𝑦) and the
conditional probability of each feature given the class 𝑃(𝑥𝑖∣𝑦) based on training data.
1.Prior probability : Calculate the prior probability of each class (positive and negative).
26
Dept. of IT and AI&DS,NBKRIST
P(Positive) = no . of positive reviews / total no . of reviews
2.conditional Probability:
There are three types of Naive Bayes Model, which are given below:
Gaussian: The Gaussian model assumes that features follow a normal distribution. This
means if predictors take continuous values instead of discrete, then the model assumes that
these values are sampled from the Gaussian distribution.
Multinomial: The Multinomial Naïve Bayes classifier is used when the data is multinomial
distributed. It is primarily used for document classification problems, it means a particular
document belongs to which category such as Sports, Politics, education,etc.The classifier
uses the frequency of words for the predictors.
Bernoulli: The Bernoulli classifier works similar to the Multinomial classifier, but the
predictor variables are the independent Booleans variables. Such as if a particular word is
present or not in a document. This model is also famous for document classification tasks.
27
Dept. of IT and AI&DS,NBKRIST
Model Prediction :
Tokenization:
Tokenize the review into individual words: ["Mobile", "condition", "is", "good"].
Calculate probabilities:
For each class (positive and negative), calculate the probability of the review belonging to
that class using Bayes' theorem:
P(Positive∣review)∝P(Positive)×P("Mobile"∣Positive)×P("condition"∣Positive)×…
𝑃(Negative∣review)∝𝑃(Negative)×𝑃("Mobile"∣Negative)×𝑃"condition"∣Negative)×…
Prediction: Select the class (positive or negative) with the highest probability as the
predicted sentiment for the review.
28
Dept. of IT and AI&DS,NBKRIST
2.Support Vector Machine
Support Vector Machine (SVM) is a powerful supervised learning algorithm used for
classification and regression tasks. In the context of text classification, SVM is often used
for tasks like spam detection. SVM works by finding the optimal hyperplane that separates
data points of different classes with the largest possible margin. Here's an in-depth
explanation of how SVM works:
Linear svm : Consider a binary classification problem where we have two classes: positive
(+1) and negative (-1). SVM aims to find a hyperplane that best separates the data points of
these two classes in feature space.
Margin: The margin is the distance between the hyperplane and the nearest data point from
either class. SVM aims to maximize this margin because a larger margin typically results in
better generalization to unseen data.
Wtx+b=0
Where,
W = weight vector
29
Dept. of IT and AI&DS,NBKRIST
B = bias term
Objective function:The objective of SVM is to find the hyperplane that maximizes the
margin while minimizing the classification error. This can be formulated as an optimization
problem.
Minimize ½ ||w||2
Subject to:
Where,
Let's consider a simple example of spam reviews detection using SVM. Suppose we have a
dataset of reviews labeled as spam or non-spam. Each review is represented as a feature
vector 𝑥x (e.g., TF-IDF vector) and belongs to a class 𝑦 (spam or non-spam).
Training:
• We feed the labeled training data (feature vectors and corresponding labels) into the
SVM algorithm.
• SVM learns the optimal hyperplane that separates spam reviews from non-spam
reviews with the largest margin.
30
Dept. of IT and AI&DS,NBKRIST
Prediction:
• We use the learned SVM model to predict the class label for the new review.
• If the value of 𝑤𝑇𝑥new+𝑏 is positive, the review is classified as spam; otherwise, it's
classified as non-spam.
Mathematical Formulas:
subject to:
𝑦𝑖(𝑤𝑇𝑥𝑖+𝑏)≥1yi(wTxi+b)≥1
Subject to:
∑iN aiyi = 0
31
Dept. of IT and AI&DS,NBKRIST
Confusion Matrix:
4.3 CREATE AN UI
4.3.1 Streamlit:
Streamlit is an open-source Python library that allows you to create interactive web
applications directly from Python scripts. It's designed to make it easy and fast for data
scientists and machine learning engineers to build and share data-driven applications without
needing to have web development skills.
Write Python Script : With Streamlit, you write your application logic in Python scripts
using simple and intuitive syntax. You can use familiar Python libraries like NumPy, Pandas,
and Matplotlib for data processing, visualization, and machine learning tasks.
32
Dept. of IT and AI&DS,NBKRIST
Declarative Programming : Streamlit uses a declarative programming model, which means
that you specify what you want to appear on the web page, and Streamlit takes care of the
underlying HTML, CSS, and JavaScript code to render the user interface.
Realtime Updates : Streamlit automatically updates the web page in real-time as you
modify your Python script. This allows you to see changes immediately as you make them,
without needing to manually reload the web page.
Widgets and Components : Streamlit provides a wide range of widgets and components
that you can use to build interactive elements in your application, such as sliders,
dropdowns, buttons, and text inputs. These widgets allow users to interact with your data and
control the behavior of your application.
Installation Of streamlit :
Using pip command Install the streamlit see below then import the streamlit
Import streamlit as st
4.4 REQUIREMENTS
H/W Configuration:
S/W requirements:
33
Dept. of IT and AI&DS,NBKRIST
Web Browser : Microsoft edge
Technology : Python
34
Dept. of IT and AI&DS,NBKRIST
Fig:-8 Histogram of Ham Reviews
5.3 HISTOGRAM OF SPAM REVIEWS
35
Dept. of IT and AI&DS,NBKRIST
Fig:-9 Confusion Matrix
36
Dept. of IT and AI&DS,NBKRIST
5.6 SVM PREDICTION
37
Dept. of IT and AI&DS,NBKRIST
5.8 ACCURACY OF SVM
38
Dept. of IT and AI&DS,NBKRIST
5.11 USER INTERFACE
Fig:-10 API
Fig:-11 Result
39
Dept. of IT and AI&DS,NBKRIST
6. CONCLUSION AND FUTURE ENHANCEMENTS
6.1 CONCLUSION:
The implementation of SVM models allows for the classification of reviews into spam or
non-spam categories based on their textual content, leveraging features such as TF-IDF
representations. These models are trained on labeled datasets, enabling them to learn patterns
and characteristics indicative of spam reviews, such as excessive promotional language,
irrelevant content, or deceptive practices.
spam review detection systems represent a critical component in maintaining the integrity
and trustworthiness of online platforms and services. By leveraging advanced machine
learning techniques and feature engineering methodologies, these systems can effectively
identify and filter out spam content, thereby enhancing user experience, trust, and credibility
in online communities.
40
Dept. of IT and AI&DS,NBKRIST
7. REFERENCES
1. Product spam reviews detection based on index optimization by Ai-jun LI, Lei SHI.
Provide overview of spam reviews and detecting methods.
2. Survey of fake reviews by N. Abdelmageed , H. tork and Hussein in 2020. Provide an
overview of various techniques and challenges in detecting fake reviews across different
platforms.
3. False comment recognition by Han yutan in 2018. False comments recognitions based on
CNN.
4.LiXiao, DingShengchun. Research on the identification of spam comment information
[2013].
5. You Guirong, WuWei, Qian Yuntao. Feature extraction method of spam review detection
in e-commerce [2014].
41
Dept. of IT and AI&DS,NBKRIST
42
Dept. of IT and AI&DS,NBKRIST
43
Dept. of IT and AI&DS,NBKRIST