Minor New Report
Minor New Report
Minor New Report
A
MINOR PROJECT REPORT
Submitted for the partial fulfillment of the requirement for the award of Degree
B.Tech.
IN
COMPUTER SCIENCE & ENGINEERING
CERTIFICATE
This is to certify that Aditi Agrawal, Kavery Pandey, Kshitij Yadav of B.Tech
Third Year, Computer Science & Engineering have completed their Minor
Project entitled "Sentimental Analysis using NLP" during the year 2023 under
our guidance and supervision.
We approve the project for the submission for the partial fulfillment of the
requirement for the award of degree of B.E. in Computer Science &
Engineering.
Head of Department,
DECLARATION BY CANDIDATE
We, hereby declare that the work which is presented in the minor project,
entitled "Sentimental Analysis using NLP" submitted in partial fulfillment of the
requirement for the award of Bachelor degree in Computer Science and
Engineering has been carried out at University Institute of Technology RGPV ,
Bhopal and is an authentic record of our work carried out under the guidance of
Dr. Shikha Agrawal (Project Guide) and Prof. Manish Mishra (Project
Guide) ,Department of Computer Science and Engineering, UIT RGPV, Bhopal.
The matter in this project has not been submitted by us for the award of any
other degree.
After the completion of minor project work, words are not enough to express our
feelings about all those who helped us to reach our goal, feeling above all this is
our indebtedness to the almighty for providing us this moment in life.
First and foremost, we take this opportunity to express our deep regards and
heartfelt gratitude to our project guide Dr. Shikha Agrawal and Prof.
Manish Mishra of Computer Science and Engineering Department,
RGPV Bhopal for their inspiring guidance and timely suggestions in carrying
out our project successfully. They have also been a constant source of inspiration
for us.
This project can provide valuable insights for businesses, brands, and
researchers to understand public opinion, customer satisfaction, and trends.
Table of Contents
CERTIFICATE ………………..…………………………….…..2
DECLARATION………..……………….....……………………3
ACKNOWLEDGEMENT..……………….……………………..4
ABSTRACT ………………….…………………………...…….5
1. Introduction 10-13
1.1) Basics……………………………………………………………….……….10
1.2) Need of sentiment analysis………………………………………….….…...10
1.3) Use Cases…………………...…………...…………………………………..11
1.3.1) Social Media Monitoring for Brand Management
1.3.2) Product/Service Analysis
1.3.3) Stock Price Prediction
1.4) Working of system ………………………………………………………….12
1.5) Approach………………………………………………………………….…13
6. Implementation 24-37
6.1 Description…………………………………………………………………………...24
6.2 Raw Data…………………………………………………………...………………...24
6.3 Split into Train/Test…………………………………………………….…………….29
6.4 Data exploration……………………………………………………………………...29
6.5 Correlations………………………………………………………………………..…33
6.6 Sentimental Analysis…………………………………………………………………34
6.6.1) Bag of Words Model……………………………………………………….….35
6.6.2) Multinomial Naive Bayes…….……………………………….………………36
S No Description Page No
1 Sentiment Analyzer 11
3 Descriptive statistics 27
4 Visual representation 28
10 Classifier code 37
Chapter 1
Introduction
Development of an accurate and robust sentiment analysis system that can effectively
classify sentiment in textual data, enabling businesses to gain insights, make informed
decisions, and take appropriate actions based on public sentiment and customer feedback.
We use various natural language processing (NLP) and text analysis tools to figure out
what could be subjective information. We need to identify, extract and quantify such
details from the text for easier classification and working with the data.
a product. This way they can easily sort through the comments or questions and prioritize
what they need to handle first and even order them in a way that look better. Companies
sometimes even try to delete content that has a negative sentiment attached to it.
It is an easy way to understand and analyze public reception and perception of different ideas
and concepts, or a newly launched product, maybe an event or a government policy. Emotion
understanding and sentiment analysis play a huge role in collaborative filtering based
recommendation systems. Grouping together people who have similar reactions to a certain
product and showing them related products. Like recommending movies to people by
grouping them with others that have similar perceptions for a certain show or movie. Lastly,
they are also used for spam filtering and removing unwanted content.
Also, a very common classification is based on what needs to be done with the data or the
reason for sentiment analysis. Examples of which are:
● Simple classification of text into positive, negative or neutral. It may also advance into
fine grained answers like very positive or moderately positive.
● Aspect-based sentiment analysis- where we figure out the sentiment along with a
specific aspect it is related to. Like identifying sentiment regarding various aspects or
parts of a car in user reviews, identifying what feature or aspect was appreciated or
disliked.
● The sentiment along with an action associated with it. Like mails written to customer
support. Understanding if it is a query or complaint or suggestion etc
1.5 Approach
Based on what needs to be done and what kind of data we need to work with there are two
major methods of tackling this problem.
● Matching rules based sentiment analysis: There is a predefined list of words for each
type of sentiment needed and then the text or document is matched with the lists. The
algorithm then determines which type of words or which sentiment is more prevalent in it.
This type of rule based sentiment analysis is easy to implement, but lacks flexibility and does
not account for context.
● Automatic sentiment analysis: They are mostly based on supervised machine learning
algorithms and are actually very useful in understanding complicated texts.
Algorithms in this category include support vector machine, linear regression, rnn,
and its types.
Chapter 2
Objectives of this project
There have been numerous studies and research papers on sentiment analysis, exploring various
aspects, techniques, and applications of the field. Here are a few notable research papers that have
contributed to the advancement of sentiment analysis:
3.1 "Opinion Mining and Sentiment Analysis" by Pang and Lee (2008): [1]
This seminal paper provides an overview of sentiment analysis, discusses challenges and techniques
in opinion mining, and introduces the use of machine learning algorithms for sentiment
classification.
This survey covers techniques and approaches that promise to directly enable opinion-oriented
information seeking systems. Our focus is on methods that seek to address the new challenges raised
by sentiment aware applications, as compared to those that are already present in more traditional
fact-based analysis. It include material on summarization of evaluative text and on broader issues
regarding privacy, manipulation, and economic impact that the development of opinion-oriented
information-access services gives rise to. To facilitate future work, a discussion of available
resources, benchmark datasets, and evaluation campaigns is also provided.
This paper introduces the Recursive Neural Tensor Network (RNTN) model for sentiment
analysis, which captures compositional structure in sentences to improve sentiment prediction.
The paper discusses various compositional methods to combine words and phrases (n-gram) to
predict the binary (positive or negative) as well as fine-grained (very positive, positive, neutral,
negative, very negative) sentiments of words, phrases and whole sentences in a bottom-up
fashion. The main contribution of this paper is to introduce a parse tree based dataset with
fine-grained sentiment labels: “Stanford Sentiment Treebank” and proposes a neural
compositional model:
Recursive Neural Tensor Network (RNTN) that outperforms all previous recursive models and
achieves state-of-the-art performance.
The paper offered several important insights and observations:Models were compared with Naive
Bayes, SVMs, BiNB (NB with bigram features), VecAvg(average of word vectors). On
fine-grained classification for all phrases (at all node levels of the parse trees) RNTN achieves
best performance, followed by MV-RNN, RNN and other models. For binary classification on
sentence level, RNTN pushes state of the art accuracy from 80% to 85.4% .
Optimal performances for all the models were achieved for word vector dimension between 25
and 35, performance deteriorates for smaller and larger value of word vectors which confirms
RNTN performance enhancement is not dependent on its increased parameter size as MV-RNN
has largest number of parameters.
RNTN also captures the effect of negation in both positive and negative sentences. It has highest
accuracy for negating the positive sentences; it also increases non-negative activation ( degree of
non-negative sentiment in a sentence) for negation of negative sentence cases, which clearly
indicates the model learns the negation concept well beyond simple negation rules.
RNTN model is powerful in capturing the structural composition of the words and phrases in a
sentence and learning the effect of composition in detecting sentiments in a principled and
efficient way. The treebank dataset captures intricacies of linguistic phenomena; all models show
substantial improvement on their performances when trained on this new dataset. However, it is
to be noted that as RNTN requires the parse tree of the input sentences to be constructed; the
model might not perform well in cases of poor grammatical constructions such as dialogues in
chatbots or tweets. Another interesting case would be, to observe the effect of pre-trained word
embeddings such as word2vec, glove, fasttext on over all performance of the model instead of
learning the word vector embeddings as parameters during training.
Chapter 4
Problem Description
Any business is obliged to understand clients — their needs, their opinions, their satisfaction
with the product. In case of large web-based companies we need to analyse hundreds of
thousands or even millions of opinions to different products, and simply searching for
pre-defined “good” or “bad” words in the comments is not enough. With rise of machine
learning, in particular, deep neural networks, sentiment analysis — the problem of
understanding the emotional tone of a text has been solved with very high accuracy.
According to wikipedia:
A basic task in sentiment analysis is classifying the polarity of a given text at the document,
sentence, or feature/aspect level — whether the expressed opinion in a document, a sentence or an
entity feature/aspect is positive, negative, or neutral. Advanced, “beyond polarity” sentiment
classification looks, for instance, at emotional states such as “angry”, “sad”, and “happy”.
These problems and challenges have driven the development and advancement of sentiment
analysis techniques and technologies, aiming to automate the analysis of sentiment in text data,
derive valuable insights, and support decision-making processes.
Chapter 5
Proposed Work
Overall, sentiment analysis aims to address the challenge of automatically understanding and
classifying sentiment in text data, considering factors such as context, subjectivity, ambiguity,
domain specificity, and real-time analysis requirements. Researchers and practitioners in the field
work on developing robust and accurate sentiment analysis models and systems to tackle these
challenges effectively.
Chapter 6
Implementation
6.1 Dataset
This dataset is based on Amazon branded/Amazon manufactured products only, and Customer
satisfaction with Amazon products seem to be the main focus.By using Sentiment analysis, we can
predict scores for reviews based on certain words
Potential suggestion for product reviews:
Product X is highly rated on the market, it seems most people like its lightweight sleek design and fast
speeds. Most products that were associated with negative reviews seemed to indicate that they were too
heavy and they couldn't fit them in the bags. We suggest that next gen models for e-readers are
lightweight and portable, based on this data.
6.1.1 Assumptions:
● Assuming that sample size of 30K examples are sufficient to represent the entire population of
sales/reviews.
● Assuming that the information in the text reviews of each product will be rich enough to train a
sentiment analysis classifier with accuracy (hopefully) > 70%
■ The range of most reviews will be between 0-13 people finding helpful (reviews.numHelpful).
● reviews.numHelpful: Outliers in this case are valuable, so we want to weight reviews that had
more than 50+ people who find them helpful
● reviews.rating: Majority of examples were rated highly (looking at rating distribution). There is
twice amount of 5 star ratings than the others ratings combined.
● asins
● name
● reviews.rating
● reviews.doRecommend
● (reviews.numHelpful - not possible since numHelpful is only between 0-13 as per previous
analysis in Raw Data)
● (reviews.text - not possible since text is in long words)
Working hypothesis: there are only 35 products based on the training data ASINs
Fig. 6.4.1 Bar graph showing product sales frequency based on ASINS
● Based on the bar graph for ASINs, certain products have significantly more reviews than other
products, which may indicate a higher sale in those specific products
● The ASINs have a "right tailed" distribution which can also suggest that certain products have
higher sales which can correlate to the higher ASINs frequencies in the reviews.
● The log of the ASINs to normalize the data, in order display an in-depth picture of each ASINs,
and we see that the distribution still follows a "right tailed" distribution
This answers the first question that certain ASINs (products) have better sales, while other ASINs have
lower sale, and in turn dictates which products should be kept or dropped.
6.4.2 reviews.rating / ASINs
● 1a) The most frequently reviewed products have their average review ratings in the 4.5 - 4.8
range, with little variance
● 1b) Although there is a slight inverse relationship between the ASINs frequency level and
average review ratings for the first 4 ASINs, this relationship is not significant since the
average review for the first 4 ASINs are rated between 4.5 - 4.8, which is considered good
overall reviews
● 2a) For ASINs with lower frequencies as shown on the bar graph (top), we see that their
corresponding average review ratings on the point-plot graph (bottom) has significantly higher
variance as shown by the length of the vertical lines. As a result, we suggest that, the average
review ratings for ASINs with lower frequencies are not significant for our analysis due to
high variance
● 2b) On the other hand, due to their lower frequencies for ASINs with lower frequencies, we
suggest that this is a result of lower quality products
● 2c) Furthermore, the last 4 ASINs have no variance due to their significantly lower
frequencies, and although the review ratings are a perfect 5.0, but we should not consider the
significance of these review ratings due to lower frequency as explained in 2a)
6.5 Correlations
Table 6.5.1 Correlations for each attribute
From our analysis in data exploration above between ASINs and reviews.rating, we discovered that there are
many ASINs with low occurrence that have high variances, as a result we concluded that these low occurrence
ASINs are not significant in our analysis given the low sample size.
Similarly in our correlation analysis between ASINs and reviews.rating, we see that there is almost no
correlation which is consistent with our findings.
Test code:
def sentiments(rating):
return "Positive"
elif rating == 3:
return "Neutral"
return "Negative"
strat_train["Sentiment"] = strat_train["reviews.rating"].apply(sentiments)
strat_test["Sentiment"] = strat_test["reviews.rating"].apply(sentiments)
strat_train["Sentiment"][:20]
6.6.2 Extract Features
Here we will turn content into numerical feature vectors using the Bag of Words strategy:
Here we will turn content into numerical feature vectors using the Bag of Words strategy:
- Assign fixed integer id to each word occurrence (integer indices to word occurrence
dictionary.
- X[i,j] where i is the integer indices, j is the word occurrence, and X is an array of words
(our training set)
In order to implement the Bag of Words strategy, we will use SciKit-Learn's CountVectorizer to
performs the following:
* Text preprocessing:
* Tokenization (breaking sentences into words)
* Stopwords (filtering "the", "are", etc)
* Occurrence counting (builds a dictionary of features from integer indices with word
occurrences
* Feature Vector (converts the dictionary of text documents into a feature vector)
6.8 TFIDF
Here we have 27,701 training samples and 12,526 distinct words in our training sample.
Also, with longer documents, we typically see higher average count values on words that carry
very little meaning, this will overshadow shorter documents that have lower average counts with
same frequencies, as a result, we will use TfidfTransformer to reduce this redundancy:
Term Frequencies (Tf) divides number of occurrences for each word by total number of words.
Term Frequencies times Inverse Document Frequency (Tfidf) downscales the weights of each
word (assigns less value to unimportant stop words "the", "are".
Naive Bayes is a powerful algorithm that is used for text data analysis and with problems with multiple
classes. Bayes theorem, formulated by Thomas Bayes, calculates the probability of an event occurring
based on the prior knowledge of conditions related to an event. It is based on the following formula:
P(A|B) = P(A) * P(B|A)/P(B) eq-1.1
Where we are calculating the probability of class A when predictor B is already provided.
● Multinominal Niave Bayes is most suitable for word counts where data are typically represented
as word vector counts (number of times outcome number X[i,j] is observed over the n trials),
while also ignoring non-occurrences of a feature i
● Naive Bayes is a simplified version of Bayes Theorem, where all features are assumed
conditioned independent to each other (the classifiers), P(x|y) where x is the feature and y is the
classifier.
Here we see that our Multinominal Naive Bayes Classifier has a 93.45% accuracy level based on
the features.
Fig 6.6 Code for classifier
● Here we will run a Grid Search of the best parameters on a grid of possible values, instead of
tweaking the parameters of various components of the chain (ie. use_idf in tfidf transformer)
● We will also run the grid search with LinearSVC classifier pipeline, parameters and cpu core
maximization
● Then we will fit the grid search to our training data set
● Next we will use our final classifier (after fine-tuning) to test some arbitrary reviews
● Finally we will test the accuracy of our final classifier (after fine-tuning).
Note that Support Vector Machines are very suitable for classification by measuring extreme values
between classes, to differentiate the worst case scenarios so that it can classify between Positive,
Neutral and Negative correctly.
Chapter 7
● Analyze the best mean score of the grid search (classifier, parameters, CPU core)
● Analyze the best estimator
● Analyze the best parameter
● Here the best mean score of the grid search is 93.65% which is very close to our accuracy
level of 94.08%
● Best estimator here is also displayed
● Lastly, best parameters are true for use_idf in tfidf, and ngram_range between 1,2
Results:
● After testing some arbitrary reviews, it seems that our features is performing correctly
with Positive, Neutral, Negative results
● We also see that after running the grid search, our Support Vector Machine Classifier has
improved to 94.08% accuracy level
The results in this analysis confirms previous data exploration analysis, where the data are very
skewed to the positive reviews as shown by the lower support counts in the classification report.
Also, both neutral and negative reviews has large standard deviation with small frequencies,
which we would not consider significant as shown by the lower precision, recall and F1 scores in
the classification report.
However, despite that Neutral and Negative results are not very strong predictors in this data set,
it still shows a 94.08% accuracy level in predicting the sentiment analysis, which we tested and
worked very well when inputting arbitrary text (new_text). Therefore, we are comfortable here
with the skewed data set. Also, as we continue to input new dataset in the future that is more
balanced, this model will then re-adjust to a more balanced classifier which will increase the
accuracy level.
Finally, the overall result here explains that the products in this dataset are generally positively
rated.
By considering only row 2-4 and column 2-4 labeled as negative, neutral and positive, the
positive sentiment can sometimes be confused for one another with neutral and negative ratings,
with scores of 246 and 104 respectively. However, based on the overall number of significant
positive sentiment at a score 6445, then confusion scores of 246 and 104 for neutral and negative
ratings respectively are considered insignificant.
Also, this is a result of a positively skewed dataset, which is consistent with both data exploration
and sentiment analysis. Therefore, we conclude that the products in this dataset are generally
positively rated, and should be kept from Amazon's product roster.
Results
The results of a sentiment analysis project can be evaluated based on several metrics and factors:
1. Accuracy: Accuracy is the most commonly used metric to measure the performance of a
sentiment analysis model. It indicates the proportion of correctly classified instances out of the
total number of instances. Higher accuracy indicates better performance. Precision and Recall:
2. Precision measures the proportion of true positive predictions (correctly identified positive
sentiments) out of all positive predictions made by the model. Recall, on the other hand,
measures the proportion of true positive predictions out of all actual positive instances in the data.
Both precision and recall are important in sentiment analysis as they provide insights into the
model's ability to correctly identify positive or negative sentiments.
3. F1-Score: The F1-score is the harmonic mean of precision and recall, providing a balanced
measure of the model's performance. It takes into account both precision and recall, making it a
useful metric when classes are imbalanced.
5. Error Analysis: Analyzing the misclassified instances can provide insights into the limitations
and challenges faced by the sentiment analysis model. Understanding the types of errors made
(e.g., misinterpreting sarcasm, handling negations, or context-specific sentiments) can guide
further improvements in the model or the preprocessing techniques.
It's important to note that the performance and results of a sentiment analysis project can vary
based on the quality and size of the training data, the choice of algorithms and models, and the
preprocessing techniques employed. Continuous monitoring and evaluation are necessary to
ensure the model's performance remains consistent and reliable.
Chapter 8
As a result, we need to input more data in order to consider the significance of lower rated
products, in order to determine which products should be dropped from Amazon's product roster.
In conclusion, although we need more data to balance out the lower rated products to consider
their significance, we were still able to successfully associate positive, neutral and negative
sentiments for each product in Amazon's Catalog.
Conclusion
In conclusion, sentiment analysis is a valuable technique for automatically analyzing and
understanding the sentiments expressed in text data. The success of a sentiment analysis project
relies on data collection, preprocessing, model selection, training, and evaluation. By considering
metrics such as accuracy, precision, recall, F1-score, and analyzing the confusion matrix, we can
assess the performance of the sentiment analysis model
Future Work
4. Emotion Analysis: Sentiment analysis primarily focuses on classifying text into positive,
negative, or neutral sentiments. However, emotions play a crucial role in understanding
human sentiments. Future work can involve developing models that can detect and
classify specific emotions such as joy, sadness, anger, or surprise.
5. Handling Sarcasm, Irony, and Figurative Language: Sentiment analysis models often
struggle with identifying sarcasm, irony, or sentiments expressed through figurative
language. Future research can explore techniques to better handle these linguistic
phenomena to improve the accuracy of sentiment analysis models.
6. Real-Time and Dynamic Sentiment Analysis: As sentiments change over time and in
response to events, real-time and dynamic sentiment analysis becomes crucial. Future
work can focus on developing models and algorithms that can analyze sentiments in
real-time, capture sentiment shifts, and adapt to changing sentiment patterns.
In summary, future work in sentiment analysis should aim to address the challenges of
domain-specific sentiments, context-aware analysis, multimodal data, emotions, linguistic
nuances, and real-time analysis to further enhance the accuracy and applicability of
sentiment analysis models in various domains and applications.
Chapter 9
References
[1]:
https://www.cs.cornell.edu/home/llee/omsa/omsa.pdf
[2]: :
https://www.researchgate.net/publication/284039049_Recursive_deep_models_for_semantic_co
mpositionality_over_a_sentiment_treebank
EMNLP2013_RNTN.pdf (stanford.edu)
Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank - ACL
Anthology