Fake News Spreader Detection Using Naïve Bayes Classifier and Logistic Regression
Fake News Spreader Detection Using Naïve Bayes Classifier and Logistic Regression
Fake News Spreader Detection Using Naïve Bayes Classifier and Logistic Regression
ISSN No:-2456-2165
Abstract:- Till date people have worked only one domain I am not thinking that social media is worst only fake
means only politics news but I have worked on politics as news disadvantage but within seconds we are updated on
well as crime and film industry news. world current affaire using social media .
I collect this data set from Kaggl e.com . Our data is III. LITERATURE SURVEY
text form so we have convert this data to numeric form
using countvectorizer .To see which algorithm is best ,I [1] Detecting Fake News in Social Media Networks" by Shu
compared naïve bayes classifier and logistic regression et al. (2017): This paper proposes a framework for detecting
and logistic regression gives best accuracy . fake news in social media networks based on textual and
network features.
Higest accuracy gave logistic regression is 95% and [2] Fake News Detection on Social Media: A Data Mining
lowest accuracy gave BernoulliNB is 78%. Perspective" by Kumar et al. (2018): This paper presents a
data mining approach to detecting fake news on social
Keywords:- ML, Naïve Bayes ,Logistic Regression. media using features such as user profile information, post
content, and social network structure.
I. INTRODUCTION [3] Combating Fake News: A Survey on Detection and
Mitigation Techniques" by Karim et al. (2019): This paper
Online news may be found in a variety of places, provides a comprehensive survey of existing techniques for
including social networking websites, computer detecting and mitigating fake news, including both machine
programmes, news agency homepages, and fact-checking learning and rule-based approaches.
websites. There are several publicly available datasets for [4] Fake News Detection on Twitter Using Machine
the categorization of fake news on the internet, including Learning: A Comparative Study" by Iqbal et al. (2019): This
those from Buzzfeed News, BS Detector, Kaggle datasets, paper compares the performance of several machine
etc. Since we spend an ever-increasing amount of our time learning algorithms for detecting fake news on Twitter,
online communicating with others through social media using features such as sentiment analysis, user credibility,
platforms, many people choose to search for and consume and linguistic patterns.
news from social media over traditional news organisations. [5] Fake News Detection on Social Media: A Review" by
In comparison to more traditional forms of journalism, such Nair and Singh (2020): This paper provides a
as newspapers or television, news is typically more current comprehensive review of the existing literature on fake news
and less expensive to consume on social media. It is also detection on social media, including the challenges and
simpler to share, discuss, and debate the news with friends future research directions in this field.
or other readers on social media. [6] A Deep Learning Approach for Fake News Detection in
Social Media" by Khan et al. (2021): This paper proposes a
I coulden’t use more algorithm when I used jupyter deep learning approach for detecting fake news on social
idle because the dataset was more than 200000 .It take a lot media, using features such as word embeddings and
of time to run the programm and I face so many times dead attention mechanisms.
kernel . [7] Fake News Detection on Social Media Using
Geolocation Information" by Ahmadi et al. (2020): This
II. WHAT IS FAKE NEWS paper investigates the use of geolocation information for
detecting fake news spreaders on social media, by analyzing
Fake news is more spread online platforms .There are the location of users and the geographic distribution of
some social media platform to spread fake news like twitter content.
,facebook ,whatsapp ,Instagram within half an hour it [8] Identifying Fake News Spreading Accounts on Twitter"
becomes viral on social media . now small children to oldest by Yang et al. (2020): This paper proposes a method for
peoples use social media viral timing becomes so fast . identifying fake news spreading accounts on Twitter, using
features such as tweet content, user profile information, and
During the period of COVID-19 so many fake news network structure.
spreads like Maharashtra PM is more serious because of
covid but that was fake news .
Undermining trust in institutions: Fake news spreaders But we have text data .text data also have have feature
can undermine trust in institutions such as the government, scaling means vectorization method we use one of the most
media, and scientific community. This can lead to a lack of famous method count vectorizer .
confidence in public health measures or scientific research,
for example. Classification algorithm :- There so many classification
algorithm like Descision tree ,random forest,svm,knn But
Exacerbating social tensions: Fake news spreaders can we use Multinomial Naive Bayes ,Bernoulli Naive Bayes,
exacerbate social tensions by spreading false information Gaussian Naive bayes and Logistic Regression .
about various groups of people. This can lead to
discrimination, prejudice, and even violence. Generally, we need a procedure for representing text
information for the ML algorithm. Bag- of-words are useful
Overall, the impact of fake news spreaders on to complete this task. This model is simple to implement. It
individuals can be significant and can lead to confusion, is one of the methods to extract features from the given text
anxiety, and mistrust. It is important to take steps to detect for machine learning models. The Bag of Words model is
and prevent the spread of fake news in order to ensure that used to pre-process the input text by changing it into a bag
accurate information is available to people and that they can of words. The bow can be represented using a table, which
make informed decisions based on facts and evidence. contains the count of words corresponding to the word itself.
The news is real or fake. 1 for real news and 0 for fake
news This dataset contains 20800 news that is balanced with
10413 for positive and 10387 for fake news.
Model Performance:-
1. TN / True Negative: when a case was negative and
predicted negative
2. TP / True Positive: when a case was positive and
predicted positive
3. FN / False Negative: when a case was positive but
predicted negative
4. FP / False Positive: when a case was negative but
predicted positive
In conclusion, the Naive Bayes classifier and logistic [1]. Castillo, C., Mendoza, M., & Poblete, B. (2011).
regression are two popular and effective machine learning Information credibility on twitter. In Proceedings of
algorithms that can be used for fake news spreader the 20th international conference on World wide web
detection. Naive Bayes classifier is a probabilistic model (pp. 675-684). Gilda, S. & Kulkarni, U. (2019).
that is based on Bayes' theorem, and it is a simple and fast Detection and Prevention of Fake News Using
algorithm that can handle large datasets. Logistic regression Machine
is a linear model that is used to predict binary outcomes, and [2]. Learning Techniques. International Journal of
it is a widely used algorithm for classification tasks. Computer Science and Information Security, 17(10),
63-70.
Both algorithms can be trained on a dataset of labeled [3]. Potthast, M., Köpsel, S., Stein, B., & Hagen, M.
examples of fake news spreaders and non-spreaders, and (2018). A Stylometric Inquiry into Hyperpartisan and
then used to classify new instances of news stories or social Fake News. Proceedings of the 27th International
media posts. Naive Bayes classifier and logistic regression Conference on Computational Linguistics (pp. 2972-
have been shown to achieve high accuracy in detecting fake 2983).
news spreaders, and they can be used in combination with [4]. Ruchansky, N., Seo, S., & Liu, Y. (2017). CSI: A
other techniques such as feature engineering and ensemble Hybrid Deep Model for Fake News Detection.
learning to improve their performance. Proceedings of the 2017 ACM on Conference on
Information and Knowledge Management (pp. 797-
Overall, the detection of fake news spreaders using 806).
machine learning algorithms is an important area of research [5]. Thakur, D. & Trivedi, R. (2020). Fake News Detection
that can help to mitigate the impact of fake news on society. using Machine Learning: A Comprehensive Survey.
While no single algorithm can guarantee perfect accuracy, Proceedings of the 3rd International Conference on
the use of multiple algorithms and techniques can help to Communication, Computing and Networking (pp. 275-
improve the accuracy and reliability of fake news detection 281).
systems. [6]. Wang, W., Chen, L., & Thirunarayan, K. (2019).
Machine Learning for Detecting Fake News: An
Overview. Journal of Big Data, 6(1), 1-22.