An efficient fake account identification in social media networks: Facebook and Instagram using NSGA-II algorithm

Sallah, Amine; Abdellaoui Alaoui, El Arbi; Hessane, Abdelaaziz; Agoujil, Said; Nayyar, Anand

doi:10.1007/s00521-024-10350-8

An efficient fake account identification in social media networks: Facebook and Instagram using NSGA-II algorithm

Original Article
Published: 28 August 2024

Volume 36, pages 21487–21515, (2024)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Amine Sallah¹,
El Arbi Abdellaoui Alaoui²,
Abdelaaziz Hessane¹,
Said Agoujil³ &
…
Anand Nayyar ORCID: orcid.org/0000-0002-9821-6146^4,5

266 Accesses
Explore all metrics

Abstract

The widespread use of online social networks (OSNs) has made them prime targets for cyber attackers, who exploit these platforms for various malicious activities. As a result, a whole industry of black-market services has emerged, selling services based on the sale of fake accounts. Because of the massive rise of OSNs, the number of fraudulent accounts rapidly expands. Hence, this research focuses on detecting fraudulent profiles on Instagram and Facebook and aims to find an optimal subset of features that can effectively differentiate between real and fake accounts. The problem has been formulated as a multiobjective optimization task, aiming to maximize the classification accuracy while minimizing the number of selected features. NSGA-II (non-dominated sorting genetic algorithm II) is employed as the optimization algorithm to explore the trade-offs between these conflicting objectives. In the current study, a novel approach for feature selection using the NSGA-II optimization algorithm to detect fake accounts is proposed. The proposed methodology relies on input data comprising features characterizing the profiles under investigation. The selected features are utilized to train a machine learning model. The model’s performance is evaluated using various metrics, including precision, recall, F1-score, and receiver operating characteristic (ROC) curve. The final prediction model achieved accuracy values ranging from 90 to 99.88%. The results indicated that the model, utilizing features selected by the NSGA-II algorithm, delivered high prediction accuracy while using less than 31% of the total feature space. This efficient feature selection allowed for the precise differentiation between fake and real users, demonstrating the model’s effectiveness with a minimal number of input variables. Furthermore, the results of experiments demonstrate that the proposed approach achieves better performance as compared to other existing approaches. This research paper focuses on explainability, which refers to the ability to understand and interpret the decisions and outcomes of machine learning models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 8

Feature Selection Using Chi-Squared Feature-Class Association Model for Fake Profile Detection in Online Social Networks

Enhancing Fake Account Detection on Facebook Using Boruta Algorithm

Feature Selection for Identification of Fake Profiles on Facebook

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data availability

We declare that all the data associated with the manuscript are mentioned in the manuscript.

References

Adewole KS, Balogun AO, Raheem MO, Jimoh MK, Jimoh RG, Mabayoje MA, Usman-Hamza FE, Akintola AG, Asaju-Gbolagade AW (2021) Hybrid feature selection framework for sentiment analysis on large corpora. Jordan J Comput Inf Technol. https://doi.org/10.5455/jjcit.71-1609858713
Article Google Scholar
Aditya BL, Mohanty SN (2023) Heterogenous social media analysis for efficient deep learning fake-profile identification. IEEE Access 11:99339–99351. https://doi.org/10.1109/ACCESS.2023.3313169
Article Google Scholar
Ahmed H, Traore I, Saad S (2018) Detecting opinion spams and fake news using text classification. Secur Priv 1(1):9
Article Google Scholar
Akyon FC, Esat Kalfaoglu M (2019) Instagram fake and automated account detection. In: Proceedings—2019 innovations in intelligent systems and applications conference, ASYU 2019 https://doi.org/10.1109/ASYU48272.2019.8946437. arXiv:1910.03090
Albayati MB, Altamimi AM (2019) Identifying fake Facebook profiles using data mining techniques. J ICT Res Appl 13:107–117. https://doi.org/10.5614/itbj.ict.res.appl.2019.13.2.2
Article Google Scholar
Allam M, Nandhini M (2018) Optimal feature selection using binary teaching learning based optimization algorithm. J King Saud Univ Comput Inf Sci 34:329–341
Google Scholar
Alnagi E, Ahmad A, Al-Haija QA, Aref A (2024) Unmasking fake social network accounts with explainable intelligence. Int J Adv Comput Sci Appl 15:1277–1283. https://doi.org/10.14569/IJACSA.2024.01503125
Article Google Scholar
Alsubaei FS (2023) Detection of inappropriate tweets linked to fake accounts on twitter. Appl Sci (Switzerland). https://doi.org/10.3390/app13053013
Article Google Scholar
Anand N, Sehgal R, Anand S, Kaushik A (2021) Feature selection on educational data using Boruta algorithm. Int J Comput Intell Stud 10:27–35
Google Scholar
Arega KL, Alasadi MK, Yaseen AJ, Salau AO, Braide SL, Bandele JO (2023) Machine learning based detection of fake Facebook profiles in Afan Oromo language. Math Model Eng Probl 10:1987–1993. https://doi.org/10.18280/mmep.100608
Article Google Scholar
Bakhshandeh B (2019) Instagram fake spammer genuine accounts
Bhattasali T, Saeed K (2021) Typing pattern analysis for fake profile detection in social media, in: Computer information systems and industrial management: 20th international conference, CISIM 2021, Ełk, Poland, September 24–26, 2021, Proceedings 20, Springer. pp 17–27. https://doi.org/10.1007/978-3-030-84340-3_2
Cai J, Luo J, Wang S, Yang S (2018) Feature selection in machine learning: a new perspective. Neurocomputing 300:70–79
Article Google Scholar
Carmi E (2020) Rhythmedia: a study of Facebook immune system. Theory Cult Soc 37:119–138. https://doi.org/10.1177/0263276420917466
Article Google Scholar
Cauteruccio F, Kou Y (2023) Investigating the emotional experiences in esports spectatorship: the case of league of legends. Inf Process Manag 60:103516. https://doi.org/10.1016/j.ipm.2023.103516
Article Google Scholar
Chalkiadakis G, Elkind E, Wooldridge M (2012) Cooperative game theory: basic concepts and computational challenges. IEEE Intell Syst 27:86–90
Article Google Scholar
Chen C, Zhang J, Xie Y, Xiang Y, Zhou W, Hassan MM, AlElaiwi A, Alrubaian M (2015) A performance evaluation of machine learning-based streaming spam tweets detection. IEEE Trans Comput Soc Syst 2:65–76
Article Google Scholar
Cresci S, Di Pietro R, Petrocchi M, Spognardi A, Tesconi M (2015) Fame for sale: efficient detection of fake twitter followers. Decis Support Syst 80:56–71
Article Google Scholar
Cresci S, Di Pietro R, Petrocchi M, Spognardi A, Tesconi M (2017) The paradigm-shift of social spambots: evidence, theories, and tools for the arms race. In: Proceedings of the 26th international conference on world wide web companion, pp 963–972
Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: Nsga-ii. IEEE Trans Evol Comput 6:182–197. https://doi.org/10.1109/4235.996017
Article Google Scholar
Deepa S (2008) Introduction to genetic algorithms. Springer, Berlin
Google Scholar
Deng X, Li Y, Weng J, Zhang J (2019) Feature selection for text classification: a review. Multimed Tools Appl 78:3797–3816
Article Google Scholar
Fakhraei S, Foulds J, Shashanka M, Getoor L (2015) Collective spammer detection in evolving multi-relational social networks. In: Proceedings of the 21th ACM sigkdd international conference on knowledge discovery and data mining, pp 1769–1778
Feng S, Tan Z, Wan H, Wang N, Chen Z, Zhang B, Zheng Q, Zhang W, Lei Z, Yang S et al (2022) Twibot-22: towards graph-based twitter bot detection. Adv Neural Inf Process Syst 35:35254–35269
Google Scholar
Fraser A, Burnell D et al (1970) Computer models in genetics. Comput Models Genet
Galán-García P, Puerta JGDI, Gómez CL, Santos I, Bringas PG (2016) Supervised machine learning for the detection of troll profiles in twitter social network: application to a real case of cyberbullying. Logic J IGPL 24:42–53
MathSciNet Google Scholar
Gambella C, Ghaddar B, Naoum-Sawaya J (2021) Optimization problems for machine learning: a survey. Eur J Oper Res 290:807–828. https://doi.org/10.1016/j.ejor.2020.08.045
Article MathSciNet Google Scholar
Gazeloğlu C (2020) Prediction of heart disease by classifying with feature selection and machine learning methods. Progress Nutr. https://doi.org/10.23751/pn.v22i2.9830
Ghatasheh N, Altaharwa I, Aldebei K (2022) Modified genetic algorithm for feature selection and hyper parameter optimization: case of XGBoost in spam prediction. IEEE Access 10:84365–84383
Article Google Scholar
Gu B, Zhai Z, Li X, Huang H (2022) Towards fairer classifier via true fairness score path. In: Proceedings of the 31st ACM international conference on information & knowledge management, pp 3113–3121
Haq ZU, Ullah H, Khan MNA, Naqvi SR, Ahad A, Amin NAS (2022) Comparative study of machine learning methods integrated with genetic algorithm and particle swarm optimization for bio-char yield prediction. Bioresour Technol 363:128008. https://doi.org/10.1016/j.biortech.2022.128008
Article Google Scholar
Hashemi A, Bagher Dowlatshahi M, Nezamabadi-pour H (2021) An efficient pareto-based feature selection algorithm for multi-label classification. Inf Sci 581:428–447. https://doi.org/10.1016/j.ins.2021.09.052
Article MathSciNet Google Scholar
Igual L, Seguí S (2017) Introduction to data science: a python approach to concepts. Tech Appl. https://doi.org/10.1007/978-3-319-50017-1
Article Google Scholar
Jennings PC, Lysgaard S, Hummelshøj JS, Vegge T, Bligaard T (2019) Genetic algorithms for computational materials discovery accelerated by machine learning. NPJ Comput Mater. https://doi.org/10.1038/s41524-019-0181-4
Article Google Scholar
Joshi S, Nagariya HG, Dhanotiya N, Jain S (2020) Identifying fake profile in online social network: an overview and survey. In: International conference on machine learning. Image Processing, Network Security and Data Sciences, Springer, pp 17–28
Katoch S, Chauhan SS, Kumar V (2021) A review on genetic algorithm: past, present, and future. Multimed Tools Appl 80:8091–8126
Article Google Scholar
Kaubiyal J, Jain AK (2019) A feature based approach to detect fake profiles in twitter. In: ACM international conference proceeding series. https://doi.org/10.1145/3361758.3361784
Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of ICNN’95-international conference on neural networks, IEEE. pp 1942–1948. https://doi.org/10.1109/ICNN.1995.488968
Kubat M (2017) An introduction to machine learning. https://doi.org/10.1007/978-3-319-63913-0
Kursa M, Rudnicki W (2020) Boruta: wrapper algorithm for all relevant feature selection. Visité le 6:2020
Google Scholar
Kursa MB, Rudnicki WR (2010) Feature selection with the Boruta package. J Stat Softw 36:1–13. https://doi.org/10.18637/jss.v036.i11
Liu XY, Liang Y, Wang S, Yang ZY, Ye HS (2018) A hybrid genetic algorithm with wrapper-embedded approaches for feature selection. IEEE Access 6:22863–22874
Article Google Scholar
Lundberg SM, Lee SI (2017) A unified approach to interpreting model predictions. In: Advances in neural information processing systems 2017-December, 4766–4775. arXiv:1705.07874
Ma W, Zhou X, Zhu H, Li L, Jiao L (2021) A two-stage hybrid ant colony optimization for high-dimensional feature selection. Pattern Recog. https://doi.org/10.1016/j.patcog.2021.107933
Article Google Scholar
Medhane DV, Sangaiah AK (2017) Search space-based multi-objective optimization evolutionary algorithm. Comput Electr Eng 58:126–143. https://doi.org/10.1016/j.compeleceng.2017.01.025
Article Google Scholar
Mohammadrezaei M, Shiri ME, Rahmani AM (2018) Identifying fake accounts on social networks based on graph analysis and classification algorithms. Secur Commun Netw. https://doi.org/10.1155/2018/5923156
Article Google Scholar
Moslehi F, Haeri A (2020) A novel hybrid wrapper-filter approach based on genetic algorithm, particle swarm optimization for feature subset selection. J Ambient Intell Humaniz Comput 11:1105–1127
Article Google Scholar
Nettleton D (2014) Selection of variables and factor derivation. In: Commercial data mining. https://doi.org/10.1016/b978-0-12-416602-8.00006-6
Oumaima L, Mariam R, Ouafae B, Abdelouahid L (2024) Fake account detection in twitter using long short-term memory and convolutional neural network. Int J Eng Trends Technol 72:116–126. https://doi.org/10.14445/22315381/IJETT-V72I3P112
Article Google Scholar
Rácz A, Bajusz D, Héberger K (2019) Multi-level comparison of machine learning classifiers and their performance metrics. Molecules 24:2811
Article Google Scholar
Raja EVS, Aditya BL, Mohanty SN (2024) Fake profile detection using logistic regression and gradient descent algorithm on online social networks. EAI Endorsed Trans Scalable Inf Syst 11:1–10. https://doi.org/10.4108/eetsis.4342
Article Google Scholar
Rostami M, Berahmand K, Forouzandeh S (2021) A novel community detection based genetic algorithm for feature selection. J Big Data 8:1–27
Article Google Scholar
Shah A, Varshney S, Mehrotra M (2024) Detection of fake profiles on online social network platforms: performance evaluation of artificial intelligence techniques. SN Comput Sci. https://doi.org/10.1007/s42979-024-02839-9
Article Google Scholar
Shami TM, El-Saleh AA, Alswaitti M, Al-Tashi Q, Summakieh MA, Mirjalili S (2022) Particle swarm optimization: a comprehensive survey. IEEE Access 10:10031–10061. https://doi.org/10.1109/ACCESS.2022.3142859
Article Google Scholar
Sheikhi S (2020) An efficient method for detection of fake accounts on the Instagram platform. Rev d’Intelligence Artif 34:429–436
Google Scholar
Shirataki S, Yamaguchi S (2017) A study on interpretability of decision of machine learning. In: Proceedings—2017 IEEE international conference on big data, big data vol 2018, pp 4830–4831. https://doi.org/10.1109/BigData.2017.8258557
Shunmugapriya P, Kanmani S (2017) A hybrid algorithm using ant and bee colony optimization for feature selection and classification (ac-abc hybrid). Swarm Evolut Comput 36:27–36. https://doi.org/10.1016/j.swevo.2017.04.002
Article Google Scholar
Singhal Y, Jain A, Batra S, Varshney Y, Rathi M (2018) Review of bagging and boosting classification performance on unbalanced binary classification. In: 2018 IEEE 8th international advance computing conference (IACC), IEEE. pp 338–343
Song XF, Zhang Y, Guo YN, Sun XY, Wang YL (2020) Variable-size cooperative coevolutionary particle swarm optimization for feature selection on high-dimensional data. IEEE Trans Evol Comput 24:882–895. https://doi.org/10.1109/TEVC.2020.2968743
Article Google Scholar
Statista. Most popular social networks worldwide as of january 2022, ranked by number of monthly active users. https://www.statista.com/statistics/272014/global-social-networks-ranked-by-number-of-users/
Unni MV, Jeevananda S, Kalapurackal JJ, Fatma S (2024) Enhancing authenticity and trust in social media: an automated approach for detecting fake profiles. Indones J Electr Eng Comput Sci 35:292–300. https://doi.org/10.11591/ijeecs.v35.i1.pp292-300
Article Google Scholar
Venkatesh B, Anuradha J (2019) A review of feature selection and its methods. Cybern Inf Technol 19:3–26
MathSciNet Google Scholar
Venkatesh SC, Shaji S, Sundaram BM (2024) A fake profile detection model using multistage stacked ensemble classification. Proc Eng Technol Innov 26:18–32. https://doi.org/10.46604/peti.2024.13200
Article Google Scholar
Wang X, Lai CM, Lin YC, Hsieh CJ, Wu SF, Cam H (2019) Multiple accounts detection on Facebook using semi-supervised learning on graphs. In: Proceedings—IEEE military communications conference MILCOM 2019-Oct 94–101. https://doi.org/10.1109/MILCOM.2018.8599718
Wani MA, Agarwal N, Jabin S, Hussain SZ (2019) Analyzing Real and Fake users in Facebook Network based on Emotions. 2019 11th International Conference on Communication Systems and Networks, COMSNETS 2019 2061:110–117. https://doi.org/10.1109/COMSNETS.2019.8711124
Xue Y, Li M, Shepperd M, Lauria S, Liu X (2019) A novel aggregation-based dominance for pareto-based evolutionary algorithms to configure software product lines. Neurocomputing 364:32–48. https://doi.org/10.1016/j.neucom.2019.06.075
Article Google Scholar
Yang L, Shami A (2020) On hyperparameter optimization of machine learning algorithms: theory and practice. Neurocomputing 415:295–316. https://doi.org/10.1016/j.neucom.2020.07.061
Article Google Scholar
Zeng F, Sun Y, Li Y (2023) MRLBot : Multi-dimensional representation learning for social media bot detection

Download references

Funding

The authors received no specific funding for this study.

Author information

Authors and Affiliations

Department of Computer Science, Faculty of Sciences and Techniques, Moulay Ismail University of Meknes, Errachidia, Morocco
Amine Sallah & Abdelaaziz Hessane
Department of Sciences, Ecole Normale Supérieure, Moulay Ismail University of Meknes, Errachidia, Morocco
El Arbi Abdellaoui Alaoui
École Nationale de Commerce et de Gestion, Moulay Ismail University of Meknes, El Hajeb, Morocco
Said Agoujil
Faculty of Information Technology, Duy Tan University, Da Nang, 550000, Viet Nam
Anand Nayyar
Graduate School, Duy Tan University, Da Nang, 550000, Viet Nam
Anand Nayyar

Authors

Amine Sallah
View author publications
You can also search for this author in PubMed Google Scholar
El Arbi Abdellaoui Alaoui
View author publications
You can also search for this author in PubMed Google Scholar
Abdelaaziz Hessane
View author publications
You can also search for this author in PubMed Google Scholar
Said Agoujil
View author publications
You can also search for this author in PubMed Google Scholar
Anand Nayyar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anand Nayyar.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest to report regarding the present study.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Sallah, A., Abdellaoui Alaoui, E.A., Hessane, A. et al. An efficient fake account identification in social media networks: Facebook and Instagram using NSGA-II algorithm. Neural Comput & Applic 36, 21487–21515 (2024). https://doi.org/10.1007/s00521-024-10350-8

Download citation

Received: 03 July 2023
Accepted: 29 July 2024
Published: 28 August 2024
Issue Date: December 2024
DOI: https://doi.org/10.1007/s00521-024-10350-8

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An efficient fake account identification in social media networks: Facebook and Instagram using NSGA-II algorithm

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Feature Selection Using Chi-Squared Feature-Class Association Model for Fake Profile Detection in Online Social Networks

Enhancing Fake Account Detection on Facebook Using Boruta Algorithm

Feature Selection for Identification of Fake Profiles on Facebook

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

An efficient fake account identification in social media networks: Facebook and Instagram using NSGA-II algorithm

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Feature Selection Using Chi-Squared Feature-Class Association Model for Fake Profile Detection in Online Social Networks

Enhancing Fake Account Detection on Facebook Using Boruta Algorithm

Feature Selection for Identification of Fake Profiles on Facebook

Explore related subjects

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation