Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Advertisement

Analysing the email data using stylometric method and deep learning to mitigate phishing attack

  • Original Research
  • Published:
International Journal of Information Technology Aims and scope Submit manuscript

A Correction to this article was published on 01 August 2024

This article has been updated

Abstract

The high-volume usage of email has attracted cybercriminals to the platform and criminals are aware of difficulties users often have in separating legitimate from illegitimate emails and seek to take advantage of those difficulties by impersonating staff of a trusted organisation to persuade users into divulging their private information. To help users overcome the difficulty in detecting phishing attacks, a system is proposed. Recent advancement uses: stylometric features, gender features and personality features to carry out a sender verification process. The existing approaches are more complex and if the system fails to detect bad email, and it gets to users, the possibility of becoming a victim becomes high if not detected by the user. The proposed framework adds Colour Code to Email Verification (CCEV). It conducts sender’s verification at the recipients’ end based on 3-features related with senders, writing pattern, gender, and header.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Reproduced from Fig. 2 in [4]

Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availability

The dataset utilised was obtained from the web page of William Cohen, which can be accessed at http://www-2.cs.cmu.edu/~enron/

Change history

References

  1. Petelka J, Zou Y, Schaub F (2019) Put your warning where your link is: Improving and evaluating email phishing warnings. In: Proceedings of the 2019 CHI conference on human factors in computing systems. pp. 1–15.

  2. Li Q, Cheng M, Wang J, Sun B (2020) LSTM based phishing detection for big email data. IEEE Trans Big Data 8(1):278–288

    Article  Google Scholar 

  3. Halgaš L, Agrafiotis I, Nurse JR (2020) Catching the phish: detecting phishing attacks using recurrent neural networks (rnns). In: Information security applications: 20th international conference, WISA 2019, Jeju Island, South Korea, August 21–24, 2019, revised selected papers 20 2020. Springer International Publishing, pp. 219–233

  4. Rastenis J, Ramanauskaitė S, Janulevičius J, Čenys A, Slotkienė A, Pakrijauskas K (2020) E-mail-based phishing attack taxonomy. Appl Sci 10(7):2363

    Article  Google Scholar 

  5. Nurse JR (2018) Cybercrime and you: How criminals attack and the human factors that they seek to exploit. arXiv preprint arXiv:1811.06624.

  6. Alkhalil Z, Hewage C, Nawaf L, Khan I (2021) Phishing attacks: a recent comprehensive study and a new anatomy. Front Comput Sci 3:563060

    Article  Google Scholar 

  7. Humayun M, Jhanjhi NZ, Alsayat A, Ponnusamy V (2021) Internet of things and ransomware: evolution, mitigation and prevention. Egypt Inf J 22(1):105–117

    Google Scholar 

  8. GOV.UK. Cyber security breaches survey 2022. [Online] Available from: https://www.gov.uk/government/statistics/cyber-security-breaches-survey-2022/cyber-security-breaches-survey-2022 [cited 2023 May 22].

  9. Anjana SA (2019) Security concerns and countermeasures in cloud computing: a qualitative analysis. Int J Inf Technol 11:683–690

    Google Scholar 

  10. Goodman R, Hahn M, Marella M, Ojar C, Westcott S (2007) The use of stylometry for email author identification: a feasibility study. Proc Student/Faculty Research Day, CSIS, Pace University, White Plains, NY. 1-7

  11. Widup S, Rudis B, Hylender D, Spitler M, Thompson K, Baker WH, Bassett G, Karambelkar B, Brannon SK, Kennedy D (2015) Verizon Data Breach Investigations Report. URL: 1–2-DBIR-Widup (nist.gov) [Accessed 2022–03–22].

  12. Alzahrani SM, Salim N, Abraham A (2011) Understanding plagiarism linguistic patterns, textual features, and detection methods. IEEE Trans Syst Man Cybern Part C Appl Rev 42(2):133–149

    Article  Google Scholar 

  13. Vayansky I, Kumar S (2018) Phishing–challenges and solutions. Comput Fraud Secur 2018(1):15–20

    Article  Google Scholar 

  14. Nmachi WP, Win T (2021) Mitigating phishing attack in organisations: a literature review. In: CS & IT conference proceedings 2021 (Vol. 11, No. 1). CS & IT conference proceedings.

  15. Sharma P, Dash B, Ansari MF (2022) Anti-phishing techniques–a review of cyber defense mechanisms. Int J Adv Res Comput Commun Eng ISO 31(3297):2007

    Google Scholar 

  16. Evans K, Abuadbba A, Wu T, Moore K, Ahmed M, Pogrebna G, Nepal S, Johnstone M (2022) Raider: Reinforcement-aided spear phishing detector. In: International conference on network and system security. Cham: Springer Nature Switzerland. pp 23–50

  17. Al-Hamar Y, Kolivand H, Tajdini M, Saba T, Ramachandran V (2021) Enterprise credential spear-phishing attack detection. Comput Electr Eng 1(94):107363

    Article  Google Scholar 

  18. Fette I, Sadeh N, Tomasic A (2007) Learning to detect phishing emails. In: Proceedings of the 16th international conference on world wide web, pp. 649–656

  19. Khonji M, Iraqi Y, Jones A (2012) Enhancing phishing e-mail classifiers: a lexical url analysis approach. Int J Inf Secur Res (IJISR) 2(1/2):40

    Google Scholar 

  20. Smadi S, Aslam N, Zhang L (2018) Detection of online phishing email using dynamic evolving neural network based on reinforcement learning. Decis Support Syst 1(107):88–102

    Article  Google Scholar 

  21. Hota HS, Shrivas AK, Hota R (2018) An ensemble model for detecting phishing attack with proposed remove-replace feature selection technique. Procedia Comput Sci 1(132):900–907

    Article  Google Scholar 

  22. Lötter A, Futcher L (2015) A framework to assist email users in the identification of phishing attacks. Inf Comput Secur 23(4):370–381

    Article  Google Scholar 

  23. Li JS, Chen LC, Monaco JV, Singh P, Tappert CC (2017) A comparison of classifiers and features for authorship authentication of social networking messages. Concurr Comput: Pract Exp 29(14):e3918

    Article  Google Scholar 

  24. Abbasi A, Chen H (2008) Writeprints: a stylometric approach to identity-level identification and similarity detection in cyberspace. ACM Trans Inf Syst (TOIS) 26(2):1–29

    Article  Google Scholar 

  25. Beigi G, Liu H (2020) A survey on privacy in social media: Identification, mitigation, and applications. ACM Trans Data Sci 1(1):1–38

    Article  Google Scholar 

  26. Afroz S, Brennan M, Greenstadt R. (2012) Detecting hoaxes, frauds, and deception in writing style online. In2012 IEEE Symposium on Security and Privacy. IEEE. pp. 461–475

  27. Liu Y, Wu YF (2020) Fned: a deep network for fake news early detection on social media. ACM Trans Inf Syst (TOIS) 38(3):1–33

    Article  MathSciNet  Google Scholar 

  28. Afroz S, Islam AC, Stolerman A, Greenstadt R, McCoy D (2014) Doppelgänger finder: taking stylometry to the underground. In: 2014 IEEE symposium on security and privacy. IEEE. pp 212–226

  29. McDonald AW, Afroz S, Caliskan A, Stolerman A, Greenstadt R (2012) Use fewer instances of the letter “i”: Toward writing style anonymization. In: Privacy enhancing technologies: 12th international symposium, PETS 2012, Vigo, Spain, July 11-13, 2012. Proceedings 12 2012 (pp. 299-318). Springer Berlin Heidelberg

  30. Narayanan A, Paskov H, Gong NZ, Bethencourt J, Stefanov E, Shin EC, Song D (2012) On the feasibility of internet-scale author identification. In: 2012 IEEE symposium on security and privacy. IEEE. pp 300–314

  31. Ledger G, Merriam T (1994) Shakespeare, fletcher, and the two noble kinsmen. Lit Linguist Comput 9(3):235–248

    Article  Google Scholar 

  32. De Vel O, Anderson A, Corney M, Mohay G (2001) Mining e-mail content for author identification forensics. ACM SIGMOD Rec 30(4):55–64

    Article  Google Scholar 

  33. Nizamani S, Memon N (2013) CEAI: CCM-based email authorship identification model. Egypt Inf J 14(3):239–249

    Google Scholar 

  34. Iqbal F, Khan LA, Fung BC, Debbabi M (2010) E-mail authorship verification for forensic investigation. In: Proceedings of the 2010 ACM symposium on applied computing. pp. 1591–1598

  35. Lin E, Aycock J, Mannan M (2012) Lightweight client-side methods for detecting email forgery. InInformation security applications: 13th international workshop, WISA 2012, Jeju Island, Korea, August 16-18, 2012, revised selected papers. Springer Berlin Heidelberg, pp. 254-269

  36. Brocardo ML, Traore I, Saad S, Woungang I (2013) Authorship verification for short messages using stylometry. In: 2013 international conference on computer, information and telecommunication systems (CITS). IEEE. pp 1–6

  37. Stringhini G, Thonnard O (2014) That ain't you: detecting spearphishing emails before they are sent. arXiv preprint arXiv:1410.6629

  38. Duman S, Kalkan-Cakmakci K, Egele M, Robertson W, Kirda E (2016) Emailprofiler: spearphishing filtering with header and stylometric features of emails. In: 2016 IEEE 40th Annual Computer Software and Applications Conference (COMPSAC), Vol. 1. IEEE. pp 408–416

  39. Xiujuan W, Chenxi Z, Kangfeng Z, Haoyang T, Yuanrui T (2019) Detecting spear-phishing emails based on authentication. In: 2019 IEEE 4th international conference on computer and communication systems (ICCCS). IEEE. pp 450–456

  40. Ding X, Liu B, Jiang Z, Wang Q, Xin L (2021) Spear phishing emails detection based on machine learning. In: 2021 IEEE 24th international conference on computer supported cooperative work in design (CSCWD). IEEE. pp 354–359

  41. Mishra S, Jabin S (2023) Anomaly detection in surveillance videos using deep autoencoder. Int J Inf Technol 24:1–2

    Google Scholar 

  42. Rajak A, Tripathi R (2023) DL-SkLSTM approach for cyber security threats detection in 5G enabled IIoT. Int J Inf Technol 18:1–8

    Google Scholar 

  43. Jain G, Sharma M, Agarwal B (2019) Optimizing semantic LSTM for spam detection. Int J Inf Technol 4(11):239–250

    Google Scholar 

  44. Priya CS, Deepalakshmi P (2023) Sentiment analysis from unstructured hotel reviews data in social network using deep learning techniques. Int J Inf Technol 15(7):3563–3574

    Google Scholar 

  45. Nmachi Wosah P (2023) A framework for securing email entrances and mitigating phishing impersonation attacks. arXiv e-prints. arXiv-2312.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peace Nmachi Wosah.

Ethics declarations

Conflict of interest

There are no conflicts of interest to disclose as all views presented in this paper belong to the author alone, and not any institution. I declare that I have no competing interests.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wosah, P.N., Ali Mirza, Q. & Sayers, W. Analysing the email data using stylometric method and deep learning to mitigate phishing attack. Int. j. inf. tecnol. (2024). https://doi.org/10.1007/s41870-024-01839-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s41870-024-01839-5

Keywords