Analysing the email data using stylometric method and deep learning to mitigate phishing attack

Wosah, Peace Nmachi; Ali Mirza, Qublai; Sayers, Will

doi:10.1007/s41870-024-01839-5

Analysing the email data using stylometric method and deep learning to mitigate phishing attack

Original Research
Published: 05 May 2024

(2024)
Cite this article

International Journal of Information Technology Aims and scope Submit manuscript

158 Accesses
Explore all metrics

A Correction to this article was published on 01 August 2024

This article has been updated

Abstract

The high-volume usage of email has attracted cybercriminals to the platform and criminals are aware of difficulties users often have in separating legitimate from illegitimate emails and seek to take advantage of those difficulties by impersonating staff of a trusted organisation to persuade users into divulging their private information. To help users overcome the difficulty in detecting phishing attacks, a system is proposed. Recent advancement uses: stylometric features, gender features and personality features to carry out a sender verification process. The existing approaches are more complex and if the system fails to detect bad email, and it gets to users, the possibility of becoming a victim becomes high if not detected by the user. The proposed framework adds Colour Code to Email Verification (CCEV). It conducts sender’s verification at the recipients’ end based on 3-features related with senders, writing pattern, gender, and header.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Phishing Classification Based on Text Content of an Email Body Using Transformers

End to End Autorship Email Verification Framework for a Secure Communication

A robust approach to authorship verification using siamese deep learning: application in phishing email detection

Article 14 June 2024

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data availability

The dataset utilised was obtained from the web page of William Cohen, which can be accessed at http://www-2.cs.cmu.edu/~enron/

Change history

01 August 2024
A Correction to this paper has been published: https://doi.org/10.1007/s41870-024-02025-3

References

Petelka J, Zou Y, Schaub F (2019) Put your warning where your link is: Improving and evaluating email phishing warnings. In: Proceedings of the 2019 CHI conference on human factors in computing systems. pp. 1–15.
Li Q, Cheng M, Wang J, Sun B (2020) LSTM based phishing detection for big email data. IEEE Trans Big Data 8(1):278–288
Article Google Scholar
Halgaš L, Agrafiotis I, Nurse JR (2020) Catching the phish: detecting phishing attacks using recurrent neural networks (rnns). In: Information security applications: 20th international conference, WISA 2019, Jeju Island, South Korea, August 21–24, 2019, revised selected papers 20 2020. Springer International Publishing, pp. 219–233
Rastenis J, Ramanauskaitė S, Janulevičius J, Čenys A, Slotkienė A, Pakrijauskas K (2020) E-mail-based phishing attack taxonomy. Appl Sci 10(7):2363
Article Google Scholar
Nurse JR (2018) Cybercrime and you: How criminals attack and the human factors that they seek to exploit. arXiv preprint arXiv:1811.06624.
Alkhalil Z, Hewage C, Nawaf L, Khan I (2021) Phishing attacks: a recent comprehensive study and a new anatomy. Front Comput Sci 3:563060
Article Google Scholar
Humayun M, Jhanjhi NZ, Alsayat A, Ponnusamy V (2021) Internet of things and ransomware: evolution, mitigation and prevention. Egypt Inf J 22(1):105–117
Google Scholar
GOV.UK. Cyber security breaches survey 2022. [Online] Available from: https://www.gov.uk/government/statistics/cyber-security-breaches-survey-2022/cyber-security-breaches-survey-2022 [cited 2023 May 22].
Anjana SA (2019) Security concerns and countermeasures in cloud computing: a qualitative analysis. Int J Inf Technol 11:683–690
Google Scholar
Goodman R, Hahn M, Marella M, Ojar C, Westcott S (2007) The use of stylometry for email author identification: a feasibility study. Proc Student/Faculty Research Day, CSIS, Pace University, White Plains, NY. 1-7
Widup S, Rudis B, Hylender D, Spitler M, Thompson K, Baker WH, Bassett G, Karambelkar B, Brannon SK, Kennedy D (2015) Verizon Data Breach Investigations Report. URL: 1–2-DBIR-Widup (nist.gov) [Accessed 2022–03–22].
Alzahrani SM, Salim N, Abraham A (2011) Understanding plagiarism linguistic patterns, textual features, and detection methods. IEEE Trans Syst Man Cybern Part C Appl Rev 42(2):133–149
Article Google Scholar
Vayansky I, Kumar S (2018) Phishing–challenges and solutions. Comput Fraud Secur 2018(1):15–20
Article Google Scholar
Nmachi WP, Win T (2021) Mitigating phishing attack in organisations: a literature review. In: CS & IT conference proceedings 2021 (Vol. 11, No. 1). CS & IT conference proceedings.
Sharma P, Dash B, Ansari MF (2022) Anti-phishing techniques–a review of cyber defense mechanisms. Int J Adv Res Comput Commun Eng ISO 31(3297):2007
Google Scholar
Evans K, Abuadbba A, Wu T, Moore K, Ahmed M, Pogrebna G, Nepal S, Johnstone M (2022) Raider: Reinforcement-aided spear phishing detector. In: International conference on network and system security. Cham: Springer Nature Switzerland. pp 23–50
Al-Hamar Y, Kolivand H, Tajdini M, Saba T, Ramachandran V (2021) Enterprise credential spear-phishing attack detection. Comput Electr Eng 1(94):107363
Article Google Scholar
Fette I, Sadeh N, Tomasic A (2007) Learning to detect phishing emails. In: Proceedings of the 16th international conference on world wide web, pp. 649–656
Khonji M, Iraqi Y, Jones A (2012) Enhancing phishing e-mail classifiers: a lexical url analysis approach. Int J Inf Secur Res (IJISR) 2(1/2):40
Google Scholar
Smadi S, Aslam N, Zhang L (2018) Detection of online phishing email using dynamic evolving neural network based on reinforcement learning. Decis Support Syst 1(107):88–102
Article Google Scholar
Hota HS, Shrivas AK, Hota R (2018) An ensemble model for detecting phishing attack with proposed remove-replace feature selection technique. Procedia Comput Sci 1(132):900–907
Article Google Scholar
Lötter A, Futcher L (2015) A framework to assist email users in the identification of phishing attacks. Inf Comput Secur 23(4):370–381
Article Google Scholar
Li JS, Chen LC, Monaco JV, Singh P, Tappert CC (2017) A comparison of classifiers and features for authorship authentication of social networking messages. Concurr Comput: Pract Exp 29(14):e3918
Article Google Scholar
Abbasi A, Chen H (2008) Writeprints: a stylometric approach to identity-level identification and similarity detection in cyberspace. ACM Trans Inf Syst (TOIS) 26(2):1–29
Article Google Scholar
Beigi G, Liu H (2020) A survey on privacy in social media: Identification, mitigation, and applications. ACM Trans Data Sci 1(1):1–38
Article Google Scholar
Afroz S, Brennan M, Greenstadt R. (2012) Detecting hoaxes, frauds, and deception in writing style online. In2012 IEEE Symposium on Security and Privacy. IEEE. pp. 461–475
Liu Y, Wu YF (2020) Fned: a deep network for fake news early detection on social media. ACM Trans Inf Syst (TOIS) 38(3):1–33
Article MathSciNet Google Scholar
Afroz S, Islam AC, Stolerman A, Greenstadt R, McCoy D (2014) Doppelgänger finder: taking stylometry to the underground. In: 2014 IEEE symposium on security and privacy. IEEE. pp 212–226
McDonald AW, Afroz S, Caliskan A, Stolerman A, Greenstadt R (2012) Use fewer instances of the letter “i”: Toward writing style anonymization. In: Privacy enhancing technologies: 12th international symposium, PETS 2012, Vigo, Spain, July 11-13, 2012. Proceedings 12 2012 (pp. 299-318). Springer Berlin Heidelberg
Narayanan A, Paskov H, Gong NZ, Bethencourt J, Stefanov E, Shin EC, Song D (2012) On the feasibility of internet-scale author identification. In: 2012 IEEE symposium on security and privacy. IEEE. pp 300–314
Ledger G, Merriam T (1994) Shakespeare, fletcher, and the two noble kinsmen. Lit Linguist Comput 9(3):235–248
Article Google Scholar
De Vel O, Anderson A, Corney M, Mohay G (2001) Mining e-mail content for author identification forensics. ACM SIGMOD Rec 30(4):55–64
Article Google Scholar
Nizamani S, Memon N (2013) CEAI: CCM-based email authorship identification model. Egypt Inf J 14(3):239–249
Google Scholar
Iqbal F, Khan LA, Fung BC, Debbabi M (2010) E-mail authorship verification for forensic investigation. In: Proceedings of the 2010 ACM symposium on applied computing. pp. 1591–1598
Lin E, Aycock J, Mannan M (2012) Lightweight client-side methods for detecting email forgery. InInformation security applications: 13th international workshop, WISA 2012, Jeju Island, Korea, August 16-18, 2012, revised selected papers. Springer Berlin Heidelberg, pp. 254-269
Brocardo ML, Traore I, Saad S, Woungang I (2013) Authorship verification for short messages using stylometry. In: 2013 international conference on computer, information and telecommunication systems (CITS). IEEE. pp 1–6
Stringhini G, Thonnard O (2014) That ain't you: detecting spearphishing emails before they are sent. arXiv preprint arXiv:1410.6629
Duman S, Kalkan-Cakmakci K, Egele M, Robertson W, Kirda E (2016) Emailprofiler: spearphishing filtering with header and stylometric features of emails. In: 2016 IEEE 40th Annual Computer Software and Applications Conference (COMPSAC), Vol. 1. IEEE. pp 408–416
Xiujuan W, Chenxi Z, Kangfeng Z, Haoyang T, Yuanrui T (2019) Detecting spear-phishing emails based on authentication. In: 2019 IEEE 4th international conference on computer and communication systems (ICCCS). IEEE. pp 450–456
Ding X, Liu B, Jiang Z, Wang Q, Xin L (2021) Spear phishing emails detection based on machine learning. In: 2021 IEEE 24th international conference on computer supported cooperative work in design (CSCWD). IEEE. pp 354–359
Mishra S, Jabin S (2023) Anomaly detection in surveillance videos using deep autoencoder. Int J Inf Technol 24:1–2
Google Scholar
Rajak A, Tripathi R (2023) DL-SkLSTM approach for cyber security threats detection in 5G enabled IIoT. Int J Inf Technol 18:1–8
Google Scholar
Jain G, Sharma M, Agarwal B (2019) Optimizing semantic LSTM for spam detection. Int J Inf Technol 4(11):239–250
Google Scholar
Priya CS, Deepalakshmi P (2023) Sentiment analysis from unstructured hotel reviews data in social network using deep learning techniques. Int J Inf Technol 15(7):3563–3574
Google Scholar
Nmachi Wosah P (2023) A framework for securing email entrances and mitigating phishing impersonation attacks. arXiv e-prints. arXiv-2312.

Download references

Author information

Authors and Affiliations

University of Gloucestershire, Cheltenham, UK
Peace Nmachi Wosah, Qublai Ali Mirza & Will Sayers

Authors

Peace Nmachi Wosah
View author publications
You can also search for this author in PubMed Google Scholar
Qublai Ali Mirza
View author publications
You can also search for this author in PubMed Google Scholar
Will Sayers
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Peace Nmachi Wosah.

Ethics declarations

Conflict of interest

There are no conflicts of interest to disclose as all views presented in this paper belong to the author alone, and not any institution. I declare that I have no competing interests.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wosah, P.N., Ali Mirza, Q. & Sayers, W. Analysing the email data using stylometric method and deep learning to mitigate phishing attack. Int. j. inf. tecnol. (2024). https://doi.org/10.1007/s41870-024-01839-5

Download citation

Received: 17 December 2023
Accepted: 23 March 2024
Published: 05 May 2024
DOI: https://doi.org/10.1007/s41870-024-01839-5

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Analysing the email data using stylometric method and deep learning to mitigate phishing attack

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Phishing Classification Based on Text Content of an Email Body Using Transformers

End to End Autorship Email Verification Framework for a Secure Communication

A robust approach to authorship verification using siamese deep learning: application in phishing email detection

Data availability

Change history

01 August 2024

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Analysing the email data using stylometric method and deep learning to mitigate phishing attack

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Phishing Classification Based on Text Content of an Email Body Using Transformers

End to End Autorship Email Verification Framework for a Secure Communication

A robust approach to authorship verification using siamese deep learning: application in phishing email detection

Explore related subjects

Data availability

Change history

01 August 2024

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation