Social engineering email attacks remain a persistent threat despite commercial solutions and user training focused on identifying phishing indicators like urgency, unusual greetings, or inconsistent email addresses.
However, relying on users to detect phishing during routine email checks is prone to error.
This research investigates the use of NLP to assist users by automatically identifying weak explainable phishing indicators (WEPI) – subtle signals that may appear in legitimate emails but are often exploited in phishing attacks.
A comprehensive study presents an annotated email corpus of 940 emails labeled with 32 WEPI labels, including some novel indicators.
Security analysts from the Information Sciences Institute in Los Angeles have recently provided insights into the frequencies of WEPI, areas for improving user training, and the performance of machine learning models in automating WEPI detection to enhance user vigilance.
Leveraging AI to Detect Phishing Attacks
Previous works have employed NLP and machine learning techniques, such as statistical methods and neural networks, to detect phishing emails based on extracted language features.
However, this study does not propose a new phishing detection algorithm. Instead, it highlights the need to update anti-phishing training for both humans and machines by defining 32 weak explainable phishing indicators (WEPIs) derived from analyzing anti-phishing recommendations and malicious emails.
WEPIs capture content associated with potential phishing (e.g., urgency, unusual requests) and verifiable discrepancies between stated identities or information and metadata or publicly available facts.
An annotated corpus of 940 emails, labeled with these WEPIs across different linguistic scopes (words, sentences, messages), is presented to enable training and benchmarking of automated WEPI detection models to complement human vigilance.
The annotation process involved a combination of paid students and authors who followed specific guidelines, iteratively refining their work until achieving high inter-annotator agreement.
The performance of pre-trained language models, such as BERT and RoBERTa, on the 32 WEPI labels across different linguistic scopes served as the baseline.
This corpus aims to demonstrate the challenges machines face in understanding natural language and the difficulty humans have in detecting phishing emails. Rather than automating everything, the goal is to facilitate combined human-machine approaches based on model predictions about interpretable indicators, helping users remain vigilant while reducing cognitive load.
Researchers present an annotated dataset and trained models to identify phishing email indicators. This study illustrates the benefits of applying natural language understanding models to phishing email detection and supports the development of a comprehensive phishing email identification curriculum.