"Evaluating the Effectiveness of LLM-Generated Phishing Campaigns" by Nathan Sniegowski

Date of Award

Spring 2025

Document Type

Thesis

Degree Name

Master of Science (MS)

Department

Computer Science

First Advisor

Despoina Perouli

Second Advisor

Charles Woodward

Third Advisor

Despoina Perouli

Fourth Advisor

Keyang Yu

Abstract

This paper investigates the effectiveness and security implications of Large Language Models (LLMs) in phishing campaigns. Existing research has explored using LLMs and AI for enterprise security tools and automating phishing email processes. However, few studies evaluated the effectiveness of fine-tuned LLMs in generating phishing email content and measuring real-world user interaction. This research used LLMs to produce the body content of phishing emails with phishing links manually inserted post-generation. The researcher performed three core experiments: (1) evaluating the success rate of jailbreaking three commercial LLMs to generate phishing content, (2) analyzing ChatGPT-4o mini’s phishing emails based on institution-specific context and a historical phishing email, and (3) comparing user email interaction metrics between real-world traditional and LLM-generated phishing campaigns. The results revealed that LLM-generated emails are just as effective or better, under certain conditions, at compromising users by avoiding common phishing indicators such as poor grammar, urgency, and formatting inconsistencies. Fine-tuned phishing emails demonstrated that credibility through writing style transfer, contextual relevance, authority, and timing significantly influence recipient behavior. These findings emphasize the growing threat of AI-assisted phishing and highlight the need for enhanced detection methods, user awareness, and responsible LLM deployment.

COinS