Introduction
The CIA Triad—Confidentiality, Integrity, and Availability—has been the foundational framework for information security since the 1970s. Every security control, policy, and architecture decision ultimately maps back to protecting one or more of these three properties. While the triad remains relevant, AI systems introduce new dimensions that require us to revisit and extend this classic model.
This section examines each pillar of the CIA Triad through the lens of AI and machine learning systems, explores necessary extensions like Authenticity and Non-repudiation, and identifies AI-specific risks that traditional security frameworks were never designed to address.
Confidentiality, Integrity, Availability
Confidentiality ensures that information is accessible only to authorized parties. In traditional systems, this means encryption, access controls, and data classification. For AI systems, confidentiality extends to training data, model weights, hyperparameters, and inference outputs. A leaked model can expose proprietary algorithms worth millions in R&D investment.
Integrity ensures that information is accurate and has not been tampered with. For AI, integrity means that training data has not been poisoned, model weights have not been modified, and inference results can be trusted. A compromised model that produces subtly wrong outputs can be more dangerous than one that fails completely, because incorrect results may go undetected for extended periods.
Availability ensures that systems and data are accessible when needed. AI systems face unique availability challenges: models require significant computational resources, inference latency must meet operational requirements, and adversarial inputs designed to cause excessive computation (sponge examples) can create targeted denial-of-service conditions.
Think About It: When a traditional database is breached, you know data was stolen. When an AI model's training data is poisoned, the model may appear to function normally while producing subtly biased or incorrect outputs. This makes integrity attacks on AI systems particularly insidious and difficult to detect.
Extending the Triad for AI
Modern security frameworks increasingly recognize that three pillars are insufficient for comprehensive security. Two additional properties have gained prominence, and both have heightened importance in AI systems.
Authenticity ensures that the origin of data or communications can be verified. In AI systems, authenticity extends to model provenance—can you verify that a pre-trained model came from the claimed source and was not tampered with during distribution? As organizations increasingly rely on third-party models from platforms like Hugging Face, verifying model authenticity becomes critical.
- Authenticity: Verifying the origin and identity of data, models, and system components. Prevents model supply chain attacks where trojaned models are distributed as legitimate.
- Non-repudiation: Ensuring that actions cannot be denied after the fact. Critical for AI-driven decisions in regulated industries where audit trails must prove who (or what system) made specific decisions and when.
Together, these five properties—Confidentiality, Integrity, Availability, Authenticity, and Non-repudiation—form a more complete framework for securing AI systems. Every security architecture we discuss in this book will reference these properties as design objectives.
AI-Specific Risks to the Triad
AI systems introduce threat categories that have no direct equivalent in traditional information security. Understanding these risks is essential for any security professional working with AI/ML systems.
- Model Poisoning (Integrity): An attacker injects malicious samples into training data to cause the model to learn incorrect patterns. For example, poisoning a spam classifier to always allow emails containing a specific trigger phrase.
- Adversarial Inputs (Integrity): Carefully crafted inputs that cause a model to misclassify or produce incorrect outputs. A stop sign with subtle stickers might be classified as a speed limit sign by a vision model.
- Data Leakage (Confidentiality): Models can memorize and reveal training data through targeted queries. Language models have been shown to reproduce verbatim text, phone numbers, and email addresses from their training sets when prompted with specific patterns.
- Model Extraction (Confidentiality): Attackers query a model systematically to reconstruct a functionally equivalent copy, stealing the intellectual property embedded in the model's learned parameters.
Why This Matters: Traditional security tools like firewalls, IDS, and antivirus were designed to protect data at rest and in transit. They have no understanding of model integrity, adversarial robustness, or training data provenance. Securing AI requires new tools and frameworks built specifically for these threats—topics we will explore in depth throughout Parts III and IV of this book.