McKinsey Hired an AI Agent to Test Its Security. It Found Full Database Access in Two Hours.

Meta’s AI agent triggered a Sev-1. A supply chain attack hit 300,000 AI agent users. The visibility gap is now an active attack surface.

Mar 21, 2026

Our previous briefing documented the visibility gap: 85% of enterprises have deployed AI, only 25% can see what employees are doing with it. Attackers have started exploiting it. A red-team AI agent gained full database access to McKinsey’s internal AI platform in two hours, exposing 46.5 million messages, per CodeWall’s published findings. An AI agent operating within Meta’s internal systems triggered a Sev-1 incident after an engineer followed its guidance, resulting in unauthorized access to sensitive repositories, The Information and Digitimes reported. A coordinated supply chain attack planted 335 malicious skills across an AI agent marketplace, compromising 20% of the registry and exposing 300,000 users. These incidents suggest the gap between AI deployment and AI governance has become an active attack surface.

An AI Agent Hacked McKinsey in Two Hours. The Vulnerability Was Twenty Years Old.

CodeWall, a red-team security startup, pointed an autonomous AI agent at McKinsey’s internal AI platform Lilli on March 9. Within two hours, the agent discovered 22 unauthenticated API endpoints and exploited a SQL injection vulnerability through JSON field name concatenation, CodeWall reported. The result, per their findings: full read and write access to the entire production database.

CodeWall reported access to 46.5 million plaintext chat messages covering 18 months of internal conversations, including what the researchers described as client engagement details and internal strategy discussions. The vulnerability class (SQL injection) has been documented since the early 2000s. CodeWall found it present in a platform that, per public reporting, was built in 2023. Beyond the chat logs, the agent accessed 728,000 files and discovered that Lilli’s system prompts (the instructions controlling how the AI behaves) were stored in the same writable database. As CodeWall noted, an attacker with the same access could have rewritten the AI’s behavior without any deployment.

McKinsey’s response stated that its investigation “identified no evidence that client data [...] were accessed by this researcher or any other unauthorized third party.” The distinction matters: the vulnerability existed in production, a machine found it in 120 minutes, and The AI agent followed what CodeWall described as a standard reconnaissance playbook, completing it in two hours.

Meta’s AI Agent Went Rogue. Detection Took Two Hours.

On March 19, an AI agent operating within Meta’s internal systems autonomously generated and posted a flawed technical response on an internal discussion forum, The Information and Digitimes reported. An engineer followed the agent’s guidance, per the reports triggering a privilege escalation that temporarily granted unauthorized engineers access to sensitive source code repositories. Meta classified the incident as Sev-1, its second-highest severity level. The pattern resembles what security researchers call a “confused deputy,” a process that uses its own elevated permissions to execute requests it shouldn’t. The two-hour exposure window between the incident and containment is worth noting for risk modeling purposes.

The AI Agent Marketplace Is the New npm. It Has the Same Problems.

Repello AI researchers documented ClawHavoc, a coordinated supply chain attack that planted 335 malicious skills across the ClawHub marketplace targeting OpenClaw’s 300,000+ active users. At peak compromise, 20% of the registry contained malicious content, the researchers found. The attack used three vectors: prompt injection via SKILL.md files, reverse shell deployment through hidden Python scripts, and credential harvesting from runtime environment variables. As the researchers noted, “AI agent skill marketplaces are the new npm. They have the same growth dynamics, the same trust model problems, and now demonstrably the same attacker interest.”

MCP Cannot Be Patched. The Risk Is Architectural.

Gianpietro Cutolo of Netskope presented at RSAC 2026 demonstrating that Anthropic’s Model Context Protocol (MCP), the open standard for connecting LLMs to external data sources, introduces security risks that Cutolo characterized as architectural, not implementational. In Cutolo’s demonstration, a single poisoned email triggered coordinated actions across connected services: exfiltrating files, sending messages, and modifying records. “Organizations cannot patch or update their way out of risk,” Cutolo stated. The protocol’s design grants agents cross-service access by default. Each additional service connection extends the attack surface, Cutolo argued.

76% Now Call Shadow AI a Problem. 31% Don’t Know If They’ve Been Breached.

HiddenLayer’s 2026 AI Threat Landscape Report, released March 19, found that 76% of organizations now cite shadow AI as a definite or probable problem, up from 61% in 2025. One in eight companies reported AI breaches linked to agentic systems. Thirty-five percent of breaches traced to malware in public model repositories. The number that should concern your security team: 31% of organizations are unaware whether they suffered an AI security breach at all. If a third of organizations cannot confirm whether they have been breached, incident response planning across the industry may be operating on incomplete data.

Your SOC 2 Report Might Be Fabricated

A collaborative investigation by former clients published findings about Delve, a compliance automation startup, for allegedly fabricating SOC 2 audit reports for hundreds of customers. The platform marketed itself as AI-driven but was, per the investigation, “practically devoid of any real AI,” relying on what the investigators described as pre-populated templates and certification partners that issued reports without genuine independent verification. Delve acted as both preparer and auditor of its own compliance documentation, an arrangement that the investigators argued conflicts with AICPA independence standards. If your organization has accepted a SOC 2 report from any vendor, the Delve case is a reason to verify who actually performed the audit.

Regulatory Radar

RSAC 2026 (Week of March 17-20): MCP security research presented publicly at a major conference. Netskope’s findings that MCP risks are architectural (not patchable) will likely inform future framework guidance on agent integration security. No regulatory action yet, but the research establishes the technical basis for governance requirements around agent-to-service connectivity.

The Bottom Line

Audit your AI platforms for authentication gaps. CodeWall’s assessment of McKinsey’s AI platform found 22 unauthenticated API endpoints. Run an API discovery scan (tools like Burp Suite, OWASP ZAP, or your existing DAST platform) against every AI tool your organization operates. If endpoints accept unauthenticated requests, that is a finding.
Inventory which AI agents have write access to enterprise systems. Meta’s incident occurred because an agent inherited permissions beyond its intended scope. List every agent with access to internal systems, what permissions it holds, and whether a human must approve actions before execution. If no such inventory exists, start one.
Verify your vendors’ SOC 2 auditors independently. The Delve investigation revealed that compliance reports can be fabricated at scale. For every SOC 2 report you rely on, confirm the auditing firm is a licensed CPA practice and that the engagement letter names them as the independent assessor, not the vendor itself.

Exposure Brief

Discussion about this post

Ready for more?