Why · 02
More agents than your IAM team
has heard of.
Every enterprise we’ve assessed has more LLM-powered automations running in production than the security team can name. Most are useful. Some are dangerous. None are inventoried. Discovery first — then triage — then governance.
Definitions
Shadow AI takes four shapes.
Unsanctioned consumer tools
Engineers using the personal-tier ChatGPT to debug production logs. Customer-ops pasting tickets into Claude. Sometimes paid by the employee on a personal card — sometimes on an unmanaged corporate card.
Forgotten PoCs
An LLM-powered something built for a hackathon two quarters ago. The deck got applause. The container is still running on a workstation under someone’s desk, billing OpenAI every night.
Embedded vendor features
Your CRM shipped “AI summaries”. Your IDE has Copilot. Your spreadsheet has “explain this”. The vendor handles the model. You handle the data leakage. Often without a procurement review.
Production agents without a principal
An automation that authenticates as a service account — or worse, as a human — and calls a model. Functions correctly. Has no scope, no baseline, no kill-switch. The owner left the company in February.
Why it spreads
Asymmetric incentives.
Predictable result.
The fastest path to a useful automation is to bypass procurement, plug a personal API key into an uncontrolled environment, and ship. The reward is a working tool today. The cost is paid later by someone else — security, finance, compliance.
- The model is cheap, the velocity is high, the friction of doing it “properly” is real
- Procurement and security review for AI is often slower than the use-case window
- No central catalogue means no obvious place to even register the tool
- The first incident is invisible — data leakage doesn’t set off alarms
Discovery summary · sample tenant
| Sanctioned LLM apps | 14 |
| Discovered LLM apps | 62 |
| Active vendor keys | 38 |
| Keys with named owner | 11 |
| Agents in IAM | 4 |
| Agents in production | 34 |
| MCP servers (unregistered) | 9 |
How to discover
Five signals you already have.
You don’t need a new tool to find shadow AI. You need to point at signals you’re already collecting.
- 1
Egress proxy / netflow logs
Filter for hostnames matching
api.openai.com,api.anthropic.com,generativelanguage.googleapis.com,bedrock-runtime.*,*.cognitive.azure.com. Group by source IP & user. - 2
SaaS expense data
Search expense reports and corporate-card transactions for OpenAI, Anthropic, Cohere, Replicate, ElevenLabs, Pinecone, Cursor, Notion AI, etc. Cross-reference against procurement’s known list.
- 3
Source-code search
Greps over your monorepo / GitHub org for
sk-…,ANTHROPIC_API_KEY,openai,langchain,llama_index,mcp. Catches the long tail of forgotten PoCs. - 4
IdP & secret-manager audit logs
Service accounts that authenticate from atypical environments. Secret reads of vendor keys outside expected services. Long-lived PATs without a rotation date.
- 5
Endpoint & CASB telemetry
Browser sessions to consumer LLM dashboards. Outbound paste detections to
chat.openai.com. Often the highest-volume signal — and the one with the most data leakage exposure.
Triage matrix
Once you have a list. Decide fast.
Graceful retirement
Discovery is easy. Retiring a working thing is the hard part.
People built these tools because the sanctioned alternative didn’t exist or was too slow. If your only move is to revoke keys, you’ll be playing whack-a-mole forever. Pair every retirement with a sanctioned on-ramp and a fast review path.
- Stand up the gateway and a sanctioned chat tier first
- Publish a one-page “register your AI use-case” flow with a 5-day SLA
- Migration assistance for the top 10 discovered tools — mint the principal, port the config
- Hard cut-off for non-migrated long-lived keys with 60 days notice
Migration plan · 12 weeks
- Stand up gateway in non-prod (week 1)
- Run discovery (week 1–2)
- Triage results with risk + finance + product (week 3)
- Pilot migrate top 3 use-cases (week 4–6)
- Open self-serve onboarding (week 7)
- Deprecation notice for non-migrated keys (week 8)
- Hard cut-off (week 12)
Stop guessing.
Start counting.
A two-week assessment ends with a dashboard, an inventory, and a triage list. We’ll work the discovery against your real environment with read-only credentials, no install required.