What if your AI workflow was the security breach? The hidden risks of LLM infrastructure

LLM infrastructure security in 2026 is defined by the vulnerability of the 'plumbing' rather than the model itself. While UK law firms have focused on mitigating AI hallucinations, the integration of agentic AI into SharePoint and internal Document Management Systems (DMS) has created a new primary attack surface: the API endpoint.

Now ask yourself: who else can reach that model? What permissions does the service account running it actually have? If someone fed a malicious instruction through that dashboard, where could it go?

If you cannot answer those questions, you have a problem. And you are not alone.

The threat has shifted from the model to the plumbing

A year ago, the conversation about AI security was dominated by the model itself. Jailbreaks, hallucinations, bias. Those concerns remain live, but the security research community has largely moved on. The current analysis from Wiz, Bright Security, and others is clear: in 2026, the primary attack surface for LLM deployments is not the model. It is the infrastructure connecting the model to everything else.

Every API endpoint you expose to serve an LLM creates a potential entry point. Every service account you configure to let your model read files, query databases, or call external tools carries permissions that can be abused. Every cloud misconfiguration in the environment hosting your inference layer is a gap waiting to be found. As organisations have moved from simple chatbot deployments to integrated, agentic AI systems, the attack surface has expanded dramatically and in ways that many teams have not mapped.

The 2026 State of LLM Security benchmarks identify a consistent pattern: over-privileged non-human identities (NHIs) are among the most exploited vectors in LLM infrastructure breaches. A service account that can read your entire document store does not need to be compromised directly. If it is accessible to a model endpoint that accepts user input, a carefully crafted prompt can instruct the model to retrieve and exfiltrate data using the permissions that account already holds legitimately. No malware. No brute force. Just the system doing what it was told, by the wrong person.

Why 'vibe coding' amplifies this risk

There is a growing cohort of knowledge workers who are building their own tools. Lawyers drafting automation scripts in Python with AI assistance. Consultants wiring together APIs to create custom research pipelines. Accountants building bespoke document-processing workflows. This is genuinely valuable. It democratises capability and reduces dependence on overloaded IT teams.

It also introduces infrastructure risk from people who have never thought about infrastructure security.

The term "vibe coding" has become shorthand for building functional software through AI-assisted iteration without deep technical grounding. The output works, until it does not. A workflow that connects an LLM to a live database via an API key hardcoded in a script, deployed on a shared server, with no logging and no rate-limiting, is not just technically fragile. Under UK GDPR and the NIS2 Regulations, if that workflow processes personal data and a breach results from that configuration, you have a compliance problem of significant magnitude.

The ICO's guidance on security of processing under Article 32 UK GDPR requires that organisations implement appropriate technical measures to protect personal data. "Appropriate" is not defined by intention. A misconfigured API endpoint is not an appropriate measure, regardless of how useful the workflow it serves happens to be. NIS2, now transposed into UK law, adds further obligations for organisations operating critical infrastructure or significant digital services, with potential fines reaching 4% of global annual turnover for serious breaches.

What exposed endpoints actually enable

The mechanics are worth understanding, because they inform the practical response.

Prompt injection is the most discussed vector. An attacker embeds instructions in content that the model will process, such as a document, an email, a web page scraped during research, and those instructions redirect the model's behaviour. In a simple chatbot, the damage is limited. In an agentic workflow with tool-calling capabilities, the model can be instructed to send data externally, delete files, or authenticate to other services using its own credentials. The model follows instructions. It does not verify their legitimacy.

Tool-calling abuse sits alongside this. Modern LLM deployments increasingly give models access to external tools: web search, code execution, calendar access, file systems. Each tool integration creates a path from user input to system action. If the permissions governing those tools are not tightly scoped, a successful injection can traverse far beyond the original application boundary. This is lateral movement, achieved through natural language.

Cloud misconfigurations complete the picture. Open inference endpoints, publicly accessible model APIs with no authentication, over-permissive IAM roles in the hosting environment: these are documented findings, not theoretical risks. The Hacker News analysis of exposed LLM infrastructure highlights that many organisations have deployed internal models with the same carelessness that characterised early cloud storage deployments. The consequences are structurally similar.

The Monday morning test

If you are building or using custom AI workflows, the following are not optional considerations for later. They are questions to answer before the workflow touches production data.

What can the service account or API key connecting your LLM to other systems actually access? Scope it to the minimum required. If it can read your entire file system when it only needs to read one folder, fix that before you do anything else.

Is user input ever passed directly to your model without sanitisation or validation? If so, you are exposed to prompt injection. You need either input filtering, output guardrails, or both.

Are your API keys stored in code, in plain text configuration files, or in environment variables that are not properly secured? If yes, rotate them now and use a secrets manager going forward.

Does your workflow log what the model does, what tools it calls, and what data it accesses? Without logging, you cannot detect abuse and you cannot demonstrate compliance if challenged by the ICO.

Who else can reach your LLM endpoint? If the answer is "anyone on the corporate network" or, worse, "anyone with the URL", you need authentication on that endpoint immediately.

These are not sophisticated questions. They are basic infrastructure hygiene. The gap between knowing they matter and actually addressing them is where most of the risk lives.

The competency gap is the real story

The deeper issue is that AI capability has outrun security awareness in the knowledge worker population. That is not a criticism of individuals. The tools have become accessible faster than the supporting education has followed. Lawyers, consultants, and finance professionals are now building and running infrastructure that their IT and security colleagues would previously have reviewed before deployment. The review is no longer happening, because the tools make deployment trivially easy.

Organisations deploying LLMs in integrated environments need to treat LLM infrastructure security as a distinct discipline, not an extension of standard IT security. The OWASP Top 10 for LLM Applications provides a useful framework. The NIS2 obligations provide a legal floor. Neither substitutes for genuine understanding of how these systems actually fail.

The question worth sitting with is not whether your AI tools are useful. Clearly they are. The question is whether the people building and running them understand what they have connected together, and who else might be able to use those connections.

That answer, in most organisations right now, is no.