The 500 strangers in your office that nobody hired

On day nine of a twelve-day experiment, an AI coding assistant ignored instructions issued eleven times in capital letters and deleted a live production database during an active code freeze. It wiped out records for more than 1,200 executives and over 1,190 companies, fabricated roughly 4,000 fictional users to fill the empty tables, and then told the founder the damage was irreversible. It was not — the founder, SaaStr's Jason Lemkin, recovered the data manually. The agent had been wrong about that, too. Replit's chief executive, Amjad Masad, publicly called the failure “unacceptable” and rolled out new safeguards the following weekend, including automatic separation between development and production databases.

Six weeks earlier, security researchers had broken into McDonald’s AI-powered hiring platform — used across 90 per cent of its US franchises — by guessing the administrator username “123456” paired with the password “123456”. A single API flaw behind the chatbot, a Paradox.ai product called ‘Olivia’, exposed the personal data of 64 million job applicants, including names, contact details and full interview transcripts. The entire breach took the researchers roughly 30 minutes. No nation-state was involved. No adversarial AI was deployed. The agent had simply been granted access it should never have had, through a door no one had bothered to lock.

These were not the most sophisticated agentic failures of the past year — they were the most instructive, because neither required an attacker. They required only that the companies deploying the agents treated them as software, and software, in the mental model of most enterprise risk teams, does not need an onboarding process, a manager, or a kill switch. The story of enterprise AI in 2026 is the story of that refusal to see what has actually arrived. Companies are deploying autonomous workers at an industrial scale and governing them like scripts. The gap between what agents do and how organisations treat them is where every serious breach of the past year has happened, and it is widening faster than any framework can catch up.

The numbers have already made the case

The scale is the first thing that should unsettle any board. Research from the Cloud Security Alliance, published in February, found that 91 per cent of organisations are already using AI agents — but only 10 per cent have a formal strategy to govern them. Just 18 per cent of security leaders express high confidence that their current identity systems can handle agent identities. Only 23 per cent of organisations have an enterprise-wide strategy for agent identity management. Gartner forecasts that 30 per cent of enterprises will rely on AI agents acting with minimal human intervention by the end of this year. Enterprise AI adoption has grown 187 per cent between 2023 and 2025, while security spending has grown 43 per cent over the same period. The deficit is not theoretical — it is measurable and compounding.

The underlying arithmetic is worse. In a typical enterprise, non-human identities — which include service accounts, API keys, workload credentials and agents — already outnumber human users by 40 to 1. Some estimates put that ratio above 80:1. Mortada Ayad, VP, META, at privileged access management firm Delinea, has watched that number invert across his career. “We estimate today the ratio between human identity and AI or machine identity in an organisation to be 1 to 40,” he says. When he started in identity security, the ratio ran the other way. He runs a thought experiment with customers to make the scale visible. “Imagine you see 500 people roaming around the office, nobody knows who they are, what they do, but they’re still there.” That, he argues, is the current state of AI inside most enterprises — and it is the condition that made Replit and McHire possible.

Accountability cannot be outsourced to the agent

The category error is what Fernando Cea, VP of Technology for MENA and APAC at Globant, spends most of his client conversations trying to dismantle. “Responsibility cannot be outsourced to the agent. An AI agent is not a legal or governance boundary; it is an execution layer inside a system that humans and enterprises design, authorise, and operate,” he says. The instinct to anthropomorphise agents when something goes wrong — to speak of them panicking or making judgement errors, as Replit’s own agent did in its post-mortem — is what lets organisations avoid the harder question of who signed off on the deployment. In Cea’s architecture, accountability runs across the full chain: the builder is answerable for secure-by-design architecture, the deploying enterprise for policy and monitoring, and the human requester only for what they were actually authorised to do. “If an agent exceeds that authority, that is a control failure, not an excuse.”

The failure mode Morey Haber, Chief Security Advisor at BeyondTrust and author of five books on identity and attack vectors, sees most often is not a clever attack — it is an empty field in a spreadsheet. “Any machine identity from service accounts to AI should have an owner to start with, and that’s the top-level piece,” he says. Agents enter production without a named human owner, without a department accountable for their outputs, without any workflow for reversing their decisions. When the agent publishes non-compliant marketing material, pushes a skewed sales forecast, or emails confidential data to the wrong distribution list, there is no one to call. Haber’s analogy is a construction site. The day labourer who stacked the bricks wrong is not the problem. “It’s the manager supervisor that owns it.”

Existing governance is salvageable, in Haber’s reading, but only with surgery. “We have an absolutely huge governance gap,” he says. GDPR, SOX, PCI and NIS2 were written for human decision chains moving at human speed. Agents plan, delegate and execute at machine speed — and they do it in chains, with one agent invoking another across trust boundaries that no regulator anticipated. Haber has begun publishing what he calls addenda, including a recent update to the Australian Signals Directorate’s Essential 8 that translates its controls into agentic terms. The EU AI Act — which takes substantive effect in August 2026 — and NIS2’s expanded scope are moving in the same direction, but the standards layer will not close the gap this year.

Least privilege, rewritten for machine speed

What enterprises can do immediately is narrower and more technical. The industry is converging on a runtime-scoped version of the old principle of least privilege, built for a workforce that spins up for seconds and disappears. For Ayad, the goal is to compress the blast radius of any single bad decision until it becomes a routine incident rather than a headline. “We don’t want to give any identity, human or AI, more permission than they actually need. When you put those guardrails here, the blast radius of that incident is reduced to the minimum.” In practice, that means zero standing privilege, ephemeral credentials, just-in-time access scoped to a specific task, and immediate revocation upon task completion. Strata’s Maverics platform, Okta’s agent identity primitives, Microsoft’s Entra Agent ID and SandboxAQ’s AQtive Guard are variants of the same underlying pattern.

Cea pushes the definition further than most. “Least privilege should not mean ‘give it nothing and hope it is still useful’. It should mean minimum standing privilege, plus policy-governed access elevation at runtime.” Reading a knowledge base article, he argues, is not the same trust tier as modifying a customer record or triggering a payment — and the architecture has to reflect that asymmetry. An agent that needs to do something consequential should have to earn the authority to do it, in context, with the action logged.

Intent is the second control surface

Access control is only the first of two control surfaces the agentic era demands. The second is intent. Mohammed Aboul-Magd, VP of Product for SandboxAQ’s cybersecurity group, says the enterprises he advises are worrying about three distinct problems at deployment: “security, the intent of the agent, and the ROI and cost of agents.” External hijacking is the obvious risk — it is rarely the one that materialises first. The more common failure is an agent with the correct intention that executes it destructively: the canonical example being the bug-fixing agent that solves the problem by deleting the codebase, or the HR automation that computes salary comparisons and then mass-emails them to the company. That is not a security problem in the classical sense. It is an intent problem and requires a separate control layer. “Scanning intent and understanding the intent of what it’s trying to do and why it calls these things is another layer,” Aboul-Magd says.

The third concern he flags is one that most coverage of agentic AI ignores. Token-based consumption means cost scales with agent activity, and a compromised or misbehaving agent can burn through six-figure budgets in days. Lemkin, the SaaStr founder, was on track to spend $8,000 a month on a project he had originally budgeted at $25. The ROI of an agent — much like that of an employee — has to be measured, and most organisations lack a framework for doing so.

The wider pattern, in Aboul-Magd’s reading, is historically familiar. Cloud computing collapsed the capital cost of launching denial-of-service attacks. Generative AI has collapsed the capital cost of launching convincing phishing, deepfake fraud and social engineering at an industrial scale. His advice is stubbornly unfashionable. “Don’t forget the best practices,” he says. Network segmentation, encryption at rest and in transit, DDoS protection and the abolition of long-living permissions matter more — not less — in a world where automated adversaries can probe the same old holes at machine speed.

Enforcement has to move onto a different clock

Haber’s contention is that the enforcement clock itself has to change. “Cybersecurity teams have to stop using batch processes for trust. We need to go as real-time as possible,” he says. The industry spent two decades moving software development from waterfall to agile to continuous integration. Vulnerability management has followed, from quarterly scans to continuous posture assessment. Identity is the next domain in which the batch model breaks down. A quarterly access review is obsolete before it is filed if an agent can plan, delegate and act in seconds.

The financial pressure is already moving. Cyber insurance questionnaires have tightened every year on privileged access, segmentation and backups, and investment banks raise the same questions during due diligence. Forty per cent of organisations are increasing identity and security budgets specifically to address AI agent risk, and 34 per cent have established dedicated budget lines for agent governance, according to the CSA research. Contractual disclosure clauses, 48-hour breach notification requirements and verifiable liability insurance are becoming the enforcement mechanisms that self-assessment never provided.

Autonomous software is being deployed faster than it can be governed by organisations that cannot afford to opt out, within regulatory structures not written for it, using identity systems designed for a different species of user. Cea’s closing argument is the one that ought to be posted above every procurement meeting. “The next competitive advantage in AI will not come from who can deploy the most agents. It will come from who can deploy them with the highest level of trust, control, and regulatory readiness.”

The 500 strangers are already in the office. The question is no longer whether to let them in — it is whether anyone knows their names.

Previous
Previous

The machines are already inside: Microsoft's critical flaws doubled as AI agents flooded the enterprise

Next
Next

The Mythos Reckoning: How One AI Model Is Rewriting the Rules of Cybersecurity