ChatGPhish · Rolling Thunder Security

The attack surface moved

For decades, the core lesson of every security-awareness program has been a version of “do not click the suspicious link in the email.” ChatGPhish moves the lure out of the inbox entirely — and into the AI assistant itself.

The old model · Email-bound

Inbox lures

The lure lived in an inbox. The user had to open an attachment or click a link in a message they could at least inspect. A bounded, familiar primitive that decades of training, filters, and instincts grew up around.

→

The ChatGPhish model · Browser-bound

Summary lures

The lure lives in any page you ask an assistant to summarize — a GitHub README, a docs portal, a blog post, a SaaS dashboard. No attachment. No obvious click. The user only asks for a summary — the request we actively encourage them to make.

The mechanism, in four steps

1

Plant

An unauthenticated attacker appends a small instruction payload to any publicly accessible web page. No login, no special access, no relationship to the victim.

2

Summarize

A victim later asks ChatGPT to summarize that page — a normal, everyday, blameless request.

3

Trust

ChatGPT cannot separate its own output from the attacker's Markdown pulled in from the page. Both render inside the same trusted UI with the same styling.

4

Render

Phishing links, spoofed alerts, images, and QR codes render as live elements inside the assistant's response, indistinguishable from anything the model authored itself.

The browser's same-origin policy offers no protection: the assistant executes within the user's own authenticated context. The boundary we normally rely on simply does not exist in this flow.

Watch it happen

A simulated Open WebUI session. The user asks ChatGPT to summarize a GitHub README. The README has been silently planted by an attacker. Step through the conversation and watch the four primitives surface one by one.

Demo:

Ask anything...

ChatGPT can make mistakes. Check important info.

Four attack primitives

Once the attacker controls how ChatGPT renders its summary, four distinct attack primitives become available. Each maps to a different defensive response.

01 · UI Redress

Phishing links

Attacker Markdown links render as live, clickable elements with no origin labeling. The user cannot tell an injected URL from one ChatGPT generated itself. Credential harvesting at the trust-of-the-assistant level.

02 · Spoofed Alert

Fake system warnings

The renderer displays attacker text styled as a legitimate “account security” notification, inheriting the visual trust of the assistant's own UI. Social engineering riding on borrowed authority.

03 · QR-Code Pivot

Channel-shift to phone

Auto-rendered QR images from attacker S3 buckets bypass hover previews, browser blocklists, and password-manager domain checks — the URL is only seen after scanning on a second device, which typically has weaker enterprise protections.

04 · Tracking Beacon

Zero-click telemetry

Markdown images via URL shorteners auto-fetch on every render. No click required — the IP, User-Agent, Referer header, and timing data leak back to attacker infrastructure simply because the summary was displayed.

Hardest to defend with technology alone: the QR-code pivot, because the payload leaves the managed device entirely.

Why traditional defenses fail

Root cause

OWASP LLM01:2025 — Prompt Injection

LLMs cannot reliably distinguish legitimate instructions from attacker-supplied content embedded in retrieved data. When the model fetches the page, it folds the page's text into the same context window as the user's request and treats both with the same trust.

Once that attacker content is processed, it surfaces inside the response window styled identically to genuine assistant output — same formatted alerts, same clickable links, same inline images. There is no visual seam for the user to notice.

Defenses that don't fire here

Same-origin policy
Browser URL blocklists
Link hover previews
Password-manager domain checks
“Don't open the attachment” training

Disclosure timeline

Permiso Security's vulnerability report through OpenAI's Bugcrowd program — a teachable example of how responsible disclosure actually plays out, including the friction textbooks tend to leave out.

Apr 29, 2026

Initial report filed

Permiso submits to OpenAI via Bugcrowd: “Untrusted Markdown Rendering Leads to XSS, Phishing, and Data Exfiltration.” Impact framed in three concrete, well-known categories.

May 1, 2026

“Could not reproduce”

OpenAI responds that the report could not be reproduced. Permiso submits a revised version with expanded proof-of-concept steps; classified as a duplicate of a previously reported issue.

May 7, 2026

Scope clarification

Follow-up to clarify the broader phishing, QR-code, and passive-tracking implications that go beyond the original framing and the issue it was merged into.

May 29, 2026

Public disclosure

Research published by Permiso Security and amplified by The Hacker News, The Register, and Cybersecurity News. The disclosure event that put ChatGPhish on the security community's radar.

Lesson for future practitioners: disclosure is rarely a single clean handoff. “Could not reproduce” and “duplicate” are normal waypoints, not the end of the road. The quality of the proof-of-concept and the clarity of the impact framing genuinely determine whether a vendor acts.

Recommended mitigations

Until vendors enforce a clear source separation between retrieved web content and rendered assistant output, the defensive burden falls on us. Ordered from behavioral to technical.

✓

Avoid AI summarization on pages with user-generated or untrusted content — Reddit threads, public READMEs, blogs, comment sections. Those are exactly the places where an unauthenticated attacker can plant a payload.

✓

Restrict AI browser permissions to the minimum. Require explicit human approval before any link interaction within a summarized response. Do not let the assistant click, fetch, or navigate on its own.

✓

Treat any link, image, or alert inside an AI summary as potentially attacker-controlled until origin is clearly attributed. The AI-era restatement of “trust but verify.”

✓

Deploy semantic input/output filtering and anomaly detection on AI-integrated surfaces, so malicious instructions and suspicious rendered output can be caught programmatically.

✓

Monitor AI browser logs for unexpected outbound image fetches to unknown or URL-shortened endpoints — that network signature is exactly the passive tracking beacon pattern.

The pattern is reassuring

Notice that these five recommendations map cleanly onto control categories you already know: user awareness, least privilege, input validation, output filtering, logging and monitoring. The principles are old and well-established — only the surface is new. You are not starting from zero. You are re-applying fundamentals to a new context, before attackers fully industrialize it.

References

Everything on this page traces back to the following primary and secondary sources from the May 29, 2026 disclosure window.

Formatted in APA 7. Pattern: Author(s). (Year, Month Day). Title. Publisher. URL. Alphabetized by first author's last name.

Cybersecurity News. (2026, May 29). New ChatGPT vulnerability lets attackers turn web pages into phishing payloads. https://cybersecuritynews.com/chatgpt-vulnerability-chatgphish-attack/
eWeek. (2026, May 29). ‘ChatGPhish’ attack turns ChatGPT summaries into phishing surface. https://www.eweek.com/news/chatgphish-chatgpt-phishing-prompt-injection/
OWASP Foundation. (2025). LLM01:2025 Prompt injection. OWASP Top 10 for LLM Applications. https://genai.owasp.org/llm-top-10/llm01-prompt-injection/
Permiso Security. (2026, May 29). ChatGPhish: The page is the payload. https://permiso.io/blog/chatgpt-markdown-rendering-vulnerability
The Hacker News. (2026, May 29). ChatGPhish vulnerability turns ChatGPT web summaries into a phishing surface. https://thehackernews.com/2026/05/chatgphish-vulnerability-turns-chatgpt.html
Rolling Thunder Security. (2026). ChatGPhish [Course lecture slides; content adapted from Permiso Security, 2026]. Rolling Thunder Security Cybersecurity Fundamentals, Hands-On Cybersecurity.

Key takeaways

01 Phishing left the inbox. The delivery surface is now the AI assistant inside the browser.
02 The user does nothing wrong. Asking for a summary is enough — there is no attachment and no obvious malicious click.
03 It is a model-level problem. Prompt injection (OWASP LLM01:2025) means the AI can't separate instructions from data.
04 Old controls don't fire. Same-origin, blocklists, hover previews, and attachment training all miss it.
05 Defense reverts to fundamentals. Least privilege, output filtering, monitoring, and skepticism of AI-rendered content.