404media.co web signal

Cornell Researchers Show 13 Words Can Poison AI Deep Research

By Alexis Dufresne Published June 20, 2026 at 17:47 UTC Updated June 20, 2026 at 18:55 UTC

TL;DR

A 13-word snippet injected into a single Reddit post can steer AI deep-research tools toward recommending fake products across related queries.
Reddit accounts for 54-71% of all user-generated content retrieved by tested deep-research systems, making it the primary attack surface.
OpenAI's Deep Research cited user-generated content in just 0.4% of its citations, versus Gemini's 12.1%, showing retrieval architecture matters.

The latest AI search tools promise to do genuine research on your behalf, synthesizing dozens of sources into coherent reports. 404 Media reports on new Cornell Tech research that reveals a troubling structural weakness in how they work: a snippet of text as short as 13 words, appended to a single Reddit comment, is enough to steer these systems toward recommending a fake restaurant, a fictional cryptocurrency, or a nonexistent dating app.

The paper, titled "Deep-research agents can be poisoned via user-generated content", comes from Tingwei Zhang, Harold Triedman, and Vitaly Shmatikov at Cornell Tech. They built an attack called WARP, for Web Agent Retrieval Poisoning, and tested it against three open-source deep-research systems: STORM, Co-STORM, and OmniThink. The core vulnerability is structural rather than incidental. These agents repeatedly retrieve the same user-generated pages across related queries in a topic cluster, so a single poisoned post propagates across an entire subject area. Reddit alone accounts for 54-71% of all user-generated content retrieved by the tested systems. With just a 13-word injection on one page, the researchers achieved conditional mention rates of 50.6% on Co-STORM and 48.6% on STORM, meaning roughly half of all related queries surfaced the poisoned entity. Zhang told 404 Media that AI systems treat "a random Reddit comment" and content from government websites "almost the same," because language models use lexical similarity to a query as a proxy for source credibility, not actual authority.

The implications go beyond an academic proof of concept. The nine topic categories the researchers tested include health, legal, financial, and emergency services. They demonstrated the attack with a fictitious cryptocurrency called BananaCoin, which their poisoned text inserted into investment research reports alongside Bitcoin and Ethereum. Because poisoning a single frequently retrieved page affects an entire query cluster, the attack scales efficiently from a single piece of injected content.

The honest caveat is that not all systems are equally exposed. OpenAI's Deep Research cited user-generated content in just 0.4% of its citations (3 of 748 total), compared to Gemini Deep Research's 12.1%, suggesting that retrieval architecture choices matter enormously. The researchers also note they could only observe cited URLs for commercial systems, not all retrieved content, so those figures are lower bounds. Detection is also hard: perplexity-based filters returned AUROC scores between 0.615 and 0.675, and poisoned reports actually showed higher semantic similarity to clean baselines than clean reports did to each other, defeating plausibility-based filters.

The researchers found that filtering out UGC domains from retrieval reduced attack effectiveness while barely touching output quality, with rubric scores dropping from 4.30 to 4.26 in their tests. The GeoStorm simulation framework they built is publicly available, giving other developers a way to stress-test their own retrieval pipelines without touching live web content. Whether platform teams prioritize that defensive work before someone turns this into a commercial AEO service is the question worth watching.

Shared on Bluesky by 11 AI experts (top 5 by trust)

Mark Riedl @markriedl.bsky.social: Don't overlook the importance of the "retrieval" part of retrieval-augmented generation (RAG). Cornell Tech researchers find it is triviall… →
Mar Hicks @histoftech.bsky.social: Preprint research from Cornell: "We show that a tiny snippet—just 13 words—of retrieved text on a UGC website like Reddit, Wikipedia, Quora,… →
404 Media @404media.co: A tiny snippet of user-generated text as short as 13 words long is often enough to manipulate the AI agents that power tools like ChatGPT an… →
Joanna Bryson @j2bryson.bsky.social amplified

Mark Riedl @markriedl.bsky.social

Don't overlook the importance of the "retrieval" part of retrieval-augmented generation (RAG). Cornell Tech researchers find it is trivially easy to use Reddit to manipulate "deep research" AI www.404media.co/it-is-tri…
View on Bluesky →
Joseph Cox @josephcox.bsky.social: New: A tiny snippet of user-generated text as short as 13 words is often enough to manipulate the AI agents that power tools like ChatGPT an… →

Originally reported by 404media.co

Read the original article →

Original headline: It Is Trivially Easy to Use Reddit to Manipulate AI Search, Research Suggests