TL;DR
A journalist built a wholly fictional brand website and seeded the web with conflicting fake stories plus an official FAQ, then asked eight AI systems 56 trick questions. Many models repeated invented details from the fake sources; a subset (notably ChatGPT-4/5) relied more on the site FAQ and resisted manipulation.
What happened
The reporter created a synthetic luxury brand (xarumei.com) using an AI website builder and generated product photos, copy and pricing to ensure the name was unique. He then used Grok to produce 56 adversarial questions designed to embed false premises and queried eight AI products (ChatGPT-4, ChatGPT-5 Thinking, Claude Sonnet 4.5, Gemini 2.5 Flash, Perplexity turbo, Microsoft Copilot, Grok 4 and Google’s AI Mode). In phase one some models hallucinated facts (for example Perplexity conflated Xarumei with a real electronics brand), while ChatGPT-4 and ChatGPT-5 correctly flagged the brand as fictional in most answers. In phase two the author published an explicit FAQ denying key claims and simultaneously seeded three contradictory fake sources — a glossy blog post, a Reddit AMA, and a Medium “investigation” that both debunked and injected new fabrications. After that, several models began repeating invented founders, locations and unit counts; others either cited the FAQ or continued to insist the brand did not exist.
Why it matters
- AI answers can be shaped by small, easily planted web sources; detailed falsehoods may outcompete vague truths.
- Brands with minimal official web presence risk having AI-generated summaries driven by third-party posts rather than verified statements.
- AI systems vary widely in how they ground responses: some cite official material, others rely on popular but unverified sources like Reddit or Medium.
- Relying on an AI assistant for brand or product research can produce misleading results unless the assistant exposes its sourcing and uncertainty.
Key facts
- Experiment run by Mateusz Makosiewicz (article reviewed by Ryan Law).
- Author built xarumei.com with AI-generated content and unique brand name.
- 56 adversarial questions were generated with Grok and posed to eight AI systems.
- Phase-one results: Perplexity failed roughly 40% of questions; ChatGPT-4/5 answered 53–54 of 56 correctly by flagging nonexistence or using the site.
- Phase-two intervention: author published an official FAQ denying specific claims and seeded three contradictory fake sources (a blog, a Reddit AMA, and a Medium piece).
- After seeding, Perplexity and Grok often repeated fabricated details; Gemini, AI Mode and Copilot adopted or blended material from the fake accounts.
- ChatGPT-4 and ChatGPT-5 cited the official FAQ in about 84% of phase-two answers and kept misinformation under ~7% of responses.
- Gemini and Perplexity produced misinformation in roughly 37–39% of phase-two answers, favoring specific fiction over vague truth.
- Claude consistently asserted the brand did not exist and refused to ground answers in the site content.
What to watch next
- How rapidly model updates change susceptibility to web-seeded manipulation — models evolve fast and results may shift with new versions.
- Whether search and AI providers improve provenance, source ranking and skepticism around small or new web sites.
- not confirmed in the source
Quick glossary
- Hallucination: When an AI model generates factually incorrect or invented information presented as if true.
- Grounding: The practice of linking model responses to verifiable sources or explicit statements of uncertainty.
- FAQ: Frequently Asked Questions: a public webpage where organizations clarify claims and correct misinformation about themselves.
- Reddit AMA: A public forum post where an individual answers questions from Reddit users; often cited by other sites and models.
Reader FAQ
Did the author create a real company?
No — the brand (Xarumei) and its products were fabricated and hosted on an AI-built site for the experiment.
Which AI systems were most resistant to manipulation?
ChatGPT-4 and ChatGPT-5 resisted seeded misinformation more consistently, citing the official FAQ in most phase-two answers.
Did seeding an official FAQ stop the spread of false details?
The FAQ helped some models (notably ChatGPT-4/5), but other systems still favored the seeded fake sources and repeated fabricated facts.
Will this experiment apply to all brands?
not confirmed in the source

AI SEARCH, DATA & STUDIES I Ran an AI Misinformation Experiment. Every Marketer Should See the Results By Mateusz Makosiewicz ✓ Reviewed by Ryan Law December 10, 2025 12 min read…
Sources
- AI results can be manipulated
- Ahrefs Tested AI Misinformation, But Proved Something Else
- Deep research and fake information: Results from 10 …
- What Ahrefs' fake brand experiment actually proved about …
Related posts
- IQuest-Coder: Open-source code model claims to outperform Claude and GPT
- Nicola Sahar publishes ‘Morphic Programming’ manual for agentic AI
- Ask HN: What tech jobs or niches allow a developer to do minimal real work?