Field note·Citation patterns·27 April 2026·4 min read

Why LLMs pick third-party pages over your own site

Semrush finds LLMs prefer independent pages over brand-owned content. The implication: earned media and review presence now drive AI visibility, not your corporate domain.

Key takeaways

LLMs prefer third-party pages over brand-owned content for commercial and comparison queries.
The corporate homepage matches your brand, not the user's question. Independent reviews match both.
B2B marketers should track citation share across LLM answers, not just traffic to owned domains.
PR and earned media budgets are now AEO budgets. Trade press mentions feed retrieval and training.
Compliance-hardened thought leadership loses citation slots to opinionated, specific third-party pages.

What happened

Per Semrush, large language models routinely cite third-party publishers, review sites, and forums ahead of a brand's own domain when answering buyer questions. The Semrush analysis frames this as a structural feature of how ChatGPT, Perplexity, and Google's AI surfaces choose sources, not a temporary glitch. The models prefer pages that look like independent verification of a claim over pages that look like the claim itself.

Semrush reports that the gap shows up most clearly on commercial and comparison queries. Ask an LLM "best enterprise CRM for financial services" and the answer leans on G2, Reddit threads, Forrester summaries, and trade press. The vendor's own product page tends to appear later, if at all, even when it ranks well in classical Google search.

The mechanism Semrush describes is straightforward. Models are trained and retrieval-tuned to weight pages that match the intent of the prompt and carry external trust signals. A vendor's homepage matches the brand, not the question. A third-party review matches both.

Why it matters for your brand

For most B2B marketers, this finding rewires the assumption that owned content is the centre of gravity. It is not. In LLM answers, owned content is one input among many, and often not the decisive one. The decisive inputs are the pages that look like a third party answering the user's actual question.

For financial services brands, this is acute. A global bank or asset manager publishing thought leadership on its own domain will frequently lose the citation slot to the FT, Risk.net, the Banker, or a Bogleheads thread. Compliance teams have spent a decade hardening corporate sites against anything that reads like a strong opinion or a comparative claim. That same hardening is what makes those pages weak LLM citations. The pages that win are opinionated, specific, and answer a narrow question. Your bond desk's quarterly outlook PDF does not.

For multilaterals and UN-system bodies, the citation pattern is different but the lesson is the same. UNDRR, WHO, and the World Bank do get cited heavily on policy and statistics queries, because the models treat them as primary sources of fact. They lose ground on applied questions ("how should a city finance climate adaptation") where the model prefers case studies, academic syntheses, and trade press write-ups of the multilateral's own work. The implication: distribution of your findings through secondary outlets matters as much as the original report. A WHO study cited by the Lancet and Reuters will surface in LLM answers more reliably than the WHO PDF on its own.

For major industrial groups (Holcim, Siemens, ArcelorMittal types), the citation gap is widest on sustainability and procurement questions. Buyers asking ChatGPT about low-carbon cement specifications get answers built from Carbon Brief, ENR, and academic press, not from corporate sustainability reports. If your strategy is to publish a 120-page ESG report once a year and call it content, you are invisible in the layer where procurement research now starts.

For philanthropic and policy institutions, the model behaviour rewards being quoted, not just publishing. A foundation cited in Devex, Alliance, or SSIR will turn up in LLM answers about funding strategy. The same foundation publishing the same insight only on its own site will not. This is a media relations problem dressed up as a content problem.

The content strategy shift is concrete. First, stop measuring owned content only by traffic to your domain. Add a citation-share metric: of the LLM answers to your top 50 buyer questions, how often does any source mention your brand, and how often is your domain the cited URL versus a third party. Second, fund the third-party layer deliberately. Bylines in trade press, contributed analysis to research firms, structured data feeds to comparison sites, and yes, presence in Reddit and Stack Overflow communities where your buyers actually congregate. Third, write owned pages that look like answers, not brochures. A page titled "How [Bank] thinks about counterparty risk in tokenised settlement" will be cited. A page titled "Our innovation approach" will not.

Brand-building budgets should rebalance accordingly. The PR function, long treated as a defensive cost centre at many large enterprises, becomes an AEO function. Earned mentions in trusted outlets are now training data and retrieval data. They compound.

The signal in context

The Semrush finding sits alongside a year of evidence that LLM citation behaviour favours independent verification over self-publication. Studies from Profound, Ahrefs, and Authoritas through 2024 and 2025 have shown Reddit, Wikipedia, YouTube, and a narrow set of trade publishers punching far above their classical search weight in ChatGPT and Perplexity answers. Google's AI Overviews behave similarly, pulling from forums and review sites for commercial queries. The pattern is not vendor-specific; it is a property of how retrieval-augmented generation evaluates trust.

What is new in the Semrush framing is the diagnostic angle: the question is no longer "are LLMs citing my category" but "are they citing me, and if not, who are they citing instead." That second question is the one B2B brands targeting regulated buyers should be running monthly. The answer dictates where the next dollar of content and PR budget goes, and which third-party relationships are now mission-critical rather than nice-to-have.

Source: Semrush blog

AI-authored, editor reviewed