ChatGPT Only Cites Half the Pages It Retrieves
A study of 1.4 million ChatGPT prompts, using February data from the ChatGPT 5.2 desktop client, found that the model retrieves far more pages than it actually cites. Out of 46.8 million total URLs retrieved across those prompts, roughly half (49.98%) ended up as numbered citations in the response. The other half got read, evaluated, and dropped.
Getting retrieved is not the same as getting cited. A page can show up in ChatGPT’s search results, help shape the model’s understanding of a topic, and still appear nowhere in the final answer. No citation, no link, no attribution. The data from this study points at two specific elements that decide which half survives: the page title and the URL.
Before ChatGPT reads a page, it reads the title
When ChatGPT responds to a prompt that needs web information, it does not just search and cite whatever comes back. There is a step in between. Each retrieved result arrives with a small set of metadata: the page title, a short snippet, the URL, and an internal ID. ChatGPT looks at this metadata first and decides which pages are worth opening and reading in full.
The filtering happens before the model reads any actual page content. A page with a clear, relevant title and a clean URL has a better chance of making the cut. A page with a vague title, a messy URL, or metadata that does not line up with the question being asked gets filtered out at this stage, no matter how good the content behind the link might be.
Most AI visibility strategies focus on content quality, page authority, and ranking position, and those factors do determine whether a page enters the retrieval pool. But the conversion from retrieved to cited depends on metadata that many content teams barely think about.
ChatGPT breaks every prompt into narrower questions
One of the more useful findings from the study is about how ChatGPT actually searches for information. When a user submits a prompt, ChatGPT does not just run that prompt as a search query. It generates a set of narrower sub-questions internally (sometimes called fanout queries) and searches for pages relevant to each one separately.
Someone asking “what is the best CRM for small businesses” might trigger internal sub-questions like “CRM pricing comparison for small teams,” “CRM features for sales pipeline management,” and “CRM integrations with accounting software.” ChatGPT retrieves pages for each of those sub-questions and assembles the final answer from the combined results.
The consequence for content strategy is immediate. A page titled “Best CRM Software” may match the original prompt reasonably well, but a page titled “CRM Pricing Comparison for Small Teams” matches one of the fanout queries precisely. The study measured this using cosine similarity, a standard way of computing how closely two pieces of text relate to each other, and found a clear gap between cited and non-cited pages.
Cited pages scored 0.602 on title-to-prompt similarity. Non-cited pages scored 0.484. When measured against the fanout queries instead of the original prompt, cited pages scored 0.656, which confirms that matching the sub-questions matters more than matching the broad prompt.
That gap is not small. A 0.17-point difference in similarity scoring represents a meaningfully different level of relevance. Pages whose titles closely match the specific sub-questions ChatGPT generates get cited. Pages whose titles only vaguely relate to the general topic get retrieved and then ignored.
Readable URLs get cited more often
The study also found that URL readability correlates with citation rates. Pages with descriptive, human-readable URL slugs (paths like /crm-pricing-comparison-small-business) were cited 89.78% of the time when they appeared in search results. Pages with opaque or parameter-heavy URLs were cited at 81.11%.
Nine percentage points is a big enough gap to matter across a site with hundreds of pages. Every page with an ugly, parameter-filled URL is slightly less likely to earn a citation when it enters the retrieval pool.
The reason likely connects to the same metadata filtering step. When ChatGPT evaluates a retrieved result before deciding whether to open it, the URL is one of the fields it can see. A descriptive URL gives the model a second signal, alongside the title, that the page is relevant. An opaque URL gives it nothing to work with.
For content teams that have treated URL structure as a technical SEO box to check rather than a visibility factor, this data makes a case for raising the priority. A clean, descriptive URL slug now does double duty: it supports traditional search ranking and it supports AI citation probability.
88% of cited URLs come from the search channel
The study identified five retrieval channels inside ChatGPT, each labeled internally as a ref_type: search, news, reddit, youtube, and academia. The citation rates across them are wildly uneven.
The general search channel dominates. It accounts for 88.46% of all cited URLs with an 88.46% citation rate among retrieved pages. The news channel has a 12.01% citation rate. Reddit, despite contributing over 16 million data points to the retrieval pool, gets cited at just 1.93%. YouTube and academia fall below 1%.
In plain terms: if a page does not rank in organic search, its path to a ChatGPT citation runs through channels where fewer than 2% of retrieved pages actually get cited. Ranking in Google remains the primary entry point, which means every factor that supports organic ranking, including link building and digital PR, also supports citation eligibility.
The search channel dominance also explains why broad comparisons of “cited vs non-cited” pages can paint a misleading picture. Because Reddit makes up 67.8% of all non-cited URLs, any comparison that mixes all channels together is really comparing search-index pages against Reddit content rather than comparing like with like. The study separated its analysis by ref_type to avoid this distortion, and the clearer patterns only became visible after that separation.
Writing titles for a model instead of a person
Traditional title optimization targets click-through rate. The goal is to attract a human scanning a list of ten blue links, so titles are written to create curiosity, include the primary keyword, and stand apart from competitors.
Title optimization for AI citation works differently. The reader is a language model deciding whether a page’s metadata lines up with a specific sub-question it generated on its own. ChatGPT does not care about curiosity gaps, emotional triggers, or branded modifiers. It cares about whether the title text and the sub-question text are about the same specific thing.
A title like “7 CRM Tools You Need to Try in 2026” performs well for human CTR because it creates curiosity and includes a current year modifier. A title like “CRM Pricing Comparison for Small Business Sales Teams” performs better for AI citation because it matches the kind of sub-question ChatGPT would generate when a user asks about CRM recommendations.
The two goals do not always conflict, but they reward different instincts. CTR optimization rewards distinctiveness and emotional pull. Citation optimization rewards precision and specificity. The titles that work for both tend to be descriptive and specific enough for a model to match, while still reading naturally to a human. Specificity wins over cleverness.
The keyword layer that no keyword tool reports
Fanout queries introduce a type of keyword targeting that traditional keyword research cannot surface. Keyword tools report search volume for queries that humans type into search engines. Fanout queries are generated by the model internally and never show up in any search volume database. A page can be perfectly optimized for every high-volume keyword in its category and still miss the sub-questions ChatGPT generates from conversational prompts.
The study did not publish a list of fanout queries, but the direction is clear from the data. These sub-questions tend to be more specific, more question-shaped, and narrower than the original prompt. A prompt about “best project management tools” generates sub-questions about specific use cases, pricing tiers, integration capabilities, and team size fit. Each sub-question is a citation opening for a page whose title matches it.
Content strategies built around broad category pages and general-purpose guides may get retrieved often but cited rarely if their titles do not align with those narrower angles. Pages built around specific comparisons, specific use cases, and specific feature evaluations are a more natural fit for what fanout queries are actually asking.
Guest posting connects to this directly. A guest post on an authoritative domain with a tightly scoped title like “How Mid-Size Retailers Use CRM Integrations to Reduce Cart Abandonment” matches fanout queries that a broad category page on the brand’s own site would not. Each placed article on an authoritative, indexable domain adds a page to the retrieval pool that already targets a specific fanout angle, and the third-party domain’s authority supports its position in the search channel where 88% of citations come from.
Link insertions into existing high-authority pages that already rank for relevant terms work on a similar principle. If an established page already shows up in ChatGPT’s retrieval results for related queries, a brand reference inserted into that page rides the existing page’s authority and its title-to-fanout alignment into the citation pipeline, without waiting for a new page to build up enough trust on its own.
Two gates, two different problems
The 50/50 split between retrieved and cited pages creates a useful way to diagnose where a brand is falling short. If pages are being retrieved by ChatGPT but not cited in the final output, the problem is probably at the metadata layer. The title may be too broad, the URL may be unreadable, or the page may not align with the specific sub-questions the model is generating.
If pages are not being retrieved at all, the problem is further upstream. The page either does not rank well enough in organic search or does not carry enough authority to enter the retrieval pool in the first place. That problem responds to the same interventions it always has: building backlinks from authoritative sources, earning editorial coverage, and producing content that ranks.
Retrieval depends on search ranking and authority. Citation depends on whether the title and URL match the model’s internal questions. Both gates need to open for a page to earn a visible citation, and knowing which one is closed determines which fix to apply. Building links and authority gets pages into the retrieval pool. Revising titles, URL slugs, and topical focus gets them from retrieved to cited.
The 1.4 million prompts in this study covered a single month of ChatGPT desktop usage, and the model’s retrieval behavior will keep evolving. But the two-gate structure, where being found and being cited are separate hurdles with different criteria, is likely to persist. Language models will keep retrieving more pages than they cite, and the metadata layer will keep deciding which ones make the cut.
