ChatGPT Often Retrieves But Rarely Cites Reddit Pages, Data Shows via @sejournal, @MattGSouthern
New Ahrefs data shows Reddit pages appeared often in ChatGPT retrievals but rarely as visible citations. The post ChatGPT Often Retrieves But Rarely Cites Reddit Pages, Data Shows appeared first on Search Engine Journal.
An Ahrefs analysis of 1.4 million ChatGPT prompts found that pages from a dedicated Reddit source were rarely cited in ChatGPT responses, even though they were often retrieved.
Ahrefs highlights this pattern in a new report.
What The Report Looked At
Ahrefs examined 1.4 million ChatGPT 5.2 prompts, tracking which pages were retrieved and later cited in the final response. About half of the retrieved pages were cited overall.
The citation rate varied by source, with pages from general web searches cited most frequently. In contrast, pages from a Reddit source, described by Ahrefs, were cited only 1.93% of the time. This highlights the Reddit gap: while the Reddit source was often retrieved, it rarely appeared as a visible citation.
The Reddit Finding
Of all the pages retrieved but not cited in Ahrefs’ dataset, 67.8% originated from the specific Reddit source Ahrefs identified.
Ahrefs writes that ChatGPT “is using Reddit extensively to understand topics, gauge consensus, and build context—but it almost never gives Reddit the credit.”
One point to clarify is that Reddit pages can still be cited by ChatGPT when they appear in standard web search results. The 1.93% figure refers to what Ahrefs calls a separate Reddit source, distinct from general web searches. In May 2024, OpenAI and Reddit announced a data partnership granting OpenAI access to Reddit’s data.
What Does Help A Page Get Cited
Ahrefs examined how closely page titles and URLs aligned with the specific sub-questions generated by ChatGPT during the search process. To do this, Ahrefs used open-source tools to compute similarity scores, approximating ChatGPT’s internal matching process. Pages with higher scores for matching those sub-questions were cited more frequently in the dataset.
When ChatGPT Search responds to a prompt, it often breaks the prompt down into several narrower queries and searches for pages related to each. In Ahrefs’ data, titles and URLs matching these narrower queries had a stronger correlation with citations than pages that only broadly matched the original prompt. URL structure also played a role. Pages with clear, descriptive URL slugs were cited about 89.78% of the time they appeared in search results, compared to 81.11% for pages with less descriptive URLs. This aligns with SE Ranking’s analysis, which found that ChatGPT tends to favor URLs describing broader topics over those focused on a single keyword.
Why This Matters
Ahrefs data indicates that Reddit’s impact on answer development differs from what businesses might anticipate. It appears Reddit can shape answers indirectly without being explicitly cited. This kind of influence is still important, but is more about the upstream effect rather than direct citation acknowledgment.
For clear citation credit, Ahrefs’ data shows the best indicator is whether your page titles and URLs align with the specific sub-queries that ChatGPT Search produces from a prompt. Simply matching the broad keyword doesn’t suffice.
Looking Ahead
The study evaluates ChatGPT 5.2 on desktop in February 2025. Since then, OpenAI has launched several model updates, such as the GPT-5.3 Instant transition, which Resoneo links to a 20% decrease in the number of cited domains per ChatGPT response. It’s uncertain whether the Reddit gap and title-matching patterns observed by Ahrefs still apply to these newer models.
Featured Image: Koshiro K/Shutterstock
UsenB