Blog

Adding Schema Doesn’t Boost AI Citations

Blog Image

Rasit Cakir

May 13, 20268 min read

Adding Schema Doesn’t Boost AI Citations

Ahrefs, the SEO and AI search analytics platform, published a study on May 11 that tested one of the more popular assumptions in AI visibility advice: that adding schema markup to a page will get it cited more often in AI responses. The study tracked 1,885 pages that added JSON-LD schema between August 2025 and March 2026, matched them against 4,000 control pages with similar citation history, and measured what happened to AI citations on Google AI Overview, AI Mode, and ChatGPT in the 30 days before and after schema was added. Adding schema produced no meaningful citation boost on any of the three platforms.

That finding cuts against a lot of marketing content from the past 18 months that has positioned schema as an AI visibility lever. The argument usually starts with a correlation: pages cited by AI are much more likely to have schema markup than pages that aren’t. The Ahrefs data confirms the correlation. What it doesn’t confirm is the cause-and-effect story people have been telling about it.

The correlation everyone has been quoting

Before the study, the team at Ahrefs analyzed six million URLs and found that schema markup is much more common on AI-cited pages than on pages that aren’t cited. Among non-cited pages, only 18.5% had JSON-LD schema. Among AI-cited pages, 53.6% had it for reference citations and 71.7% had it for inline citations. AI-cited pages were almost three times more likely to have schema than non-cited ones.

That gap is the kind of stat that gets shared in conference slides and LinkedIn carousels as evidence that schema is the unlock for AI visibility. But correlation and causation are two different things, and a difference this clean usually has a confounding explanation. Schema markup tends to live on better-maintained, more technically sophisticated sites. Those same sites publish stronger content, build more authority, earn more links, and do all the other things that get pages cited. So the team designed an experiment to find out whether schema specifically was doing the work, or whether it was just along for the ride.

The study Ahrefs ran on 1,885 pages

The team built a dataset of 1,885 pages that added JSON-LD schema between August 2025 and March 2026. They identified the exact date schema went live on each page by combing through historical crawl data and spotting the first day the page had a JSON-LD script tag.

For each of those treated pages, they then picked three control pages from different domains with similar pre-period citation levels, none of which had added schema during the same window. The matched pairs let them compare apples to apples: a page that added schema versus a similar page that didn’t, both starting from a comparable baseline.

The reason for the matched comparison comes down to how noisy AI citation data has been over the past year. AI Overview citations were contracting during the study period while AI Mode citations were expanding. If the team had just compared each page to itself before and after schema, they would have been measuring the platform-wide trend, not the effect of schema. The matched comparison let them strip out those platform swings and isolate what adding schema actually did.

The numbers across three AI platforms

After running the matched comparison across the full dataset, the results came out like this:

             Google AI Overview: citations on treated pages fell 4.6% relative to control pages. The decline was small but statistically significant.

             Google AI Mode: citations on treated pages rose 2.4% relative to control pages. The result is statistically indistinguishable from zero.

             ChatGPT: citations on treated pages rose 0.8% relative to control pages. Also indistinguishable from zero.

Two of the three numbers are essentially noise. The AI Overview decline is real in the sense that it’s unlikely to be random chance, but the absolute size is small (an average loss of around 12 daily citations on pages that were already getting hundreds), and the team can’t cleanly attribute the gap to schema itself. Treated and control pages were both already on a downward trajectory in AI Overview citations during the study window, which suggests something else (a Google update, content staleness, AI Overview pulling back from certain content types) may explain part of the decline.

To make sure the conclusion held up, the team ran the test four different ways: a basic t-test, a difference-in-differences analysis, an event study tracking week-by-week trends, and a symmetric-window version of the DiD analysis. All four pointed in the same direction. No citation growth in AI Mode, no growth in ChatGPT, and a small AI Overview decline that’s real but can’t be definitively pinned on schema.

The small AI Overview decline that nobody can explain

The 4.6% AI Overview decline deserves a closer look, since it’s the one finding that isn’t statistical noise. The team flagged it as real but unexplained. A few possible explanations:

The pages in the study were already getting heavy AI Overview citations going in. Every page in the dataset had over 100 AI Overview citations in February 2025 before any schema was added. Pages that high in the citation pool tend to be the ones AI Overview is actively reviewing and refreshing, which means small changes to those pages get noticed. It’s possible that adding schema triggered some kind of re-evaluation that pulled some of these pages slightly out of favor.

It’s also possible the decline reflects content-type patterns that happen to correlate with the kind of sites that add schema during the study period. A more granular follow-up looking at which schema types and which content categories drove the decline could clarify the question.

For now, the only thing the team can say with confidence is that the decline is real, small, and not the kind of result anyone betting on schema as an AI visibility lever would have predicted.

Where schema might still play a role

The study has one important caveat that anyone reading the headline should keep in mind. Every page in the dataset was already being cited heavily by AI before any schema was added. These were pages already inside the consideration set, being crawled and surfaced by language models. The study tested whether adding schema pushed those pages higher. It did not test whether schema helps pages that aren’t being cited at all get into the consideration set in the first place.

For pages that aren’t being seen by AI search, schema markup might still help with crawling, parsing, or indexing. A separate experiment from searchVIU tested whether five major AI systems (ChatGPT, Claude, Perplexity, Gemini, and Google AI Mode) actually parsed schema markup when fetching a page in real-time. None of them did. All five extracted only visible HTML content during direct retrieval, ignoring JSON-LD, hidden Microdata, and hidden RDFa.

That doesn’t rule out schema playing a role in earlier stages of the pipeline (the crawl, the index, the entity recognition layer that decides what a page is about before any user prompt comes in). But it does suggest that at the moment of retrieval, when an AI system pulls content to compose a response, the content that matters is the content humans can see on the page.

What the data suggests actually drives AI citations

Schema being three times more common on AI-cited pages was the original observation that kicked off the schema-helps-AI-visibility theory. The study confirms the correlation and dismantles the causal claim built on top of it. So the question becomes: why are 53% of AI-cited pages running schema if schema isn’t what’s getting them cited?

The Ahrefs analysis offers the answer directly. Sites that add structured data tend to also invest in technical SEO, publish authoritative content, build links, maintain their pages, and rank well in regular search. AI systems are more likely to retrieve and cite that kind of content. Schema and citation eligibility share a common cause, which is the broader investment in content quality and authority that makes a page citation-worthy in the first place.

For brands trying to build AI visibility, the takeaway is to focus the investment on what causes citations rather than what correlates with them. Link building and digital PR build the third-party authority signals that AI retrieval systems use to decide which pages to trust. Guest posting on credible domains and link insertions into authoritative content put a brand inside the citation pool through pages that retrieval systems already trust.

Schema can still earn its place on a page for other reasons. Rich results in regular search (where they still apply), voice assistant compatibility, knowledge graph contributions, and downstream entity recognition all benefit from structured data. But if the only reason for adding JSON-LD was to get more AI citations on pages already being indexed, the Ahrefs data doesn’t support that as a working theory.

The simpler story is the one the study lands on: pages that get cited tend to be pages whose owners do a lot of things right, and schema is one of those things, alongside everything else that actually moves the needle. Adding the marker without the underlying work isn’t going to get a page over the line.