Measuring GEO: What to Track When Rankings Don't Apply

The measurement problem GEO has

For twenty years, SEO has had a clear default metric. Rankings.

Where do you rank for your target keywords? Are you moving up or down? How does your visibility compare to competitors? The metric was imperfect — rankings vary by location, by personalisation, by intent — but it was concrete, comparable, and easy to communicate.

GEO doesn’t have an equivalent. There’s no equivalent of “ranking position five for [keyword]” when the AI answer is a single paragraph with maybe one citation in it. There’s no clear competitive ladder. There’s no canonical metric every measurement tool agrees on.

This isn’t a failure of the field. It’s a structural reality of what AI citation actually is. Trying to force GEO into ranking-style metrics produces numbers that look reassuring and don’t measure anything useful. The honest approach is to measure differently — using metrics that match what GEO actually does.

Four metrics matter. None of them are perfect. Together, they tell you most of what you need to know.

1. Citation presence

The simplest question: do you appear in AI answers to questions you should appear in?

This is what the manual testing from lesson 23 measures. You run category prompts — questions your business should be the answer to — across multiple AI systems, and you record whether you appear, whether the appearance is accurate, and how prominently you’re cited.

Citation presence is binary at the level of any single prompt — you’re either there or you aren’t — but becomes a richer signal across many prompts and over time. Are you appearing in more of the prompts you should appear in? In more AI systems? More prominently?

Citation presence is the closest thing GEO has to a ranking metric, and it’s the most useful single thing to track.

A reasonable way to operationalise this: pick ten prompts that genuinely matter for your business — the questions a buyer might ask before hiring you. Run them quarterly across three or four AI systems. Track presence over time as a simple count. “I appeared in 4 of 10 prompts last quarter; I appear in 7 of 10 this quarter” tells you the work is paying off. Numbers don’t have to be sophisticated to be useful.

2. Citation accuracy

Appearing isn’t enough. Being described accurately matters as much.

If an AI mentions your business but gets the description wrong — wrong location, wrong specialism, wrong type of work, wrong audience — you’ve achieved presence without value. Worse, you’ve achieved presence with damage. Inaccurate citation can actively mislead the people the AI is supposed to be helping.

This is why branded prompts (lesson 23) matter alongside category prompts. When an AI is asked to describe you directly, what does it say? Does the description match what you’d write about yourself? Are the facts right? Is the framing one you’d choose?

Accuracy is hard to quantify cleanly, but easy to evaluate qualitatively. A simple approach: every quarter, ask each major AI system to describe your business. Save the responses. Compare to what you’d write yourself. Note the gaps — what’s wrong, what’s missing, what’s vague.

The gaps tell you where to do more work. Inaccurate location → reinforce location in schema and on the About page. Wrong audience → strengthen the audience relationships covered in Module 3. Missing specialism → make the specialism more prominent on your site. The accuracy data points directly to the work that closes the gap.

3. AI referral traffic

A growing share of website traffic now arrives from AI tools — people who clicked a citation in an answer and visited the source. This is direct, measurable, and increasing.

In your analytics, AI referral traffic shows up as visits from referrers like chat.openai.com, claude.ai, perplexity.ai, and (increasingly) Google AI Overview links. Most analytics platforms — Google Analytics, Plausible, Fathom, server-log-based tools — surface these as referrers in the same way they surface traffic from any other site.

The numbers will be small at first. AI referral traffic is currently a fraction of organic search traffic for most sites. But it’s growing, and it’s a real signal. People who arrive from an AI citation tend to be high-intent — they’ve already received some information about you, decided it was useful, and chosen to visit. Conversion rates on AI-referred traffic are often markedly higher than on traffic from broad keyword searches.

Track AI referral traffic monthly. Look at the trend rather than the absolute number. A site doing GEO well should see this traffic grow steadily over months and years, even if it stays small in absolute terms.

4. Brand mention frequency

The fourth metric is the most diffuse — and the most likely to age oddly — but worth understanding.

As AI systems take over informational queries that used to be searches, some traffic doesn’t arrive as a click. The user gets the answer in the AI tool and never visits anyone’s site. But your brand might still have been mentioned in the answer. Over time, repeated mentions across many users build awareness even without traffic.

This is called brand mention frequency — the rate at which your business is referenced in AI answers, even when no click results. It’s harder to measure than the other three metrics (you can’t always tell when you’ve been mentioned without traffic), but tools in category 1 from lesson 24 are starting to try. The data is currently noisy.

For most businesses today, brand mention frequency is best monitored qualitatively. Are people finding you and saying “ChatGPT recommended you” or “I asked Perplexity and your name came up”? That’s brand mention frequency in action. The metric isn’t precise, but the underlying phenomenon is real.

What to ignore

A few metrics that look useful and aren’t.

“GEO score” or “AI visibility score” from a single tool. Black-box composite scores tell you what the tool thinks of you, not what AI systems do. Useful as a directional signal at best; treated as a definitive metric, they mislead.

Total impressions across all AI systems. Some tools count every time your name appears in any AI answer across millions of prompts. The number sounds impressive and means almost nothing. It includes appearances in irrelevant contexts, in prompts no real buyer would ask, in misidentifications. Volume without quality is a vanity metric.

Ranking-style “position” inside AI answers. Some tools assign you a “position” within an AI answer based on where you appear in the citation list. This is forcing AI citation into a SEO frame that doesn’t fit. The order of citations in most AI answers isn’t ranked the way search results are. Treating position as meaningful overinterprets the data.

Anything that looks like SEO metrics applied unchanged. GEO works differently. Metrics that pretend it works the same as SEO produce false confidence — usually in directions that mislead the people relying on them.

The honest position: GEO measurement is real but rougher than SEO measurement. Resist the temptation to over-quantify it just because precision is reassuring. The four metrics above, tracked simply and consistently, tell you more than any sophisticated dashboard.

A simple measurement plan

If you wanted a measurement plan you could maintain in an hour a quarter, this is the one I’d suggest.

Quarterly:

Run 10 category prompts and 3 branded prompts across 3 AI systems. Record citation presence and accuracy.
Pull AI referral traffic data from your analytics for the quarter. Note the trend.
Note any qualitative brand mention feedback received that quarter (from clients, prospects, peers).

Annually:

Review the full year’s trends. Are citation presence, accuracy, and referral traffic moving in the right direction?
Identify the gaps in citation accuracy. Plan the work that closes them.
Reassess the prompts you’re tracking. Are they still the right questions?

That’s the whole plan. No dashboards, no expensive tools, no formal reporting. An hour a quarter, an afternoon a year. It’s enough.

A useful mindset

The right metrics for GEO are the ones that tell you whether the work is paying off, not the ones that look impressive in a slide. Simple, consistent measurement beats complex measurement done sporadically.

If you’ve done the work in the previous modules and you’re measuring with these four metrics, you’ll know more about your GEO performance than the vast majority of businesses. The bar is genuinely low, and the rewards for clearing it are real.

Coming up in the final module: Judgment, restraint, and what’s coming. Two lessons that close the course — covering the GEO myths that are already forming, what to stop chasing, and when to trust the work you’ve done to do its job. The same restraint message the on-page SEO course ended on, applied to AI hype.