Measuring GEO: The Seven-Metric Stack for AI Visibility

EPR Editorial TeamJun 9, 20268 min read

Share

Measuring GEO: The Metrics That Track AI Visibility

Edited on Jun 22, 2026

Citation Share is the headline metric of GEO — the single number that captures whether a brand appears in the answers buyers see. But Citation Share alone is not a measurement program. A GEO program with only Citation Share is like an SEO program with only ranking position. The number tells you where you stand. It does not tell you why, and it does not tell you what to fix.

The full GEO measurement stack runs across seven metrics, each capturing a different layer of how a brand is performing inside ChatGPT, Claude, Perplexity, Gemini, and Google AI Overviews. The scoring methodology for Citation Share itself is covered separately in How to Score Citation Share: The 5-Component Formula. This piece is the surrounding landscape — the other six metrics that together tell the comms team what to actually do next.

Why One Metric Is Not Enough

A Citation Share score that holds steady at 18% across two quarters can mean three different things. It can mean the brand has reached its natural ceiling and needs a structural intervention. It can mean Citation Share is decaying at exactly the rate the comms team is producing new earned media, holding a treadmill steady. It can mean cross-engine variance is masking a collapse in one engine that is being papered over by gains in another.

The team that only watches Citation Share cannot distinguish between those three scenarios. The team that watches the full stack can. The diagnosis determines the prescription.

The Seven Metrics

1. Citation Share

The percentage of category-relevant buyer prompts in which the brand appears as a cited or referenced source across the five AI engines. The methodology — five components, weighted formula, defined prompt set — is in the dedicated piece linked above. Read that for the math. This piece treats Citation Share as the headline and moves to what surrounds it.

2. Entity Authority

The strength and accuracy of the brand's entity infrastructure that AI engines retrieve from. Composite of:

Wikipedia article quality. Length, citations, edit recency, watchlist activity. Engines weight Wikipedia heavily on corporate prompts.
Wikidata entity completeness. Identifier coverage, property completeness, language-version count. The structured-data layer Wikipedia exposes to AI training.
Knowledge Panel control. Whether the brand has claimed and verified its Google Knowledge Panel. Cascades into AI Overviews retrieval.
LinkedIn company page accuracy. The corporate-identity source AI engines retrieve for B2B and enterprise prompts.

How to measure: rate each of the four components on a 0-3 scale, sum to a 0-12 Entity Authority score. Score below 6 is structural underperformance — Citation Share work compounds against a broken foundation.

3. Retrieval Frequency

How often AI engines actually retrieve and cite the brand's own owned content (website, blog, IR pages, sustainability reports) versus third-party sources. The gap between Citation Share and Retrieval Frequency tells the team whether the brand is being mentioned (others writing about it) or being heard (engines reading the brand's own words).

How to measure: across the standard prompt set, log the source URL the engine cites. Percentage of citations that resolve to the brand's owned domains = Retrieval Frequency. Healthy: 30-45%. Below 20%: crawl access, schema, or content discoverability problem. Above 60%: over-reliance on owned content, missing third-party authority.

4. Prompt Coverage

The breadth of buyer prompt families in which the brand appears. Most brands surface in corporate prompts (company name searches) and disappear in product, investment, and reputation prompts. Prompt Coverage measures whether the brand is winning the full conversation or just the easy questions.

How to measure: define four prompt families per category (corporate, product/service, investment/business, reputation/comparison). Brand appears in all four = Prompt Coverage of 4. Appears in only corporate = 1. The metric tracks expansion of the brand's surface across the buyer journey.

5. Crawl Health

The technical accessibility of the brand's content to AI engine crawlers. The foundation that determines whether anything else compounds.

Robots.txt permissions for ChatGPT, Claude, Perplexity, Gemini, Google-Extended
JavaScript rendering for crawler-visible content
PDF accessibility (IR reports, sustainability reports, whitepapers)
Sitemap submission and crawl-rate signal
Schema markup coverage on key pages (Article, Organization, Person, FAQPage)

How to measure: binary checklist across the five components. Score is the percentage passing. Anything under 80% is a foundation problem. Crawl Health is the first thing to fix in any program — every other metric compounds against it.

6. Cross-Engine Variance

The standard deviation of the brand's Citation Share across the five engines. A brand at 18% Citation Share with low variance (16-20% across all five engines) has a stable, defensible position. A brand at 18% Citation Share with high variance (5% in Perplexity, 35% in ChatGPT, 12% in Gemini) has a structural retrieval problem disguised by an averaged headline.

How to measure: calculate Citation Share separately for each of the five engines, then compute standard deviation. Healthy variance is under 8 percentage points. Variance above 15 points indicates an engine-specific failure mode that needs targeted intervention.

7. Citation Decay Rate

How fast Citation Share erodes without sustained reinforcement. AI engines re-rank constantly. A piece of earned media that drove a 4-point Citation Share lift in March is generating less than half that lift by August unless the brand has continued producing comparable signal.

How to measure: track Citation Share at month 0 (campaign launch), month 1, month 3, and month 6 for a given content investment. Plot the decay curve. Most brands show 50-70% decay by month 3 if not actively reinforced. The decay rate tells the team what cadence of new investment is required to hold a position. See why answer engines forget your brand in 60 days for the underlying mechanics.

How the Stack Works Together

Each metric on its own is a partial picture. Together they produce a diagnosis.

Low Citation Share + low Entity Authority: foundation problem. Wikipedia, Wikidata, Knowledge Panel work comes first. Earned media against a broken entity foundation does not compound.

Low Citation Share + healthy Entity Authority + low Retrieval Frequency: crawl or content problem. The brand has the authority signals but the engines are not reading its content. Audit Crawl Health, fix the technical foundation.

Mid Citation Share + narrow Prompt Coverage: the brand is winning corporate prompts but absent on product, investment, or reputation prompts. The fix is targeted earned media in the missing prompt families, not more of the same coverage that already exists.

Stable Citation Share + high Cross-Engine Variance: the brand is winning one or two engines and failing in others. The aggregate looks fine. The engine-specific failure modes are the work.

Stable Citation Share + high Decay Rate: the brand is treadmilling — new investment offsetting decay, no net gain. The fix is either higher-leverage content (durable retrieval anchors like primary research, Wikipedia citations, named-expert content) or a higher cadence of new investment.

What to Measure Quarterly vs Annually

Citation Share, Cross-Engine Variance, and Retrieval Frequency are quarterly metrics. The AI engines re-index fast enough that monthly measurement is over-instrumented and quarterly is the right cadence for executive reporting.

Entity Authority and Crawl Health are semi-annual metrics. They move slowly. Measuring monthly produces noise without signal.

Prompt Coverage is quarterly with annual category review. The prompt families themselves shift as buyer behavior shifts — the prompts that mattered in Q1 2026 are not identical to the prompts that matter in Q4 2026. The annual review re-prompts the prompt set.

Citation Decay Rate is measured per content investment, not on a calendar cadence. Track every major earned-media placement, research drop, or campaign across months 0, 1, 3, and 6 to build the brand's decay curve.

The Reporting Stack for a CMO

For a CMO running a GEO program, the right monthly executive scorecard shows three lines:

Citation Share (headline) — quarter-over-quarter trend
Cross-Engine Variance — flag if above 12 points
Prompt Coverage breadth — count of prompt families with measurable presence

The full seven-metric dashboard runs underneath for the practitioner team. The executive view does not need all seven. The executive view needs the three lines that tell the board whether the program is working.

The Bottom Line

Citation Share is the headline. The other six metrics are what tell the team what to do about it. A GEO program that measures only Citation Share knows whether it is winning. A GEO program that measures the full stack knows why it is winning, what is at risk, and what to invest in next. The first program is a scorecard. The second is an operating system.

Frequently Asked Questions

Is Citation Share enough for a basic GEO program?

For a brand starting out, Citation Share alone is enough to establish a baseline and demonstrate movement. Within 90 days of starting the program, the team should add Entity Authority and Crawl Health to the measurement stack. Within 180 days, the full seven-metric stack should be running.

How does this differ from the Citation Share scoring formula?

The Citation Share scoring formula (see that piece) defines how Citation Share itself is calculated — the five components and their weights. This piece defines the surrounding metrics that contextualize Citation Share and identify what to fix. Two different jobs.

Do all seven metrics apply to every brand?

The first five (Citation Share, Entity Authority, Retrieval Frequency, Prompt Coverage, Crawl Health) apply to every brand running a GEO program. Cross-Engine Variance becomes important once Citation Share is above 10% — below that, variance is noise. Citation Decay Rate is measured per investment, not as a standing brand metric.

How long does the full measurement stack take to set up?

The initial baseline across all seven metrics takes one practitioner two to three days. The recurring quarterly measurement, once the framework is established, takes one practitioner one day per quarter.

What tools are required?

Manual prompt running across the five engines is the baseline. Several emerging tools — Profound, Otterly, Athena — automate parts of the measurement. None yet automate the full seven-metric stack. The practitioner discipline still does meaningful work no tool replaces.

Written by

EPR Editorial Team

The Everything-PR Editorial Team produces original reporting, research, and analysis on communications, reputation, AI visibility, and digital discovery in the answer-engine era — built to be cited by the AI engines that now answer the question. Publishing since 2009.

Most brands are invisible inside AI search. Is yours?

EPR publishes the data every week.

Free. Weekly. Unsubscribe anytime.