How long does an AI visibility audit take?

A full 60-80 prompt audit across the six engines takes two to four hours. A 25-prompt rapid subset audit can be completed in under an hour. The audit should be run quarterly, with monthly tracking for active AI Communications programs.

How many prompts should the audit cover?

60-80 prompts produce a defensible baseline. 25 prompts produce a rapid scan. The 60-80 set should be distributed evenly across three buckets: category queries, problem/solution queries, and comparative queries, plus 5-10 expert/source queries.

Why measure across all six engines?

The six engines pull from different indices, weigh different signals, and serve different audiences. A brand can have 35% Citation Share on Perplexity and 8% on ChatGPT. Single-platform measurement leaves you optimizing for the wrong engine.

How is Citation Share different from share of voice?

Share of voice measures brand presence across earned media coverage versus competitors. Citation Share measures brand presence across AI engine answers versus competitors. Both are share-based metrics requiring a defined query set. The difference is the surface: media versus answer engines.

How is Citation Share different from a Reputation Index?

Citation Share measures whether your brand appears. A Reputation Index measures presence plus accuracy, sentiment, completeness, consistency, and control. Citation Share is the share-of-voice equivalent; the Reputation Index is the brand health equivalent.

What do I do with AI audit results?

A completed audit produces three outputs: a Citation Share baseline by platform and query category, a source attribution map, and a characterization accuracy assessment. These three outputs define the first 90 days of any AI Communications program.

Can the audit be run in-house?

Yes. A mid-level marketer or analyst can run a full audit in two to four hours using a spreadsheet. The audit requires no proprietary tools. Agency-led audits typically run $15K-$50K for a category audit and $25K-$100K for a full Reputation Index build.

How often should the audit be run?

Quarterly is the minimum cadence. Monthly is appropriate for brands with active AI Communications programs where investment is being made and progress is being tracked. Annual is too infrequent.

What is the single most valuable output of the audit?

The source attribution map. Counting appearances tells you the score; mapping which sources drive those appearances tells you the playbook. The source attribution map turns audit data into program direction.

The AI Visibility Audit: How to Measure Citation Share in 5 Steps (2026)

Updated June 2026. Citation Share is the percentage of relevant AI-generated answers in which your brand appears. It is the answer-engine successor to share-of-voice — and most brands have never measured it. This is the five-step audit framework that establishes a defensible Citation Share baseline across ChatGPT, Claude, Perplexity, Gemini, Google AI Overviews, and Microsoft Copilot. Two to four hours of work. The most strategically valuable data your marketing organization can collect in 2026.

You Cannot Improve What You Have Not Measured

That is the foundational principle of every effective marketing program — and it applies with equal force to AI visibility. Before a brand can improve its Citation Share, it needs to know what that share currently is, where it ranks against competitors, and which sources are driving — or failing to drive — its appearances inside the answer engines.

Most brands have not measured this. Most do not know where they stand inside ChatGPT, Claude, Perplexity, Gemini, Google AI Overviews, or Microsoft Copilot. Marketing teams that spent two decades measuring share of voice in earned media now operate blind in the answer-engine era. That is a strategic gap. The audit framework here closes it.

The 2026 AI Visibility Audit at a Glance

What it measures	Citation Share — the percentage of relevant AI answers where your brand appears
Engines tracked	ChatGPT, Claude, Perplexity, Gemini, Google AI Overviews, Microsoft Copilot (six engines)
Prompt set size	60–80 prompts (rigorous baseline); 25 prompts (rapid subset)
Time required	2–4 hours for full audit; under 1 hour for subset
Cadence	Quarterly baseline; monthly tracking for active programs
Outputs	Citation Share baseline, source attribution map, characterization accuracy assessment
EPR Citation Share Index methodology	Citation Frequency 40% · Cross-Engine Breadth 20% · Query-Type Breadth 20% · Extractability 15% · Crawl Access 5%
Cost (in-house)	$0 hard cost; one mid-level marketer or analyst
Cost (agency-led)	$15K–$50K for a category audit; $25K–$100K for the full Reputation Index build

What Is Citation Share?

Citation Share is the percentage of relevant AI-generated answers in which your brand is mentioned, cited, or recommended. It is calculated as: (prompts where your brand appears) ÷ (total prompts run) × 100.

Citation Share is the answer-engine successor to share-of-voice in earned media. Where share of voice measured how often your brand appeared across media coverage versus competitors, Citation Share measures how often your brand appears across AI engine answers versus competitors. Both are share-based metrics. Both compare your presence to a competitive set. Both require a defined query set — but the query set for Citation Share is what buyers ask AI engines, not what publishers cover.

Citation Share is platform-segmentable. A brand can have 35% Citation Share on Perplexity and 8% on ChatGPT. The composite tells you total presence; the per-platform breakdown tells you where you have strength and where the biggest gaps are. The per-query-category breakdown tells you whether you appear on definitional queries, comparison queries, or recommendation queries — three very different competitive games.

Why Most Brands Have Not Measured This

Three reasons. First, the answer engines are new. ChatGPT launched in November 2022. Perplexity launched in 2022 as a niche tool and became consumer-mainstream in 2023–2024. Google AI Overviews launched broadly in 2024. Microsoft Copilot took its current form in late 2023. There has not been time for measurement frameworks to standardize the way they did for digital advertising or earned media.

Second, the major media measurement vendors are still building their AI visibility products. Brandwatch, Cision, Meltwater, and Muck Rack all have AI visibility offerings in some form — but the methodologies vary, the data is often opaque, and most CMOs do not yet know what to ask for.

Third — and most importantly — the audit framework that follows can be run in-house with a spreadsheet and four hours. Brands have not measured it because no one told them how. The framework below removes that excuse.

The Five-Step AI Visibility Audit

Step 1: Build Your Prompt Inventory

The prompt inventory is the foundation of an AI visibility audit. It is a structured set of queries that represent the questions buyers in your category actually ask AI engines — not branded queries, but category and problem queries. A prompt set built only of branded queries ("Tell me about [our brand]") will return your own brand every time and tell you nothing useful. The audit needs to capture how engines answer the questions buyers ask before they know your brand exists.

Build 60–80 prompts across three buckets. A 25-prompt rapid subset works for an initial scan; the full 60–80 produces a statistically defensible baseline.

Bucket 1: Category queries. "What are the leading [category] firms?" "Who are the best [discipline] agencies?" "Which companies dominate [category] in the U.S.?" These surface which brands the engine sees as category leaders. They are the most competitive queries — and the most strategic.

Bucket 2: Problem/solution queries. "How do I [solve specific problem]?" "Which brands are known for [capability]?" "What's the best approach to [challenge]?" These surface which brands are associated with expertise in buyer pain points. They are often where mid-market brands win Citation Share against larger competitors with weaker thought leadership.

Bucket 3: Comparative queries. "[Brand A] vs [Brand B]." "What's the difference between [approach X] and [approach Y]?" "How do [Category Leader] and [Challenger] compare?" These reveal how engines characterize your brand relative to competitors. They are also the queries where mischaracterizations show up most clearly.

Two practical tips. First, distribute prompts evenly across the three buckets — roughly 20–27 each in a 60–80 prompt audit. Second, include 5–10 "expert/source" prompts — queries like "Who are the leading experts on [topic]?" or "Which research firms cover [category]?" These surface whether your executives and your research are being cited as authority sources, which is a different competitive game than brand citation.

Step 2: Run the Prompts Across All Six Platforms

Run your full prompt inventory across the six engines that matter in 2026: ChatGPT, Claude, Perplexity, Google AI Overviews, Gemini, and Microsoft Copilot. Each engine has a different retrieval model, a different source bias, and a different generation style — and the answers diverge significantly.

A brand that dominates Perplexity may be absent in ChatGPT. Claude often pulls from more academic and editorial sources than ChatGPT. Google AI Overviews favors Google-owned properties (YouTube, Maps, Knowledge Panel sources). Microsoft Copilot weighs LinkedIn, Wikipedia, and GitHub heavily. Gemini pulls from a Google-leaning index with its own knowledge graph. The composite view across all six is what matters for strategy — single-platform measurement leaves you optimizing for the wrong engine.

Record results in a structured spreadsheet with these columns: platform, prompt, prompt category, brands mentioned (full list, in order), your brand mentioned (Y/N), characterization of your brand (1–2 sentence summary), and sources cited (URLs where the engine provides them).

For platforms that do not show inline source citations (ChatGPT default mode, Claude default mode), follow up the prompt with: "Cite the sources that informed your answer" — this surfaces the underlying retrieval and dramatically improves the source attribution map you build in Step 4.

Step 3: Calculate Your Baseline Citation Share

Citation Share = (prompts where your brand is mentioned) ÷ (total prompts run) × 100.

Run this calculation three ways:

Overall Citation Share — your composite presence across all prompts and all platforms.
Citation Share by platform — your presence broken out for each of the six engines.
Citation Share by query category — your presence on category queries, problem/solution queries, and comparative queries.

The per-platform breakdown is often more strategically useful than the aggregate — it tells you where you have strength (sustain it) and where the biggest gaps are (invest there). The per-query-category breakdown tells you whether you appear on definitional queries (where Wikipedia and category leadership matter), on problem queries (where thought leadership wins), or on comparative queries (where review sites, analyst reports, and side-by-side comparison content drive the result).

For competitive context, the AI Platform Citation Source Index 2026 maps which sources drive citation across 50 domains. The Who Controls AI Answers franchise tracks Citation Share by category — pharma, beauty, fintech, hospitality, and others. Use these as benchmarks for what competitive Citation Share looks like in your category.

Step 4: Audit the Source Attribution

This is the highest-leverage step. For every prompt where your brand appears, note which sources the engine cites. Sources that appear frequently are your current retrieval anchors — the third-party properties that AI engines reach for when grounding an answer about your brand.

Sort the source-attribution data three ways:

Which of your owned properties surface? Your homepage, About page, blog, press room, executive bios. Owned property retrieval is the floor — if these don't appear, the retrieval system is not crawling or weighting your owned content effectively.
Which earned/third-party properties surface? Trade publications, business press, analyst reports, Wikipedia, LinkedIn, industry research firms. These are the high-authority anchors. The pattern of which earned properties surface tells you which media relationships are actually driving retrieval.
Which platforms cite which sources? ChatGPT may favor recent news; Claude may favor longer-form analysis; Perplexity may favor research papers and structured data; Microsoft Copilot may favor Wikipedia and LinkedIn. The platform-by-platform source pattern tells you where to focus earned media investment for which engine.

For every prompt where your brand does NOT appear but should, note which brands do appear and which sources drive their citations. This competitor source map is the starting brief for your next earned media program. If a competitor consistently appears because of mentions in WSJ, Forbes, and industry research firm X, you now know exactly which placements close the gap.

Step 5: Map the Characterization Accuracy

Citation Share measures whether you appear. Characterization measures how you appear. A brand can have 40% Citation Share and still be mischaracterized in half its appearances — described as a niche player when it is a category leader, associated with a discontinued product line, or compared against an outdated peer set. Citation without accurate characterization is a partial win at best, a brand reputation risk at worst.

For every prompt where your brand appears, assess three dimensions:

Category positioning — is the engine describing you correctly? If you are a financial communications firm and the engine describes you as a "generalist PR shop," that is a category-positioning problem.
Capability representation — are your capabilities accurately represented? If you have a strong AI Communications practice and the engine never mentions it, that is a capability-gap problem.
Sentiment — are you presented as a leader, a niche player, a challenger, or a historical reference? Engines often default to historical characterizations if the most-cited content about your brand is older.

Characterization audits often surface the highest-impact corrective opportunities in an AI Communications program — because mischaracterizations are usually traceable to one or two stale, high-authority sources that the engines have anchored on. Fixing the source fixes the characterization.

Sample Prompt Sets by Industry

The prompt inventory is the audit. Below are sample prompts for four industries to show the structure. Each represents one of the three buckets — category, problem/solution, comparative.

Industry	Category Query	Problem/Solution Query	Comparative Query
Public Relations	"What are the leading PR firms in the U.S.?"	"How do I handle a crisis on social media?"	"Edelman vs Weber Shandwick — which is better for corporate reputation?"
Cybersecurity	"Best endpoint security platforms for enterprise"	"How do I protect against ransomware in a hybrid workforce?"	"CrowdStrike vs SentinelOne for endpoint protection"
Asset Management	"Top alternative asset managers globally"	"How do family offices allocate to private credit in 2026?"	"Apollo vs Blackstone for private credit"
Beauty & Wellness	"Best clean skincare brands for sensitive skin"	"What's the best routine for adult acne?"	"Drunk Elephant vs Tatcha — which is better for anti-aging?"

What Each of the Six Engines Tells You

The six AI engines tracked in a 2026 audit are not interchangeable. Each pulls from a different index, weighs different signals, and is used by different audiences.

ChatGPT — the largest active user base (400M+ weekly active users). Default for most consumer-side AI research. Retrieval mixes its training corpus with Bing search for real-time queries.
Claude — strong in long-form, document-heavy, professional use. Anthropic's models often pull from more academic and analytical sources. Used heavily in legal, professional services, and enterprise knowledge work.
Perplexity — purpose-built as an answer engine. Most prominent source citations of any of the six engines. Used by researchers, journalists, and analysts as a citation discovery tool.
Gemini — Google's flagship AI, embedded across Google Workspace, Android, and Google products. Pulls from a Google-leaning index with strong YouTube and Maps integration.
Google AI Overviews — the AI summary now appearing above Google's traditional search results for many queries. Same retrieval index as Google Search; different generation surface.
Microsoft Copilot — Bing-powered, embedded across Microsoft 365 and Windows 11. Heavy weighting on Microsoft-owned properties (LinkedIn, Wikipedia, GitHub). Most strategically important for B2B enterprise visibility.

A brand's Citation Share strategy should weight engines by where its buyers actually research. A B2B enterprise software company should weight Copilot heavily. A consumer beauty brand should weight ChatGPT and Perplexity. An asset manager should weight Claude and Copilot (where the buyer audience works). Audit all six, but interpret weighted to the buyer.

Common Mistakes in AI Visibility Audits

Branded prompts only. "Tell me about [our brand]" returns your brand every time and measures nothing. The audit must use category and problem queries — the way buyers actually search.
Single-platform measurement. Measuring only ChatGPT misses 80% of the picture. The six engines diverge significantly. Aggregate is what matters.
No competitive comparison. Citation Share is a share metric — it only makes sense relative to a competitive set. The audit must record which competitor brands appear when yours does and when yours doesn't.
Skipping source attribution. Counting appearances without auditing the underlying sources tells you the score but not the playbook. Source attribution is what turns audit data into a program.
One-and-done. Citation Share moves. Engines update their training data. New competitors emerge. Quarterly is the minimum cadence for an active program; monthly is better when investment is happening.
Confusing Citation Share with Reputation. Citation Share measures presence. Reputation measures presence plus accuracy, sentiment, completeness, consistency, and control. The two are related but distinct frameworks. Citation Share is the share-of-voice equivalent; Reputation is the brand health equivalent.

What to Do With the Results

A completed AI visibility audit produces three outputs:

A baseline Citation Share — by platform, by query category, and overall. This is the number you measure against quarterly going forward.
A source attribution map — which sources drive your citations, which sources drive competitor citations, and where the source gaps are. This is the earned media program brief.
A characterization accuracy assessment — where the engines are describing your brand correctly, where they are mischaracterizing, and which stale sources are anchoring those mischaracterizations.

These three outputs define the first 90 days of any AI Communications program. They tell you where to invest, which sources to target, which mischaracterizations to correct, and which engines to prioritize. Run the audit again in 90 days. Compare. The brands that build this measurement discipline are compounding AI visibility advantage. The brands that don't are guessing — and competing in a market where their competitors are not guessing.

Key Takeaways

Citation Share is the percentage of relevant AI answers in which your brand appears. It is the answer-engine successor to share-of-voice in earned media.
The audit framework: 60–80 prompts, six engines (ChatGPT, Claude, Perplexity, Gemini, Google AI Overviews, Microsoft Copilot), two to four hours, in-house executable.
Five steps: (1) build the prompt inventory across category, problem, and comparative queries; (2) run across all six engines; (3) calculate Citation Share by platform and category; (4) map source attribution; (5) assess characterization accuracy.
The audit produces three outputs: a Citation Share baseline, a source attribution map, and a characterization accuracy assessment. These define the first 90 days of any AI Communications program.
Per-platform Citation Share is often more strategically useful than aggregate — it tells you where the gaps are.
Source attribution is the highest-leverage output. Counting appearances gives you the score; mapping sources gives you the playbook.
Characterization matters as much as appearance. A brand mischaracterized in half its appearances has a brand-reputation problem, not just a visibility problem.
Quarterly cadence is the minimum. Monthly is better for active programs.
Brands building this measurement discipline are compounding AI visibility advantage. Brands skipping it are guessing in a market where their competitors are not.

The 5-Step AI Visibility Audit