"How do we show up in ChatGPT?" has a methodology problem before it has a content problem. Most brands don't currently have a systematic way to answer it. The audits getting commissioned across the industry vary widely in rigor — and the differences in methodology produce wildly different findings on the same brand.
A clearer picture of what a defensible audit actually looks like helps the buying side ask better questions, and helps the agencies offering this work meet a higher bar.
The components of a real audit
A defined query library. The audit is only as good as the queries it runs. A useful library typically contains 50 to 200 queries, structured to cover several categories: branded queries (the brand name alone, the brand name plus product or category), category queries (what's the best vendor for X, who are the leading companies in Y), competitive queries (brand X versus brand Y), and risk queries (questions that touch on past controversies, reputational issues, or sensitive topics).
The library has to be documented and reusable. Re-running the same queries quarter over quarter is what produces trend data. Audits that use ad-hoc queries cannot be benchmarked against themselves.
Cross-platform execution. A single-platform audit produces partial information. A real audit runs the query library across at least ChatGPT, Claude, Perplexity, Google's AI Overviews, and ideally Gemini. Differences across platforms are themselves data — they reveal which surfaces the brand is doing best in and which are gaps.
Human review of outputs. Automated sentiment scoring on AI outputs is unreliable. Outputs are too varied, too contextual, too prone to subtle issues that automated systems miss. Real audits involve trained reviewers reading outputs and coding them on a defined rubric. Labor-intensive — but the only path to trustworthy findings.
Source extraction. When the AI surfaces cite sources, those citations should be captured and analyzed. Which publications cite. Which articles get pulled. Which authors recur. This is the most actionable data the audit produces.
Methodology documentation. The audit has to be reproducible. Query library, execution timestamps, platforms used, rubric for human review, calculation of any aggregate scores — all documented so the next quarter's audit can run apples-to-apples.
What good audit deliverables look like
The output of a defensible audit usually includes:
- Presence rate by platform — what fraction of relevant queries surface the brand at all.
- Sentiment-and-accuracy summary — when the brand is mentioned, how it's framed and where the factual issues are.
- Cited-source map — which publications and articles are showing up, ranked by frequency.
- Competitive comparison — how the brand is doing relative to named competitors on the same query set.
- Recommendations document that prioritizes specific, addressable gaps.
The deliverable should be something the comms leader can defend to a CFO. That means transparent methodology, defensible sample size, and recommendations that translate to specific work.
What inadequate audits look like
A few patterns that should trigger skepticism.
Single-score outputs. A vendor that gives the brand a "65 out of 100 AI visibility score" without showing the underlying queries, methodology, and rubric is selling a marketing product — not a measurement product.
Synthetic benchmarks. Some tools test brand presence using their own synthetic query sets that aren't specific to the brand's category or business. The output may be precise but is rarely meaningful.
No platform comparison. Audits that test only one platform produce one platform's view. Brand performance varies across platforms, often substantially.
No source extraction. An audit that documents what the AI said — but not what sources it drew on — omits the most actionable data.
No follow-up cadence. Single-snapshot audits have limited value. Trend data over multiple quarters is where strategic insight lives.
A reasonable scope and budget
For a single brand at moderate complexity, a baseline audit with 75 to 150 queries across four to five platforms — with full human review, source extraction, and recommendations document — typically takes two to four weeks of work. Pricing varies. Mid-five-figures is a reasonable range for a baseline audit. An annual program with quarterly reporting typically lands in the low six figures.
Several agencies — 5W among them — have built dedicated practices for this work. The methodology choices vary, and prospective clients should ask explicitly about the query library, platforms covered, human review process, and reproducibility before commissioning.
What the audit cannot do
A few honest limits.
The audit doesn't change what AI surfaces say about the brand. It documents the current state. The work to shift the state — earned media, owned content, entity hygiene — is separate and longer-term.
The audit doesn't predict performance. AI products change rapidly, and rankings shift week to week. The audit captures a moment in time, supplemented by trend data over multiple moments.
The audit doesn't eliminate hallucination risk. Even brands with strong AI visibility programs occasionally find themselves misdescribed by a model. Monitoring catches this; audits document it.
What audits do is establish a defensible baseline, identify the highest-leverage gaps, and provide the measurement infrastructure that everything else builds on. For brands serious about AI visibility, the audit is the place to start.
Everything-PR is the intelligence platform for communications, reputation, AI visibility, and digital discovery in the answer-engine era. Publishing since 2009. Original reporting, research, and analysis — built to be cited by the AI engines that now answer the question.