What scoring model does Rankio use?

Rankio uses a weighted linear scoring model. Each prompt–model pair produces a raw score from 0 to 1 for every metric. These are multiplied by their category weight, summed, and scaled to 0–100. The weights were calibrated against real-world referral traffic from AI-powered search.

How often are the scoring weights updated?

Scoring weights are re-calibrated quarterly based on correlation analysis between Visibility Score changes and observed traffic/conversion outcomes from Rankio customers. Any weight change is documented and applied prospectively (historical scores are not retroactively altered).

Can I audit the raw data behind my score?

Yes. Every Rankio analysis shows the full raw AI response alongside the extracted metrics. You can verify exactly what each model said and how each metric was scored. Rankio is built on a principle of full auditability.

Methodology — How Rankio Calculates Your AI Visibility Score

Q: What are the limitations of Rankio's methodology?

Key limitations include: LLM non-determinism (same prompt can yield different answers), model version changes, difficulty distinguishing parametric vs. retrieval-based knowledge, prompt set coverage, geographic/language variance, and the inability to measure AI traffic referrals directly. Rankio mitigates these with multi-sample averaging, version logging, and full transparency on raw data.

Rankio calculates the Overall Score (0–100) as a two-tier composite: Content Quality (80%) blended with GEO Readiness (20%). Content Quality is computed from 7 weighted metrics per AI response: Presence (25%), Citation quality (20%), Position (15%), Recommendation strength (15%), Sentiment (10%), Cross-model consistency (10%), and Frequency (5%). GEO Readiness is the automated GEO Content Audit score that checks 10 structural elements AI models need to cite a page. The model is calibrated against real referral traffic from AI search.

Rankio sends the same prompts to multiple AI models, parses each response for brand citations, scores 7 content quality dimensions, then blends with a GEO Readiness score (10 structural checks) into a single 0–100 composite. Weights are calibrated quarterly against real traffic outcomes. Every data point is auditable.

Score calculation

How Rankio calculates the Visibility Score

The Visibility Score answers one question: how visible is your brand when people ask AI models about your market? It is computed in five steps:

Prompt generation

You provide a brand name, URL, or topic. Rankio generates (or you define) a prompt set that covers four intent types: discovery, comparison, branded, and intent-based queries. A minimum of 30 prompts is recommended for statistical reliability.

input: brand / URLoutput: prompt set

Multi-model querying

Every prompt is sent simultaneously to ChatGPT (OpenAI), Gemini (Google), Claude (Anthropic), and Perplexity. Responses are captured with timestamp, model version, and full raw text. The same prompt set is used across all models to ensure comparability.

models: 4output: raw responses

Response parsing

Each response is analyzed for brand mentions (exact match, fuzzy match, entity recognition, URL detection), citation position, sentiment, and recommendation language. This produces a per-prompt, per-model metric vector.

detection: 4 methodsoutput: metric vectors

Metric scoring

Raw detections are converted into normalised scores (0–1) for each of the 7 metric categories (see Scoring Model below). Each score is multiplied by its category weight.

7 categoriesoutput: weighted scores

Aggregation

Weighted scores are summed per prompt–model pair, then averaged across all prompts and models to produce the final Visibility Score (0–100). The same pipeline, applied to competitors, produces your AI Share of Voice.

output: Visibility Score 0–100

Scoring model

The scoring model: Content Quality + GEO Readiness

Rankio uses a two-tier composite scoring model. The first tier — Content Quality (80% of the final score) — is a weighted linear model of 7 metrics. The second tier — GEO Readiness (20%) — is the GEO Content Audit score measuring how well the page is structured for AI extraction.

Tier 1: Content Quality (80%)

For each prompt–model pair, every metric category produces a normalised score between 0 and 1. These scores are multiplied by their weight, summed, and scaled to 0–100.

Metric	Weight	What it measures	How it is scored (0→1)
Presence	25 %	Is the brand mentioned at all in the AI response?	Binary: 1.0 if mentioned, 0.0 if not. Averaged across the full prompt set.
Citation quality	20 %	How is the brand referenced?	URL citation = 1.0 · Direct name mention = 0.6 · Contextual/implied = 0.3
Position	15 %	Where in the response does the brand appear?	1st position = 1.0 · 2nd = 0.7 · 3rd+ = 0.4
Recommendation	15 %	Is the brand actively endorsed by the AI?	"We recommend" / "top choice" = 1.0 · "is an option" = 0.4 · No rec. = 0.0
Sentiment	10 %	What is the tone of the mention?	Positive = 1.0 · Neutral = 0.5 · Negative = 0.1
Consistency	10 %	Does the brand appear across multiple models and prompt types?	Ratio of (model × prompt) pairs where brand is cited vs. total pairs
Frequency	5 %	How many times is the brand mentioned in a single response?	1 mention = 0.5 · 2 = 0.7 · 3+ = 1.0

Score formula

The final Overall Score is a two-tier composite that blends Content Quality (the 7 metrics above) with GEO Readiness (the GEO Content Audit score):

Composite formula

Overall Score = Content Quality × 0.80 + GEO Readiness × 0.20

Where Content Quality = Σ (metric_i × weight_i) × 100 for the 7 metrics above.

And GEO Readiness = the overall GEO score (0–100) from the automated GEO Content Audit that checks 10 structural elements (Direct Answer, tables, FAQ, JSON-LD, headings, lists, internal links, meta description, entity clarity, TL;DR).

Component	Weight	Source
Content Quality	80 %	7-metric weighted score (Presence, Citation quality, Position, Recommendation, Sentiment, Consistency, Frequency)
GEO Readiness	20 %	10-element GEO Content Audit score (Direct Answer, TL;DR, tables, FAQ, headings, lists, JSON-LD, internal links, meta description, entity clarity)

Score example

Metric	Raw score	Weight	Contribution
Presence	1.0	× 0.25	0.25
Citation quality	0.6 (name mention)	× 0.20	0.12
Position	0.7 (2nd position)	× 0.15	0.105
Recommendation	0.4 (listed, not endorsed)	× 0.15	0.06
Sentiment	1.0 (positive)	× 0.10	0.10
Consistency	0.67 (2 of 3 models)	× 0.10	0.067
Frequency	0.7 (2 mentions)	× 0.05	0.035
Content Quality subtotal			0.737 → 74/100

Composite calculation

Content Quality = 74 · GEO Readiness = 62 (page has tables and headings but lacks Direct Answer and FAQ JSON-LD)

Overall Score = 74 × 0.80 + 62 × 0.20 = 59.2 + 12.4 = 72/100

Weight calibration

Weights are re-calibrated quarterly using a correlation analysis between Visibility Score deltas and observed referral traffic and conversion outcomes from consenting Rankio customers. Key principles:

Calibration is always prospective — historical scores are never retroactively altered
Every calibration change is documented with a version number and effective date
A minimum of 500 prompt–model samples is required before a weight change is applied
The current weight set (v2.0) has been effective since January 2026

Limitations

Known limitations

No measurement system is perfect. We document our constraints so you can interpret results correctly.

Limitation	Impact	How Rankio mitigates
LLM non-determinism	The same prompt can produce different answers on different runs	Multi-sample averaging; score confidence intervals on repeated runs
Model version changes	Provider updates (e.g. GPT-4 → GPT-4o) shift response patterns	Model version is logged with every analysis; version-tagged trend charts
Parametric vs. retrieval knowledge	Cannot always tell if a citation comes from training data or live search	Perplexity always uses retrieval; GPT/Gemini are flagged as mixed-source
Prompt set coverage	Score reliability depends on the breadth of prompts tested	Minimum 30 prompts recommended; coverage warnings when set is too small
Geographic & language variance	AI responses vary by inferred locale	Default locale is clearly documented; multi-locale support on roadmap
No direct AI traffic measurement	Rankio measures visibility (citations), not downstream clicks from AI answers	Correlations with referral traffic are used for calibration; UTM integration planned
Correlation ≠ causation	Score changes may reflect competitor movements or model updates, not just your actions	Always track competitors in parallel via SOV benchmarking

What Rankio does not measure

Click-through from AI answers — AI interfaces don't expose referral data consistently. Rankio measures citation presence, not downstream clicks.
Private/enterprise AI deployments — Custom GPTs or internal model deployments are not covered.
Image/video visibility — Current analysis is text-only. Multimodal analysis is on the roadmap.

Transparency

Why we publish our methodology

AI visibility is a new discipline. There is no Google Search Console for LLMs. When measurement is new, trust is earned through transparency — not through black-box scores.

Every Rankio analysis includes:

The full raw AI response for every prompt × model pair
Per-metric breakdown showing how each score was derived
Model version, timestamp, and prompt text
Exportable data for independent analysis

We believe that if you can't audit a score, you shouldn't trust it. This principle drives every design decision in Rankio.

Concept	Definition	Why it matters
Overall Score	Two-tier composite (0–100): Content Quality (80%) + GEO Readiness (20%)	Single reliable number that captures both citation strength and structural readiness
Content Quality	Weighted linear model of 7 metrics (Presence, Citation quality, Position, etc.)	Measures how AI models actually respond to your brand
GEO Readiness	10-element audit score from the GEO Content Audit	Measures how well your page is structured for AI extraction
Non-determinism	LLMs can produce different outputs for the same input	Explains why scores fluctuate slightly between runs
Prospective calibration	Weight changes apply going forward, never retroactively	Historical scores remain comparable over time
Full auditability	Every score traces to a raw AI response you can read	You never have to trust a black box

FAQ

Frequently asked questions

The Visibility Score (0–100) is a weighted composite of 7 metric categories: Presence (25%), Citation quality (20%), Position (15%), Recommendation strength (15%), Sentiment (10%), Consistency (10%), and Frequency (5%). Each is measured per prompt and per AI model, then averaged across the full set.

A weighted linear model. Each prompt–model pair produces a raw 0–1 score per metric. These are multiplied by their weight, summed, and scaled to 0–100. Weights are calibrated quarterly against real-world referral traffic from AI search.

LLM non-determinism (same prompt can yield different answers), model version changes, difficulty distinguishing parametric vs. retrieval knowledge, prompt set coverage, geographic/language variance, and the inability to measure AI click-through directly. Rankio mitigates these with multi-sample averaging, version logging, and full data transparency.

Quarterly, based on correlation analysis. Changes are always prospective — historical scores are not retroactively altered. Every calibration is versioned and documented.

Yes. Every analysis shows the full raw AI response, the per-metric breakdown, model version, timestamp, and prompt text. Data is exportable for independent analysis.

ChatGPT (OpenAI), Gemini (Google), Claude (Anthropic), and Perplexity. Each model is queried with the same prompt set for fair comparison. New models are added as they gain significant user adoption.

See the methodology in action

Run your first analysis and explore the raw data, per-metric breakdown, and competitive benchmarks behind your Visibility Score.

Get started Book a demo

How Rankio Calculates Your AI Visibility Score