Methodology

Reporting Signal Evaluation Framework

How community reports are counted, weighted, and turned into classifications - including thresholds, confidence levels, risk assessment, trend analysis, and mixed-signal handling.

Last Updated:
← Methodology Overview

Classification Categories

When submitting a report, contributors select an interaction category that reflects their experience. The six standard categories are:

These categories reflect contributor-selected assessments based on individual call experiences.

Minimum Report Threshold

Every number page displays submitted reports regardless of volume. However, signal reliability varies significantly with report count:

Consensus Determination

A number's displayed classification reflects the majority vote across all qualifying reports. The process runs as follows:

  1. Report collection - reports enter through the reporting pipeline, which handles normalisation, de-duplication, and abuse screening before any report enters the dataset
  2. Equal weighting - each report contributes one vote for its selected caller type; no report carries more weight than another regardless of submission timing or reporter history
  3. Majority classification - the category with the most votes becomes the primary classification displayed on the number's page. If two categories share equal vote counts, the higher-risk category takes precedence
  4. Summary indicator - the displayed rating summarises the dominant category, total report count, and activity tier. It is derived from the vote distribution described above, not from a separate scoring system

Confidence Levels & Risk Assessment

Each number page displays two independent indicators: a confidence level (based on report volume) and a risk level (based on vote distribution). These are computed separately and serve different purposes.

Confidence Levels

Confidence reflects how much reporting evidence supports the assessment. Higher volume produces more stable classifications.

Confidence Report Volume What It Means
Limited 1–2 reports Very limited reporting volume. Treat as an early signal only.
Emerging 3–5 reports Based on limited reporting volume. Pattern may change as more reports arrive.
Moderate 6–15 reports Based on moderate reporting volume. Classification is becoming more reliable.
High 16+ reports Strong reporting volume supports the assessment.

Risk Levels

Risk levels are derived from the vote distribution - specifically, which category holds the majority and what percentage it represents.

Risk Level Condition What It Means
Preliminary Signal ≤2 reports Not enough reports for a risk assessment yet. Early community signals only.
Low Risk Legitimate ≥60% Community reports indicate mostly safe and expected interactions.
Elevated Non-legitimate category ≥60% Community reports show a developing pattern of unwanted contact.
Emerging Risk Non-legitimate category ≥40% An emerging pattern. Classification may change as more reports come in.
Mixed Signals Top category ≤30% or 3+ categories No dominant classification. Reports reflect mixed interaction experiences.
Under Review None of the above conditions met Not enough data to determine a consistent risk classification.

Both indicators update automatically as new reports arrive. They are computed at display time from the current dataset - no manual review is involved.

Trend & Recency Analysis

For numbers with 4 or more reports, Reverseau calculates a trend indicator by comparing report frequency in two time windows:

The comparison produces one of three trend labels:

All reports remain in the dataset permanently and are visible on the number's page regardless of age. Recency context is surfaced through timestamps on individual reports and through the recently updated feed, which lists numbers with activity in the past 24 hours.

Worked Example

Suppose a number has received 7 reports with the following breakdown:

  • 5 × Scam
  • 1 × Legitimate
  • 1 × Nuisance

Step 1 - Majority vote: Scam holds 5 of 7 votes (71%). Primary classification = Scam.

Step 2 - Confidence level: 7 reports falls in the 6–15 range = Moderate confidence.

Step 3 - Risk level: Scam (non-legitimate) holds 71% (≥60%) = Elevated risk.

Step 4 - Mixed signal note: Two non-Scam reports are visible on the number page. The distribution (5 Scam / 1 Legitimate / 1 Nuisance) is shown so visitors can assess the consensus strength themselves. In this case the Legitimate report may reflect a reallocation event - the number may have been reassigned since earlier Scam reports were filed.

No manual review is triggered by this scenario. The classification updates automatically if new reports shift the vote distribution.

Mixed Classification Scenarios

Some numbers attract reports across multiple categories - a number may carry both "Scam" and "Legitimate" reports, for example. Mixed classifications are common for:

The majority category still becomes the primary classification. The full vote distribution remains visible on the number page. Mixed signals are expected behaviour, not a system error - the dataset shows what contributors reported, not a single verified truth.

What This Framework Does Not Do

The reporting signal evaluation framework:

Classifications are informational indicators built from community-reported data. They provide contextual awareness, not definitive findings.

What To Do With This Information

Classifications are informational signals, not directives. What you do with them depends on the risk level and your situation: