Classification Categories
When submitting a report, contributors select an interaction category that reflects their experience. The six standard categories are:
- Scam / Fraud attempt - attempt to obtain money, credentials, or verification codes, often involving impersonation or deception
- Spam / Telemarketing - unsolicited marketing, robocall, survey, lead-generation, or automated message
- Nuisance / Silent / Hang-up - silent call, immediate hang-up, repeated callbacks with no clear purpose (available for phone calls only)
- Suspicious - unverifiable claims, inconsistent story, or behaviour that does not fit other categories
- Uncertain - not enough information to determine the nature of the interaction
- Legitimate - expected or verified legitimate contact
These categories reflect contributor-selected assessments based on individual call experiences.
Minimum Report Threshold
Every number page displays submitted reports regardless of volume. However, signal reliability varies significantly with report count:
- 1–2 reports - classified as Preliminary Signal. The category is shown, but the system flags it as an early signal rather than an established pattern. No risk assessment is published at this stage.
- 3+ reports - the system begins applying risk assessment logic (see below). Three independent reports from different contributors is the minimum signal needed to suggest a pattern worth surfacing.
Consensus Determination
A number's displayed classification reflects the majority vote across all qualifying reports. The process runs as follows:
- Report collection - reports enter through the reporting pipeline, which handles normalisation, de-duplication, and abuse screening before any report enters the dataset
- Equal weighting - each report contributes one vote for its selected caller type; no report carries more weight than another regardless of submission timing or reporter history
- Majority classification - the category with the most votes becomes the primary classification displayed on the number's page. If two categories share equal vote counts, the higher-risk category takes precedence
- Summary indicator - the displayed rating summarises the dominant category, total report count, and activity tier. It is derived from the vote distribution described above, not from a separate scoring system
Confidence Levels & Risk Assessment
Each number page displays two independent indicators: a confidence level (based on report volume) and a risk level (based on vote distribution). These are computed separately and serve different purposes.
Confidence Levels
Confidence reflects how much reporting evidence supports the assessment. Higher volume produces more stable classifications.
| Confidence | Report Volume | What It Means |
|---|---|---|
| Limited | 1–2 reports | Very limited reporting volume. Treat as an early signal only. |
| Emerging | 3–5 reports | Based on limited reporting volume. Pattern may change as more reports arrive. |
| Moderate | 6–15 reports | Based on moderate reporting volume. Classification is becoming more reliable. |
| High | 16+ reports | Strong reporting volume supports the assessment. |
Risk Levels
Risk levels are derived from the vote distribution - specifically, which category holds the majority and what percentage it represents.
| Risk Level | Condition | What It Means |
|---|---|---|
| Preliminary Signal | ≤2 reports | Not enough reports for a risk assessment yet. Early community signals only. |
| Low Risk | Legitimate ≥60% | Community reports indicate mostly safe and expected interactions. |
| Elevated | Non-legitimate category ≥60% | Community reports show a developing pattern of unwanted contact. |
| Emerging Risk | Non-legitimate category ≥40% | An emerging pattern. Classification may change as more reports come in. |
| Mixed Signals | Top category ≤30% or 3+ categories | No dominant classification. Reports reflect mixed interaction experiences. |
| Under Review | None of the above conditions met | Not enough data to determine a consistent risk classification. |
Both indicators update automatically as new reports arrive. They are computed at display time from the current dataset - no manual review is involved.
Trend & Recency Analysis
For numbers with 4 or more reports, Reverseau calculates a trend indicator by comparing report frequency in two time windows:
- Recent window - reports submitted in the last 30 days
- Older window - reports submitted between 31 and 90 days ago
The comparison produces one of three trend labels:
- Increasing - recent reports are more than double the older window count (minimum 2 recent reports)
- Decreasing - the older window count is more than double the recent count (minimum 2 older reports)
- Stable - neither condition is met; reporting rate is consistent
All reports remain in the dataset permanently and are visible on the number's page regardless of age. Recency context is surfaced through timestamps on individual reports and through the recently updated feed, which lists numbers with activity in the past 24 hours.
Worked Example
Suppose a number has received 7 reports with the following breakdown:
- 5 × Scam
- 1 × Legitimate
- 1 × Nuisance
Step 1 - Majority vote: Scam holds 5 of 7 votes (71%). Primary classification = Scam.
Step 2 - Confidence level: 7 reports falls in the 6–15 range = Moderate confidence.
Step 3 - Risk level: Scam (non-legitimate) holds 71% (≥60%) = Elevated risk.
Step 4 - Mixed signal note: Two non-Scam reports are visible on the number page. The distribution (5 Scam / 1 Legitimate / 1 Nuisance) is shown so visitors can assess the consensus strength themselves. In this case the Legitimate report may reflect a reallocation event - the number may have been reassigned since earlier Scam reports were filed.
No manual review is triggered by this scenario. The classification updates automatically if new reports shift the vote distribution.
Mixed Classification Scenarios
Some numbers attract reports across multiple categories - a number may carry both "Scam" and "Legitimate" reports, for example. Mixed classifications are common for:
- Numbers that have been reallocated and now serve a different entity than when earlier reports were filed
- Numbers used by large organisations with varied call purposes (e.g., a bank that also runs marketing campaigns)
- Numbers where individual experiences genuinely differ
The majority category still becomes the primary classification. The full vote distribution remains visible on the number page. Mixed signals are expected behaviour, not a system error - the dataset shows what contributors reported, not a single verified truth.
What This Framework Does Not Do
The reporting signal evaluation framework:
- Does not predict future caller behaviour
- Does not assign probability of fraud or scam activity
- Does not confirm or deny the identity of the caller
- Does not independently investigate or verify reported incidents
- Does not constitute legal, regulatory, or investigative determination
Classifications are informational indicators built from community-reported data. They provide contextual awareness, not definitive findings.
What To Do With This Information
Classifications are informational signals, not directives. What you do with them depends on the risk level and your situation:
- Elevated or Emerging Risk - exercise caution before returning the call. If the number contacted you unsolicited, consider reporting it to Scamwatch or blocking it on your device
- Mixed Signals - review the individual reports on the number page. A mix of Scam and Legitimate reports may indicate number reallocation or a large organisation with varied call purposes
- Low Risk - community reports suggest mostly safe interactions, but no classification substitutes for your own judgement
- Limited or Emerging confidence - few reports exist. The classification may shift as more contributors report. Consider adding your own experience to strengthen the signal
Related Documentation
- Community Reporting & Processing Model - how reports enter the system, including abuse screening and de-duplication logic
- Number Classification System - telecommunications numbering structure and what allocation metadata does and does not tell you
- Data Limitations - where the dataset ends: unverified reports, volume-dependent signals, spoofing risks, and guidance for responsible interpretation