Data Collection, Classification & Publication Framework
Transparent documentation of Reverseau's data intake, processing, evaluation, and publication framework.
Applies to all publicly accessible Reverseau phone record pages.
Last Updated:Every phone record page on Reverseau is built from the same pipeline. Community reports enter the system anonymously, pass through normalisation and de-duplication, and are evaluated against threshold rules before any classification is published.
The pages below document that pipeline end-to-end - so that contributors, researchers, and regulators can verify what the data represents and where its boundaries lie. Reverseau does not perform caller identity verification and does not make determinations of legal wrongdoing.
Key Methodology Decisions
- Minimum report threshold: A number requires at least 3 independent community reports before any classification is published - enough to suggest a pattern beyond a single frustrated caller, while keeping the barrier low enough that genuine scam numbers surface quickly.
- Confidence tiers: Classifications range from "Low Activity" to "High Risk" based on report volume, consensus level, and recency weighting.
- Moderation: Reports pass through automated abuse screening (rate limiting, duplicate detection, pattern analysis) before entering the dataset.
- Data retention: Reports older than 24 months are downweighted. Number reallocation cycles in Australia typically run 12–18 months, so a 24-month window captures the full lifecycle while letting stale signals fade.
- No identity claims: ACMA allocation data tells us the service type and original carrier - not who currently holds or uses the number.
In simple terms: we collect anonymous reports, check them for abuse, count them, and only publish a safety classification when enough independent reports agree.
Data Processing & Evaluation Framework
Community Reporting & Processing Model
From submission to storage: anonymous reports are normalised, de-duplicated, screened for abuse, then aggregated into the dataset with full audit trail.
View Full Documentation →Reporting Signal Evaluation Framework
How Reverseau classifies phone numbers: 3-report minimum threshold, equal-weight consensus voting, four confidence levels, six risk levels from Preliminary Signal to Elevated, 30/90-day trend analysis, and mixed-signal handling.
View Full Documentation →Transparency & Data Integrity
Moderation workflow, false-positive handling, correction requests, refresh frequency, retention policy, and the safeguards that keep the dataset honest.
View Full Documentation →Telecommunications Structure & Data Context
Number Classification System
ACMA numbering allocations, service-type prefixes, and why allocation metadata does not confirm caller identity - including portability and reallocation limits.
View Full Documentation →Data Sources & Cross-Referencing
Two primary inputs - community reports and ACMA allocation records - plus cross-referencing, enrichment, consistency checks, and what we deliberately exclude.
View Full Documentation →Data Limitations & Interpretation Boundaries
Where the dataset ends: unverified reports, volume-dependent signals, allocation vs ownership gaps, spoofing risks, and guidance for responsible interpretation.
View Full Documentation →- Advertising policies, sponsorship, or commercial partnerships
- User account data, session tracking, or privacy/cookie policies
- Third-party API integrations or data sharing agreements
- Internal operational or infrastructure documentation