Status: versioned, public, OpenAPI-published.
Stability: v1 is the contract. Backwards-compatible additions only;
breaking changes require shipping a v2 alongside v1 with a Sunset
header on v1 endpoints.
BeaconEvent v1 is the versioned contract between the **browser
capture layer and the deterministic check layer**. The check
analyzers consume only this schema; they never see raw CDP. This
isolation protects the trust story from provider / protocol drift
(PLAN.md A1).
Two artifacts:
not code; reviewed, hot-fixable, diffable). The drift watchdog
(T9) verifies these against live vendor endpoints.
The defensibility argument is network effects, not format opacity.
Every agent / SDK / integration that speaks BeaconEvent v1 makes
the format more standard, which makes more agents adopt Prufa. The
format is therefore public, versioned, and published — *not* a
competitive moat (PLAN § Public spec commitment, 2026-06-11).
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://prufa.dev/schemas/beacon_event/v1.json",
"title": "BeaconEvent",
"type": "object",
"required": ["schema_version", "vendor", "request_url", "page_url", "timestamp"],
"properties": {
"schema_version": { "const": "1" },
"vendor": {
"type": "string",
"description": "Vendor identifier — one of: ga4, gtm, ua_legacy, gtag_loader, meta, tiktok, tiktok_loader, linkedin, google_ads, unknown"
},
"event_name": {
"type": ["string", "null"],
"description": "Vendor event name (page_view, ViewContent, Lead, etc.)"
},
"account_id": {
"type": ["string", "null"],
"description": "Account / pixel / container id (G-XXXX, GTM-XXXX, pixel id)"
},
"request_url": { "type": "string", "format": "uri" },
"method": { "type": "string", "enum": ["GET", "POST"], "default": "GET" },
"params": {
"type": "object",
"additionalProperties": { "type": "string" },
"description": "Query / form parameters (stringified)"
},
"post_body_excerpt": {
"type": ["string", "null"],
"description": "Capped excerpt of POST body (never the full payload)"
},
"page_url": { "type": "string", "format": "uri" },
"consent_state": {
"type": "string",
"enum": ["not_applicable", "pending", "accepted", "rejected", "unknown"]
},
"timestamp": { "type": "string", "format": "date-time" }
},
"additionalProperties": false
}
The v1 signature table recognizes 9 vendors. New vendors are
additive — a v1.1 release adds the row, no consumer change.
| vendor | Name | Match |
|---|---|---|
| ga4 | Google Analytics 4 | host *.google-analytics.com path /g/collect |
| ua_legacy | Universal Analytics (sunset) | host *.google-analytics.com path /(j/)?collect |
| gtm | Google Tag Manager container | host www.googletagmanager.com path /gtm.js |
| gtag_loader | gtag.js loader | host www.googletagmanager.com path /gtag/js |
| meta | Meta Pixel | host www.facebook.com path /tr |
| tiktok | TikTok Pixel | host analytics*.tiktok.com path /api/v2/pixel |
| tiktok_loader | TikTok events.js loader | host analytics.tiktok.com path /i18n/pixel/events.js |
| linkedin | LinkedIn Insight Tag | host px.ads.linkedin.com path /collect |
| google_ads | Google Ads conversion | host *.googleadservices.com path /pagead/conversion |
| unknown | Unrecognized beacon | — |
A request that doesn't match any of the above is dropped (or
emitted with vendor: "unknown" if a project opts into capture-
everything mode). The deterministic check layer then flags
"unknown beacons" as a verified finding — useful for the customer
who's been told by their marketing team that a particular tag is
installed but Prufa can't see it.
| State | Meaning |
|---|---|
| not_applicable | No CMP detected on the page. |
| pending | Banner shown, no choice made yet. |
| accepted | User accepted. |
| rejected | User rejected. |
| unknown | CMP detected but state unverifiable. |
The unknown state is the EXPECTED common case — many CMPs don't
expose their state to the page. The deterministic check
promotes unknown to an advisory finding ("consent state
could not be verified") rather than a verified failure. The
trust rule is: if we can't verify, we don't claim broken.
Live sites emit *a lot* of beacons. The runner caps captured
events per page (max_events_per_page, default 5000) and notes
truncation in the report. The raw stream is persisted for 30
days; the report always includes a truncation_note when a cap
hit. This is the "no silent failures" invariant applied to
capture.
v1.x. No consumer change required.
rename) → v2.0 alongside v1 with Sunset header on v1.
it can change without re-versioning the schema, but the drift
watchdog (T9) must verify every change against a live vendor
endpoint before it ships.
A consumer of BeaconEvent v1 MUST:
from matches against the signature table.
from pathlib import Path
import json, jsonschema
schema = json.loads(Path("beacon_event.schema.json").read_text())
jsonschema.validate(event, schema)
The schema is a single file; the signature table is a single
YAML file. Both are checked into the repo and shipped as part of
the public SDK.