Home API Agent skill Schema (JSON)

BeaconEvent v1 — Public Spec

Status: versioned, public, OpenAPI-published.

Stability: v1 is the contract. Backwards-compatible additions only;

breaking changes require shipping a v2 alongside v1 with a Sunset

header on v1 endpoints.

What this is

BeaconEvent v1 is the versioned contract between the **browser

capture layer and the deterministic check layer**. The check

analyzers consume only this schema; they never see raw CDP. This

isolation protects the trust story from provider / protocol drift

(PLAN.md A1).

Two artifacts:

not code; reviewed, hot-fixable, diffable). The drift watchdog

(T9) verifies these against live vendor endpoints.

Why public

The defensibility argument is network effects, not format opacity.

Every agent / SDK / integration that speaks BeaconEvent v1 makes

the format more standard, which makes more agents adopt Prufa. The

format is therefore public, versioned, and published — *not* a

competitive moat (PLAN § Public spec commitment, 2026-06-11).

JSON Schema (canonical)


{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "$id": "https://prufa.dev/schemas/beacon_event/v1.json",
  "title": "BeaconEvent",
  "type": "object",
  "required": ["schema_version", "vendor", "request_url", "page_url", "timestamp"],
  "properties": {
    "schema_version": { "const": "1" },
    "vendor": {
      "type": "string",
      "description": "Vendor identifier — one of: ga4, gtm, ua_legacy, gtag_loader, meta, tiktok, tiktok_loader, linkedin, google_ads, unknown"
    },
    "event_name": {
      "type": ["string", "null"],
      "description": "Vendor event name (page_view, ViewContent, Lead, etc.)"
    },
    "account_id": {
      "type": ["string", "null"],
      "description": "Account / pixel / container id (G-XXXX, GTM-XXXX, pixel id)"
    },
    "request_url": { "type": "string", "format": "uri" },
    "method": { "type": "string", "enum": ["GET", "POST"], "default": "GET" },
    "params": {
      "type": "object",
      "additionalProperties": { "type": "string" },
      "description": "Query / form parameters (stringified)"
    },
    "post_body_excerpt": {
      "type": ["string", "null"],
      "description": "Capped excerpt of POST body (never the full payload)"
    },
    "page_url": { "type": "string", "format": "uri" },
    "consent_state": {
      "type": "string",
      "enum": ["not_applicable", "pending", "accepted", "rejected", "unknown"]
    },
    "timestamp": { "type": "string", "format": "date-time" }
  },
  "additionalProperties": false
}

Vendors (v1)

The v1 signature table recognizes 9 vendors. New vendors are

additive — a v1.1 release adds the row, no consumer change.

| vendor | Name | Match |

|---|---|---|

| ga4 | Google Analytics 4 | host *.google-analytics.com path /g/collect |

| ua_legacy | Universal Analytics (sunset) | host *.google-analytics.com path /(j/)?collect |

| gtm | Google Tag Manager container | host www.googletagmanager.com path /gtm.js |

| gtag_loader | gtag.js loader | host www.googletagmanager.com path /gtag/js |

| meta | Meta Pixel | host www.facebook.com path /tr |

| tiktok | TikTok Pixel | host analytics*.tiktok.com path /api/v2/pixel |

| tiktok_loader | TikTok events.js loader | host analytics.tiktok.com path /i18n/pixel/events.js |

| linkedin | LinkedIn Insight Tag | host px.ads.linkedin.com path /collect |

| google_ads | Google Ads conversion | host *.googleadservices.com path /pagead/conversion |

| unknown | Unrecognized beacon | — |

A request that doesn't match any of the above is dropped (or

emitted with vendor: "unknown" if a project opts into capture-

everything mode). The deterministic check layer then flags

"unknown beacons" as a verified finding — useful for the customer

who's been told by their marketing team that a particular tag is

installed but Prufa can't see it.

Consent states

| State | Meaning |

|---|---|

| not_applicable | No CMP detected on the page. |

| pending | Banner shown, no choice made yet. |

| accepted | User accepted. |

| rejected | User rejected. |

| unknown | CMP detected but state unverifiable. |

The unknown state is the EXPECTED common case — many CMPs don't

expose their state to the page. The deterministic check

promotes unknown to an advisory finding ("consent state

could not be verified") rather than a verified failure. The

trust rule is: if we can't verify, we don't claim broken.

Beacon storm protection

Live sites emit *a lot* of beacons. The runner caps captured

events per page (max_events_per_page, default 5000) and notes

truncation in the report. The raw stream is persisted for 30

days; the report always includes a truncation_note when a cap

hit. This is the "no silent failures" invariant applied to

capture.

Versioning policy

v1.x. No consumer change required.

rename) → v2.0 alongside v1 with Sunset header on v1.

it can change without re-versioning the schema, but the drift

watchdog (T9) must verify every change against a live vendor

endpoint before it ships.

Conformance

A consumer of BeaconEvent v1 MUST:

  1. Validate `schema_version == "1"`.
  2. Tolerate unknown `vendor` values gracefully (drop or quarantine).
  3. Treat `unknown` consent state as advisory, never verified.
  4. Surface `events_truncated_note: true` to the user when present.
  5. Never infer a tracking problem from `vendor: "unknown"` — only
  6. from matches against the signature table.

    Authoring a new consumer

    
    from pathlib import Path
    import json, jsonschema
    
    schema = json.loads(Path("beacon_event.schema.json").read_text())
    jsonschema.validate(event, schema)
    

    The schema is a single file; the signature table is a single

    YAML file. Both are checked into the repo and shipped as part of

    the public SDK.