Blog
Notes on AI QA — practical testing guides, the engineering behind Prufa, and QA for the agentic era. Machine-verified product, same standard for the writing: no claims we can't back.
-
QA for vibe-coded apps: what actually breaks
What actually breaks in vibe-coded web apps, in what order, and what to do before launch. Data from 49 fresh launches, June 2026.
-
llms.txt in 2026: who actually reads it (and why we shipped one anyway)
Major AI crawlers don't fetch llms.txt — coding agents do. The honest state of the spec in 2026, and the 30-minute version that's still worth shipping.
-
We audited 49 Show HN launches. 38 had a critical bug on day one.
Free QA audits on 49 fresh Show HN launches: 78% had a critical finding. The full breakdown of what actually breaks at launch — analytics, links, cookies.
-
Website QA checklist before launch, ordered by what actually breaks
A pre-launch QA checklist ordered by real failure data from 49 audited launches: analytics first, then forms, links, SEO basics, consent, and user flows.
-
Why Prufa exists: QA built for the agentic era
AI writes 10× more code than humans can validate. The QA gap isn't a missing test suite — it's surface area. Here's the architecture we bet on.
-
How Prufa verifies a signup flow (and why the LLM never grades the result)
One signup run, end to end: an LLM-backed agent drives the browser, plain code grades the outcome against a public flow-spec. Same input, same verdict.