Critical Document Audit

Paste or upload a policy, contract, incident report, or medical note. Extract findings, then audit the evidence spans with HLM3-Mix signals.

Status

Demo-ready browser workflow. The default checkpoint is the validated HLM3-Mix 35M K=16 (val PPL 10.66); the all-models edition exposes the bundle's broader language checkpoint catalog as selectable audit models.

What the demo proves

The same Hopfield core that drives language generation produces span-level audit signals over a document:

Per-finding PPL, entropy, and prefix delta for each evidence span
A document hash so the audit can be replayed
A clear fragile / review / stable status per finding
Saved JSON and Markdown reports for downstream review

What the reviewer sees

A profile selector (contract / policy / incident / medical)
The original document with findings extracted from deterministic rules
Each finding's evidence span audited by HLM3-Mix
A downloadable JSON of the full audit run

Two editions

Edition	Models available	Use it for
Default	HLM3-Mix 35M K=16 only	Reliable client demo with the validated checkpoint
All-models	The full packaged checkpoint catalog	Side-by-side audit behavior across the bundle

The all-models edition uses the same audit pipeline; it adds a sidebar checkpoint selector with the tier badge and caveat for each option. Non-validated checkpoints surface a clear warning.

What "all-models" reveals

Selecting different checkpoints lets a reviewer see how the audit signals shift with the underlying model. The validated checkpoint is the quality reference; other entries in the bundle are diagnostic surfaces with explicit in-UI caveats so a reviewer cannot mistake an experimental artifact for a quality claim.

Caveats

The deterministic rules cover four profiles; broader risk schemas need a partner integration.
Larger checkpoints in the bundle can need substantial VRAM. The UI surfaces this and offers a CPU fallback.

Where it fits

The right demo for compliance, legal, clinical, or operations reviewers who care about evidence traceability over single-answer generation.

Critical AI Audit Showcase — automated multi-document variant with perturbation comparison
Use case: EU AI Act compliance — narrative deployment for regulated decisioning
Validation Summary — public benchmark numbers
HLM3-Mix Model Lab — prompt-test the same catalog

Critical Document Audit ​

What the demo proves ​

What the reviewer sees ​

Two editions ​

What "all-models" reveals ​

Caveats ​

Where it fits ​

Related ​