0× verify
batch integrity · sha-2560000000000000000000000000000000000000000000000000000000000000000
Verifier-grounded training data

Train on reality, not noise

Model capability breakthroughs require data breakthroughs.

Position

Four Convictions

01

Data is key.

Progress in model performance is gated on high-quality data.

02

Human annotators don't scale.

Human generated data is important, but collecting it is slow and expensive. More importantly, superintelligence won't come from mimicking humans.

03

Verifiers as oracles.

Formal proof systems, simulators, executable tests, and oracle databases can produce more trustworthy data than humans.

04

Enhanced natural language.

Models pre-trained on noisy web-scraped text need information-dense supplements about the natural world. We're projecting the laws of physics, biological facts, self-consistent logic and more into natural language.

Team

We're researchers and engineers

Our background is in foundation model training across LLMs, image, video, and speech. We're selective about the engagements we take on.

Open a pilot conversation