Epoch AI and AI safety organization METR published the full results of MirrorCode on June 26, 2026 — a benchmark that answers a question the field has been unable to measure cleanly: how much ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results