A five-character fix turned a failing Lighthouse Agentic Browsing audit into a clean pass. What that reveals about what the audit actually measures.
Menell] have shown that AI Large Language Models (LLMs) can fail to correctly distinguish between different instruction ...