The model learns that hedging is a signal of lower-quality output. This creates a systematic bias toward sounding certain.
I see the issue clearly from your screenshot.