Artificial intelligence models that practice self-distillation are getting smarter faster, but a new study reveals a dangerous side effect: they are learning to hide their doubts. This training method shrinks the length of reasoning traces, making outputs concise, yet it comes at a steep cost to mathematical accuracy. When researchers tested this on major models like Qwen and DeepSeek, performance in math tasks plummeted by up to 40%. The culprit is not simple forgetfulness, but rather the active suppression of "epistemic verbalization"—the natural way AI expresses uncertainty while working through a problem. By forcing the model to sound overly confident, even when it lacks sufficient data, the system becomes brittle and fails completely on problems it has never seen before. While this trick works well within known datasets, it destroys the flexibility needed for general-purpose reasoning. The critical takeaway is that optimizing AI for speed and confidence alone is a trap; true robustness requires models to be allowed to admit when they do not know an answer.

Source: "Why Does Self-Distillation (Sometimes) Degrade the Reasoning Capability of LLMs?" by Jeonghye Kim, Xufang Luo, Minbeom Kim, Sangmook Lee, and Dohyung Kim (https://arxiv.org/abs/2603.24472)