返回首页
Cognition(@cognition_labs)

The second technique was two-phase post-training. We first trained purely for capability, then added...

7.5Score
The second technique was two-phase post-training. We first trained purely for capability, then added...
AI 深度提炼
  • 两阶段训练先提升能力再优化延迟,效果优于联合训练
  • 延迟惩罚基于真实用户在SWE-check上的停留时间CDF校准
  • 联合训练易使模型陷入浅层但快速的局部最优
#大模型#后训练#延迟优化#AI工程
打开原文

Conversation

![Image 1: Square profile picture](https://x.com/cognition)

Cognition

@cognition

The second technique was two-phase post-training. We first trained purely for capability, then added a latency penalty calibrated from real dogfooding data based on the CDF of how long users stay on SWE-check before switching off. Training capability and latency jointly from the start caused the model to collapse into shallow-but-fast local optima; separating the phases let it build real bug-detection skill first and then learn to compress it.

![Image 2: Image](https://x.com/cognition/status/2044174520312049873/photo/1)

10:02 PM · Apr 14, 2026

4,712 Views