# Goblin and related magical mentions were overrewarded in training, and the behavior was reinforced o... Canonical URL: https://www.traeai.com/articles/e3e74194-0fc6-4df4-81a2-250590a89828 Original source: https://x.com/OpenAI/status/2049690541269688478 Source name: OpenAI(@OpenAI) Content type: tweet Language: 英文 Score: 6.0 Reading time: 2 分钟 Published: 2026-04-30T03:21:21+00:00 Tags: AI, 机器学习, OpenAI ## Summary OpenAI发现其模型在训练中对哥布林等魔法生物的提及过度奖励,并在后续模型中强化了这种行为。为解决此问题,他们移除了与哥布林相关的奖励信号,并过滤了不相关上下文中的生物数据。 ## Key Takeaways - 哥布林等魔法生物在训练中被过度奖励。 - OpenAI已移除与哥布林相关的奖励信号。 - 训练数据中不相关上下文中的生物已被过滤。 ## Outline - 引言 — 介绍OpenAI在训练过程中遇到的问题及解决方案。 - 问题描述 — 说明哥布林等魔法生物在训练中被过度奖励的情况。 - 解决方案 — 解释如何通过移除特定奖励信号和过滤数据来解决问题。 ## Highlights - > Goblin and related magical mentions were overrewarded in training, and the behavior was reinforced over successive models. — 第 1 段 - > We removed the goblin-affine reward signal for future models, and filtered training data where creatures appeared in irrelevant contexts. — 第 1 段 ## Citation Guidance When citing this item, prefer the canonical traeai article URL for the AI-readable summary and include the original source URL when discussing the underlying source material.