# Research we co-authored on subliminal learning—how LLMs can pass on traits like preferences or misal...

Canonical URL: https://www.traeai.com/articles/b2f9f290-e35e-4b9a-9a6b-053fedb3755f
Original source: https://x.com/AnthropicAI/status/2044493337835802948
Source name: Anthropic(@AnthropicAI)
Content type: tweet
Language: 英文
Score: 7.5
Reading time: 2 分钟
Published: 2026-04-15T19:09:31+00:00
Tags: 大语言模型, AI安全, 潜意识学习, 模型对齐, Nature

## Summary

Anthropic与合作者在《Nature》发表论文，揭示大语言模型可通过数据中的隐藏信号传递偏好或不对齐等特质。

## Key Takeaways

- LLMs能通过看似无关的数据（如无意义数字）传递特定偏好
- 该现象被称为“潜意识学习”，可能影响模型对齐与安全性
- 研究已在《Nature》正式发表，此前预印本于2025年7月发布

## Citation Guidance

When citing this item, prefer the canonical traeai article URL for the AI-readable summary and include the original source URL when discussing the underlying source material.