T
traeai
Sign in

人物

什么是 Jan Leike

也叫:janleike

AI 安全研究员,曾任 DeepMind 研究员,现关注 LLM 可解释性与对齐。

为什么现在值得关注?

最近变化

2026-05-08 · Jan Leike 感谢了多年在 AI 对齐领域合作的顶尖人才

Jan Leike 被反复提及时,通常意味着它正在影响产品路线、开发者工作流或 AI 产业判断。这个页面把分散材料合并成一个可持续更新的观察入口。

📰 Jan Leike 最新动态

已收录 4 篇与「Jan Leike」相关的 AI 资讯和分析。

I'm really excited about this as a new tool in our interpretability tool kit

NLAs is an unsupervised method that converts LLM internal states into human-readable text, significantly improving model transparency and safety auditing.

入选理由:NLAs 是一种无监督技术,能将 LLM 内部激活向量转为自然语言描述。

FeaturedTweet#LLM#Interpretability#AI Safety#Anthropic英文
When I started to work on the alignment problem more than 10 years ago, we had no idea how AGI was g...

Jan Leike on X: The Evolution of AI Alignment Research Over a Decade

Jan Leike(@janleike)292 字 (约 2 分钟)
75

Jan Leike reflects on the transformation of AI alignment research over the past decade—from a niche field with only ~12 researchers and unclear methods to one now driven by RLHF, scalable oversight, and automated techniques like constitutional AI in models such as Claude.

入选理由:10 年前 AI 对齐领域仅有约 12 人作为副业从事研究,且方法混乱。

FeaturedTweet#AI Alignment#AGI#RLHF#Machine Learning英文
Jan Leike(@janleike) 图标

Jan Leike on X: Grateful for talented people in AI alignment

Jan Leike(@janleike)120 字 (约 1 分钟)
45

Jan Leike thanks talented collaborators in AI alignment, calling it a privilege to work with those deeply motivated to make the future better.

入选理由:Jan Leike 感谢了多年在 AI 对齐领域合作的顶尖人才

FeaturedTweet#AI Alignment#OpenAI#Ethics英文

与「Jan Leike」经常一起出现的 AI 术语。

💡 想追踪「Jan Leike」的长期趋势?去 实体雷达 · Jan Leike 查看详细分析和跨材料问答。

AI may generate inaccurate information. Please verify important content.