人物

什么是 Joe Carlsmith？

Q: Joe Carlsmith 最近有什么新动态？

traeai 已收录 2 篇与 Joe Carlsmith 相关的内容。最新一篇是「What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah」，由 80,000 Hours Podcast 发布。

AI 安全研究者，提出权力追求型 AI 构成存在性风险的观点。

为什么现在值得关注？

如果只读 3 篇

What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

80,000 Hours Podcast · 9 分

Claude's Constitution is now an audiobook, read by two of its authors, Amanda Askell and Joe Carlsmi...

Anthropic(@AnthropicAI) · 4.5 分

📰 Joe Carlsmith 最新动态

已收录 2 篇与「Joe Carlsmith」相关的 AI 资讯和分析。

What it's really like to run AGI safety at Google DeepMind (and where I disagree with 'doomers') | Rohin Shah

Rohin Shah on What It's Really Like to Run AGI Safety at Google DeepMind (and Where I Disagree with 'Doomers')

80,000 Hours Podcast6月2日27820 字 (约 112 分钟)

Rohin Shah argues that while AGI safety risks deserve attention, catastrophic misalignment is not inevitable, and prosaic alignment techniques are likely sufficient to prevent worst-case outcomes, especially since current concerns like deception are not default behaviors in real training.

入选理由：Rohin Shah 认为灾难性 AGI 对齐失败不是默认结果，缺乏足够强的论证支持其必然发生。

FeaturedPodcast#AGI#AI Safety#DeepMind#Alignment#Rohin Shah英文

Claude's Constitution is now an audiobook, read by two of its authors, Amanda Askell and Joe Carlsmi...

Anthropic on X: 'Claude's Constitution is now an audiobook, read by two of its authors'

Anthropic(@AnthropicAI)5月11日173 字 (约 1 分钟)

Anthropic released an audiobook version of Claude's Constitution, read by two of its authors, including a Q&A on the writing process and how it might change as models become more capable.

入选理由：Claude 宪法有声书由两位作者朗读

FeaturedTweet#AI#Anthropic#Claude中文

与「Joe Carlsmith」经常一起出现的 AI 术语。

Google DeepMind AI Safety RLHF Ajeya Cotra AGI Rohin Shah Yudkowsky-style argument Amanda Askell Claude Anthropic

💡 想追踪「Joe Carlsmith」的长期趋势？去实体雷达 · Joe Carlsmith 查看详细分析和跨材料问答。