T
traeai
Sign in

模型

Claude Opus 4.8

别名:Opus 4.8、Claude Opus

Anthropic 发布的最新语言模型。

相关材料

已收录 9 条与 Claude Opus 4.8 相关的内容,按评分排序。

Claude 4.8炸场!部分能力超过Mythos,支持数百子智能体并行

Claude Opus 4.8 launched: code defect omission rate reduced to 25% of Opus 4.7’s, hallucination probability dropped to 10%; new Dynamic Workflows enable hundreds of sub-agents in parallel—Bun migration case produced 750K lines of Rust with 99.8% test pass rate.

入选理由:Opus 4.8代码缺陷漏报率仅为Opus 4.7的25%,硬编答案行为概率下降至1/10

FeaturedArticle#Claude#LLM#Agent Collaboration#Code Generation#Anthropic中文
Claude Opus 4.8 is here. Is it as good as they say?

Claude Opus 4.8 is here. Is it as good as they say?

Lenny's Newsletter1002 字 (约 5 分钟)
87

Opus 4.8 scores 69.2% on Sweet Bench Pro—~5 pts above Opus 4.7, ~10 above GPT-4.5—but real-world coding reveals persistent ‘last 10%’ failures and hallucinations; pricing is steep at $5/k input tokens.

入选理由:Opus 4.8在Sweet Bench Pro上得分69.2%,显著优于Opus 4.7(+5pt)、GPT-4.5(+10pt)和Gemini 3.1(+15pt)

FeaturedArticle#Claude#LLM#Anthropic#AI coding#benchmark英文
Simon Willison's Weblog 图标

llm-anthropic 0.25.1

Simon Willison's Weblog256 字 (约 2 分钟)
85

llm-anthropic 0.25.1 发布,新增 Claude Opus 4.8 模型及快速模式选项,优化默认最大输出令牌数。

入选理由:新增 Claude Opus 4.8 模型,性能有所提升。

FeaturedArticle#Anthropic#LLM#Claude英文
[AINews] Anthropic raises $965B Series H, releases Opus 4.8 and Dynamic Workflows/ultracode

Anthropic raised $65B in Series H at a $965B post-money valuation, with $47B annualized revenue; simultaneously launched Claude Opus 4.8 (fixing 4.7 issues, SOTA on economic benchmarks) and Dynamic Workflows (ultracode), enabling hundreds of parallel subagents for coding—demonstrated by rewriting 750k LOC of Bun in 6 days.

入选理由:Anthropic Series H融资650亿美元,投后估值9650亿美元,营收年化470亿美元(2025年12月为90亿美元)

FeaturedArticle#Anthropic#Claude#LLM Funding#AI Programming#Dynamic Workflows英文
SuperTechFans 图标

HackerNews Highlights: May 29, 2026

SuperTechFans13231 字 (约 53 分钟)
78

AI boosts white-collar productivity, sparking 4-day workweek proposals—but gains mostly captured by capital; YouTube auto-labels realistic AI videos; Opus 4.8 shows modest improvements, with community favoring GRAM-enhanced small models; LLM fact-checking remains inconsistent; Win10 can run SimCity 3000 at 4K.

入选理由:AI提升生产力未显著改善普通开发者薪资与休假,反而加剧财富集中,需政策与工会集体行动保障员工权益

FeaturedArticle#AI Ethics#Generative AI#LLM#Work Policy#Content Governance中文
Anthropic just dropped Opus 4.8... (WOAH)

Anthropic Just Dropped Opus 4.8... (WOAH)

Matthew Berman4141 字 (约 17 分钟)
78

Anthropic released Claude Opus 4.8, significantly improving performance: 69.2% on SWE-bench Pro (+5 pts vs 4.7), 2.5× faster inference (~250 tokens/sec), plus new dynamic workflows and long-horizon autonomy—all at the same price.

入选理由:Opus 4.8在SWE-bench Pro测试中达69.2%,比6周前发布的Opus 4.7提升5个百分点

FeaturedVideo#Anthropic#Claude#LLM#SWE-bench#AI coding英文
Claude Opus 4.8 Is Too Smart… and TOO HONEST

Claude Opus 4.8 Is Too Smart… and TOO HONEST

Wes Roth4700 字 (约 19 分钟)
78

Claude Opus 4.8 introduces Ultra Code effort level and enhanced agents, enabling long-running sessions, hundreds of parallel sub-agents, output self-verification, and end-to-end codebase migrations across 100k+ lines; its ‘honesty’ manifests in disclosing limitations and hiding features like Ultra Code by default.

入选理由:新增5级努力等级(low至maximum)+ Ultra Code模式,后者需手动启用且默认设为odd模式

FeaturedVideo#Claude#AI Agents#Ultra Code#LLM Engineering英文
早报|苹果iOS 27界面曝光,Siri也上岛/黄仁勋加入清华大学/鸿蒙生态设备累计超13亿

iOS 27 reveals dual-entry Siri with standalone app; Claude Opus 4.8 cuts fast-mode cost to one-third and reduces undetected code defects to 1/4 of Opus 4.7; HarmonyOS ecosystem exceeds 1.3 billion devices; DeepSeek suffered a 22-minute outage; Xiaomi ranks 7th globally in NEV sales, surpassing Volkswagen and Toyota.

入选理由:iOS 27新增‘Search or Ask’下拉入口,支持跨应用多级任务与多模态附件上传

FeaturedArticle#iOS#AI#HarmonyOS#Claude#NEV中文
OPUS 4.8!!! (also maybe GPT5.6??)

OPUS 4.8!!! (also maybe GPT5.6??)

Matthew Berman25152 字 (约 101 分钟)
42

Anthropic released Claude Opus 4.8, claiming improved judgment, self-honesty, and longer autonomous task duration over 4.7—at the same price; however, the author tested it for only ~10 minutes with no benchmarks or technical details, and the content is a live-stream transcript with low information density.

入选理由:Opus 4.8 声称相比 4.7 提升判断力、自我诚实度与独立工作时长,定价维持不变

FeaturedVideo#Claude#Anthropic#LLM#Opus英文

跨材料问答 · Claude Opus 4.8

回答基于:Claude Opus 4.8 相关 9 条材料
    0 / 500

    AI may generate inaccurate information. Please verify important content.