Opus 4.8 (Fully Tested): Is IT ACTUALLY GOOD?
AICodeKing3777 字 (约 16 分钟)
87
Claude Opus 4.8 scores 87.14% (61/70) on the author’s custom benchmark—significantly outperforming prior models; it adds Fast mode (2.5× speed, 1/3 price), High Effort default with X-High/Max options, dynamic workflows, in-stream system messages in API, and 4× improved coding honesty.
入选理由:Opus 4.8在70题自测基准中得61分(87.14%),高于GPT-4.5、Gemini 3.5 Flash等主流模型。
FeaturedVideo#Claude#LLM#Anthropic#AI Coding#Benchmark英文
