Skywork 在 OpenClaw 环境下的基准测试结果

Skywork(@Skywork_ai)

Skywork(@Skywork_ai)2026年5月19日

Skywork Benchmark Results on OpenClaw Environment

4.5Score

TL;DR · AI Summary

Skywork releases benchmark results for its AI models under the OpenClaw environment, claiming that v1.0 and v1.0-lite versions outperform Minimax 2.7, DeepSeek V4 Flash, and Qwen 3.6 in PinchBench, Claw-Eval, and Skywork-Claw-Bench tests, though specific performance data and detailed technical explanations are lacking.

Key Takeaways

Skywork conducts model evaluation in a self-constructed OpenClaw environment usi
Both v1.0 and v1.0-lite outperform Minimax 2.7, DeepSeek V4 Flash, and Qwen 3.6
Claw-Eval achieves ^3 stability, though the specific meaning and data are not di

Outline

Jump quickly between sections.

§Test Environment Introduction
Skywork conducts model evaluation in a self-constructed OpenClaw environment using high-quality tools and synthesized tasks derived from real user patterns.
§Benchmark Overview
Evaluation covers three test suites: PinchBench, Claw-Eval (with ^3 stability), and Skywork-Claw-Bench.
§Performance Comparison Results
Skywork v1.0 and v1.0-lite outperform Minimax 2.7, DeepSeek V4 Flash, and Qwen 3.6 35B A3B/27B in all tests.

Highlights

Key sentences worth saving and sharing.

Built on a self-constructed OpenClaw environment with high-quality tools and synthesized tasks derived from real user patterns.
— Post content
⬇︎ 下载 PNG 𝕏 分享到 X
Both v1.0 and v1.0-lite outperform Minimax 2.7, DeepSeek V4 Flash, and Qwen 3.6 35B A3B / 27B across PinchBench, Claw-Eval (with ^3 stability), and Skywork-Claw-Bench.
— Post content
⬇︎ 下载 PNG 𝕏 分享到 X

#AI Model#Benchmark#Skywork#Performance Comparison#OpenClaw

Open original article

Across PinchBench, Claw-Eval (with ^3 stability), and Skywork-Claw-Bench, both v1.0 and v1.0-lite outperform Minimax 2.7, DeepSeek V4 Flash, and Qwen" / X

Skywork on X: "Built on a self-constructed OpenClaw environment with high-quality tools and synthesized tasks derived from real user patterns. Across PinchBench, Claw-Eval (with ^3 stability), and Skywork-Claw-Bench, both v1.0 and v1.0-lite outperform Minimax 2.7, DeepSeek V4 Flash, and Qwen" / X

Don’t miss what’s happening

Skywork

@Skywork_ai

Built on a self-constructed OpenClaw environment with high-quality tools and synthesized tasks derived from real user patterns. Across PinchBench, Claw-Eval (with ^3 stability), and Skywork-Claw-Bench, both v1.0 and v1.0-lite outperform Minimax 2.7, DeepSeek V4 Flash, and Qwen 3.6 35B A3B / 27B.

12:23 PM · May 19, 2026

·

4,145 Views

1

21

1