Simon Willison's Weblog2026年4月22日

Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model

9.2Score

用这条生成生成视频方案

Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model

AI 深度提炼

Qwen3.6-27B 在主要编码基准测试中表现优于前代 Qwen3.5-397B-A17B。
新模型从 807GB 减小到 55.6GB，量化版本仅 16.8GB。
本地运行生成复杂 SVG 图像效果出色，适合资源有限环境。

#Qwen#AI模型#编码#Hugging Face

打开原文

22nd April 2026 - Link Blog

**Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model** ([via](https://news.ycombinator.com/item?id=47863217 "Hacker News")) Big claims from Qwen about their latest open weight model:

Qwen3.6-27B delivers flagship-level agentic coding performance, surpassing the previous-generation open-source flagship Qwen3.5-397B-A17B (397B total / 17B active MoE) across all major coding benchmarks.

On Hugging Face Qwen3.5-397B-A17B is 807GB, this new Qwen3.6-27B is 55.6GB.

I tried it out with the 16.8GB Unsloth Qwen3.6-27B-GGUF:Q4_K_M quantized version and `llama-server` using this recipe by benob on Hacker News, after first installing `llama-server` using `brew install llama.cpp`:

llama-server \
    -hf unsloth/Qwen3.6-27B-GGUF:Q4_K_M \
    --no-mmproj \
    --fit on \
    -np 1 \
    -c 65536 \
    --cache-ram 4096 -ctxcp 2 \
    --jinja \
    --temp 0.6 \
    --top-p 0.95 \
    --top-k 20 \
    --min-p 0.0 \
    --presence-penalty 0.0 \
    --repeat-penalty 1.0 \
    --reasoning on \
    --chat-template-kwargs '{"preserve_thinking": true}'

On first run that saved the ~17GB model to `~/.cache/huggingface/hub/models--unsloth--Qwen3.6-27B-GGUF`.

Here's the transcript for "Generate an SVG of a pelican riding a bicycle". This is an _outstanding_ result for a 16.8GB local model:

Image 1: Bicycle has spokes, a chain and a correctly shaped frame. Handlebars are a bit detached. Pelican has wing on the handlebars, weirdly bent legs that touch the pedals and a good bill. Background details are pleasant - semi-transparent clouds, birds, grass, sun.

Performance numbers reported by `llama-server`:

Reading: 20 tokens, 0.4s, 54.32 tokens/s
Generation: 4,444 tokens, 2min 53s, 25.57 tokens/s

For good measure, here's Generate an SVG of a NORTH VIRGINIA OPOSSUM ON AN E-SCOOTER (run previously with GLM-5.1):

Image 2: Digital illustration in a neon Tron-inspired style of a grey cat-like creature wearing cyan visor goggles riding a glowing cyan futuristic motorcycle through a dark cityscape at night, with its long tail trailing behind, silhouetted buildings with yellow-lit windows in the background, and a glowing magenta moon on the right.

That one took 6,575 tokens, 4min 25s, 24.74 t/s.