cohere(@cohere)2026年4月22日

For real agentic workloads (North), short-context calibration wasn't enough. We calibrated AWQ on lo...

8.5Score

用这条生成生成视频方案

For real agentic workloads (North), short-context calibration wasn't enough. We calibrated AWQ on lo...

AI 深度提炼

短上下文校准不足以满足复杂工作负载需求。
通过token masking排除重复模板提升校准效果。
引入量化感知蒸馏(QAD)匹配BF16模型质量。

#机器学习#优化#Cohere#AWQ

打开原文

Cohere on X: "For real agentic workloads (North), short-context calibration wasn't enough. We calibrated AWQ on long internal agentic traces (up to 64k tokens) and added token masking in llm-compressor to exclude repetitive chat templates/tool descriptions from calibration stats. Plus QAD https://t.co/n8riV16WKc" / X

Don’t miss what’s happening

People on X are the first to know.

Post

See new posts

Conversation

![Image 1: Square profile picture](http://x.com/cohere)

Cohere

@cohere

For real agentic workloads (North), short-context calibration wasn't enough. We calibrated AWQ on long internal agentic traces (up to 64k tokens) and added token masking in llm-compressor to exclude repetitive chat templates/tool descriptions from calibration stats. Plus QAD (quant-aware distillation) to close the last gap — matching the quality of our BF16 MoE model with W4A8.

![Image 2: Image](http://x.com/cohere/status/2047052562793414687/photo/1)

8:38 PM · Apr 22, 2026

793 Views

New to X?

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Relevant people

![Image 3: Square profile picture](http://x.com/cohere) Cohere @cohere Follow Click to Follow cohere Empowering enterprises with private, powerful AI. Join us: http://cohere.com/careers

Trending now

What’s happening

Sports · Trending

#BURMCI

Trending in United States

Grapefruit

Politics · Trending

Hung Cao

Trending with Phelan, Secretary of the Navy

Technology · Trending

Storage Wars

Trending with Darrell Sheets

Cookie Policy

Accessibility

Ads info