T
traeai
RSS登录
返回首页
cohere(@cohere)

For real agentic workloads (North), short-context calibration wasn't enough. We calibrated AWQ on lo...

8.5Score
For real agentic workloads (North), short-context calibration wasn't enough. We calibrated AWQ on lo...
AI 深度提炼
  • 短上下文校准不足以满足复杂工作负载需求。
  • 通过token masking排除重复模板提升校准效果。
  • 引入量化感知蒸馏(QAD)匹配BF16模型质量。
#机器学习#优化#Cohere#AWQ
打开原文

Cohere on X: "For real agentic workloads (North), short-context calibration wasn't enough. We calibrated AWQ on long internal agentic traces (up to 64k tokens) and added token masking in llm-compressor to exclude repetitive chat templates/tool descriptions from calibration stats. Plus QAD https://t.co/n8riV16WKc" / X

Don’t miss what’s happening

People on X are the first to know.

Log in

Sign up

Post

See new posts

Conversation

![Image 1: Square profile picture](http://x.com/cohere)

Cohere

@cohere

For real agentic workloads (North), short-context calibration wasn't enough. We calibrated AWQ on long internal agentic traces (up to 64k tokens) and added token masking in llm-compressor to exclude repetitive chat templates/tool descriptions from calibration stats. Plus QAD (quant-aware distillation) to close the last gap — matching the quality of our BF16 MoE model with W4A8.

![Image 2: Image](http://x.com/cohere/status/2047052562793414687/photo/1)

8:38 PM · Apr 22, 2026

·

793 Views

1

5

New to X?

Sign up now to get your own personalized timeline!

Sign up with Apple

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Relevant people

Trending now

What’s happening

Sports · Trending

#BURMCI

Trending in United States

Grapefruit

Politics · Trending

Hung Cao

Trending with Phelan, Secretary of the Navy

Technology · Trending

Storage Wars

Trending with Darrell Sheets

Show more

Terms of Service

|

Privacy Policy

|

Cookie Policy

|

Accessibility

|

Ads info

|

More

© 2026 X Corp.