Benchmarking inference at scale: coding agents
Together Inference Engine delivers 31% more TPS than next fastest OSS engine on same hardware, maintains 2× better TTFT at saturation. Performance gains come from full-stack optimization.
入选理由:ThunderMLA、自定义内核重写和端到端优化使Together引擎比其他OSS引擎多31%的TPS
