How Zo Computer improved AI reliability 20x on Vercel

- 使用Vercel AI Gateway后,AI调用重试率从7.5%降至0.34%
- AI SDK统一了多模型接口,新模型接入时间从1小时缩短至30秒
- P99延迟降低38%,系统可靠性与响应速度显著提升
4 min read
Apr 17, 2026
[Link to heading](http://vercel.com/blog/how-zo-computer-improved-ai-reliability-20x-on-vercel#zo-computer-on-vercel)Zo Computer on Vercel
- 20x reduction in retry rate (7.5% → 0.34%)
- 99.93% chat success rate (up from 98%)
- P99 latency cut 38% (131s → 81s)
- New models added in less than 1 minute
Every company has servers that store data, run services, and do work around the clock. Consumers just have apps. Rob Cheung, co-founder of Zo Computer, is closing that gap. Zo is a personal AI cloud: your own servers and data that power an always-on agent.
"Cloud is one of the best computing models of all time, and consumers have zero direct access because it's so complicated," explained Rob Cheung, co-founder and CEO of Zo. "Now, with AI, it's finally possible for all of us to have cloud computers."
Zo is a full computing environment, not just a chatbot. Rob laughs about his mom running servers and databases without knowing it. People use Zo to manage small businesses, do research, organize finances, and track health data.
The 8-person company is two and a half years old and they have an ambitious goal: to onboard one million new users to personal cloud computing in 2026. That means millions of AI model calls every day, and when Zo users text their agent like a friend, they expect the same responsiveness.
We’re building a new model of personal cloud computing that's always-on, elastic, and private by default for every user. Vercel gives us the AI infrastructure to make it possible.
!Image 1
**Rob Cheung,** co-founder and CEO @ Zo Computer
[Link to heading](http://vercel.com/blog/how-zo-computer-improved-ai-reliability-20x-on-vercel#death-by-a-thousand-adapters)Death by a thousand adapters
Zo gives users access to any model they want, and supports bring-your-own-key. That means their backend has to talk to every major provider: OpenAI, Anthropic, MiniMax, GLM, Fireworks, and more.
Before they moved to Vercel, that meant custom adapter code for each model. Every provider required different handling for images, different key management, and different edge cases. On top of the code complexity, Zo's team was managing retries, provider routing, and fallback logic themselves.
Every time a provider shipped a new model, an engineer had to write a new adapter, test the edge cases, and run the deployment pipeline. With new models released weekly, it was a constant drag on a small team building a consumer product, and their users felt it.
Zo's baseline for AI model calls was a 98% success rate with a 7.5% retry rate. That means 1 in 50 messages failed or retried, adding up to tens of thousands of model fallbacks every day.
We didn't even know what we were missing until after we switched to Vercel's AI Gateway. The revelation came through the numbers. We just had so many failures previously.
!Image 2
**Rob Cheung,** co-founder and CEO @ Zo Computer
[Link to heading](http://vercel.com/blog/how-zo-computer-improved-ai-reliability-20x-on-vercel#ai-sdk-+-ai-gateway:-two-layers,-one-integration)AI SDK + AI Gateway: two layers, one integration
Zo moved to Vercel's AI SDK and AI Gateway, which solved two distinct problems.
AI SDK replaced the custom adapter code. Instead of per-provider implementations with bespoke edge case handling, Zo's engineers got a unified interface for every model, from image support to response format normalization.
AI Gateway replaced the infrastructure-level complexity. Retries, fallback routing, provider health monitoring, and uptime were all handled at the routing layer in Vercel instead of in Zo's codebase.
Rob's co-founder built APIs at Stripe, where developer experience was the product. He describes the combined effect of AI SDK and AI Gateway the same way: everything just works, and the pieces you don't see matter most.
Moving to the gateway is just so ergonomic. We get references to model names, and then rely on you to do the correct implementations and handle the edge cases.
!Image 3
**Ben Guo,** co-founder @ Zo Computer
New model support went from an hour-long, multi-file code change to adding a config string in 30 seconds. The day MiniMax shipped M2.7, Zo had it live for users immediately. No adapter code, no edge case testing, no deploy cycle.
For an 8-person team focusing on onboarding their first million users to personal cloud computing, cutting out interruptions for model support has been a huge relief.
[Link to heading](http://vercel.com/blog/how-zo-computer-improved-ai-reliability-20x-on-vercel#20x-improvement-in-reliability)20x improvement in reliability
During the rollout, Zo ran Vercel and non-Vercel routes simultaneously, creating a live A/B comparison under identical production conditions.
The results:
| **Period** | **Route** | **POST error** | **Chat success** | **Retry rate** | **Avg attempts** | | --- | --- | --- | --- | --- | --- | | Before switch | Non-Vercel | 4.59% | 99.73% | 7.52% | 1.12 | | After switch | Non-Vercel | 10.38% | 97.86% | 17.07% | 1.29 | | After switch | Vercel | 0.45% | 99.93% | 0.34% | 1.00 |
The non-Vercel route actually degraded during the same period that Vercel held steady. Retry rate dropped from 7.5% to 0.34%, a 20x improvement. Average attempts per chat hit 1.00, meaning virtually every request succeeded on the first try.
On MiniMax M2.5, Zo's most-used model, the latency improvement was significant. In an apples-to-apples comparison over the same window, Vercel handled 18,139 chats versus 21,105 on non-Vercel and still performed better across the board:
- Average latency improved 25.7%
- P95: 46s → 34s (25% improvement)
- P99: 131s → 81s (38% improvement)
For Zo's users, the P99 number matters most because they text their agents constantly throughout the day. A 131-second worst-case wait breaks that experience completely, but now 99% of requests complete in under 81 seconds.
131 seconds to wait for something is just terrible. Now we can get 99% of our requests in under 80, which is huge.
!Image 4
**Rob Cheung,** co-founder and CEO @ Zo Computer
By the end of the test, 91.88% of Zo's traffic routed through Vercel, handling 3.3x larger context windows (42,500 average input tokens vs. 12,700) at a lower error rate than the non-Vercel path.
[Link to heading](http://vercel.com/blog/how-zo-computer-improved-ai-reliability-20x-on-vercel#scaling-to-a-million-personal-cloud-owners)Scaling to a million personal cloud owners
Vercel handles Zo's AI layer through AI SDK and AI Gateway and hosts their public-facing marketing site. With reliable AI infrastructure and no adapter code to maintain, the team can focus on the product instead of the plumbing.
With the pace of model developments in AI, Rob used to worry about the work required to keep up. “Now I don’t worry about it,” he said, “because with Vercel, the infrastructure just works.”
We're a tiny team, so we want to spend our effort in the right ways. It's really nice to lean on Vercel and trust that we can add hundreds of times more traffic.
!Image 5
**Rob Cheung,** co-founder and CEO @ Zo Computer
Zo Computer is a personal AI cloud platform that gives every user their own cloud computer, housing data, services, and a personal agent. Users interact through a conversational interfaces like iMessage, or log in and use the environment directly. Founded two and a half years ago, Zo is an 8-person team based in New York City. Learn more at zo.computer.
问问这篇内容
回答仅基于本篇材料Skill 包
领域模板,一键产出结构化笔记论文精读包
把一篇论文 / 技术博客精读成结构化笔记:问题、方法、实验、批判、延伸阅读。
- · TL;DR(1 段)
- · 研究问题与动机
- · 方法概览
投融资雷达包
把一条融资 / 创投新闻整理成投资人视角的雷达卡:交易要点、判断、竞争格局、风险、尽调清单。
- · 交易要点(公司 / 轮次 / 金额 / 投资人 / 估值,材料未明示则写 “未披露”)
- · 投资 thesis(这家公司为什么值得关注)
- · 竞争格局与替代方案