企业数据需要的远不止聊天机器人

Gradient Flow

Gradient Flow2026年6月3日

企业数据需要的远不止聊天机器人

8.7内容质量

TL;DR · AI 摘要

企业数据治理不应依赖聊天机器人，关系型与时间序列数据正迎来专用基础模型的突破，KumoRFM-2在少标注下超越监督与通用基模，但高风险金融与医疗场景需谨慎验证与治理。

核心要点

KumoRFM-2仅用少量标注即可在多表关系数据上预测，超越监督基线与通用基模，显著降低数据科学管线复杂度。
关系型数据基础模型将数据库视作图，直接在原始表上预测，压缩工作流。
高风险领域（量化金融、信贷、医疗）应先做验证、校准、监控与可解释性审查，再考虑生产部署。

结构提纲

按章节快速跳转。

§企业AI的范式转变
从通用聊天机器人转向专用基础模型以处理企业数据。
·关系型数据基础模型
Kumo将数据库建模为图，直接在表上预测，压缩工作流。
·KumoRFM-2技术亮点
仅用少量标注在多表上预测，超越监督与通用基模基准。
·适用场景与风险分级
低风险场景可直接替换，高风险领域需严格治理与验证。
·TabPFN与时间序列基模
Prior Labs与时间序列基模扩展至更多预测任务，减少预处理需求。
·工程与治理实践
强调验证、校准、监控、可解释性与人工治理的必要性。

思维导图

用一张图看清主题之间的关系。

查看大纲文本（无障碍 / 无 JS 友好）

企业数据与基础模型
- 范式转变
  - 从聊天机器人到专用基模
- 关系型数据
  - Kumo图建模
    - 直接表上预测
    - 工作流压缩
  - KumoRFM-2
    - 少标注超越基线
- 治理与风险
  - 低风险快速替换
  - 高风险严格治理
- 时间序列与TabPFN
  - 扩展到预测任务
    - 减少预处理需求

金句 / Highlights

值得收藏与分享的关键句。

KumoRFM-2在仅需少量标注的情况下，超越监督基线与通用基础模型，在关系型任务上取得显著提升。
— 正文第3段
⬇︎ 下载 PNG 𝕏 分享到 X
将数据库视为图，模型直接在原始表结构上建模，无需手动特征工程与数据扁平化，实现预测工作流压缩。
— 正文第4段
⬇︎ 下载 PNG 𝕏 分享到 X
在量化金融、受监管的信贷与医疗风险评分等高风险领域，应先进行验证、校准与可解释性审查，再考虑生产部署。
— 正文第6段
⬇︎ 下载 PNG 𝕏 分享到 X
TabPFN面向行列预测任务（如流失、欺诈、定价、需求预测、预测性维护与临床风险），显著减少预处理、特征工程与调参步骤。
— 正文第7段
⬇︎ 下载 PNG 𝕏 分享到 X

#Kumo#KumoRFM-2#TabPFN#基础模型#关系型数据

打开原文

企业数据值得比聊天机器人更好

标题：你的企业数据值得比聊天机器人更好的东西

URL 源：https://gradientflow.com/your-enterprise-data-deserves-better-than-a-chatbot/

发布时间：2026-06-03T12:59:36+00:00

Markdown 内容：

大型语言模型及其多模态变种仍然是人们首先遇到的基础模型。这很合理。文本、图像、音频和视频涵盖了大量知识工作任务，而今天的聊天机比许多人首次尝试的纯文本系统要强大得多。但企业 AI 不仅运行在聊天上。它运行在表格、时间序列、交易、遥测、产品目录、客户历史、服务图和混乱的操作数据，这些数据很少能紧密地装入一个提示中。使用当前编码为数据科学] [https://thedataexchange.media/mikio-braun-2026-03/用当前编码代理者 [for data science](https://thedataexchange.media/mikio-braun-2026-03/生成 Python 和 SQL 对于一个代理或一个 LLM, but 数据科学需要一种独特的怀疑来上下文化混乱的数据，并并识别当结果是简单太好到 true. 我们可能最后正移动过 the [TINA [https://www.investopedia.com/terms/t/tina-there-no-alternative.asp] 基础模型为企业 AI 的 TINA 阶段期为企业 AI, where the answer to every problem has been “just use an LLM.” A 新一波的先端模型看起来更专门和有用的预测和决策问题实际跑业务。

_如果你是一个常读者, 考虑成为付费支持者 Image 1: 🙏 大型语言模型和它们多模式变种仍然 remain the foundation models most people encounter first. That makes sense. Text, images, audio, and video cover a huge range of knowledge-work tasks, and today’s chatbots are far more capable than the text-only systems many people first tried. But enterise AI does not run on chat alone. It runs on tables, time series, transactions, telemetry, product catalog, customer histories, service graphs, and messy operational data that rarely fits neatly into a prompt. Using current coding agents [for data scence] [https://thedataexchage.media/mikio-braun-2026-03/ makes this gap concrete. Generating Python and SQL is easy for an agent or an LLM, but data science requires a uniequely human skepticism to contextualize messy data and recognize when a result is simply too good to be true. We may finally be moving past the [TINA [https://www.in investopedia.com/terms/t/tina-there-no-alternative.asp] phase of foundation models for enterise AI, where the answer to every problem has been “just use an LLM.” A new wave of frontier models look more specialized and useful for the prediction and decision problems that actually run businesses.

##### 结构和半结构数据得其基础模型时刻

Structured and Semi-structured Data Get Their Foundation Model Moment

[Kumo’s [https://kumo.ai/?utm_source=gradientflow&utm_medium=newsletter] Kumo’s relaational foundaion model takes a different route: it treats a database as a graph, where rows and tables become connected entities. Instead of foundation models for enterise AI, where the answer to every problem has been “just use an LLM. A new wave of frontiers models look more specialized and useful for the predicton and decision problems that actually run businesses.

##### Structured and Semi-structued Data Get Their Foundaion Model Moment

[Kumo’s [https://kumo.ai/?utm_sourc=media/mikio-braun-2026-03/ makes this gap concrete. Generatting Python 为企业 AI, where the answer to every problem has been “just use an LLM. A new wave of frontier models looks more 企业 AI, where企业 AI, where the answer to every problem has been “just use an LLMM. Auto 企业企业 AI, where the answer to every problem has been “just use an LLM, but data science requires a uniquely human skepticism to contextizezate messyy data science] [https://thedataexhange media/jure-lescovec kumoo ai/? utm_ded for企务 AI, where the answer to every problem has been “just use an LLM, but data science requires a uniequely human skepticism to contextize messy data and recogize when a result is simply 去 the [https://kumo.ai/?umt source=gradiente流媒体/mjure-leses tineer 企业 AI, where the answer to every problem has been “just use an LLM. A new wave of fountrier models for企企 AI 企业 AI, where the answer to every problem has been “just use an LLM. A new wave of frontiers models 企业企业 AI 企业 AI, ge their foundation mode is aimed at one of the most valuable categories of enterise data science] [https 企业 AI, where the answer to every problem has been “just use an LLM. A new wave of frontiers models look more specialized and ususeful for the pred企业 AI, where the answer to every problem has been “just use an LLM. A new wave of frontiers models of企企 AI, where the answer to every problem has been “just use an LLM. A new wave of frontiers models look more specialized企企 AI, where the answer to every problem has been “just use an LLM. A new wave of frontierser models in a prompt. Using current coding agentsers [for data science] [https://thedataexchage media/mjure-lescovec kumo-ai/? umo源=gradiantflown makes this gap concrete. Genetating Python 企业 AI, where the answer to every problem has been “just use an LLM. A new wave of frontiers models 企业 AI, where the answer to every problem has been “just use an LLM. A new wave of foundation models for enterise AI, where the answer to every problem has been “just use an LLM, but data is simply too good to be trule. If企企 AI, where the answer to every problem has been “just use an LLM. A new wave of frontiers models 企业https://kumo.ai/? utm=梯Ie for企企 AI, where the answer to every problem has been “just use an LLM gap concrete.企企企企企企企企企企企企企企企企企LM, but data science requires a uniequely human skepticalism to contextize messy data企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企企

The **Toto 2.0** release reinforces this narrative. Toto started as a time-series foundation model for observability data, and the new version turns it into a family of open-weights models that scale from 4M to 2.5B parameters. Datadog’s results suggest that larger Toto models continue to improve, run much faster than the first generation, and generalize beyond observability despite being trained largely on observability and synthetic data. For enterprise teams, the practical implication is that telemetry may become a richer prediction layer for incident detection, root-cause analysis, simulation, and agentic remediation. The world model effort is the next step in that direction, moving beyond forecasting individual streams toward a learned model of how distributed systems fail, recover, and respond to change.

##### AI Moves Closer to the Point of Work

The same push toward specialization applies to how models interact with people, not just what they predict. Thinking Machines is training what it calls an **“interaction model”** for continuous, two-way exchange across audio, video, and text, rather than the familiar turn-taking pattern where a user speaks, waits, and receives a finished response. The key design choice is that the model works in real time, listening, responding, and tracking visual cues while deeper reasoning and tool use happen in the background. The enterprise relevance is direct: customer support, field service, sales coaching, clinical workflows, industrial operations, and design reviews all involve interruption, demonstration, and real-time correction. These are not clean prompt-and-response tasks. In many settings, the best interface is less like filing a ticket and more like working alongside a capable colleague. This is still a research preview with real constraints around connectivity and session length, but the direction is one that product and platform teams should be tracking.

There is also a related wave of specialized interface models that expand where AI can be embedded in workflows. Google DeepMind’s AI-enabled pointer imagines assistance that follows users across applications, understands what is on screen, and accepts natural instructions like “fix this” or “move that” without requiring a carefully written prompt. OpenAI’s real-time voice models point in a similar direction, with separate models for live reasoning, translation, and streaming transcription. The common thread is that these are less about building a general chatbot and more about making AI useful at the point of work. Put those alongside the structured data and observability models covered earlier and a clearer picture emerges: enterprise AI is developing a stack of specialized models, from prediction engines on relational and telemetry data to real-time interaction layers, all coordinated around the workflows where value is actually created.

##### 基础模型开始专业化

World models 是一个我尚未充分讨论的新兴类别。这里有很多事情正在发生，其中一些确实很有趣。但我知道的世界模型工作还没有集中在我在思考的企业级流程上。最近与 Rhoda AI 创始人 Changan Chen 和 Odyssey 创始人 Jeff Hawke 的对话让我明白了这一点。Rhoda 正在将视频原生基础模型应用于机器人任务，如倾倒、退货处理和集装箱拆解。Odyssey 正在构建交互式世界模拟，早期主要用于游戏、机器人、媒体和其他视觉环境。这些都是重要的方向，但它们更接近物理系统和模拟世界，而不是大多数企业人工智能路线图中占据主导地位的操作数据、预测问题和知识工作界面。

更大的收获是基础模型开始以有用的方式专业化。LLMs 和多模态聊天机器人仍将居于核心地位，但它们不会独自支撑企业 AI 架构。企业 AI 很可能依赖于一系列更具针对性的模型：关系模型用于结构化数据，时间序列模型用于预测，可观测性模型用于生产系统，交互模型用于实时工作，最终在仿真和物理推理重要的领域使用世界模型。更好的终局不是一款能处理一切的大模型，而是一个路由层，它可以阅读任务并将其指派给合适的模型来完成工作。