OpenAI WebRTC Audio Session, now with document context

Simon Willison's Weblog

Simon Willison's Weblog2026年6月12日

OpenAI WebRTC Audio Session, now with document context

8.5Score

TL;DR · AI Summary

OpenAI 推出 GPT-Realtime-2 模型，支持在 WebRTC 会话中结合文档上下文进行语音交互。

Key Takeaways

OpenAI 推出 GPT-Realtime-2 模型，具备 GPT-5 级推理能力。
开发者可通过 WebRTC API 在浏览器中实现带文档上下文的语音交互。
GPT-Realtime-2 模型知识截止日期为 2024 年 9 月 30 日。

Outline

Jump quickly between sections.

§引言
作者介绍了 OpenAI WebRTC Audio Session 工具的背景和最新更新。
·GPT-Realtime-2 模型的发布
OpenAI 推出了 GPT-Realtime-2 模型，具备 GPT-5 级推理能力。
›文档上下文功能的加入
用户现在可以在 WebRTC 会话中粘贴文档上下文，以增强语音交互的准确性。
·工具的使用场景
该工具可用于探索文档内容，并通过语音进行交互式讨论。

Mindmap

See how the topics connect at a glance.

查看大纲文本（无障碍 / 无 JS 友好）

OpenAI WebRTC Audio Session
- GPT-Realtime-2 模型
  - GPT-5 级推理能力
  - 知识截止日期：2024 年 9 月 30 日
- 文档上下文功能
  - 支持在 WebRTC 会话中粘贴文档上下文

Highlights

Key sentences worth saving and sharing.

GPT-Realtime-2 是 OpenAI 推出的首个具备 GPT-5 级推理能力的语音模型。
— 第 2 段
⬇︎ 下载 PNG 𝕏 分享到 X
用户现在可以在 WebRTC 会话中粘贴文档上下文，以增强语音交互的准确性。
— 第 3 段
⬇︎ 下载 PNG 𝕏 分享到 X
GPT-Realtime-2 模型的知识截止日期为 2024 年 9 月 30 日。
— 第 2 段
⬇︎ 下载 PNG 𝕏 分享到 X

#OpenAI#WebRTC#GPT-Realtime-2#语音交互

Open original article

12th June 2026 - Link Blog

[OpenAI WebRTC Audio Session, now with document context](https://tools.simonwillison.net/openai-webrtc). I built the first version of this tool in December 2024 to try out the then-new OpenAI WebRTC API for interacting with their realtime audio models.

Last month OpenAI introduced a brand new model to that API called GPT‑Realtime‑2, which they promoted as "our first voice model with GPT‑5‑class reasoning" - with a Sep 30, 2024 knowledge cut-off.

I've been waiting for that model to show up in the ChatGPT iPhone app but it still hasn't, so I revisited my old playground.

You can now pick the better model, and you can also paste in a big chunk of document context so you can have as audio conversation in your browser about whatever information you think would be useful to explore in a conversational way.