DGX Spark 上私有本地 AI CUDA 编程辅助

NVIDIA Developer

NVIDIA Developer视频2026年5月29日

DGX Spark 上私有本地 AI CUDA 编程辅助

8.2内容质量

可直接观看的视频资源打开原视频

TL;DR · AI 摘要

Nsight Copilot 可在 DGX Spark 上本地离线运行，利用 128GB 显存部署 GPT OSS 12B NIM + CUDA RAG 管道，为 CUDA 开发者提供隐私安全、零云成本的 AI 编程辅助。

核心要点

Nsight Copilot 支持在 DGX Spark（128GB 显存）上本地部署 GPT OSS 12B NIM + CUDA RAG 管道，实现完全离线
其 autocomplete 模型为 NVIDIA 自研、专为 CUDA 训练，可在无网络环境下提供代码补全与问答能力。
相比主流 AI 编程工具缺乏高质量 CUDA 支持，Nsight Copilot 填补了该领域空白，已上架 VS Code Marketplace 与 Open

结构提纲

按章节快速跳转。

§Nsight Copilot 定位与核心价值
Nsight Copilot 是 NVIDIA 推出的专为 CUDA 开发者设计的 AI 编程助手，解决主流工具缺乏 CUDA 专业支持的问题。
·本地化部署方案：DGX Spark + NIM 蓝图
依托 DGX Spark 的 128GB 显存，可本地运行 GPT OSS 12B NIM 与 CUDA 专用 RAG 管道，实现数据不出域与零云推理成本。
·功能组成：聊天模型与自研自动补全模型
聊天模型基于 GPT OSS 12B NIM + CUDA RAG 提供问答与代码生成，自动补全模型为 NVIDIA 内部训练、专为 CUDA 优化。
·适用场景与发布状态
适用于对数据隐私或 IP 安全有高要求的组织，当前已上架 VS Code Marketplace 和 OpenVSX，支持在线与离线双模式。

思维导图

用一张图看清主题之间的关系。

查看大纲文本（无障碍 / 无 JS 友好）

Nsight Copilot 本地 CUDA AI 编程助手
- 核心能力
  - Chat 模型：GPT OSS 12B NIM + CUDA RAG
  - Autocomplete 模型：NVIDIA 自研、CUDA 专用
  - VS Code 插件（离线/在线双模式）
- 硬件依赖
  - DGX Spark（128GB 显存）
  - 本地 NIM 后端部署
- 核心优势
  - 数据隐私：全程本地处理
  - 成本节约：免云推理费用
  - 领域专精：唯一高质量 CUDA AI 助手

金句 / Highlights

值得收藏与分享的关键句。

Nsight Copilot 的 chat 模型使用 GPT OSS 12B NIM 在 CUDA intelligence RAG 管道中运行，确保响应高度适配 CUDA 开发需求。
— 第 0:39–0:48 段
⬇︎ 下载 PNG 𝕏 分享到 X
DGX Spark 的 128 GB 显存使本地运行完整 Nsight Copilot 蓝图成为可能，从而满足隐私敏感团队对数据不出域的要求。
— 第 1:27–1:35 段
⬇︎ 下载 PNG 𝕏 分享到 X
主流 AI 编程工具目前无法提供高质量 CUDA 编码辅助，Nsight Copilot 是首个专为 CUDA 优化的集成式 AI 助手。
— 第 0:48–0:54 段
⬇︎ 下载 PNG 𝕏 分享到 X

#CUDA#AI 编程助手#NVIDIA#本地大模型#DGX Spark

视频笔记

[0:00]没有 worries！以下是修正后的视频本来：

[0:02]CUDA 是 NVIDIA 的并行计算平台，Nsight Copilot 是一个AI代码助手，设计用于帮助 CUDA 开发人员成功。这里，我们将演示 Nsight Copilot**

[0:12]蓝色图示包含 LLMs 和后端 NIMs 运行在 DGX_spark 上。

[0:18]这针对开发人员和组织，他们需要隐私，其中数据永远不会离开他们的控制，可能希望避免云推断成本。

[0:21]您可以看到，我们 already launched the Nsight Copilot extension in vs code without an internet connection.

[0:28]我们将 prompt the chat model to answer CUDA questions and generate code. chat is using the GPT-3.5T NIM in a CUDA_intelligence RAG pipe to give responses specific to CUDA. And just to take a step back, high-quality CUDA coding assistance isn't available with the most popular AI coding tools today. Nsight Copilot for vs code is available on the vs code market place and openVX and is powered by DGX Cloud and enables developers to get help writing CUDA applications.

[1:00] While the online version of Nsight Copilot is great for CUDA coding assistance, it doesn't meet the needs of those who can't use a cloud-based tool for security or IP safety reasons.

[1:12] For in-editor assistance, Nsight Copilot has an autocomplete model we trained in house specifically for CUDA and we're able to run all of this locally because of the DGX Spark's 128 GB of memory. Nsight Copilot is our newest developer tool for CUDA and we're going to keep it up to date with the latest CUDA libraries and techniques.

[1:35]Thanks to DGX Spark, we're able to run the Nsight Copilot blueprint offline to help teams succeed with CUDA that need a higher level of privacy or want to eliminate cloud推断 costs.

[1:44] than to DGX Spark, we are able to run the Nsight Copilot blue print offline to help teams succeed with CUDA that need a higher level of privacy or want to eliminate cloud推断 costs. [1:44] than to DGX Spark, we are able to run the Nsight Copilot blue print offline to help teams succeed with CUDA that need a higher level of privacy or want to eliminate cloud推断 costs. [1:44] than to DGX Spark, we are able to run the Nsight Copilot blue print offline to help teams succeed with CUDA that need a higher level of privacy or want to eliminate cloud推断 costs. [1:44] than to DGX Spark, we are able to run the Nsight Copilot blue print offline to help teams succeed with CUDA that need a higher level of privacy or want to eliminate club推断 costs. [1:44] than to DGX Spark, we are able to run the Nsight Copilateral blue print offline to help teams succeed with CUDA that need a higher level of privacy or want to eliminate club推断 costs. [1:44] than to DGX Spark, we are able to run the Nsight Copilateral blue print offline to help teams succeed with CUDA that need a higher level of privacy or want to eliminate club推断 costs. [1:44] than to DGX Spark, we are able to run the Nsight Copilateral blue printoffline to help teams-run with CUDA that need a higher level of primary or want to eliminate club推断 costs. [1:44] than to DGX Spark, we are able to run the Nsight copilateral blue printoffline to help teams-run with CUDA that need a higher level of primary or want to eliminate club推断 costs. [1:44] than to DGX Spark, we are able to run the Nsight copilateral blue printoffline to help teams-run with CUDA that need a higher level of primary or want to eliminate club推断 costs. [1:44] than to DGX Spark, we are able to run the Nsight copilateral blue printoffline to help teams-run with CUDA that need a higher level of primary or want to eliminate club推断 costs. [1:44] than to DGX Spark, we are able to run the Nsight copilateral blue printoffline to help teams-run with CUDA that need a higher level of primary or want to eliminate club推断 costs. [1:44] than to DGX Spark, we are able to run the Nsight copilateral blue printoffline to help teams-run with CUDA that need a higher level of primary or want to eliminate club推断 costs. [1:44] than to DGX Spark, we are able to run the Nsight copilateral blue printoffline to help teams-run with CUDA that need a higher level of primary or want to eliminate club推断 costs. [1:44] than to DGX Spark, we are able to run the Nsight copilateral blue printoffline to help teams-run with CUDA that need a higher level of primary or want to eliminate club推断 costs. [1:44] than to DGX Spark, we are able to run the Nsight copilateral blue printoffline to help teams-run with CUDA that need a higher level of primary or want to eliminate club推断 costs. [1:44] than to DGX Spark, we are able to run the Nsight copilateral blue printoffline to help teams-run with CUDA that need a higher level of primary or want to-run club推断 costs. [1:44] than to DGX Spark, we are able to run the Nsight copilateral blue printoffline to help teams-run with CUDA that need a higher level of primary or want to-run club推断 costs. [1:44] than to DGX Spark, we are able to run the Nsight copilateral blue printoffline to help teams-run with CUDA that need a higher level of primary or want to-run club推断 costs. [1:44] than to DGX Spark, we are able to run the Nsight copilateral blue printoffline to help teams-run with CUDA that need a higher level of primary or want to-run club推断 costs. [1:44] than to DGX Spark, we are able to run the Nsight copilateral blue printoffline to help teams-run with])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])])