T
traeai
登录
返回首页
Hugging Face Blog

DeepInfra on Hugging Face Inference Providers 🔥

8.5Score
DeepInfra on Hugging Face Inference Providers 🔥
AI 深度提炼
  • DeepInfra成为Hugging Face Hub上的新推理服务提供商。
  • DeepInfra支持多种模型类型,如LLM、文本到图像等。
  • 用户可以通过设置API密钥来直接调用DeepInfra的服务。

结构提纲

AI 替你读一遍后整理出的核心层级。

  1. 介绍DeepInfra加入Hugging Face Hub作为新的推理服务提供商。

  2. 描述DeepInfra平台的特点及其支持的模型类型。

  3. 解释在网站UI中如何配置和使用DeepInfra作为推理服务提供商。

思维导图

用一张图看清主题之间的关系。

  • DeepInfra on Hugging Face Inference Providers

金句 / Highlights

值得收藏与分享的关键句。

  • DeepInfra是成本效益最高的无服务器AI推理平台之一。
  • DeepInfra支持广泛的模型类型,从LLM到文本到图像、文本到视频等。
  • 用户可以在模型页面通过设置API密钥直接调用DeepInfra的服务。
#Hugging Face#DeepInfra#AI推理
打开原文

Back to Articles

![Image 1: Aray Sultanbekova's avatar](https://huggingface.co/araikin)

![Image 2: Shang-Pin's avatar](https://huggingface.co/shang-pin-deepinfra)

![Image 3: Utemuratov's avatar](https://huggingface.co/Pernekhan)

![Image 4: Yessen K's avatar](https://huggingface.co/yessenzhar)

![Image 5: Oguz Vuruskaner's avatar](https://huggingface.co/ovuruska)

![Image 6: Célina Hanouti's avatar](https://huggingface.co/celinah)

![Image 7: Simon Brandeis's avatar](https://huggingface.co/sbrandeis)

![Image 8: Lucain Pouget's avatar](https://huggingface.co/Wauplin)

  • [How it works](http://huggingface.co/blog/inference-providers-deepinfra#how-it-works "How it works")
  • [In the website UI](http://huggingface.co/blog/inference-providers-deepinfra#in-the-website-ui "In the website UI")
  • [From the client SDKs](http://huggingface.co/blog/inference-providers-deepinfra#from-the-client-sdks "From the client SDKs")
  • [Billing](http://huggingface.co/blog/inference-providers-deepinfra#billing "Billing")
  • [Feedback and next steps](http://huggingface.co/blog/inference-providers-deepinfra#feedback-and-next-steps "Feedback and next steps")

![Image 9: banner image](https://huggingface.co/blog/assets/inference-providers/welcome-deepinfra.jpg)

We're thrilled to share that **DeepInfra** is now a supported Inference Provider on the Hugging Face Hub!

DeepInfra joins our growing ecosystem, enhancing the breadth and capabilities of serverless inference directly on the Hub's model pages. Inference Providers are also seamlessly integrated into our client SDKs (for both JS and Python), making it super easy to use a wide variety of models with your preferred providers.

DeepInfra is a serverless AI inference platform offering one of the most cost-effective pricing per token in the industry. With a catalog of over 100 models, DeepInfra makes it easy for developers to integrate a wide range of AI capabilities into their applications with minimal setup.

DeepInfra supports a broad spectrum of model types - from LLMs to text-to-image, text-to-video, embeddings, and more. As part of this initial integration, DeepInfra is launching support for **conversational and text-generation tasks** on Hugging Face, enabling access to popular open-weight LLMs such as DeepSeek V4, Kimi-K2.6, GLM-5.1, and many more. **Support for additional tasks** (text-to-image, text-to-video, embeddings, and more) will roll out soon!

Read more about how to use DeepInfra as an Inference Provider in its dedicated documentation page.

See the full list of models supported by DeepInfra here.

Follow DeepInfra on Hugging Face: https://huggingface.co/DeepInfra.

How it works

In the website UI

1. In your user account settings, you are able to:

  • Set your own API keys for the providers you've signed up with. If no custom key is set, your requests will be routed through HF.
  • Order providers by preference. This applies to the widget and code snippets in the model pages.
Image 10: Inference Providers

1. As mentioned, there are two modes when calling Inference Providers:

  • Custom key (calls go directly to the inference provider, using your own API key of the corresponding inference provider)
  • Routed by HF (in that case, you don't need a token from the provider, and the charges are applied directly to your HF account rather than the provider's account)
Image 11: Inference Providers

1. Model pages showcase third-party inference providers (the ones that are compatible with the current model, sorted by user preference)

Image 12: Inference Providers

From the client SDKs

DeepInfra is available through the Hugging Face SDKs - `huggingface_hub` (>= 1.11.2) for Python and `@huggingface/inference` for JavaScript.

The following examples show how to use DeepSeek V4 Pro through DeepInfra. Use a Hugging Face token to authenticate - the request will be routed to DeepInfra automatically.

#### [](http://huggingface.co/blog/inference-providers-deepinfra#from-your-favorite-agent-harness) From your favorite Agent Harness

Hugging Face Inference Providers are integrated in most Agent Harnesses - including Pi, OpenCode, Hermes Agents, OpenClaw, and more. This means you can plug DeepInfra-hosted models straight into your favorite tools without any extra glue code. Browse the full list of integrations here.

#### [](http://huggingface.co/blog/inference-providers-deepinfra#from-python) from Python

import os
from openai import OpenAI

client = OpenAI(
    base_url="https://router.huggingface.co/v1",
    api_key=os.environ["HF_TOKEN"],
)

completion = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-V4-Pro:deepinfra",
    messages=[
        {
            "role": "user",
            "content": "Write a Python function that returns the nth Fibonacci number using memoization."
        }
    ],
)

print(completion.choices[0].message)

#### [](http://huggingface.co/blog/inference-providers-deepinfra#from-js) from JS

import { OpenAI } from "openai";

const client = new OpenAI({
    baseURL: "https://router.huggingface.co/v1",
    apiKey: process.env.HF_TOKEN,
});

const chatCompletion = await client.chat.completions.create({
    model: "deepseek-ai/DeepSeek-V4-Pro:deepinfra",
    messages: [
        {
            role: "user",
            content: "Write a Python function that returns the nth Fibonacci number using memoization.",
        },
    ],
});

console.log(chatCompletion.choices[0].message);

Billing

For direct requests, i.e. when you use the key from an inference provider, you are billed by the corresponding provider. For instance, if you use a DeepInfra API key you're billed on your DeepInfra account.

For routed requests, i.e. when you authenticate via the Hugging Face Hub, you'll only pay the standard provider API rates. There's no additional markup from us; we just pass through the provider costs directly. (In the future, we may establish revenue-sharing agreements with our provider partners.)

**Important Note** ‼️ PRO users get $2 worth of Inference credits every month. You can use them across providers. 🔥

Subscribe to the Hugging Face PRO plan to get access to Inference credits, ZeroGPU, Spaces Dev Mode, 20x higher limits, and more.

We also provide free inference with a small quota for our signed-in free users, but please upgrade to PRO if you can!

Feedback and next steps

We would love to get your feedback! Share your thoughts and/or comments here: https://huggingface.co/spaces/huggingface/HuggingDiscussions/discussions/49

问问这篇内容

回答仅基于本篇材料
    0 / 500

    Skill 包

    领域模板,一键产出结构化笔记
    • 论文精读包

      把一篇论文 / 技术博客精读成结构化笔记:问题、方法、实验、批判、延伸阅读。

      • · TL;DR(1 段)
      • · 研究问题与动机
      • · 方法概览
    • 投融资雷达包

      把一条融资 / 创投新闻整理成投资人视角的雷达卡:交易要点、判断、竞争格局、风险、尽调清单。

      • · 交易要点(公司 / 轮次 / 金额 / 投资人 / 估值,材料未明示则写 “未披露”)
      • · 投资 thesis(这家公司为什么值得关注)
      • · 竞争格局与替代方案

    导出到第二大脑

    支持 Notion / Obsidian / Readwise
    下载 Markdown(Obsidian 直接拖入)