T
traeai
Sign in

概念

RAG

别名:Retrieval-Augmented Generation

Technique combining retrieval systems with generative models to improve response accuracy.

已跟踪 30 条高相关材料

TraeAI 观察

相关材料

已收录 30 条与 RAG 相关的内容,按评分排序。

RAG Is Burning Money — I Built a Cost Control Layer to Fix It

RAG Is Burning Money — I Built a Cost Control Layer to Fix It

Towards Data Science4995 字 (约 20 分钟)
92

RAG systems often incur hidden costs due to context over-fetching, lack of caching, and no model routing; the author built a cost control layer using semantic caching (98.5% hit rate), query routing (81% requests shifted to low-cost models), and token-budget circuit breaking, achieving 85.8% cost reduction at 10k requests/day without quality loss.

入选理由:上下文过取使每查询平均多消耗350 tokens,10k请求/日造成$52.5/日浪费(按$0.015/1K tokens计)

FeaturedArticle#RAG#Cost Optimization#Semantic Caching#Model Routing#LLM英文
Enterprise Document Intelligence: A Series on Building RAG Brick by Brick, from Minimal to Corpus scale

Enterprise RAG systems should focus on document understanding and business logic rather than stacking models and frameworks. Simple Python scripts often outperform complex production systems.

入选理由:多数企业RAG部署效果不佳,因基础解析和检索质量差。

FeaturedArticle#RAG#Enterprise AI#Document Intelligence#Retrieval-Augmented Generation#LLM Applications英文
From Regex to Vision Models: Which RAG Technique Fits Which Problem

From Regex to Vision Models: Which RAG Technique Fits Which Problem

Towards Data Science4997 字 (约 20 分钟)
90

RAG techniques are not universal; choose based on document structure and query control: use regex for templated docs, LLMs for sarcasm detection in transcripts, and vision models for schematics.

入选理由:模板化文档(如保险单、银行流水)适合用正则表达式提取字段,避免使用高成本的 RAG 流程。

FeaturedArticle#RAG#LLM#Document Intelligence#Vision Models#Enterprise AI英文
RAG Is Not Machine Learning, and the ML Toolkit Solves the Wrong Problem

RAG Is Not Machine Learning, and the ML Toolkit Solves the Wrong Problem

Towards Data Science6346 字 (约 26 分钟)
87

RAG is not machine learning, and the ML toolkit solves the wrong problem. The article argues that despite its resemblance to ML, RAG is fundamentally a search system, not a model, making hyperparameter tuning and embedding fine-tuning ineffective and misleading.

入选理由:RAG 解决的是确定性答案查找问题,而非预测未知结果,因此不能用 ML 方法优化。

FeaturedArticle#RAG#Machine Learning#Enterprise AI#Information Retrieval#LLM英文
Towards Data Science 图标

Proxy-Pointer RAG: Solving Entity and Relationship Sprawl in Large Knowledge Graphs

Towards Data Science3847 字 (约 16 分钟)
87

Proxy-Pointer RAG reduces the computational cost of entity and relationship reconciliation in knowledge graphs by over 90% by preserving document structure, enabling millisecond-scale ingestion without full-graph traversal.

入选理由:Proxy-Pointer RAG 使用 Skeleton Tree 和 Breadcrumb Injection 技术,使向量检索能精准定位文档完整结构段,而非碎片化块。

FeaturedArticle#RAG#Knowledge Graph#Proxy-Pointer#Entity Resolution#Vector Retrieval英文
Baseline Enterprise RAG, From PDF to Highlighted Answer

Baseline Enterprise RAG, From PDF to Highlighted Answer

Towards Data Science9383 字 (约 38 分钟)
85

The article introduces a minimal implementation of baseline enterprise RAG from PDF to highlighted answer using 100 lines of Python code, covering document parsing, question parsing, retrieval, and generation, returning a JSON answer with citations and a highlighted PDF.

入选理由:了解 RAG 的最快方式是实现一个最小的、实际工作的版本

FeaturedArticle#RAG#PDF#Natural Language Processing中文
Production RAG with LangChain & Vector Databases – Full Course

Production RAG with LangChain & Vector Databases – Full Course

freeCodeCamp.org106526 字 (约 427 分钟)
85

This article is a guide on how to transition from simple RAG (Retrieval-Augmented Generation) prototypes to production-grade systems. It emphasizes the challenges faced in scaling, debugging, and security and provides a comprehensive course that covers the entire RAG pipeline from vector database optimization and observability to advanced agentic and multimodal architectures. Through this course, readers will learn how to ensure that their AI applications are robust, secure, and ready for deploy

入选理由:通过解决扩展、调试和安全方面的关键挑战,将简单的RAG原型转化为生产级系统。

FeaturedVideo#RAG#LangChain#vector database#AI application#production-grade system中文
A lot of the "RAG is dead" arguments have some truth: traditional RAG is a poor fit for agentic work...

尽管传统RAG在处理代理工作负载时存在局限性,但通过引入代理RAG,可以有效解决这些问题。代理RAG通过查询路由、混合检索、检索评估和多步检索等机制,使得检索层与工作负载相匹配,从而提高系统的性能和可靠性。

入选理由:传统RAG在处理代理工作负载时存在单次检索、相似度与相关性不一致、缺乏检索质量检查和单一检索策略等问题。

FeaturedTweet#RAG#代理RAG#检索增强生成#人工智能#机器学习中文
Stanford’s AI Index Report 2026 meets the security reality in financial services

AI is becoming core infrastructure for financial services, but without security and data readiness, it accelerates risk as much as innovation. The Stanford report shows the key from pilot to production is data accessibility and governance.

入选理由:金融服务领域的AI使用已从试验阶段转变为生产阶段,安全威胁也在以机器速度演进,攻击者利用AI加速钓鱼、恶意软件开发和社会工程,攻击响应时间从天缩短到分钟。

FeaturedArticle#AI#Financial Services#Cybersecurity#Data Governance#Stanford AI Index英文
The Hidden Skill Gap: Why Knowing SQL + Python Isn’t Enough Anymore

The Hidden Skill Gap: Why Knowing SQL + Python Isn’t Enough Anymore

KDnuggets1477 字 (约 6 分钟)
85

Data job requirements have shifted from SQL+Python basics to AI system building and data engineering capabilities, with LLM, RAG, data modeling and MLOps becoming new differentiators.

入选理由:2026年数据岗位需求中AI技能排名第二,1/3岗位要求LLM/RAG/向量数据库实操能力

FeaturedArticle#Data Science#AI Skills#Data Engineering#Career Development英文
BestBlogs 周刊 | 第 95 期:Agent 工程化的全面落地

BestBlogs Weekly | Issue 95: The Full Deployment of Agent Engineering

Gino Notes7632 字 (约 31 分钟)
85

Agent engineering is fully deployed, with Anthropic and OpenAI advancing tools toward production.

入选理由:Claude Code 放弃 RAG 索引,采用 Agentic Search 实现代码导航。

FeaturedArticle#Agent#Engineering#AI Tools中文
Presentation: Accelerating LLM-Driven Developer Productivity at Zoox

Zoox systematically enhances developer productivity through an AI-driven platform called Cortex.

入选理由:Cortex 平台整合 RAG、多模态 LLM 和 API,实现文档与开发流程智能化。

FeaturedArticle#AI#LLM#Developer Tools#Platform Architecture英文
Towards Data Science 图标

Hybrid Search and Re-Ranking in Production RAG

Towards Data Science3582 字 (约 15 分钟)
85

The article discusses hybrid search and re-ranking techniques in production RAG systems, addressing the limitations of dense vector retrieval in specific technical queries.

入选理由:密集向量检索在概念性查询中表现良好,但在特定技术查询中存在不足。

FeaturedArticle#RAG#Search Engine#Hybrid Search#Re-Ranking中文
How Miro uses Amazon Bedrock to boost software bug routing accuracy and improve time-to-resolution from days to hours

Miro through combining Amazon Bedrock's RAG technology achieves BugManager, boosting software error routing accuracy by six times and reducing resolution time from days to hours.

入选理由:Miro利用Amazon Bedrock的RAG技术,使错误路由团队重分配减少六倍。

FeaturedArticle#Amazon Bedrock#RAG#Bug Triage#Miro#AI英文
Search shouldn’t stop at reranking.

Qdrant 1.17 introduces the first vector index-native relevance ...

Search shouldn’t stop at reranking.

Qdrant(@qdrant_engine)99 字 (约 1 分钟)
85

Qdrant 1.17 introduces the first vector index-native relevance feedback mechanism, embedding relevance judgment directly into retrieval for improved efficiency and accuracy.

入选理由:Qdrant 1.17 首次实现向量索引原生相关性反馈(index-native relevance feedback)

FeaturedTweet#vector search#RAG#Qdrant#AI retrieval英文
AI HOT 精选 图标

OncoAgent是一个开源的隐私保护肿瘤临床决策支持系统,结合了双层LLM架构和LangGraph拓扑,显著提升了决策支持系统的性能和安全性。

入选理由:OncoAgent结合了双层LLM架构和LangGraph拓扑

FeaturedArticle#oncology#multi-agent#LangGraph#RAG#QLoRA#AMD中文
RAG Is Blind to Time — I Built a Temporal Layer to Fix It in Production

RAG Is Blind to Time — I Built a Temporal Layer to Fix It in Production

Towards Data Science5126 字 (约 21 分钟)
85

The article reveals the time perception defects of RAG systems and proposes adding a temporal layer to solve the problem of outdated information, improving the timeliness of knowledge bases.

入选理由:RAG系统无法识别文档时效性,导致过时内容优先显示

FeaturedArticle#RAG#AI#Knowledge Base#Time Perception中文
Agentic RAG Explained in 3 Levels of Difficulty

Agentic RAG Explained in 3 Levels of Difficulty

Machine Learning Mastery1374 字 (约 6 分钟)
85

The article explains three levels of Agentic RAG, contrasts its limitations with traditional RAG, and introduces how agent mechanisms improve information retrieval and generation.

入选理由:传统RAG无法处理多源信息整合

FeaturedArticle#RAG#AI Agent#Information Retrieval中文
Agentic Search for Context Engineering — Leonie Monigatti, Elastic

Agentic Search for Context Engineering — Leonie Monigatti, Elastic

AI Engineer775 字 (约 4 分钟)
85

Agentic Search enables AI agents to dynamically construct context, significantly improving LLM performance in complex tasks while reducing reliance on manual prompt engineering.

入选理由:Agentic Search 使用 AI 代理自动检索与任务相关的文档片段,准确率提升至 87%

FeaturedVideo#AI Agent#Context Engineering#Search#RAG#Elastic中文
𝗧𝗵𝗲𝗿𝗲 𝗮𝗿𝗲 𝘁𝗵𝗿𝗲𝗲 𝗰𝗼𝗺𝗺𝗼𝗻 𝘄𝗮𝘆𝘀 𝘁𝗼 𝗰𝗵𝘂𝗻𝗸 𝗱𝗼𝗰𝘂𝗺𝗲𝗻𝘁𝘀 𝗳𝗼𝗿 𝗥𝗔𝗚....

Three Common Ways to Chunk Documents for RAG and Selection Guide

Milvus(@milvusio)129 字 (约 1 分钟)
82

RAG chunking strategies must match document types: use semantic chunking for tech docs, fixed-length with overlap for chats, and section-based splitting for APIs to avoid retrieval failures.

入选理由:固定长度分块(512/1024 token)易截断完整答案,如600 token的Nginx配置被512切分导致信息缺失。

FeaturedTweet#RAG#Chunking#Milvus#Vector Search#LLM英文
Private, Local AI CUDA Coding Assistance on DGX Spark

Private, Local AI CUDA Coding Assistance on DGX Spark

NVIDIA Developer354 字 (约 2 分钟)
82

Nsight Copilot runs offline on DGX Spark using 128GB VRAM to deploy GPT OSS 12B NIM + CUDA RAG pipeline, delivering privacy-preserving, cloud-cost-free AI coding assistance for CUDA developers.

入选理由:Nsight Copilot 支持在 DGX Spark(128GB 显存)上本地部署 GPT OSS 12B NIM + CUDA RAG 管道,实现完全离线运行。

FeaturedVideo#CUDA#AI Coding Assistant#NVIDIA#Local LLM#DGX Spark英文
Why your agents need decision traces, not just documents — Zach Blumenfeld, Neo4j

Agent systems relying solely on document retrieval (e.g., RAG) cannot support high-quality decisions; they must incorporate context graphs containing decision traces, causal chains, and precedents to enable explainable, accurate autonomous decisions—Neo4j provides tooling for rapid implementation.

入选理由:上下文图(context graph)不仅包含实体与事实,更整合决策轨迹、因果链和历史先例,使Agent能回答‘为何拒绝/接受’而非仅‘是什么’。

FeaturedVideo#Agent#Graph Database#Neo4j#Decision Explainability#RAG英文
Your RAG tested well and went live, but recall is getting worse. 
𝗧𝗵𝗿𝗲𝗲 𝗰𝗼𝗺𝗺𝗼𝗻 ...

Common Causes of Declining Recall in RAG Systems

Milvus(@milvusio)189 字 (约 1 分钟)
75

The article identifies three common reasons for declining recall in RAG systems post-deployment: stale indexes, embedding model updates causing vector mismatches, and changes in user query patterns.

入选理由:索引过时:三个月前构建的向量索引无法反映最新文档的增删改。

FeaturedTweet#RAG#Recall#Milvus#Embedding Model#Vector Database英文
The model is the least interesting part of a RAG agent.

What actually determines whether an agent s...

RAG Agent System Design: Model is Least Interesting, System Design Determines Success

Weaviate • vector database(@weaviate_io)403 字 (约 2 分钟)
75

The key to building successful AI agents lies in system design rather than the model itself. The article outlines four core architectural layers required for enterprise-grade RAG agents: security, retrieval, instructions, and guardrails.

入选理由:在生产环境中,AI代理的成功主要取决于系统设计,而非模型选择。

FeaturedTweet#RAG#AI Agents#System Design#Weaviate#Vector Database英文
Vercel News 图标

Amazon OpenSearch Serverless is now available in the Vercel Marketplace

Vercel News424 字 (约 2 分钟)
75

Amazon OpenSearch Serverless 集成到 Vercel Marketplace,支持统一配置与管理,提供高达 60% 的成本节省。

入选理由:Amazon OpenSearch Serverless 提供统一支持向量、词法、混合和代理搜索。

FeaturedArticle#Amazon OpenSearch#Vercel#Serverless#RAG英文
At last month’s Unstructured Data Meetup London, Jiang Chen, our Head of Developer Relations, broke ...

Milvus: How to Turn Conversation History into Long-Term Memory

Milvus(@milvusio)144 字 (约 1 分钟)
75

Milvus proposes a method to convert raw conversation history into readable, editable long-term memory using Markdown and semantic search.

入选理由:对话历史应以 Markdown 格式存储,便于人类阅读和编辑。

FeaturedTweet#Agent Memory#RAG#Vector Search英文
Congrats on the launch! 

Filesystems are all you need (?) 

There wasn't a huge demand for "managed...

Congrats on the launch! Filesystems are all you need (?)

Jerry Liu(@jerryjliu0)235 字 (约 1 分钟)
75

Jerry Liu congratulates Mirage's launch, suggesting low demand for 'managed RAG' in 2023 may stem from immature infrastructure and market, with filesystems potentially being the right abstraction for production document indexing.

入选理由:Mirage 项目历时 6 周,代码量超 110 万行,重构 bash 核心功能

FeaturedTweet#RAG#AI Agents#Filesystem#Mirage#Infrastructure中文

跨材料问答 · RAG

回答基于:RAG 相关 30 条材料
    0 / 500

    AI may generate inaccurate information. Please verify important content.