GitHub Uses eBPF to Eliminate Deployment Risks and Prevent Circular Failures

- eBPF技术可实时监控和拦截潜在问题,减少部署风险。
- GitHub通过eBPF实现更高效的故障检测与恢复机制。
- 该实践为其他大规模分布式系统提供了参考案例。
GitHub Uses eBPF to Eliminate Deployment Risks and Prevent Circular Failures - InfoQ
[BT](http://www.infoq.com/int/bt/ "bt")
InfoQ Software Architects' Newsletter
A monthly overview of things you need to know as an architect or aspiring architect.
Enter your e-mail address
Select your country - [x] I consent to InfoQ.com handling my data as explained in this Privacy Notice.
Close
Live Webinar and Q&A: Shipping Faster, Breaking More: Rethinking Delivery Systems in the Age of AI (May 28, 2026)Save Your Seat
Close
Toggle Navigation
Facilitating the Spread of Knowledge and Innovation in Professional Software Development
English edition
[Write for InfoQ](http://www.infoq.com/write-for-infoq/ "Write for InfoQ")
Search
Unlock the full InfoQ experience
Unlock the full InfoQ experience by logging in! Stay updated with your favorite authors and topics, engage with content, and download exclusive resources.
or
Don't have an InfoQ account?
- **Stay updated on topics and peers that matter to you**Receive instant alerts on the latest insights and trends.
- **Quickly access free resources for continuous learning**Minibooks, videos with transcripts, and training materials.
- **Save articles and read at anytime**Bookmark articles to read whenever youre ready.
NewsArticlesPresentationsPodcastsGuides
Topics
[Development](http://www.infoq.com/development/ "Development")
- [Java](http://www.infoq.com/java/ "Java")
- [Kotlin](http://www.infoq.com/kotlin/ "Kotlin")
- [.Net](http://www.infoq.com/dotnet/ ".Net")
- [C#](http://www.infoq.com/c_sharp/ "C#")
- [Swift](http://www.infoq.com/swift/ "Swift")
- [Go](http://www.infoq.com/golang/ "Go")
- [Rust](http://www.infoq.com/rust/ "Rust")
- [JavaScript](http://www.infoq.com/javascript/ "JavaScript")
Featured in Development
Dany Lepage discusses the architectural journey of porting a hit VR title to seven non-VR platforms. He explains how his team solved the challenges of cross-progression, diverse input paradigms, and maintaining release velocity across Steam, iOS, and PlayStation. Beyond the tech, he shares candid lessons on the "product fit" gap when translating immersive social presence to 2D screens.

All in developmentFollow Topic
[Architecture & Design](http://www.infoq.com/architecture-design/ "Architecture & Design")
- [Architecture](http://www.infoq.com/architecture/ "Architecture")
- [Enterprise Architecture](http://www.infoq.com/enterprise-architecture/ "Enterprise Architecture")
- [Scalability/Performance](http://www.infoq.com/performance-scalability/ "Scalability/Performance")
- [Design](http://www.infoq.com/design/ "Design")
- [Case Studies](http://www.infoq.com/Case_Study/ "Case Studies")
- [Microservices](http://www.infoq.com/microservices/ "Microservices")
- [Service Mesh](http://www.infoq.com/servicemesh/ "Service Mesh")
- [Patterns](http://www.infoq.com/DesignPattern/ "Patterns")
- [Security](http://www.infoq.com/Security/ "Security")
Featured in Architecture & Design
Frank Yu shares Coinbase’s engineering philosophy for building resilient, fair, and fast financial exchanges. He explains the power of a single-threaded architecture combined with the Raft consensus algorithm to maintain 24/7 availability. He discusses how determinism enables zero-downtime rolling deployments and the ability to replay production logs for perfect bug reproduction.

All in architecture-designFollow Topic
[AI Infrastructure](http://www.infoq.com/ai-ml-data-eng/ "AI Infrastructure")
- [Big Data](http://www.infoq.com/bigdata/ "Big Data")
- [Machine Learning](http://www.infoq.com/machinelearning/ "Machine Learning")
- [NoSQL](http://www.infoq.com/nosql/ "NoSQL")
- [Database](http://www.infoq.com/database/ "Database")
- [Data Analytics](http://www.infoq.com/data-analytics/ "Data Analytics")
- [Streaming](http://www.infoq.com/streaming/ "Streaming")
Featured in AI, ML & Data Engineering
CodeGuardian is an MCP server that extends AI coding assistants with comprehensive code quality and security analysis capabilities. By implementing eleven specialized tools, CodeGuardian enables developers to access enterprise-grade analysis directly through their AI assistant, eliminating context-switching and reducing friction in adopting secure coding practices.

All in ai-ml-data-engFollow Topic
[Culture & Methods](http://www.infoq.com/culture-methods/ "Culture & Methods")
- [Agile](http://www.infoq.com/agile/ "Agile")
- [Diversity](http://www.infoq.com/diversity/ "Diversity")
- [Leadership](http://www.infoq.com/leadership/ "Leadership")
- [Lean/Kanban](http://www.infoq.com/lean/ "Lean/Kanban")
- [Personal Growth](http://www.infoq.com/personal-growth/ "Personal Growth")
- [Scrum](http://www.infoq.com/scrum/ "Scrum")
- [Sociocracy](http://www.infoq.com/sociocracy/ "Sociocracy")
- [Software Craftmanship](http://www.infoq.com/software_craftsmanship/ "Software Craftmanship")
- [Team Collaboration](http://www.infoq.com/team-collaboration/ "Team Collaboration")
- [Testing](http://www.infoq.com/testing/ "Testing")
- [UX](http://www.infoq.com/ux/ "UX")
Featured in Culture & Methods
The panelists share insights on evolving company culture. They discuss leveraging feedback loops, lending social capital, and the friction between legacy bureaucracy and agile engineering. The panel explains how to maintain cohesion in remote teams and use interviews to uncover the true "unmanicured" culture of a firm.

All in culture-methodsFollow Topic
- [Infrastructure](http://www.infoq.com/infrastructure/ "Infrastructure")
- [Continuous Delivery](http://www.infoq.com/continuous_delivery/ "Continuous Delivery")
- [Automation](http://www.infoq.com/automation/ "Automation")
- [Containers](http://www.infoq.com/containers/ "Containers")
- [Cloud](http://www.infoq.com/cloud-computing/ "Cloud")
- [Observability](http://www.infoq.com/observability/ "Observability")
Featured in DevOps
The presenters discuss incident response, how AI-enhanced SRE platforms connect signals from logs, metrics, traces, and historical incidents to enable autonomous decisions.

All in devopsFollow Topic
[Events](https://events.infoq.com/ "Events")
Helpful links
- [About InfoQ](http://www.infoq.com/about-infoq "About InfoQ")
- [InfoQ Editors](http://www.infoq.com/infoq-editors "InfoQ Editors")
- [Write for InfoQ](http://www.infoq.com/write-for-infoq "Write for InfoQ")
- [About C4Media](https://c4media.com/ "About C4Media")
- [Diversity](https://c4media.com/diversity "Diversity")
Choose your language

[InfoQ Homepage](http://www.infoq.com/ "InfoQ Homepage")[News](http://www.infoq.com/news "News")GitHub Uses eBPF to Eliminate Deployment Risks and Prevent Circular Failures
[DevOps](http://www.infoq.com/Devops/ "DevOps")
Shipping Faster, Breaking More: Rethinking Delivery Systems in the Age of AI (Webinar May 28th)
GitHub Uses eBPF to Eliminate Deployment Risks and Prevent Circular Failures
Apr 28, 2026 3 min read
by
- Craig Risi
Follow Software Architect | Game Designer| Writer | Speaker
#### Write for InfoQ
**Feed your curiosity.**Help 550k+ global
senior developers
each month stay ahead.Get in touch
Log in to listen to this article
Audio ready to play
Your browser does not support the audio element.
0:00 0:00
Normal 1.25x 1.5x
Like
GitHub has introduced a new approach to improving deployment safety by leveraging eBPF, enabling the company to detect and prevent hidden circular dependencies that could block recovery during outages. The technique, detailed in a recent engineering blog, allows GitHub to monitor and selectively restrict network behavior of deployment processes at the kernel level, ensuring that critical systems can still be updated even when parts of the platform are unavailable.
The innovation addresses a long-standing risk in large-scale systems: circular dependencies, where deployment tooling relies, directly or indirectly, on the very services it is meant to fix. GitHub highlighted scenarios where deployment scripts might attempt to fetch binaries, call internal services, or trigger background updates that depend on GitHub itself. In failure conditions, these dependencies can cascade, preventing remediation and prolonging outages. By using eBPF to isolate deployment processes and control their outbound network access, GitHub can proactively block such calls and surface them to engineers before they cause incidents.
At the core of the solution is eBPF's ability to run custom programs inside the Linux kernel, hooking into low-level system events such as network requests. GitHub uses this capability to place deployment scripts inside controlled environments (cGroups), where their network traffic can be inspected, filtered, or blocked based on predefined rules. This allows the platform to enforce fine-grained, per-process network policies without affecting the broader system or production traffic.
To overcome the challenge of managing dynamic infrastructure, GitHub extended this approach with DNS-aware filtering. By intercepting DNS queries and routing them through a proxy, the system can evaluate outbound requests based on domain names rather than static IP addresses, making it far more adaptable in large, fast-changing environments. The system also maps blocked requests back to specific processes and commands, giving teams clear visibility into what triggered the issue and how to fix it.
Traditionally, identifying circular dependencies has been a manual and reactive process, often discovered only during incidents. GitHub's approach shifts this to proactive detection: if a deployment introduces a risky dependency - whether direct, hidden, or transient - the system flags it immediately. This reduces the likelihood of deployment failures during outages and improves mean time to recovery by ensuring that remediation paths remain available.
The system has been rolled out over six months and is now actively used to safeguard deployments across GitHub's infrastructure. It also provides additional benefits, including auditing outbound calls during deployments and enforcing resource limits to prevent runaway scripts from impacting production workloads.
GitHub's use of eBPF reflects a wider industry trend toward kernel-level observability and control as systems grow more complex. Increasingly, organizations are turning to eBPF not just for monitoring, but for enforcing runtime policies, improving security, and managing system behavior in real time. The approach allows platform teams to move beyond traditional application-level controls and gain deeper visibility into how systems behave under real-world conditions.
The development also highlights a key evolution in deployment practices: ensuring that systems can recover from failure. As platforms become more interconnected, hidden dependencies can create unexpected failure modes. By embedding safeguards directly into the operating system layer, GitHub demonstrates how modern infrastructure can be made more resilient, ensuring that the tools used to fix systems remain independent of the systems themselves.
Other large-scale platforms face similar challenges around hidden dependencies and deployment safety, and many are adopting comparable, but not identical, approaches. For example, Google has long emphasized dependency isolation and hermetic builds within its internal systems, such as Bazel, ensuring that build and deployment processes do not rely on external or runtime state that could fail during incidents. This reduces the risk of circular dependencies by design, as deployments are constructed to be reproducible and self-contained. Similarly, Amazon Web Services promotes cell-based architecture, where services are segmented into isolated units so that failures and their dependencies are contained, ensuring that deployment and recovery paths remain available even when parts of the system are degraded.
In the cloud-native ecosystem, projects like Kubernetes and networking layers such as Cilium are also evolving toward runtime policy enforcement and observability at the kernel and network layers, similar to GitHub's use of eBPF. Meanwhile, platforms like GitLabfocus on pipeline isolation and dependency control, encouraging practices such as artifact pinning, offline runners, and restricted network access during CI/CD execution.
Across these approaches, a common theme emerges: rather than relying solely on process or documentation to avoid circular dependencies, leading platforms are embedding guardrails directly into infrastructure and execution environments, ensuring that deployment systems remain reliable even under failure conditions.
About the Author

#### **Craig Risi**
Craig Risi is a man of many talents but has no sense of how to use them. He could be out changing the world but prefers to make software instead. He possesses a passion for software design, but more importantly software quality and designing systems in a technically diverse and constantly evolving tech world. Craig is also the writer of the book, Quality By Design: Designing Quality Software Systems, and writes regular articles on his blog sites and various other tech sites around the world. When not playing with software, he can often be found writing, designing board games, or running long distances for no apparent reason.
Show more Show less
#### This content is in the DevOps topic
Follow Topic
##### Related Topics:
Followers: 5054
Follow Topic
Followers: 40
Follow Topic
Followers: 11
Follow Topic
Followers: 6
Follow Topic
* #### Popular in DevOps
* #### Related Sponsors
- #### Related Sponsor
**Build and run data, API, and agentic AI services on the world's most widely adopted actor-based runtime.** Akka services are elastic, agile, and resilient. **Learn more.**
Related Content
Apr 22, 2026
Apr 21, 2026
Apr 12, 2026
Apr 08, 2026
Apr 02, 2026
Apr 02, 2026
Apr 17, 2026
Apr 28, 2026
Apr 28, 2026
Related Sponsors
- #### Akka Launches Agentic Platform for Autonomous, Real-Time & Edge AI Systems
Discover the new Akka Agentic Platform—Orchestration, Agents, Memory, and Streaming—designed for real-time, autonomous, and adaptive systems. Learn how to build faster, smarter AI with less compute.
- [![Image 14: [Webinar] Creating Certainty in the Age of Agentic AI. Watch On-Demand.](https://imgopt.infoq.com/fit-in/250x320/filters:quality(80)/sponsorship/rsc/531d8edd-4f74-486b-aaca-10058c609c1c/cover/AkkaODwebinarAI-1752830288364.jpg)](http://www.infoq.com/vendorcontent/show.action?vcr=531d8edd-4f74-486b-aaca-10058c609c1c&pageType=NEWS_PAGE&vcrPlace=TS_SPONSORED_CONTENT_TOP)#### [[Webinar] Creating Certainty in the Age of Agentic AI. Watch On-Demand.](http://www.infoq.com/vendorcontent/show.action?vcr=531d8edd-4f74-486b-aaca-10058c609c1c&pageType=NEWS_PAGE&vcrPlace=TS_SPONSORED_CONTENT_TOP)
See how enterprises move from agent prototypes to fully operational agentic AI systems—faster and with greater confidence. Learn how the right architecture, tools, and patterns help meet SLAs, reduce uncertainty, and deliver systems 3x faster with 1/3 the compute and lower token use.
- Sponsored by

Related Content
Apr 28, 2026 
- Icon##### Week-Long Outage: Lifelong Lessons
Apr 28, 2026 
Apr 28, 2026
Apr 28, 2026 
Apr 28, 2026
Apr 28, 2026
**The InfoQ** Newsletter
A round-up of last week’s content on InfoQ sent out every Tuesday. Join a community of over 250,000 senior developers. View an example
Enter your e-mail address
Select your country - [x] I consent to InfoQ.com handling my data as explained in this Privacy Notice.
- ##### [QCon San Francisco 2026: 12 Tracks Announced](http://www.infoq.com/news/2026/04/qconsf-2026-tracks-announced/ "QCon San Francisco 2026: 12 Tracks Announced")
- ##### [Microsoft's Russinovich and Hanselman Warn AI Is Hollowing Out the Junior Developer Pipeline](http://www.infoq.com/news/2026/04/junior-developer-pipeline-crisis/ "Microsoft's Russinovich and Hanselman Warn AI Is Hollowing Out the Junior Developer Pipeline")
- ##### [C++26: Reflection, Memory Safety, Contracts, and a New Async Model](http://www.infoq.com/news/2026/04/cpp-26-reflection-safety-async/ "C++26: Reflection, Memory Safety, Contracts, and a New Async Model")
- ##### [Uber Migrates 75,000+ Test Classes from Junit 4 to Junit 5 Using Automated Code Transformation](http://www.infoq.com/news/2026/04/uber-junit4-junit5-migration/ "Uber Migrates 75,000+ Test Classes from Junit 4 to Junit 5 Using Automated Code Transformation")
- ##### [How to Build an Exchange: Sub Millisecond Response Times and 24/7 Uptimes in the Cloud](http://www.infoq.com/presentations/exchange-systems-cloud/ "How to Build an Exchange: Sub Millisecond Response Times and 24/7 Uptimes in the Cloud")
- ##### [Dropbox Collaborates with GitHub to Reduce Monorepo Size from 87GB to 20GB](http://www.infoq.com/news/2026/04/dropbox-reduces-git-optimization/ "Dropbox Collaborates with GitHub to Reduce Monorepo Size from 87GB to 20GB")
- ##### [How Observability and Telemetry Can Enhance the Practice of Software Engineering](http://www.infoq.com/news/2026/04/observability-telemetry/ "How Observability and Telemetry Can Enhance the Practice of Software Engineering")
- ##### [Panel: Building a Culture that Works](http://www.infoq.com/presentations/panel-positive-culture/ "Panel: Building a Culture that Works")
- ##### [Platform as a Product: Delivering Value While Balancing Competing Priorities](http://www.infoq.com/news/2026/04/platform-product-deliver-value/ "Platform as a Product: Delivering Value While Balancing Competing Priorities")
- ##### [How Slack Manages Context in Long-running Multi-agent Systems](http://www.infoq.com/news/2026/04/slack-agent-context-management/ "How Slack Manages Context in Long-running Multi-agent Systems")
- ##### [Google Cloud Introduces Agents CLI to Streamline AI Agent Development Lifecycle](http://www.infoq.com/news/2026/04/agents-cli-google-cloud/ "Google Cloud Introduces Agents CLI to Streamline AI Agent Development Lifecycle")
- ##### [Legare Kerrison and Cedric Clyburn on LLM Performance and Evaluations](http://www.infoq.com/news/2026/04/kerrison-clyburn-llm-performance/ "Legare Kerrison and Cedric Clyburn on LLM Performance and Evaluations")
- ##### [GitHub Uses eBPF to Eliminate Deployment Risks and Prevent Circular Failures](http://www.infoq.com/news/2026/04/github-ebpf-deployment/ "GitHub Uses eBPF to Eliminate Deployment Risks and Prevent Circular Failures")
- ##### [AI-Powered SRE for Autonomous Incident Response](http://www.infoq.com/presentations/ai-sre-incident-response/ "AI-Powered SRE for Autonomous Incident Response")
- ##### [Week-Long Outage: Lifelong Lessons](http://www.infoq.com/presentations/outage-lessons/ "Week-Long Outage: Lifelong Lessons")
**The InfoQ** Newsletter
A round-up of last week’s content on InfoQ sent out every Tuesday. Join a community of over 250,000 senior developers. View an example
- Get a quick overview of content published on a variety of innovator and early adopter technologies
- Learn what you don’t know that you don’t know
- Stay up to date with the latest information from the topics you are interested in
Enter your e-mail address
Select your country - [x] I consent to InfoQ.com handling my data as explained in this Privacy Notice.
[Home](http://www.infoq.com/ "Home")[Create account](http://www.infoq.com/reginit.action "Create account")Log In[QCon Conferences](http://qconferences.com/ "QCon Conferences")Events[Write for InfoQ](http://www.infoq.com/write-for-infoq/ "Write for InfoQ")[InfoQ Editors](http://www.infoq.com/infoq-editors/ "InfoQ Editors")[About InfoQ](http://www.infoq.com/about-infoq/ "About InfoQ")[About C4Media](https://c4media.com/ "About C4Media")[Media Kit](https://get.infoq.com/infoq-mediakit/ "Media Kit")[InfoQ Developer Marketing Blog](https://devmarketing.c4media.com/?utm_source=infoq "InfoQ Developer Marketing Blog")[Diversity](https://c4media.com/diversity "Diversity")
#### Events
May 7, 2026
- ##### QCon AI Boston
June 1-2, 2026
June 10, 2026
- ##### QCon San Francisco
November 16-20, 2026
#### Follow us on
Youtube 232K FollowersLinkedin 26K FollowersInstagram NewRSS 19K ReadersX 57.1k FollowersFacebook 21K LikesBluesky New
#### Stay in the know
The InfoQ PodcastEngineering Culture PodcastThe Software Architects' Newsletter
General Feedback [feedback@infoq.com](mailto:feedback@infoq.com) Advertising [sales@infoq.com](mailto:sales@infoq.com) Editorial [editors@infoq.com](mailto:editors@infoq.com) Marketing [marketing@infoq.com](mailto:marketing@infoq.com)
InfoQ.com and all content copyright © 2006-2026 C4Media Inc.
Privacy Notice, Terms And Conditions, Cookie Policy
Close
[BT](http://www.infoq.com/int/bt/ "bt")