T
traeai
Sign in
返回首页
Google Research Blog

Private analytics via zero-trust aggregation

8.5Score
Private analytics via zero-trust aggregation

TL;DR · AI Summary

Google提出了一种基于零信任原则的高效加密聚合方法,结合硬件保护机制,确保私有分析服务中的用户数据隐私。

Key Takeaways

  • Google采用零信任原则减少对单一实体的信任。
  • 新加密聚合方法保证只有匿名聚合洞察被获取。
  • TEE提供严格的认证和透明度层。

Outline

Jump quickly between sections.

  1. 本地处理数据的AI可以提供增强保护和及时警报,同时保持用户信息私密。

  2. 部署本地模型时,了解其行为、有效性或故障模式具有挑战性。

  3. 硬件隔离(TEE)和加密协议是保护用户数据的两种方法。

  4. TEE创建安全飞地,保护数据不受操作系统或恶意虚拟机影响。

  5. 加密协议通过数学工具提供安全保证。

Mindmap

See how the topics connect at a glance.

查看大纲文本(无障碍 / 无 JS 友好)
  • 零信任聚合

Highlights

Key sentences worth saving and sharing.

#零信任#加密聚合#隐私保护#TEE#Google
Open original article

Private analytics via zero-trust aggregation

By processing data locally, on-device AI can provide enhanced protection and timely alerts while keeping user information private. For example, Android uses a system called SafetyCore to provide privacy-preserving on-device features and common infrastructure to protect users from unwanted content. When developing on-device technologies, teams need to understand how well their systems work across millions of individual smartphones, each with unique data distributions, varying hardware constraints, and different user behaviors. To achieve this in a way that reveals only collective trends without revealing individual user data, teams can leverage cryptographic secure aggregation as a key building block. Like all cryptographic protocols, secure aggregation uses advanced mathematical tools to provide its security assurance.

Today, we set a higher bar for efficient cryptographic aggregation in a private analytics service. We follow a zero-trust principle, which aims to reduce trust necessary in any single entity. We achieve this through a new security design that combines cryptographic and hardware protection mechanisms. Our solution leverages a new cryptographic aggregation method that provably guarantees only anonymized, aggregated insights about a population can be obtained by Google. Additionally, trusted execution environments (TEEs) are used to provide a strict layer of attestation and transparency.

The challenge

When models are deployed locally on-device, simply knowing that a model is 'running' isn't enough to understand its behavior, effectiveness, or failure modes. This limits the ability to answer critical questions like:

  • Is the model drifting? (e.g., Does a translation model struggle with new slang emerging in a specific region?)
  • Are there hidden biases? (e.g., Is an image classifier less accurate under specific lighting conditions common in certain geographic areas?)
  • What is the real-world error rate? (e.g., Is a "Smart Reply" feature being ignored because its suggestions are technically correct but socially awkward?)

This is where private analytics becomes the essential bridge, enabling anonymized, aggregated insights about a population without ever revealing individual user content.

Google teams use federated analytics for this kind of aggregated, private insight, with applications in Pixel Recorder, Gboard, and more. Federated analytics requires a private aggregation route, where the data from individual devices is protected until combined into a sum. Two paradigms have emerged to protect user data in this setting: hardware-based isolation (TEEs) and cryptographic protocols.

A tale of two protections

The hardware approach centers on TEEs, such as Intel TDX, AMD SEV-SNP and others. The core idea is to create a "secure enclave" — essentially a protected slice of the processor and memory that is isolated from the rest of the device. Inside this enclave, data can be decrypted and processed in plaintext, shielded even from a compromised operating system or a malicious hypervisor.

Through a process called attestation, TEEs can compute a hardware-backed cryptographic "fingerprint" of the exact firmware and software state running inside the enclave. For a user or an auditor, attestation offers a verifiable guarantee that the data is being handled by the specific, tamper-proof program they expect, rather than a modified version designed to leak information. Google has deployed TEE-backed differentially private aggregation for computing insights into AI systems in the Pixel Recorder app.

However, TEE isolation mechanisms are constantly evolving. Researchers regularly discover side-channel vulnerabilities that can be leveraged by an attacker to either invalidate TEE guarantees, or application-level specific guarantees [SNPeek, TDXray]. While the community is working towards hardening existing solutions against known side-channel attacks, new side-channel vulnerabilities are expected to be discovered. Therefore, in an ideal system, data would be protected by multiple layers of security so that even if a TEE’s security model fails, the data is not compromised.

On the other hand, cryptographic protocols rely on mathematical techniques that provide provable guarantees that individual data cannot be reconstructed, with the only visible value being the aggregated and anonymized output. Google has deployed two generations of secure aggregation protocols at scale (detailed in the initial blog post and follow-up). However, its widespread use has been limited by the complexity of requiring user devices to remain online for multiple rounds of interaction over extended periods of time.

Encryption meets isolation

Our new solution introduces a novel cryptographic protocol that allows user devices to securely submit their information in a single, one-shot message, overcoming the barriers of traditional interactive schemes. By enabling a single-message submission, we eliminate the need for devices to remain online for multiple rounds of interaction with a server.

Integrated into Google’s confidential federated analytics system, we combine this higher-efficiency protocol with execution within a TEE to create a multi-layered defense architecture. With this solution, confidentiality no longer relies entirely on hardware protection. The cryptographic layer ensures that individual raw data is never exposed or reconstructed in any server memory—not even within the hardware-protected perimeters. The only time unencrypted data is processed off-device is at the final stage, when the data has already been aggregated and anonymized. Furthermore, our solution leverages TEE attestation mechanisms to provide high-assurance, verifiable proof to all participants that the secure aggregation protocol is being executed exactly as intended, i.e., by compiling and running correctly publicly available code.

The cryptographic engine with one-shot efficiency

At its core, our cryptographic solution is powered by an innovative lattice-based protocol that allows clients to encrypt their data in a way that the resulting ciphertexts can be aggregated while aggregating the underlying messages as well as encryption keys. Now the only thing needed to enable the server to obtain the aggregated values is a decryption key that can only decrypt the aggregated value. To aid with this task, we form small committees among the clients that hold hints which help unlock the aggregated value masked with additional differential privacy noise. Clients serve on committees infrequently according to their availability and facilitate the property that any decryption key is shared over a number of parties, each of which protects the confidentiality of the encrypted data.

Private analytics for SafetyCore

Android System SafetyCore is a Google system service for Android 9+ devices that provides privacy-preserving on-device support for Android safety features. In the realm of on-device safety, tools like SafetyCore play a critical role. However, for these tools to evolve, developers need to understand their real-world performance—specifically, which threats are being caught and where there are opportunities to further refine detection capabilities, all without compromising user privacy.

To bridge this gap, in partnership with the Android SafetyCore team, we are using our state-of-the-art private analytics solution to improve the accuracy of classifiers while preserving privacy.

Relying on aggregate privacy-preserving, anonymized insights is essential here; it allows engineers to measure the "true positive" rate of safety models across a diverse global fleet without ever seeing the private, sensitive content that triggered a local alert. By observing these high-level trends, developers can refine model thresholds and deploy updates that better protect the user, ensuring the safety system remains effective against emerging threats while keeping the raw data private and strictly isolated on the device. Android SafetyCore will leverage our zero-trust private analytics to evaluate metadata indicative of the effectiveness of its tools while respecting its privacy commitment that user content stays only on the device. We are excited to introduce a technology that aids Android’s broader mission to protect user safety while preserving their privacy.

Conclusion

Cryptographic techniques for secure computation offer strong security guarantees grounded in mathematical proofs. We demonstrated how to design secure aggregation protocols compatible with deployment in large-scale distributed systems. The resulting solution, integrated with existing security mechanisms, raises the security bar for private analytics. Moving forward, we are exploring opportunities to expand the set of computations supported in this model.

Acknowledgements

_The contents of this blog post reflect the contributions of many people, including Bruno Alves, Carlos Balduz, Nacho Ballester Tester, James Bell-Clark, Oleg Chernyakhovskiy, Stanislav Chiknavaryan, Jim Choncholas, Stefan Dierauf, Emily Glanz, Shruthi Gorantala, Mira Holford, Mihaela Ion, Artem Lagzdin, Jean-Christophe Lilot, Peter Kairouz, Jonathan Katz, Baiyu Li, Ben Kreuter, Brett McLarnon, Mekhola Mukherjee, Amanda Nascimento, Timon Van Overveldt, Javed Ramjohn, Philipp Schoppmann, Karn Seth, Debora Silva, Rakshita Tandon, and Pierre Tholoniat. We would like to thank Elie Bursztein, Bryant Gipson, Marco Gruteser, Alex Freire, Xavier Llorà, Dan Ramage, David Sehr, and Amanda Walker for their leadership and Corinna Cortes, Brian Roddy, Pankaj Rohatgi, and Eduardo Tejada for their continued support._

AI may generate inaccurate information. Please verify important content.