Secure GenAI Workflows on NVIDIA GPUs with Duality

Introduction

The rapid ascent of generative artificial intelligence has ushered in a new era of data‑driven products, from conversational agents that can draft legal documents to sophisticated image‑generation tools that help designers prototype concepts in seconds. Yet the same data that fuels these models—often containing highly sensitive personal or corporate information—poses a formidable privacy risk. The industry’s response has been a surge of privacy‑enhancing technologies that aim to reconcile the need for powerful AI with the imperative to protect data. Duality Technologies, a pioneer in secure data collaboration, has taken a significant step forward by integrating its platform with Google Cloud’s Confidential Computing portfolio, specifically the NVIDIA GPU‑backed confidential virtual machines. This partnership unlocks the possibility of training and deploying large language models (LLMs) on encrypted data without exposing the underlying information to the cloud provider or any other third party.

The announcement is more than a technical milestone; it represents a convergence of three critical trends. First, the commoditization of high‑performance GPUs in the cloud has made it feasible to train state‑of‑the‑art LLMs at scale. Second, the growing regulatory landscape—encompassing GDPR, CCPA, and emerging AI‑specific frameworks—demands robust safeguards around data usage. Third, the business community is increasingly recognizing that the competitive advantage of AI lies not only in model accuracy but also in the trustworthiness of the data pipeline. By enabling secure GenAI workflows on NVIDIA GPUs, Duality Technologies positions itself at the nexus of these forces, offering a solution that is both technically sound and commercially compelling.

In the sections that follow, we will unpack how Duality’s integration works, explore the practical implications for organizations that rely on large‑scale AI, and consider the broader impact on the AI ecosystem.

Main Content

The Architecture of Confidential GPU Workloads

At the heart of the new offering is Google Cloud’s Confidential Computing infrastructure, which leverages hardware‑based Trusted Execution Environments (TEEs) to isolate workloads from the underlying hypervisor and cloud operator. When combined with NVIDIA’s GPU acceleration, this architecture allows data to remain encrypted in memory while still being processed by the GPU cores. Duality’s platform sits atop this stack, providing a set of cryptographic primitives—such as secure multi‑party computation (SMPC) and homomorphic encryption (HE)—that can be applied to data before it is uploaded to the cloud.

The workflow begins with data owners encrypting their datasets using Duality’s key management system. Once encrypted, the data is uploaded to a confidential virtual machine that runs on an NVIDIA GPU. Inside the TEE, the GPU processes the data in a way that is invisible to the host operating system, ensuring that even privileged users cannot read the plaintext. Duality’s runtime then orchestrates the training or inference job, handling tasks such as model checkpointing, gradient aggregation, and result decryption. Because the entire pipeline is executed within a secure enclave, the risk of data leakage is dramatically reduced.

Benefits for Large‑Scale LLM Training

Training a large language model is a resource‑intensive endeavor that typically requires thousands of GPU hours and terabytes of training data. For many organizations, the data involved—customer interactions, proprietary code, or internal research—cannot be exposed to external parties. Duality’s solution removes this barrier by allowing the training to occur on encrypted data without sacrificing performance.

One of the key advantages is the ability to perform secure aggregation of gradients. In a typical distributed training setup, gradients from multiple workers are combined to update the model weights. Duality’s secure aggregation protocol ensures that each worker’s contribution remains confidential, preventing any single party from reconstructing the original data. This is particularly valuable in multi‑tenant environments where different departments or partners collaborate on a shared model.

Moreover, the integration with NVIDIA GPUs means that the computational overhead of encryption and decryption is minimized. NVIDIA’s hardware supports cryptographic acceleration for certain operations, and the TEE’s isolation reduces the need for costly software‑based encryption checks. As a result, the performance penalty compared to unencrypted training is modest, making the approach viable for production workloads.

Real‑World Use Cases

Several industries stand to benefit from secure GenAI workflows. In healthcare, for instance, training a language model on patient records could enable advanced diagnostic assistants, but the data must remain confidential to comply with HIPAA. By using Duality’s platform, a hospital could train a model on encrypted EHR data without exposing patient information to the cloud provider.

Financial institutions face similar challenges. Regulatory frameworks such as Basel III and MiFID II impose strict limits on data sharing. A bank could leverage secure GPU training to develop fraud‑detection models that learn from transaction histories while keeping the data encrypted throughout the process.

The legal sector is another fertile ground. Law firms often deal with sensitive client documents that cannot be shared outside the firm. Secure GenAI could automate document review or contract analysis, allowing the firm to benefit from AI without compromising client confidentiality.

Challenges and Future Directions

While the integration marks a significant leap forward, it is not without challenges. First, the cryptographic protocols used by Duality, such as SMPC and HE, still introduce computational overhead, especially for models with billions of parameters. Continued research into more efficient schemes—like lattice‑based cryptography or differential privacy‑friendly training—will be essential to keep training times practical.

Second, the regulatory landscape is evolving. Some jurisdictions may require explicit disclosure of encryption methods or may impose limits on the use of certain cryptographic primitives. Duality’s platform will need to remain agile, offering compliance‑ready configurations that can adapt to new legal requirements.

Finally, the broader AI ecosystem must consider the implications of widespread adoption of confidential GPU workloads. Cloud providers may need to invest in more robust TEE hardware, and hardware vendors like NVIDIA will likely continue to enhance GPU support for secure enclaves. The interplay between hardware, software, and policy will shape the next generation of privacy‑preserving AI.

Conclusion

The partnership between Duality Technologies and Google Cloud’s Confidential Computing demonstrates that privacy and performance need not be mutually exclusive in the realm of generative AI. By enabling secure, GPU‑accelerated training and inference on encrypted data, the solution opens the door for organizations to harness the power of large language models while maintaining compliance with stringent data protection regulations. As the AI landscape continues to evolve, such privacy‑enhancing infrastructures will become indispensable, ensuring that the benefits of AI are accessible to all sectors without compromising the trust that underpins our digital society.

Call to Action

If your organization is exploring the deployment of large language models but is constrained by data privacy concerns, consider evaluating Duality Technologies’ secure GenAI workflow on NVIDIA GPUs. Reach out to our team to schedule a technical deep dive, or sign up for a free trial to experience firsthand how confidential GPU workloads can transform your AI strategy. By integrating privacy from the ground up, you can accelerate innovation, satisfy regulatory obligations, and build trust with your customers and partners.

Secure GenAI Workflows on NVIDIA GPUs with Duality

Table of Contents

Share This Post

Introduction

Main Content

The Architecture of Confidential GPU Workloads

Benefits for Large‑Scale LLM Training

Real‑World Use Cases

Challenges and Future Directions

Conclusion

Call to Action

Related Articles

Cisco Open-Weights Time Series Model: Decoder‑Only Transformer

OpenAGI Launches Lux: A Scalable Computer Use Model

TinyLlama Local Multi‑Agent System for Task Decomposition

We value your privacy