Cybersecurity consultancy firm Trend Micro’s research identified security vulnerabilities in NVIDIA Riva API endpoints. NVIDIA Riva is a set of GPU-accelerated features for developers that NVIDIA describes as “multilingual speech and translation microservices for building fully customizable, real-time conversational AI pipelines” for cloud deployment.Â
The features, launched in 2024, enabling automatic speech recognition (ASR), text-to-speech (TTS), and AI translation, can be used in multiple scenarios and operate through the integration of large language models (LLMs) and retrieval-augmented generation (RAG).
An article summarizing the Trend Micro research findings lists “new and unique security challenges” and risks associated with misconfigured deployments of NVIDIA Riva working without proper authentication.
According to the article, the vulnerabilities and misconfigurations identified can lead to unauthorized access, abuse of GPU resources and API keys, data leakage, denial-of-service (DoS) attacks, system disruptions, and other risks.
The article also highlights the ease with which these exposed services can be seen and exploited due to the default network configurations. A major risk for organizations is theft of intellectual property “particularly if their models or inference services are exposed through misconfigured APIs.”
The article also warns that configuring TLS/SSL, the one security server-enforced criterion, gives a false sense of security. Even with all necessary certificate details supplied in the NVIDIA QuickStart configuration, a secure TLS/SSL connection for encryption confirms the server’s identity but it does not authenticate the client, leaving the service open for anyone to use.
Trend Micro recommends that organizations deploying NVIDIA Riva use secure API gateways, apply network segmentation, strong authentication and role-based access control, container hardening, unnecessary service disabling, logging and monitoring, rate limiting and API request throttling, and keeping the framework, server, and dependencies updated.