Global blockchain supervision and query platform

English
Download

NVIDIA Expands Riva ASR Capabilities with Whisper and Canary Models

NVIDIA Expands Riva ASR Capabilities with Whisper and Canary Models WikiBit 2025-02-21 22:01

Rebeca Moen Feb 21, 2025 10:54 NVIDIA enhances its Riva ASR with new multilingual capabilities using Whisper and Canary models, integrating advanced

NVIDIA has taken significant strides in advancing its Automatic Speech Recognition (ASR) systems by introducing enhanced capabilities through the Riva 2.18.0 container and SDK. These developments are part of NVIDIAs ongoing efforts to refine its GPU-accelerated speech and translation AI microservices, as detailed by Sven Chilton on the NVIDIA Developer Blog.

Integration of New Models

The latest iteration of Riva includes support for the Parakeet architecture, which facilitates streaming multilingual ASR, and the Whisper and Canary models for offline ASR and Automatic Speech Translation (AST). Whisper, developed by OpenAI, and the Distil-Whisper models by HuggingFace, are now integral to Rivas offline ASR capabilities, allowing for transcription and translation of audio recordings in numerous languages directly to English.

Canary models further extend Rivas functionality by supporting offline ASR and AST in multiple language combinations, including Any-to-English, English-to-Any, and Any-to-Any translations. These models cater to diverse linguistic needs, offering robust support for language detection and translation tasks.

Selective NMT Deactivation

One of the notable features introduced in this update is the ability to selectively deactivate parts of the Neural Machine Translation (NMT) process using the SSML tag. This feature allows users to specify text segments that should not be translated, providing greater control over the translation outputs. Additionally, a new DNT dictionary enables the specification of how certain words or phrases should be translated, enhancing the customization of translation processes.

Deployment and Usage

Deploying these new capabilities is streamlined through the Riva Skills Quick Start resource folder, which includes scripts and configuration files necessary for setting up a Riva server with Whisper and Canary functionalities. Users can choose between Whisper and Canary models based on their specific ASR needs, utilizing provided scripts to optimize model deployment according to their GPU architecture.

NVIDIAs commitment to expanding the linguistic and functional scope of its ASR systems is evident in the integration of these advanced models and features. By supporting a wider range of languages and offering enhanced translation controls, Riva continues to set industry standards in speech recognition and translation technology.

Disclaimer:

The views in this article only represent the author's personal views, and do not constitute investment advice on this platform. This platform does not guarantee the accuracy, completeness and timeliness of the information in the article, and will not be liable for any loss caused by the use of or reliance on the information in the article.

  • Crypto token price conversion
  • Exchange rate conversion
  • Calculation for foreign exchange purchasing
/
PC(S)
Current Rate
Available

0.00